An efficient action detection from first person vision with attention model

The goal of this work is to propose possible improvements on one of the latest models for Video Action Recognition based on currently existing attention mechanisms. We took a model architecture that uses 2 sub-models in paralell: one based on Optical Flow and the other based on the video itself, and...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Straminsky, Axel, Jacobo, Julio, Buemi, María Elena
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2021
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/141291
http://50jaiio.sadio.org.ar/pdfs/saiv/SAIV-08.pdf
Aporte de:
Descripción
Sumario:The goal of this work is to propose possible improvements on one of the latest models for Video Action Recognition based on currently existing attention mechanisms. We took a model architecture that uses 2 sub-models in paralell: one based on Optical Flow and the other based on the video itself, and proposed the following improvements: adding mixed precision in the training loop, using a Ranger optimizer instead of SGD, and expanding the Attention Mechanism. The video database used for this work was the EGTEA+ that is a action database of first person videos of daily activities.