An efficient action detection from first person vision with attention model
The goal of this work is to propose possible improvements on one of the latest models for Video Action Recognition based on currently existing attention mechanisms. We took a model architecture that uses 2 sub-models in paralell: one based on Optical Flow and the other based on the video itself, and...
Guardado en:
| Autores principales: | , , |
|---|---|
| Formato: | Objeto de conferencia |
| Lenguaje: | Inglés |
| Publicado: |
2021
|
| Materias: | |
| Acceso en línea: | http://sedici.unlp.edu.ar/handle/10915/141291 http://50jaiio.sadio.org.ar/pdfs/saiv/SAIV-08.pdf |
| Aporte de: |
| Sumario: | The goal of this work is to propose possible improvements on one of the latest models for Video Action Recognition based on currently existing attention mechanisms. We took a model architecture that uses 2 sub-models in paralell: one based on Optical Flow and the other based on the video itself, and proposed the following improvements: adding mixed precision in the training loop, using a Ranger optimizer instead of SGD, and expanding the Attention Mechanism. The video database used for this work was the EGTEA+ that is a action database of first person videos of daily activities. |
|---|