ConvAtt Network: a low parameter approach for sign language recognition

Despite recent advances in Large Language Models in text processing. Sign Language Recognition (SLR) remains an unresolved task. This is, in part, due to limitations in the available data. In this paper, we investigate combining ID convolutions with transformer layers to capture local features and g...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ríos, Gastón Gustavo, Dal Bianco, Pedro Alejandro, Ronchetti, Franco, Ponte Ahón, Santiago Andrés, Stanchi, Oscar Agustín, Hasperué, Waldo
Formato: Articulo
Lenguaje:Inglés
Publicado: 2024
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/173739
Aporte de:
Descripción
Sumario:Despite recent advances in Large Language Models in text processing. Sign Language Recognition (SLR) remains an unresolved task. This is, in part, due to limitations in the available data. In this paper, we investigate combining ID convolutions with transformer layers to capture local features and global interactions in a low-parameter SLR model. We experimented using multiple data augmentation and regularization techniques to categorize signs of the French Belgian Sign Language. We achieved a top-1 accuracy of 42.7% and a top-10 accuracy of 81.9% in 600 different signs. This model is competitive with the current state of the art while using a significantly lower number of parameters.