nano-JEPA: Democratizing Video Understanding with Personal Computers
The Video Joint Embedding Predictive Architecture (V-JEPA) has shown great promise in self-supervised video representation learning. However, its substantial computational demands, often necessitates powerful GPU clusters, limit accessibility for many researchers. We introduce nano-JEPA, a streamlin...
Guardado en:
| Autores principales: | , , , , , , |
|---|---|
| Formato: | Objeto de conferencia |
| Lenguaje: | Inglés |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | http://sedici.unlp.edu.ar/handle/10915/176281 |
| Aporte de: |
| id |
I19-R120-10915-176281 |
|---|---|
| record_format |
dspace |
| spelling |
I19-R120-10915-1762812025-02-07T20:05:00Z http://sedici.unlp.edu.ar/handle/10915/176281 nano-JEPA: Democratizing Video Understanding with Personal Computers Rostagno, Adrián Iparraguirre, Javier Ermantraut, Joel Tobio, Lucas Foissac, Segundo Aggio, Santiago Friedrich, Guillermo Rodolfo 2024-10 2024 2025-02-07T16:57:48Z en Ciencias Informáticas feature prediction unsupervised learning visual representations video joint-embedding predictive architecture The Video Joint Embedding Predictive Architecture (V-JEPA) has shown great promise in self-supervised video representation learning. However, its substantial computational demands, often necessitates powerful GPU clusters, limit accessibility for many researchers. We introduce nano-JEPA, a streamlined adaptation of V-JEPA designed to run efficiently on resource-constrained personal computers, even those with only CPUs. Additionally, we present the nano-datasets repository, facilitating the creation of manageable subsets from large public video datasets. Our work aims to democratize research in this field, enabling broader participation and experimentation with V-JEPA-like models. We demonstrate that nano-JEPA, trained on smaller datasets and hardware, can still achieve reasonable performance on downstream tasks, opening doors for further exploration and innovation. Red de Universidades con Carreras en Informática Objeto de conferencia Objeto de conferencia http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) application/pdf 94-103 |
| institution |
Universidad Nacional de La Plata |
| institution_str |
I-19 |
| repository_str |
R-120 |
| collection |
SEDICI (UNLP) |
| language |
Inglés |
| topic |
Ciencias Informáticas feature prediction unsupervised learning visual representations video joint-embedding predictive architecture |
| spellingShingle |
Ciencias Informáticas feature prediction unsupervised learning visual representations video joint-embedding predictive architecture Rostagno, Adrián Iparraguirre, Javier Ermantraut, Joel Tobio, Lucas Foissac, Segundo Aggio, Santiago Friedrich, Guillermo Rodolfo nano-JEPA: Democratizing Video Understanding with Personal Computers |
| topic_facet |
Ciencias Informáticas feature prediction unsupervised learning visual representations video joint-embedding predictive architecture |
| description |
The Video Joint Embedding Predictive Architecture (V-JEPA) has shown great promise in self-supervised video representation learning. However, its substantial computational demands, often necessitates powerful GPU clusters, limit accessibility for many researchers. We introduce nano-JEPA, a streamlined adaptation of V-JEPA designed to run efficiently on resource-constrained personal computers, even those with only CPUs. Additionally, we present the nano-datasets repository, facilitating the creation of manageable subsets from large public video datasets. Our work aims to democratize research in this field, enabling broader participation and experimentation with V-JEPA-like models. We demonstrate that nano-JEPA, trained on smaller datasets and hardware, can still achieve reasonable performance on downstream tasks, opening doors for further exploration and innovation. |
| format |
Objeto de conferencia Objeto de conferencia |
| author |
Rostagno, Adrián Iparraguirre, Javier Ermantraut, Joel Tobio, Lucas Foissac, Segundo Aggio, Santiago Friedrich, Guillermo Rodolfo |
| author_facet |
Rostagno, Adrián Iparraguirre, Javier Ermantraut, Joel Tobio, Lucas Foissac, Segundo Aggio, Santiago Friedrich, Guillermo Rodolfo |
| author_sort |
Rostagno, Adrián |
| title |
nano-JEPA: Democratizing Video Understanding with Personal Computers |
| title_short |
nano-JEPA: Democratizing Video Understanding with Personal Computers |
| title_full |
nano-JEPA: Democratizing Video Understanding with Personal Computers |
| title_fullStr |
nano-JEPA: Democratizing Video Understanding with Personal Computers |
| title_full_unstemmed |
nano-JEPA: Democratizing Video Understanding with Personal Computers |
| title_sort |
nano-jepa: democratizing video understanding with personal computers |
| publishDate |
2024 |
| url |
http://sedici.unlp.edu.ar/handle/10915/176281 |
| work_keys_str_mv |
AT rostagnoadrian nanojepademocratizingvideounderstandingwithpersonalcomputers AT iparraguirrejavier nanojepademocratizingvideounderstandingwithpersonalcomputers AT ermantrautjoel nanojepademocratizingvideounderstandingwithpersonalcomputers AT tobiolucas nanojepademocratizingvideounderstandingwithpersonalcomputers AT foissacsegundo nanojepademocratizingvideounderstandingwithpersonalcomputers AT aggiosantiago nanojepademocratizingvideounderstandingwithpersonalcomputers AT friedrichguillermorodolfo nanojepademocratizingvideounderstandingwithpersonalcomputers |
| _version_ |
1845116777788342272 |