Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams
With the growing use of vector embeddings in areas like natural language processing and recommendation systems, the need for effective storage and retrieval methods is increasingly important. However, deploying specialized databases for vector indexing can be challenging due to resource limitations...
Guardado en:
| Autores principales: | , |
|---|---|
| Formato: | Objeto de conferencia |
| Lenguaje: | Inglés |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | http://sedici.unlp.edu.ar/handle/10915/177179 |
| Aporte de: |
| id |
I19-R120-10915-177179 |
|---|---|
| record_format |
dspace |
| spelling |
I19-R120-10915-1771792025-03-07T20:07:00Z http://sedici.unlp.edu.ar/handle/10915/177179 Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams Rodríguez-Betancourt, Esteban Casasola-Murillo, Edgar 2024-08 2024 2025-03-07T16:45:47Z en Ciencias Informáticas Databases Indexes Natural Language Processing Word Embeddings With the growing use of vector embeddings in areas like natural language processing and recommendation systems, the need for effective storage and retrieval methods is increasingly important. However, deploying specialized databases for vector indexing can be challenging due to resource limitations or operational constraints. This paper introduces a novel approach that utilizes existing trigram indexes within SQL databases to efficiently manage vector embeddings. By adapting traditional relational databases to handle high-dimensional data, organizations can use their existing infrastructure without the need to invest in new database systems. This method reduces management complexity and costs associated with maintaining separate systems for vector data. We outline the process of converting vector embeddings for trigram indexing and evaluate the performance and recall through empirical analysis. This paper aims to offer a practical solution for researchers and practitioners seeking to integrate advanced vector-based queries into their current database systems, thereby enhancing the functionality and accessibility of vector embeddings in mainstream applications. Sociedad Argentina de Informática e Investigación Operativa Objeto de conferencia Objeto de conferencia http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) application/pdf 150-157 |
| institution |
Universidad Nacional de La Plata |
| institution_str |
I-19 |
| repository_str |
R-120 |
| collection |
SEDICI (UNLP) |
| language |
Inglés |
| topic |
Ciencias Informáticas Databases Indexes Natural Language Processing Word Embeddings |
| spellingShingle |
Ciencias Informáticas Databases Indexes Natural Language Processing Word Embeddings Rodríguez-Betancourt, Esteban Casasola-Murillo, Edgar Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams |
| topic_facet |
Ciencias Informáticas Databases Indexes Natural Language Processing Word Embeddings |
| description |
With the growing use of vector embeddings in areas like natural language processing and recommendation systems, the need for effective storage and retrieval methods is increasingly important. However, deploying specialized databases for vector indexing can be challenging due to resource limitations or operational constraints. This paper introduces a novel approach that utilizes existing trigram indexes within SQL databases to efficiently manage vector embeddings. By adapting traditional relational databases to handle high-dimensional data, organizations can use their existing infrastructure without the need to invest in new database systems. This method reduces management complexity and costs associated with maintaining separate systems for vector data. We outline the process of converting vector embeddings for trigram indexing and evaluate the performance and recall through empirical analysis. This paper aims to offer a practical solution for researchers and practitioners seeking to integrate advanced vector-based queries into their current database systems, thereby enhancing the functionality and accessibility of vector embeddings in mainstream applications. |
| format |
Objeto de conferencia Objeto de conferencia |
| author |
Rodríguez-Betancourt, Esteban Casasola-Murillo, Edgar |
| author_facet |
Rodríguez-Betancourt, Esteban Casasola-Murillo, Edgar |
| author_sort |
Rodríguez-Betancourt, Esteban |
| title |
Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams |
| title_short |
Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams |
| title_full |
Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams |
| title_fullStr |
Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams |
| title_full_unstemmed |
Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams |
| title_sort |
teaching sql new tricks: efficient vector indexing with trigrams |
| publishDate |
2024 |
| url |
http://sedici.unlp.edu.ar/handle/10915/177179 |
| work_keys_str_mv |
AT rodriguezbetancourtesteban teachingsqlnewtricksefficientvectorindexingwithtrigrams AT casasolamurilloedgar teachingsqlnewtricksefficientvectorindexingwithtrigrams |
| _version_ |
1847925348812980224 |