Hash2Vec: Feature Hashing for Word Embeddings

In this paper we propose the application of feature hashing to create word embeddings for natural language processing. Feature hashing has been used successfully to create document vectors in related tasks like document classification. In this work we show that feature hashing can be applied to obta...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Argerich, Luis, Cano, Matías J., Torre Zaffaroni, Joaquín
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2016
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/56977
http://45jaiio.sadio.org.ar/sites/default/files/ASAI-10_0.pdf
Aporte de:
id I19-R120-10915-56977
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
feature hashing
word embedding
Natural Language Processing
spellingShingle Ciencias Informáticas
feature hashing
word embedding
Natural Language Processing
Argerich, Luis
Cano, Matías J.
Torre Zaffaroni, Joaquín
Hash2Vec: Feature Hashing for Word Embeddings
topic_facet Ciencias Informáticas
feature hashing
word embedding
Natural Language Processing
description In this paper we propose the application of feature hashing to create word embeddings for natural language processing. Feature hashing has been used successfully to create document vectors in related tasks like document classification. In this work we show that feature hashing can be applied to obtain word embeddings in linear time with the size of the data. The results show that this algorithm, that does not need training, is able to capture the semantic meaning of words.We compare the results against GloVe showing that they are similar. As far as we know this is the first application of feature hashing to the word embeddings problem and the results indicate this is a scalable technique with practical results for NLP applications.
format Objeto de conferencia
Objeto de conferencia
author Argerich, Luis
Cano, Matías J.
Torre Zaffaroni, Joaquín
author_facet Argerich, Luis
Cano, Matías J.
Torre Zaffaroni, Joaquín
author_sort Argerich, Luis
title Hash2Vec: Feature Hashing for Word Embeddings
title_short Hash2Vec: Feature Hashing for Word Embeddings
title_full Hash2Vec: Feature Hashing for Word Embeddings
title_fullStr Hash2Vec: Feature Hashing for Word Embeddings
title_full_unstemmed Hash2Vec: Feature Hashing for Word Embeddings
title_sort hash2vec: feature hashing for word embeddings
publishDate 2016
url http://sedici.unlp.edu.ar/handle/10915/56977
http://45jaiio.sadio.org.ar/sites/default/files/ASAI-10_0.pdf
work_keys_str_mv AT argerichluis hash2vecfeaturehashingforwordembeddings
AT canomatiasj hash2vecfeaturehashingforwordembeddings
AT torrezaffaronijoaquin hash2vecfeaturehashingforwordembeddings
bdutipo_str Repositorios
_version_ 1764820476773793795