About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
NAACL 2019
Conference paper
Word-node2vec: Improving word embedding with document-level non-local word co-occurrences
Abstract
Standard word embedding algorithms, such as word2vec and Glove, make a restrictive assumption that words are likely to be semantically related only if they co-occur locally within a window of fixed size. However, this restrictive assumption may not capture the semantic association between words that co-occur frequently but non-locally within documents. To alleviate this restriction, in this paper, we propose a graph-based word embedding method, named 'word-node2vec'. By relaxing the strong constraint of locality, our method is able to capture both local and non-local co-occurrences. Wordnode2vec constructs a weighted graph, where each node represents a word and the weight of an edge between two nodes represents a combination of both local (e.g. word2vec) and document-level co-occurrences. Our experiments show that word-node2vec outperforms word2vec and glove on a range of different tasks, such as word-pair similarity prediction, word analogy and concept categorization.