Improvement: Previous methods don’t account for documents’ meaning and treat similar words as unique. Word2Vec is a neural network based technique to create word representations which allows measurement of similarity between words. So words “king” and “emperor” will not be treated as completely different words.
Technical: The 2 common techniques under this paradigm are
- CBoW (Continuous Bag of Words) – a neural network tries to predict the missing word given the words around it
- SkipGram – a neural network, given a word, tries to predict the words around it