word2vec code example

Continuing from yesterday’s post, documenting an example word2vec notebook using Gensim. Here, I went for a super basic example with the following steps:

  • Queried the keyword Neonicotinoids from pymed, a python-based PubMed API. Set the maximum query size as 5000
  • Used the lower-cased keywords of the articles as the sentence for training word2vec word embeddings
  • Built both a CBOW and a Skip-gram model and checked how the word vectors differed for one single word vector (imidacloprid)
  • Printed out the top 10 similar word vector to imidacloprid under both methods
CBOW - most similar word vectors to imidacloprid
Skip-gram - most similar word vectors to imidacloprid

Detail example and citation of part of the example codes can be found in my notebook here.