word2vec code example
Continuing from yesterday’s post, documenting an example word2vec notebook using Gensim. Here, I went for a super basic example with the following steps:
- Queried the keyword
Neonicotinoids
frompymed
, a python-based PubMed API. Set the maximum query size as 5000 - Used the lower-cased keywords of the articles as the sentence for training word2vec word embeddings
- Built both a CBOW and a Skip-gram model and checked how the word vectors differed for one single word vector (
imidacloprid
) - Printed out the top 10 similar word vector to
imidacloprid
under both methods
CBOW - most similar word vectors to imidacloprid
Skip-gram - most similar word vectors to imidacloprid
Detail example and citation of part of the example codes can be found in my notebook here.