Word2Vec With TensorFlow - GitHub Pages PDF Free Download

1y ago

30 Views

1 Downloads

1.99 MB

16 Pages

Report/dmca

Download PDF

Transcription

Word2Vec withTensorFlowDepartment of Computer Science,National Tsing Hua University, TaiwanDeep Learning

Word embedding How to represent words in machine learning?My favorite pet is cat .-Treat words as discrete atomic symbols, and therefore ‘cat’may be represented as ‘1’ and ‘pet’ as ‘2.’-However, this embedding includes no semanEcs. Taskslearning from this representaEon causes models hard totrain.-BeHer soluEons?

Word embedding IntuiEon: Modeling relaEonships between words.--SemanEcs Similarity๏CBOW (ConEnuous Bag-of-Words model),๏Skip-GramGrammar / Syntax

CBOW & Skip-Gram AssumpEons in CBOW & Skip-Gram:-Words with similar meaning / domain have the closecontexts.My favorite pet is . cat, dog

CBOW & Skip-Gram CBOW: PredicEng the targetwords with context words. Skip-Gram: PredicEngcontext words from thetarget words.

CBOW & Skip-Gram For example, given the sentence:the quick brown fox jumped over the lazy dog. CBOW will be trained on the dataset:([the, brown], quick), ([quick, fox], brown), Skip-Gram will be trained on the dataset:(quick, the), (quick, brown), (brown, quick),(brown, fox)

Picture from https://kavita-ganesan.com/

Skip-GramPicture from https://lilianweng.github.io/

SoWmax & Words classiﬁcaEon Recap the soWmax funcEon:σ(z)i e ziK j 1 e zj While training the embedding, we need to sum over the enErevocabulary in the denominator, which is costly.p(w i x) exp(h vw′i ) j V exp(h vw′j ) V (vocabulary size) is thousands of millions

Noise ContrasEve EsEmaEon IntuiEon: EsEmate the parameters by learning to discriminatebetween the data w and some arEﬁcially generated noise w̃.maximum likelihood estimation noise contrastive estimation ObjecEve funcEon:arg max log p(d 1 wt , x) Derive form:N w̃i Qlog p(d 0 w̃i, x)Np(wt x)Nq(w̃i)arg max log logp(wt x) Nq(w̃) w̃ p(wt x) Nq(w̃i) Qi

Homework FuncEonal API:

Homework Model subclassing:

Reference noisecontrasEveesEmaEon rning-word-embedding.html hHps://aegis4048.github.io/opEmize computaEonal eﬃciency of skipgram with negaEve sampling 0a.pdf pdf hHps://www.tensorﬂow.org/api docs/python/k/nn/nce loss

Word embedding How to represent words in machine learning? -Treat words as discrete atomic symbols, and therefore 'cat' may be represented as '1' and 'pet' as '2.' -However, this embedding includes no semanEcs.Tasks learning from this representaEon causes models hard to train.