MatrixFactorization - Cs.cmu.edu

Transcription

Matrix  Factorization1

Recovering  latent  factors  in  a  matrixn rowsm columnsv11 vij vnm2

Recovering  latent  factors  in  a  matrixK*my1a1a2. amx2y2b1b2 bm.n*Kx1 v11 vij xnyn vnm3

What  is  this  for?K*my1a1a2. amx2y2b1b2 bm.n*Kx1 v11 vij xnyn vnm4

MF  for  collaborative  @iltering5

What  is  collaborative  @iltering?

What  is  collaborative  @iltering?

What  is  collaborative  @iltering?

What  is  collaborative  @iltering?

Recovering  latent  factors  in  a  matrixn usersm moviesv11 vij vnmV[i,j] user i’s rating of movie j11

Recovering  latent  factors  in  a  matrixm moviesm moviesy1a1a2. amx2y2b1b2 bm.n usersx1 v11 vij xnyn vnmV[i,j] user i’s rating of movie j12

13

MF  for  image  modeling14

15

MF  for  imagesPC1y1a1a2. amx2y2b1b2 bm.1000 imagesx11000 * 10,000,0010,000 pixels2 prototypes v11 vij xnynPC2 vnmV[i,j] pixel j in image i

MF  for  modeling  text17

The Neatest Little Guide to Stock Market Investing Investing For Dummies, 4th Edition The Little Book of Common Sense Investing: The OnlyWay to Guarantee Your Fair Share of Stock MarketReturns The Little Book of Value Investing Value Investing: From Graham to Buffett and Beyond Rich Dad’s Guide to Investing: What the Rich Invest in,That the Poor and the Middle Class Do Not! Investing in Real Estate, 5th Edition Stock Investing For Dummies Rich Dad’s Advisors: The ABC’s of Real EstateInvesting: The Secrets of Finding Hidden Profits MostInvestors tent-semantic-analysis-lsa-tutorial/

TFIDF counts would be latent-semantic-analysis-lsa-tutorial/

Recovering  latent  factors  in  a  matrixdoc term matrixm termsy1a1a2. amx2y2b1b2 bm.n documentsx1 v11 vij xnyn vnmV[i,j] TFIDF score of term j in doc i20

Investing forreal estateRich Dad’sAdvisor’s: TheABCs of RealEstateInvestment

The little bookof commonsense investing: Neatest LittleGuide to StockMarketInvesting

MF  is  like  clustering24

k- ‐means  as  MFindicators for rclusters1a1a210b1b2.n examples0Zoriginal data setcluster meansM. am bm v11 vij xnynX vnm

How  do  you  do  it?K*my1a1a2. amx2y2b1b2 bm.n*Kx1 v11 vij xnyn vnm26

KDD 2011talk pilfered from à .27

28

Recovering  latent  factors  in  a  matrixry1a1a2x2y2b1b2.n usersx1m moviesm moviesW xnynH. am bm v11 vijV vnmV[i,j] user i’s rating of movie j29

30

31

32

for image denoising33

Matrix  factorization  as  SGDstep sizewhy does this work?34

Matrix  factorization  as  SGD  - ‐  why  doesthis  work?    Here’s  the  key  claim:35

Checking  the  claimThink for SGD for logistic regression LR loss compare y and ŷ dot(w,x) similar but now update w (user weights) and x (movie weight)36

What  loss  functions  are  possible?N1,  N2  - ‐  diagonalmatrixes,  sort  of  like  IDFfactors  for  the  users/movies“generalized”  KL- ‐divergence37

What  loss  functions  are  possible?38

What  loss  functions  are  possible?39

ALS alternating least squares40

KDD 2011talk pilfered from à .41

42

43

44

Similar to McDonnell et alwith perceptron learning45

Slow convergence .46

47

48

49

50

51

52

More  detail . Randomly  permute  rows/cols  of  matrix Chop  V,W,H  into  blocks  of  size  d  x  d– m/d  blocks  in  W,  n/d  blocks  in  H Group  the  data:– Pick  a  set  of  blocks  with  no  overlapping  rowsor  columns  (a  stratum)– Repeat  until  all  blocks  in  V  are  covered Train  the  SGD– Process  strata  in  series– Process  blocks  within  a  stratum  in  parallel53

More  detail .Z was V54

More  detail .M Initialize  W,H  randomly– not  at  zero  J Choose  a  random  ordering  (random  sort)  of  the  pointsin  a  stratum  in  each  “sub- ‐epoch” Pick  strata  sequence  by  permuting  rows  and  columnsof  M,  and  using  M’[k,i]  as  column  index  of  row  i  insubepoch  k Use  “bold  driver”  to  set  step  size:– increase  step  size  when  loss  decreases  (in  an  epoch)– decrease  step  size  when  loss  increases Implemented  in  Hadoop  and  R/Snowfall55

56

Wall  Clock  Time8  nodes,  64  cores,  R/snow57

58

59

60

61

Number  of  Epochs62

63

64

65

66

Varying  rank100  epochs  for  all67

Hadoop  scalabilityHadoop processsetup time startsto dominate68

Hadoop  scalability69

70

Rich Dad's Guide to Investing: What the Rich Invest in, That the Poor and the Middle Class Do Not! Investing in Real Estate, 5th Edition Stock Investing For Dummies Rich Dad's Advisors: The ABC's of Real Estate Investing: The Secrets of Finding Hidden Profits Most Investors Miss