DimStiller: Workflows For Dimensional Analysis And Reduction

Transcription

DimStiller: Workflows forDimensional Analysis and ReductionStephen Ingram, Tamara MunznerVeronika Irvine, Melanie TorySteven Bergner, Torsten Möller1

Overview Dimensionality Reduction Users Related Work Guidance DimStiller2

Dimension(ality)Reduction3

Dimension ReductionFilterCullSynthesizeCollectcomplex combinationsof input dimensions(nuttiness, fruitiness)dimensions areuninteresting(weight of spoon)dimensions areredundant(caffeine s/2009/03/coffee tongue.jpghttp://commons.wikimedia.org/wiki/File:A small cup of 4

Dimension ReductionFilterCullSynthesizeCollectcomplex combinationsof input dimensions(nuttiness, fruitiness)dimensions areuninteresting(weight of spoon)dimensions areredundant(caffeine s/2009/03/coffee tongue.jpghttp://commons.wikimedia.org/wiki/File:A small cup of 5

Dimension ReductionFilterCullSynthesizeCollectcomplex combinationsof input dimensions(nuttiness, fruitiness)dimensions areuninteresting(weight of spoon)dimensions areredundant(caffeine s/2009/03/coffee tongue.jpghttp://commons.wikimedia.org/wiki/File:A small cup of 6

Dimension ReductionFilterCullSynthesizeCollectcomplex combinationsof input dimensions(nuttiness, fruitiness)dimensions areuninteresting(weight of spoon)dimensions areredundant(caffeine s/2009/03/coffee tongue.jpghttp://commons.wikimedia.org/wiki/File:A small cup of 7

Dimension ReductionFilterCullSynthesizeCollectcomplex combinationsof input dimensions(nuttiness, fruitiness)dimensions areuninteresting(weight of spoon)dimensions areredundant(caffeine s/2009/03/coffee tongue.jpghttp://commons.wikimedia.org/wiki/File:A small cup of 8

Synthetic DR Example123Face Image Dataset:700 Faces35x35 1225 Dimensions700 x 1225 Dataset700http://isomap.stanford.edu/web3.jpg9

Synthetic DR ExampleNew Dataset:700 Faces2 Dimensions700 x 2 Dataset10http://isomap.stanford.edu/web3.jpg

USERS11

Visual High Dimensional Analysis(VHDA) User MapMath / StatsData Knowledge12

VHDA User MapBest Paper at NIPSMath / StatsTook Stats in UndergradWhat’s a mean?Data Knowledge13

VHDA User MapTotal Information AwarenessMath / StatsDropped in lapData Knowledge14

VHDA User MapMath / StatsPedagogicalData Knowledge15

VHDA User MapMath / StatsDon’t NeedAnalysisData Knowledge16

VHDA User MapWell DefinedTasksMath / StatsData Knowledge17

VHDA User MapMath / StatsMiddle Ground UsersData Knowledge18

RELATED WORK19

Other SystemsToolTarget UsersLimitationsMatlab, R, etc.Needs PowerUsersDR ToolkitsOnly LessProgrammingXMDVTool, GGobiNo GuidanceBeyond VisJohansson &Johansson 2009No Synthetic DR20

Hole In Prev Work Access To Range Of DR Algos Guidance For Middle Ground Users

Contributions22

Design andImplementation ofDimStiller23

Global and LocalGuidanceGlobal : : Operators24View:SPLOM

GUIDANCE25

Sloppy,MisunderstoodCompact,EvocativeOperator Space26

Which Operations and What courses/cs322/2008sp/schedule.htmlOperator e plot for the initial dataset Figure 36.jpghttp://www.scielo.cl/scielo.php?pid S0716-078X2001000200019&script sci /400/data filter icon?r 1http://www.personality-project.org/R/SPLOM27

Global GuidanceWhich Operations and What courses/cs322/2008sp/schedule.htmlOperator e plot for the initial dataset Figure 36.jpghttp://www.scielo.cl/scielo.php?pid S0716-078X2001000200019&script sci /400/data filter icon?r 1http://www.personality-project.org/R/SPLOM28

Local GuidanceWhat to do with a given operator?FilterHow many principal components?Sloppy,MisunderstoodPCAPCAWhat do they du/courses/cs322/2008sp/schedule.htmlOperator e plot for the initial dataset Figure 36.jpghttp://www.scielo.cl/scielo.php?pid S0716-078X2001000200019&script sci /400/data filter icon?r ct,Evocative

DIMSTILLER30

DimStiller31

DimStillerWorkflowSelector32

DimStillerExpressionTree33

DimStillerOperatorControl34

DimStillerOperatorViews35

EXAMPLE36

5000 pts294 dim294 DIMS37

SelectReduce:PCAWorkflow294 DIMSViewOperatorList Here38

Cull:VarianceOperatorScree Plotof Variances294 DIMS39

Log-scalefor betterVisibility294 DIMS40

List ofCulled DimsChoose firstnonzerodimension(31)264 DIMS41

Data:NormOperator264 DIMS42

Correlation Sliderset to 1.0146 DIMS43

Correlation Sliderset to 0.937 DIMS44

Eigenvalue ScreePlot : values die offaround 1616 DIMS45

Manageable SPLOM16 DIMS46

Operators &Workflows47

Operator FamiliesFamily NameOperatorsCullVariance, NameCollectPearson’sReducePCA, MDSViewSPLOM, HistoAttribColor, ClusterFilterValue48

Custom Workflows Three Workflows Given Freeform Experimenting With Operators Custom Workflows after Success49

Conclusions Presented the design and implementationof the DimStiller software Provided Global and Local guidance toopen up dimensionality reduction formiddle ground users beyond experts in math AND data

Thanks! Download DimStiller at .http://www.cs.ubc.ca/ sfingram/dimstiller Doing Dim Reduction?sfingram@cs.ubc.ca Funded By NSERC51Let me know!

DimStiller: Workflows for Dimensional Analysis and Reduction Stephen Ingram, Tamara Munzner Ver