IBM SPSS Modeler 14.2 User’s Guide

Transcription

iIBM SPSS Modeler 14.2 User’s Guide

Note: Before using this information and the product it supports, read the general informationunder Notices on p. 250.This edition applies to IBM SPSS Modeler 14 and to all subsequent releases and modificationsuntil otherwise indicated in new editions.Adobe product screenshot(s) reprinted with permission from Adobe Systems Incorporated.Microsoft product screenshot(s) reprinted with permission from Microsoft Corporation.Licensed Materials - Property of IBM Copyright IBM Corporation 1994, 2011.U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADPSchedule Contract with IBM Corp.

PrefaceIBM SPSS Modeler is the IBM Corp. enterprise-strength data mining workbench. SPSSModeler helps organizations to improve customer and citizen relationships through an in-depthunderstanding of data. Organizations use the insight gained from SPSS Modeler to retainprofitable customers, identify cross-selling opportunities, attract new customers, detect fraud,reduce risk, and improve government service delivery.SPSS Modeler’s visual interface invites users to apply their specific business expertise, whichleads to more powerful predictive models and shortens time-to-solution. SPSS Modeler offersmany modeling techniques, such as prediction, classification, segmentation, and associationdetection algorithms. Once models are created, IBM SPSS Modeler Solution Publisherenables their delivery enterprise-wide to decision makers or to a database.About IBM Business AnalyticsIBM Business Analytics software delivers complete, consistent and accurate information thatdecision-makers trust to improve business performance. A comprehensive portfolio of businessintelligence, predictive analytics, financial performance and strategy management, and analyticapplications provides clear, immediate and actionable insights into current performance and theability to predict future outcomes. Combined with rich industry solutions, proven practices andprofessional services, organizations of every size can drive the highest productivity, confidentlyautomate decisions and deliver better results.As part of this portfolio, IBM SPSS Predictive Analytics software helps organizations predictfuture events and proactively act upon that insight to drive better business outcomes. Commercial,government and academic customers worldwide rely on IBM SPSS technology as a competitiveadvantage in attracting, retaining and growing customers, while reducing fraud and mitigatingrisk. By incorporating IBM SPSS software into their daily operations, organizations becomepredictive enterprises – able to direct and automate decisions to meet business goals and achievemeasurable competitive advantage. For further information or to reach a representative visithttp://www.ibm.com/spss.Technical supportTechnical support is available to maintenance customers. Customers may contact TechnicalSupport for assistance in using IBM Corp. products or for installation help for one of thesupported hardware environments. To reach Technical Support, see the IBM Corp. web siteat http://www.ibm.com/support. Be prepared to identify yourself, your organization, and yoursupport agreement when requesting assistance. Copyright IBM Corporation 1994, 2011.iii

Contents1About IBM SPSS Modeler1IBM SPSS Modeler Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1IBM SPSS Modeler Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1IBM SPSS Text Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2IBM SPSS Modeler Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Application Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Demos Folder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42New Features5New and Changed Features in IBM SPSS Modeler 14.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5New Nodes in IBM SPSS Modeler 14.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53IBM SPSS Modeler Overview7Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Starting IBM SPSS Modeler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Launching from the Command Line . . . . . . . . . . .Connecting to IBM SPSS Modeler Server . . . . . .Changing the Temp Directory . . . . . . . . . . . . . . . .Starting Multiple IBM SPSS Modeler Sessions . .IBM SPSS Modeler Interface at a Glance . . . . . . . . . .88121213IBM SPSS Modeler Stream Canvas . . . . . . . .Nodes Palette . . . . . . . . . . . . . . . . . . . . . . . .IBM SPSS Modeler Managers . . . . . . . . . . . .IBM SPSS Modeler Projects . . . . . . . . . . . . .IBM SPSS Modeler Toolbar . . . . . . . . . . . . . .Customizing the Toolbar . . . . . . . . . . . . . . . . .Customizing the IBM SPSS Modeler Window.Using the Mouse in IBM SPSS Modeler . . . . .Using Shortcut Keys . . . . . . . . . . . . . . . . . . .Printing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13131416171819202021.Automating IBM SPSS Modeler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21iv

4Understanding Data Mining23Data Mining Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Assessing the Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24A Strategy for Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26The CRISP-DM Process Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Types of Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Data Mining Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345Building Streams35Stream-Building Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Building Data Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Working with Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Working with Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Stream Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Running Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Working with Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Adding Comments and Annotations to Nodes and Streams . .Saving Data Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Loading Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Mapping Data Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Tips and Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6Handling Missing Values.3647626566667678798487Overview of Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Handling Missing Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88Handling Records with Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Handling Fields with Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Imputing or Filling Missing Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90CLEM Functions for Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 917Building CLEM Expressions93About CLEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93CLEM Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96v

Values and Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Expressions and Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Stream, Session, and SuperNode Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100Working with Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100Handling Blanks and Missing Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101Working with Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102Working with Times and Dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102Summarizing Multiple Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103Working with Multiple-Response Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104The Expression Builder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105Accessing the Expression Builder . . . . . . . . . . . . . . .Creating Expressions . . . . . . . . . . . . . . . . . . . . . . . . .Selecting Functions . . . . . . . . . . . . . . . . . . . . . . . . . .Selecting Fields, Parameters, and Global Variables . .Viewing or Selecting Values. . . . . . . . . . . . . . . . . . . .Checking CLEM Expressions . . . . . . . . . . . . . . . . . . .Find and Replace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8.CLEM Language Reference.107107108109109111111115CLEM Reference Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115CLEM Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115Integers. . . . .Reals . . . . . . .Characters . .Strings. . . . . .Lists. . . . . . . .Fields. . . . . . .Dates. . . . . . .Time . . . . . . .CLEM Operators . .116116116117117117117118119Functions Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121Conventions in Function DescriptionsInformation Functions . . . . . . . . . . . .Conversion Functions . . . . . . . . . . . .Comparison Functions . . . . . . . . . . . .Logical Functions. . . . . . . . . . . . . . . .Numeric Functions . . . . . . . . . . . . . .Trigonometric Functions . . . . . . . . . .Probability Functions . . . . . . . . . . . . .Bitwise Integer Operations . . . . . . . .vi.121122123124125126127128128

Random Functions . . . . . . . . . . . . . . . . . . . . .String Functions. . . . . . . . . . . . . . . . . . . . . . .SoundEx Functions . . . . . . . . . . . . . . . . . . . .Date and Time Functions . . . . . . . . . . . . . . . .Sequence Functions . . . . . . . . . . . . . . . . . . .Global Functions . . . . . . . . . . . . . . . . . . . . . .Functions Handling Blanks and Null Values . .Special Fields . . . . . . . . . . . . . . . . . . . . . . . .9.Using IBM SPSS Modeler with a Repository.129130134135139143144145147About the IBM SPSS Collaboration and Deployment Services Repository . . . . . . . . . . . . . . . . . . 147Storing and Deploying IBM SPSS Collaboration and Deployment Services Repository Objects . . 149Connecting to the IBM SPSS Collaboration and Deployment Services Repository . . . . . . . . . . . . 150Entering Credentials for the IBM SPSS Collaboration and Deployment Services Repository . 151Browsing the IBM SPSS Collaboration and Deployment Services Repository Contents . . . . . . . . 152Storing Objects in the IBM SPSS Collaboration and Deployment Services Repository . . . . . . . . . 153Setting Object Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Storing Streams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Storing Projects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Storing Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Storing Output Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Storing Models and Model Palettes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Retrieving Objects from the IBM SPSS Collaboration and Deployment Services Repository . . . .153159159160160161161Choosing an Object to Retrieve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163Selecting an Object Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164Searching for Objects in the IBM SPSS Collaboration and Deployment Services Repository . . . . 164Modifying IBM SPSS Collaboration and Deployment Services Repository Objects . . . . . . . . . . . 167Creating, Renaming, and Deleting Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Locking and Unlocking IBM SPSS Collaboration and Deployment Services RepositoryObjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Deleting IBM SPSS Collaboration and Deployment Services Repository Objects . . . . . . . . .Managing Properties of IBM SPSS Collaboration and Deployment Services Repository Objects .167Viewing Folder Properties . . . . . . . . . . . .Viewing and Editing Object Properties . . .Managing Object Version Labels . . . . . . .Deploying Streams . . . . . . . . . . . . . . . . . . . . .169170173174.167168169Stream Deployment Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175The Scoring Branch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178vii

10 Exporting to External Applications185About Exporting to External Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185Opening a Stream in IBM SPSS Modeler Advantage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185Predictive Applications 4.x Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186Before Using the Predictive Applications Wizard . . . .Exporting Binary Predictions as Propensity Scores . .Step 1: Predictive Applications Wizard Overview . . .Step 2: Selecting a Terminal Node . . . . . . . . . . . . . . .Step 3: Selecting a UCV Node . . . . . . . . . . . . . . . . . .Step 4: Specifying a Package. . . . . . . . . . . . . . . . . . .Step 5: Generating the Package. . . . . . . . . . . . . . . . .Step 6: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . .Importing and Exporting Models as PMML . . . . . . . . . . . .187188188189190192192195195Model Types Supporting PMML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19711 Projects and Reports199Introduction to Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199CRISP-DM View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200Classes View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201Building a Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201Creating a New Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Adding to a Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Transferring Projects to the IBM SPSS Collaboration and Deployment Services Repository .Setting Project Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Annotating a Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Object Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

IBM SPSS Text Analytics IBM SPSS Text Analytics is a fully integrated add-on for SPSS Modeler that uses advanced linguistic technologies and Natural Language Processing (NLP) to rapidly process a large vari