Talend Open Studio - Courses

Transcription

Talend Open Studiofor Data IntegrationUser Guide5.0 b

Talend Open StudioTalend Open Studio : User GuideAdapted for Talend Open Studio for Data Integration v5.0.x. Supersedes previous User Guide releases.CopyleftThis documentation is provided under the terms of the Creative Commons Public License (CCPL).For more information about what you can and cannot do with this documentation in accordance with the CCPL, please read: oticesAll brands, product names, company names, trademarks and service marks are the properties of their respective owners.

Table of ContentsPreface . ix1. General information . ix1.1. Purpose . ix1.2. Audience . ix1.3. Typographicalconventions . ix2. History of changes . ix3. Feedback and Support . xChapter 1. Data integration andTalend Studio . 11.1. Data analytics . 21.2. Operational integration . 21.3. Execution monitoring . 3Chapter 2. Getting started withTalend Studio . 52.1. Important concepts in TalendOpen Studio for Data Integration. 62.2. Launching Talend OpenStudio for Data Integration . 62.2.1. How to launch theStudio for the first time . 62.2.2. How to set up a project. 102.3. Working with differentworkspace directories . 102.3.1. How to create a newworkspace directory . 112.4. Working with projects . 112.4.1. How to create a project. 122.4.2. How to import thedemo project . 142.4.3. How to import projects. 152.4.4. How to open a project. 172.4.5. How to delete a project. 182.4.6. How to export a project. 182.4.7. Migration tasks . 192.5. Setting Talend Open Studiofor Data Integration preferences. 202.5.1. Java Interpreter path . 202.5.2. External or Usercomponents . 212.5.3. Exchange preferences. 222.5.4. Language preferences. 222.5.5. Debug and Jobexecution preferences . 232.5.6. Designer preferences . 242.5.7. Adding code by default. 252.5.8. Performancepreferences . 262.5.9. Documentationpreferences . 272.5.10. Displaying specialcharacters for schemacolumns . 272.5.11. SQL Builderpreferences . 272.5.12. Schema preferences . 282.5.13. Libraries preferences. 292.5.14. Type conversion . 302.5.15. Usage Data Collectorpreferences . 302.6. Customizing project settings. 322.6.1. Palette Settings . 332.6.2. Version management . 342.6.3. Status management . 352.6.4. Job Settings . 362.6.5. Stats & Logs . 372.6.6. Context settings . 382.6.7. Project Settings use . 392.6.8. Status settings . 402.6.9. Security settings . 42Chapter 3. Designing a BusinessModel . 433.1. What is a Business Model . 443.2. Opening or creating aBusiness Model . 443.2.1. How to open aBusiness Model . 453.2.2. How to create aBusiness Model . 453.3. Modeling a Business Model. 463.3.1. Shapes . 463.3.2. Connecting shapes . 473.3.3. How to comment andarrange a model . 493.3.4. Business Models . 513.4. Assigning repositoryelements to a Business Model . 533.5. Editing a Business Model . 543.5.1. How to rename aBusiness Model . 543.5.2. How to copy and pastea Business Model . 543.5.3. How to move aBusiness Model . 543.5.4. How to delete aBusiness Model . 543.6. Saving a Business Model . 54Chapter 4. Designing a dataintegration Job . 57Talend Open Studio for Data Integration User Guide

Talend Open Studio4.1. What is a Job design . 584.2. Getting started with a basicJob design . 584.2.1. How to create a Job . 584.2.2. How to dropcomponents to the workspace. 614.2.3. How to searchcomponents in the Palette . 634.2.4. How to connectcomponents together . 634.2.5. How to dropcomponents in the middle ofa Row link . 644.2.6. How to definecomponent properties . 654.2.7. How to run a Job . 714.2.8. How to customize yourworkspace . 774.3. Using connections . 824.3.1. Connection types . 824.3.2. How to defineconnection settings . 864.4. Using the Metadata Manager. 884.4.1. How to centralize theMetadata items . 884.4.2. How to centralizecontexts and variables . 894.4.3. How to use the SQLTemplates . 1004.5. Handling Jobs: advancedsubjects . 1004.5.1. How to map data flows. 1004.5.2. How to create queriesusing the SQLBuilder . 1014.5.3. How to download/upload Talend Communitycomponents . 1044.5.4. How to install externalmodules . 1114.5.5. How to launch a Jobperiodically . 1124.5.6. How to use the tPrejoband tPostjob components . 1144.5.7. How to use the UseOutput Stream feature . 1154.6. Handling Jobs: miscellaneoussubjects . 1154.6.1. How to share adatabase connection . 1154.6.2. How to define the Startcomponent . 1164.6.3. How to handle erroricons on components or Jobs. 117iv4.6.4. How to add notes to aJob design .4.6.5. How to display thecode or the outline of yourJob .4.6.6. How to manage thesubjob display .4.6.7. How to define optionson the Job view .4.6.8. How to findcomponents in Jobs .4.6.9. How to set defaultvalues in the schema of ancomponent .119120121123124126Chapter 5. Managing dataintegration Jobs . 1295.1. Activating/Deactivating a Jobor a sub-job . 1305.1.1. How to disable a Startcomponent . 1305.1.2. How to disable a nonStart component . 1305.2. Importing/exporting items orJobs . 1315.2.1. How to import items . 1315.2.2. How to export Jobs toan archive . 1335.2.3. How to export items . 1445.2.4. How to change contextparameters in Jobs . 1465.3. Managing repository items . 1475.3.1. How to handle updatesin repository items . 1475.4. Searching a Job in therepository . 1495.5. Managing Job versions . 1515.6. Documenting a Job . 1525.6.1. How to generateHTML documentation . 1525.6.2. How to update thedocumentation on the spot . 1535.7. Handling Job execution . 1535.7.1. How to deploy a Job onSpagoBI server . 153Chapter 6. Mapping data flows. 1576.1. tMap and tXMLMapinterfaces .6.2. tMap operation .6.2.1. Setting the input flowin the Map Editor .6.2.2. Mapping variables .6.2.3. Using the expressioneditor .6.2.4. Mapping the Outputsetting .Talend Open Studio for Data Integration User Guide158159160167168173

Talend Open Studio6.2.5. Setting schemas in theMap Editor . 1786.2.6. Solving memorylimitation issues in tMap use. 1816.2.7. Handling Lookups . 1836.3. tXMLMap operation . 1846.3.1. Using the documenttype to create the XML tree. 1856.3.2. Defining the outputmode . 1956.3.3. Editing the XML treeschema . 199Chapter 7. Managing Metadata. 2017.1. Objectives . 2027.2. Setting up a DB connection. 2037.2.1. Step 1: Generalproperties . 2037.2.2. Step 2: Connection . 2037.2.3. Step 3: Table upload . 2057.2.4. Step 4: Schemadefinition . 2087.3. Setting up a JDBC schema. 2097.3.1. Step 1: Generalproperties . 2097.3.2. Step 2: Connection . 2097.3.3. Step 3: Table upload . 2107.3.4. Step 4: Schemadefinition . 2117.4. Setting up a SAS connection. 2117.4.1. Prerequisites . 2117.4.2. Step 1: Generalproperties . 2117.4.3. Step 2: Connection . 2117.5. Setting up a File Delimitedschema . 2137.5.1. Step 1: Generalproperties . 2137.5.2. Step 2: File upload . 2147.5.3. Step 3: Schemadefinition . 2147.5.4. Step 4: Final schema. 2167.6. Setting up a File Positionalschema . 2177.6.1. Step 1: Generalproperties . 2187.6.2. Step 2: Connection andfile upload . 2187.6.3. Step 3: Schemarefining . 2197.6.4. Step 4: Finalizing theend schema . 2197.7. Setting up a File Regexschema . 2197.7.1. Step 1: Generalproperties . 2197.7.2. Step 2: File upload . 2197.7.3. Step 3: Schemadefinition . 2207.7.4. Step 4: Finalizing theend schema . 2217.8. Setting up an XML fileschema . 2217.8.1. Setting up an XMLschema for an input file . 2217.8.2. Setting up an XMLschema for an output file . 2287.9. Setting up a File Excelschema . 2377.9.1. Step 1: Generalproperties . 2387.9.2. Step 2: File upload . 2387.9.3. Step 3: Schemarefining . 2397.9.4. Step 4: Finalizing theend schema . 2407.10. Setting up a File LDIFschema . 2417.10.1. Step 1: Generalproperties . 2417.10.2. Step 2: File upload . 2417.10.3. Step 3: Schemadefinition . 2427.10.4. Step 4: Finalizing theend schema . 2437.11. Setting up an LDAP schema. 2437.11.1. Step 1: Generalproperties . 2447.11.2. Step 2: Serverconnection . 2447.11.3. Step 3: Authenticationand DN fetching . 2447.11.4. Step 4: Schemadefinition . 2467.11.5. Step 5: Finalizing theend schema . 2467.12. Setting up a Salesforceconnection . 2477.12.1. Step 1: Generalproperties . 2477.12.2. Step 2: Connection toa Salesforce account . 2487.12.3. Step 3: RetrievingSalesforce modules . 2487.12.4. Step 4: RetrievingSalesforce schemas . 2497.12.5. Step 5: Finalizing theend schema . 2507.13. Setting up a Generic schema. 252Talend Open Studio for Data Integration User Guidev

Talend Open Studio7.13.1. Step 1: Generalproperties . 2527.13.2. Step 2: Schemadefinition . 2527.14. Setting up an MDMconnection . 2537.14.1. Step 1: Setting up theconnection . 2537.14.2. Step 2: DefiningMDM schema . 2557.15. Setting up a Web Serviceschema . 2697.15.1. Setting up a simpleschema . 2697.16. Setting up an FTPconnection . 2727.16.1. Step 1: Generalproperties . 2727.16.2. Step 2: Connection . 2737.17. Exporting Metadata ascontext . 275Chapter 8. Managing routines. 2778.1. What are routines .8.2. Accessing the SystemRoutines .8.3. Customizing the systemroutines .8.4. Managing user routines .8.4.1. How to create userroutines .8.4.2. How to edit userroutines .8.4.3. How to edit userroutine libraries .8.5. Calling a routine from a Job.8.6. Use case: Creating a file forthe current date ppendix A. GUI . 299A.1. Main window . 300A.2. Menu bar and Toolbar . 301vi301302303305305306308308Appendix B. Theory intopractice: Job examples . 311B.1. tMap Job example .B.1.1. Introducing thescenario .B.1.2. Translating thescenario into a Job .B.2. Using the output streamfeature .B.2.1. Introducing thescenario .B.2.2. Translating thescenario into a Job .312312313320320321Appendix C. System routines . 329Chapter 9. Using SQL templates. 2879.1. What is ELT .9.2. Introducing Talend SQLtemplates .9.3. Managing Talend SQLtemplates .9.3.1. Types of system SQLtemplates .9.3.2. How to access a systemSQL template .9.3.3. How to create userdefined SQL templates .9.3.4. A use case of systemSQL Templates .A.2.1. Menu bar of TalendOpen Studio for DataIntegration .A.2.2. Toolbar of TalendOpen Studio for DataIntegration .A.3. Repository tree view .A.4. Design workspace .A.5. Palette .A.6. Configuration tabs .A.7. Outline and code summarypanel .A.8. Shortcuts and aliases .C.1. Numeric Routines . 330C.1.1. How to create aSequence . 330C.1.2. How to convert anImplied Decimal . 330C.2. Relational Routines . 331C.3. StringHandling Routines . 331C.3.1. How to store a stringin alphabetical order . 332C.3.2. How to check whethera string is alphabetical . 333C.3.3. How to replace anelement in a string . 333C.3.4. How to check theposition of a specificcharacter or substring, withina string . 333C.3.5. How to calculate thelength of a string . 333C.3.6. How to delete blankcharacters . 334C.4. TalendDataGenerator Routines. 334C.4.1. How to generatefictitious data . 335C.5. TalendDate Routines . 335C.5.1. How to format a Date. 336C.5.2. How to check a Date. 337Talend Open Studio for Data Integration User Guide

Talend Open StudioC.5.3. How to compare Dates. 337C.5.4. How to configure aDate . 337C.5.5. How to parse a Date. 338C.5.6. How to retrieve part ofa Date . 338C.5.7. How to format theCurrent Date . 338C.6. TalendString Routines . 339C.6.1. How to format anXML string . 339C.6.2. How to trim a string. 340C.6.3. How to removeaccents from a string . 340Appendix D. SQL templatewriting rules . 341D.1. SQL statements .D.2. Comment lines .D.3. The %.% syntax .D.4. The % .% syntax .D.5. The /./ syntax .D.6. Code to access the componentschema elements .D.7. Code to access the componentmatrix properties .342342342343343344344Talend Open Studio for Data Integration User Guidevii

Talend Open Studio for Data Integration User Guide

Preface1. General information1.1. PurposeThis User Guide explains how to manage Talend Open Studio for Data Integration functions in anormal operational context.Information presented in this document applies to Talend Open Studio for Data Integration releasesbeginning with 5.0.x.1.2. AudienceThis guide is for users and administrators of Talend Open Studio for Data Integration.The layout of GUI screens provided in this document may vary slightly from your actual GUI.1.3. Typographical conventionsThis guide uses the following typographical conventions: text in bold: window and dialog box buttons and fields, keyboard keys, menus, and menu andoptions, text in [bold]: window, wizard, and dialog box titles, text in courier: system parameters typed in by the user, text in italics: file, schema, column, row, and variable names, Theicon indicates an item that provides additional information about an important point. It isalso used to add comments related to a table or a figure,Theicon indicates a message that gives information about the execution requirements orrecommendation type. It is also used to refer to situations or information the end-user need to beaware of or pay special attention to.2. History of changesThe following table lists changes made in the Talend Open Studio for Data Integration User Guide.VersionDateHistory of Changesv4.2 a19/05/2011Updates in Talend Open Studio for Data Integration User Guideinclude:Talend Open Studio for Data Integration User Guide

Feedback and SupportVersionDateHistory of Changes Created a User Guide for the new Talend Open Studio for DataIntegration. Updated the Copyright variable in cover files Updated chapter: Getting Started with Talend Open Studio forData Integration Updated chapter: Mapping data flows Updated appendix: System routines Updated chapter: Managing Metadata Updated chapter: Designing a data integration Job Updated chapter: Managing data integration Jobsv4.2 b12/07/2011Updates in Talend Open Studio for Data Integration User Guideinclude: Updated chapter: Getting Sta

Talend Open Studio Talend Open Studio : User Guide Adapted for Talend Open Studio for Data Integration v5.0.x. Supersedes previous User Guide releases. Copyleft This documentation is provided under