Transcription
Introduction toTalend Open Studio for Data IntegrationDimitar ZaharievBI / DI Consultantdimitar@zahariev.pro@shekeriev
DisclaimerPlease keep in mind that: 2I’m not related in any way to TalendEverything stated from now on is my personal opinion and itdoesn’t reflect in any way the position of my employer or otherrelated parties
Agenda General definitions Business case Demo3
General definitionsJust to be sure that we are on the same page4
Main definitions WorkspaceLocal directory that stores one or more projects ProjectLogical grouping of one or more jobs JobThe smallest executable unit. It is a group of one or more components.Typically implements a data flow or integration process5
General look and feel6
General look and feelRepositoryGives us access to the Repository where we cancreate Jobs and manage metadata7
General look and feelDesign WorkspaceProvides us with a playground to design our Jobs8
General look and feelConfiguration TabsAllow us to control the components behaviorand execute Jobs9
General look and feelOutline and Code tabsThe Outline tab lists the components that have beenadded to the design workspace. The Code tab displaysthe code associated with each component10
General look and feelPaletteContains the different components we useto build our Jobs11
Main sections of the Repository Job DesignsStores Jobs we work on. Furthermore Jobs can be organized into folders ContextsContains sets of global or job-specific variables MetadataHolds descriptive information about our data sources and targets groupedby type12
Building blocks of a data warehouse DimensionsA dimension is a structure that categorizes facts and measures in order toenable users to answer business questions. Commonly used dimensions arepeople, products, place and time. Historical changes in dimensions areusually tracked by SCD management methodologies referred to as Type 0through 6. FactsA fact is a value or measurement, which represents a fact about themanaged entity or system.Wikipedia13
Business caseWhat is the problem and how to deal with it?14
The customerLinuxGoods.rs is a local Serbian on-line shop for Linux and Unixrelated merchandise like: Badges Stickers T-shirts Hats and etc.15
The caseAs their business was growing they began to realize that therehad to be a way to analyze what is going on. It would allow themto keep the trend.So they decided to build a small data warehouse to meet theirgrowing need for analytical overview of the business.16
The landscapeThree source systems and one target – a database. Input data is coming inthree forms - plain text files, excel files, and XML files. Part of the processedfiles should be moved in another folder for archiving purposes.17
The solution18
DemoTalend Open Studio for Data Integration in action19
ResourcesUseful stuff to help us on our journey with Talend20
Official resourcesA short list of helpful resources: Software and -open-studio#t4 Talend knowledge basehttps://help.talend.com/display/HOME/Knowledge Base Talend community sitehttps://www.talendforge.org/ Talend demo project (available within the studio)21
Additional resourcesA very good book on the subject: Getting Started with Talend Open Studio for Data Integrationby Jonathan BowenResources prepared by me: Pre-Built Linux VMs with Talend installed for VirtualBoxhttps://zahariev.pro/balccon2k16 Articles on the subject (they will increase with time)https://zahariev.pro/category/talend22
Thank you!Dimitar ZaharievBI / DI Consultantdimitar@zahariev.pro@shekeriev
5 Main definitions Workspace Local directory that stores one or more projects Project Logical grouping of one or more jobs Job The smallest