SAP FORUM HANA & Hadoop

Transcription

Rumbo 2020SAP FORUMHANA & HadoopJavier Fernandez LeonFebruary 2016Intel Inside . Powerful Solution Outside.More information: www.descubrefujitsu.com/SAPforumFTS INTERNALPowered by Intel Xeon processor.

Rumbo 2020HANA &HADOOPIntroINDICE Challenges of distributed Big Data What is Apache Hadoop? Features Comparison HANA vs Hadoop HANA & Apache Spark HANA & Hadoop combined. Scenarios Uses Cases HANA & Hadoop Managed Service Pay per use model for HANA & HadoopIntel Inside . Powerful Solution Outside.More information: www.descubrefujitsu.com/SAPforumFTS INTERNALPowered by Intel Xeon processor.Copyright 2014 FUJITSU LIMITED

Challenges of distributed Big DataWE ARE DROWING IN OUR OWN DATAInefficient Data ProcessingReal-time drill-down interaction is impossible when data is distributed across thousandsof nodes and processed in batchesLack of Business AlignmentNeed to align business decisions to changing external market conditions by processingdata in business systems with Hadoop Data Lakes together.Costly Management of Big DataExtensive amounts of data start clogging business systems with data that can be moreefficiently archived to less expensive systemsIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum2Powered by Intel Xeon processor. 2015 FUJITSU

Gap between the Enterprise & Big Data FrameworksWE ARE DROWING IN OUR OWN DATAComplexityPerformanceEnterprise CoreSystemsUnable to worktogetherBig DataFrameworks &Tools .Objetives : Standarize, simplify and Automate both worlds.Intel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum3Powered by Intel Xeon processor. 2015 FUJITSU

What is Apache Hadoop?HADOOPAPACHE HADOOP is open source software that enables reliable, scalable, distributedcomputing on clusters of inexpensive serversRELIABLE : Software is fault tolerant, it expects and handles HW and SW failuresSCALABLE : designed for massive scale of processors, memory and local attached storage. PetabytesDISTRIBUTED : Handles replication. Offers massively parallel programming model , MapReduceIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum4Powered by Intel Xeon processor. 2015 FUJITSU

Hadoop Logical ComponentsHADOOPIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum5Powered by Intel Xeon processor. 2015 FUJITSU

What does Hadoop bring to the Table?HADOOPCost efficient data storage and processing for large volumes of structured, semi-structuredand unstructured data such as web logs, machine data, text data, call data records, audio,video data .BATCH PROCESSINGWhere fast response times are less critical than reliability ad scalabilityCOMPLEX INFORMATION PROCESSING: Enable heavily recursive algorithms, machine learning &queries that cannot be easily expressed in SQLLOW VALUE DATA ARCHIVE: Data stays available, though access is slower. Scale up to PetabytesPOST-HOC ANALYSIS: Mine raw data that is either schema-less or where schema changes over timeIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum6Powered by Intel Xeon processor. 2015 FUJITSU

Who uses Hadoop?HADOOPFACEBOOKYAHOOFacebook runs the world’s largestHadoop cluster. Just one of severalHadoop clusters operated by thecompany spans more than 4,000machines, and houses over 100petabytes of dataYahoo runs Hadoop on 42,000servers--that's 1,200 racks--in fourdata centers. Its largest HadoopCluster was 4000 nodes.Facebook messaging (Hbase) andgenerate reports for advertiserswho need to track effectiveness ofcampaignUse it for indexing of web crawlresultsIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum7TWITTERTwitter uses Hadoop for productanalysis, social graph analysis,generating indices for people search,natural language processing andmany other applicationsPowered by Intel Xeon processor. 2015 FUJITSU

Comparison Hadoop & HANAHADOOP & HANAHADOOPSAP HANAData ArchitectureUnstructured data and files on diskStructured data in memoryData StructuresNo predefined schemaPredefined schema & modelsPerformanceVery slow data access(seconds to hours)Very fast access ( 1 ms)ScalabilityScale-out to thousands of low cost serversScale up/ Scale-out to manyserverData ConsistencyBASE ( Basic availability, soft state,eventual consistency)ACID ( Atomicity, Consistency,Isolation, Durability)Licensing costsFree Open Source or commercial distrosMany options: cloud, enterprise OLTPNo OLTPExcellent OLTPOLAPSlow OLAPExcellent OLAPServer Fail OverQuery & Server Fail OverServer FailoverEnterprise Admin ToolsSmallExcellentIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum8Powered by Intel Xeon processor. 2015 FUJITSU

Combination of HANA & HadoopHADOOP & HANASAP HANA Instant resultsHADOOP Infinite storage Raw DataSAP & Hadoop Instant access Infinite scaleIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum9Powered by Intel Xeon processor. 2015 FUJITSU

Connection to HANASMART DATA ACCESS ( SDA) Benefits Enables access to remote data access just like“local” table Smart query processing including querydecomposition with predicate push-down,functional compensation Supports data location agnostic development No special syntax to access heterogeneousdata sources Not restricted only to Hadoop Heterogeneous data sources Oracle, MS SQL, Teradata, DB2, Netezza Hadoop –Hive, vUDF, Spark SAP HANA (BWoH, SoH) SAP Sybase ASE, IQ, MaxDB SAP Sybase ESP, SQLAIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum10Powered by Intel Xeon processor. 2015 FUJITSU

Example of scenario for bringing both worlds - POSSCENARIO HADOOP - HANAIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum11Powered by Intel Xeon processor. 2015 FUJITSU

SparkAPACHE SPARK VERY fast in-memory, data-processing framework – like lightning fast. 100xfaster than Hadoop fast Unlike Hadoop, supports batch and steaming Analysis -- Single Framework forbatch and near real time use cases Spark requires a1)Cluster Management :standalone, Hadoop YARN, Apache .2) Distributed Storage System : supports HDFS, Cassandra,Openstack Swift, Amazon S3 - All Hadoop connectors can be leveraged in Spark If you are going to start with Hadoop now, you should do it with SparkIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum12Powered by Intel Xeon processor. 2015 FUJITSU

SAP HANA VoraWHAT IS INSIDE?HANA Vora is an in-memory query engine which leverages and extends the Apache Sparkexecution framework to provide enriched interactive analytics on Hadoop. HANA Spark Adapter for improved performance between distributed systemsCompiled queries enable applications & data analysis to work more efficiently across nodesFamiliar OLAP experience on Hadoop to derive Business Insights from Big Data such as drill-down into HFDS dataIntegration of SAP data with data LakesHANA connectivity on HadoopEnterprise Analytics(hierarchies) & Interactive SQL on Hadoop dataData Tiering from HANA to Hadoop for OLAP scenarios using DLMArchiving of ERP data using ILM to HadoopIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum13Powered by Intel Xeon processor. 2015 FUJITSU

SAP HANA VoraUSE CASE : IoT for a Turbine Sensors stream data continuously Sensors typically structured in a Hierarchy Information regarding Hierarchy are typically stored on ERP System Information important for error detection: two sensorsROLE OF HANA VORA Providing OLAP capabilities - Joining Hierachy with IoT Data Bridges gap between Enterprise systems and cluster : BOM ofturbine easily accesible Performance of in-memory computing: On both Enterprise & ClusterprocessingIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum14Powered by Intel Xeon processor. 2015 FUJITSU

Key ScenariosIntel Inside . Powerful Solution Outside.FUJITSUINTERNALUSE ONLYMore information: www.descubrefujitsu.com/SAPforum15Powered by Intel Xeon processor.Copyright 2014 FUJITSU LIMITEDCopyright 2014 FUJITSU 2015 LIMITEDFUJITSU

Key ScenariosExample of Scenarios Flexible data store – Using Hadoop as a flexible store of data captured from multiple sources,including SAP and non-SAP software, enterprise software, and externally sourced data Simple database – Using Hadoop as a simple database for storing and retrieving data in very largedata sets Processing engine – Using the computation engine in Hadoop to execute business logic or someother process Data analytics – Mining data held in Hadoop for business intelligence and analyticsIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum16Powered by Intel Xeon processor. 2015 FUJITSU

Key Scenarios - ArchitectureEXAMPLE OF USE SCENARIOSIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum17Powered by Intel Xeon processor. 2015 FUJITSU

Key Scenarios – Hadoop as Flexible Data StoreEXAMPLE OF USE SCENARIOSSCENARIODESCRIPTIONSAMPLE USECASESCOMMENTSocial MediaReal-time capture of data from socialmedia sites, especially ofunstructured TextComments onproducts on Twitter,Facebook, andAmazonCombine social media datawith other data, for CRMdata or productdata, in real time to gaininsight.Data StreamCaptureReal-time capture of high volume,rapidly arriving data streamsSmart meters, factoryfloor machines, realtime web logs,sensors in vehiclesData ArchiveCapture of archive logs that wouldotherwise be sent to off-line storageArchive Data orcomputer systemslogsOLTP TransactionDataLong-term persistence oftransactional data fromhistorical online transactionprocessing (OLTP)Call center,inventory.Lower costs whencompared withconventional solutionsIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum18Powered by Intel Xeon processor. 2015 FUJITSU

Key Scenarios – Hadoop as Flexible Data StoreEXAMPLE OF USE SCENARIOSSCENARIODESCRIPTIONSAMPLE USE CASESReference DataCopy of existing largereference data setsCensus surveys, GIS, largeindustry specific data sets,weather measurement andtracking systemsStore reference dataalongside other data in oneplace to make it easier tocombine for analyticpurposesE-mail historiesCapture logs of e-mailcorrespondence a companysends and recevivesFulfillment of legalrequirements for e-mailpersistence and for use inanalyticsCombine data from emailwith other data to support,for example, riskmanagementDocument & MultmediaStorageCapture of businessdocuments generated andreceived by business.BLOBSHealthcare, insurance andother businesses thatgenerate or use largevolumes of documents thatmust be kept for extendedperiordsStore unlimited number ofdocuments in Hadoop, forexample, using HBAseIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum19Powered by Intel Xeon processor. 2015 FUJITSU

Key Scenarios – Hadoop as Processing EngineEXAMPLE OF USE SCENARIOSUse Hadoop as a data processing engine for ETL rationalization to feed SAP HANA MapReduce Programs execute process logic Pig for data analysis Mahout for data mining and machine learning Replicate master data to hadoop for data processing Feed results to SAP HANA with Data Services and merge with conformed modelIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum20Powered by Intel Xeon processor. 2015 FUJITSU

Key Scenarios – Hadoop as Processing EngineEXAMPLE OF USE SCENARIOSSCENARIODESCRIPTIONSAMPLE USE CASESETL RationalizationLow-latency ingestion of data fromoperational systemsTiered storage: High-value data loadedand transformed in HANA in parallel, offload preprocessing to hadoopIdentify differencesDifferences in large, but similar sets of dataDNA AnalysisHadoop usingMapreduceRisk AnalysisLook for known patterns in data in Hadoopthat suggest risky behaviorRisk in credit cards; Rogue tradersDaData Cleansing andenrichmentFix data issues. Enhance with additionalinformationAdd demographic or other data to, forexample, customer Web logsData MiningLook for patterns, data clusters, andcorrelations in HadoopAnalyze machine data to predictCorrelate customer behaviourIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum21COMMENTRequire MahoutPowered by Intel Xeon processor. 2015 FUJITSU

Key Scenarios – Hadoop & HANA for AnalyticsEXAMPLE OF USE SCENARIOS Hadoop storage is sometimes so high that can t be replicated into SAP HANA in a cost effective or timelymanner Some of the analysis must be done in Hadoop as well as SAP HANA Hadoop queries require longer processing times that SAP HANA Analysis will likely require combining data from Hadoop , SAP HANA and other sources Two approaches: Two-Phase Analytics : run analysis continually o Hadoop, then periodic updates to SAP HANA forfast interactive query responseFederated Queries: Split analysis into parts and run async on Hadoop & SAP HANA Federate results in SAP HANA or BIIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum22Powered by Intel Xeon processor. 2015 FUJITSU

Key Scenarios – Hadoop & HANA for AnalyticsEXAMPLE OF USE SCENARIOS – Two-Phase AnalyticsIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum23Powered by Intel Xeon processor. 2015 FUJITSU

Key Scenarios – Hadoop & HANA for AnalyticsEXAMPLE OF USE SCENARIOS – Federated QueriesIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum24Powered by Intel Xeon processor. 2015 FUJITSU

Use Cases - HealthcareUSE CASESIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum25Powered by Intel Xeon processor. 2015 FUJITSU

Use Cases - HealthcareEXAMPLEIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum26Powered by Intel Xeon processor. 2015 FUJITSU

Use Cases – Predictive MaintenanceEXAMPLE OF USE SCENARIOSBusiness ChallengesA computer server manufacturer wants to implement effective preventative maintenance by identifying problems asthey arise then take prompt action to prevent the problem occurring at other customer sitesTechnical Challenges Identifying problems by analyzing text data from call centers, customer questionnaires together with server logsgenerated by their hardware Combining results with CRM, sales and manufacturing data to predict which servers are ikely to have problems inthe futureSolution Use SAP Data Services to analyze call center data and questionnaires stored in Hadoop and identify potentialproblems Use HANA to merge results from Hadoop with server logs to identify indicators in those logs of potential problems Combine with CRM, bill of material and production/manufacturing data to identify cases where preventativemaintenance would helpIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum27Powered by Intel Xeon processor. 2015 FUJITSU

Pay per useModels forHANA &HadoopIntel Inside . Powerful Solution Outside.FUJITSUINTERNALUSE ONLYMore information: www.descubrefujitsu.com/SAPforum28Powered by Intel Xeon processor.Copyright 2014 FUJITSU LIMITEDCopyright 2014 FUJITSU 2015 LIMITEDFUJITSU

Modelo de Servicio definido por 5 parámetrosEJEMPLO: Sistema SAP ERP 6.0 de PRODUCCIÓN5parámetrosstandarddefinen elservicioSAPCualitativosCuantitativosAvailability class99.5%Managed operations24 7Disaster-recovery classDR, local HA, .Managed performanceDialog responsetime 90% 1 sec.AdditionalCertification(s)ISAE3402 (SOX),SAS70 Estos parámetros reflejan los SLAs!!!!Estos parámetros reflejan el uso!!!!Intel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum29Powered by Intel Xeon processor. 2015 FUJITSU

SLAs verificables desde SAPLas transaccionesrepresentanla utilizaciónreal del sistema SAP yestán vinculadas al negocioIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum30Powered by Intel Xeon processor. 2015 FUJITSU

¿Y qué pasa con SAP HANA?Intel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum31Powered by Intel Xeon processor. 2015 FUJITSU

HANA en Cloud en modo pago por uso - vHANAvHANA CLOUDSERVICIOS INCLUÍDOSPAGO MENSUAL ENFUNCIÓN DE LAMEMORIA CONSUMIDAEN HANAIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum32Powered by Intel Xeon processor. 2015 FUJITSU

Service Governance(Service Desk, Service-Management)Hadoop in Pay Per Use based on OpenstackHadoop Integration with SAP HANA(Administration , Connectivity )Level 5HADOOP PLATFORM Services(Administration/Monitoring, Backup- & Recovery, patches,upgrades )OPENSTACK System Services(Administration/Monitoring, patches, upgrades .)Level 3OPENSTACK FRAMEWORKLevel 2(Ceph, Neutron, Nova. Heat .)Data Center and Network Services(Administration Monitoring , Capacity-Management)Level 1Intel Inside . Powerful Solution Outside.FUJITSULevel 4More information: www.descubrefujitsu.com/SAPforum33Powered by Intel Xeon processor. 2015 FUJITSU

Hadoop in Pay Per Use based on OpenstackHADOOP CLOUDSERVICIOS INCLUÍDOSPAGO MENSUAL SERVICIOGESTONADO EN FUNCIÓNDE LA MEMORIA/CPU/CONSUMIDA POR HADOOPIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum34Powered by Intel Xeon processor. 2015 FUJITSU

Take AwaysIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum35Powered by Intel Xeon processor. 2015 FUJITSU

SummaryTAKE AWAYS Hadoop excels at very high-scale, low-cost/TB and data type flexibility SAP HANA excels at speed and structure, plus is fully integrated with Business Suite –Enterprise Logic Leverage strenghs of both platforms in data store, data processing and analytics scenarios Carefully evaluate your requirements and use case against these scenarios If you are about to start with Hadoop, use Apache Spark & Vora Both can be deployed in a simple, pay per use model by FujitsuIntel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum36Powered by Intel Xeon processor. 2015 FUJITSU

Intel Inside . Powerful Solution Outside.FUJITSUMore information: www.descubrefujitsu.com/SAPforum37Powered by Intel Xeon processor. 2015 FUJITSU

Rumbo 2020FTS INTERNAL

APACHE SPARK Spark VERY fast in-memory, data-processing framework - like lightning fast. 100x faster than Hadoop fast Unlike Hadoop, supports batch and steaming Analysis -- Single Framework for batch and near real time use cases Spark requires a 1)Cluster Management :standalone, Hadoop YARN, Apache .