Introduction To Graph Analytics And Oracle Cloud Service

Transcription

Introduction to Graph Analytics andOracle Cloud ServiceHans ViehmannProduct Manager EMEAOracle@SpatialHannesJean IhmProduct Manager USOracle@JeanIhmKorbi SchmidEngineering ManagerOracleOctober 22, 2018Copyright 2018, Oracle and/or its affiliates. All rights reserved.

Safe Harbor StatementThe following is intended to outline our general product direction. It is intended forinformation purposes only, and may not be incorporated into any contract. It is not acommitment to deliver any material, code, or functionality, and should not be relied uponin making purchasing decisions. The development, release, timing, and pricing of anyfeatures or functionality described for Oracle’s products may change and remains at thesole discretion of Oracle Corporation.Copyright 2018, Oracle and/or its affiliates. All rights reserved. 2

Program Agenda1Product Introduction2Use Cases3Feature Overview4DemoCopyright 2018, Oracle and/or its affiliates. All rights reserved. 3

Following, no follow backFollower, no follow backFollow each otherhttps://twitter.jeffprod.comCopyright 2018, Oracle and/or its affiliates. All rights reserved. 4

Oracle’s Spatial and Graph StrategyEnabling Spatial and Graph use cases on every platformOracle DatabaseSpatial and GraphOracle Big DataSpatial and GraphSpatial and Graph inOracle CloudDatabase Cloud ServiceExadata Cloud ServiceCopyright 2018, Oracle and/or its affiliates. All rights reserved. 5

Two Graph Data ModelsSocial NetworkAnalysisLinked DataSemantic WebUse CaseProperty Graph Model Path Analytics Social Network Analysis Entity analyticsRDF Data Model Data federation Knowledge representation Semantic WebGraph Model FinancialRetail, MarketingSocial MediaSmart Manufacturing Life SciencesHealth CarePublishingFinanceIndustry DomainCopyright 2018, Oracle and/or its affiliates. All rights reserved.

Graph Database Features: Scalability, Performance, Security Graph Analytics Graph Query Language Graph VisualizationCourtesy Linkurious Standard Interfaces Integration with Machine Learning toolsCourtesy Tom Sawyer PerspectivesCopyright 2018, Oracle and/or its affiliates. All rights reserved. 7

Oracle Products Supporting Property GraphsOracle Big Data Spatial and GraphOracle Spatial and Graph (DB option) Available for Big Data platform Available with Oracle 12.2/18c (EE)– Hadoop, HBase, Oracle NoSQL Using tables for graph persistence Supported both on BDA andcommodity hardware In-database graph analytics– CDH and Hortonworks– Sparsification, shortest path, page rank,triangle counting, WCC, sub graphgeneration Database connectivity through BigData Connectors or Big Data SQL SQL and PGQL queries possible Part of Big Data Cloud Service Included in Database Cloud ServicesCopyright 2018, Oracle and/or its affiliates. All rights reserved. 8

Use CasesCopyright 2018, Oracle and/or its affiliates. All rights reserved. 9

Graph Analysis for Business InsightIdentifyInfluencersDiscover Graph Patternsin Big DataCopyright 2018, Oracle and/or its affiliates. All rights reserved. GenerateRecommendations10

Banco de Galicia Customer profitability analysis– Part of larger Hadoop/Big Data project Analysis of banking transactions– Focus on corporate customers Identification of undesired behaviouralpatterns, eg.– Customers using other banks to make largenumbers of transactions– Many of which flow back to Banco Galicia Increase fees, terminate contracts, ormove activities to Banco Galicia Implemented by Oracle ConsultingCopyright 2018, Oracle and/or its affiliates. All rights reserved. 11

Romanian Police Force Creating Knowledge Graphs from all kindsof contentBIG DATA since 2012– Social media networks, documents, images,audio, video, structured data– Using machine learning (text analysis,classification, entity extraction, facerecognition, speech2text, .) Enabling relationship analysis andsemantic search bigCONNECT platform built by mWARE– Running on Big Data Applicance, Big Data CloudService or commodity HadoopCopyright 2018, Oracle and/or its affiliates. All rights reserved. 12

Ministry of Finance, Eastern Europe Ingesting accounting data in SAF-T format– Hadoop-based processing (Oozie, Spark, Hive)– Terabytes of data, rapidly growingImporter„fake” companyTax authorityBufferResellerTax authorityExportercompany?GettingVAT refundCompany inother EU membercountryPaying VAT– Circular money transfers– Connections (existing path/shortest path) tocompanies in tax havensNot paying VAT Identifying suspicious patternsBORDER 0% VAT BORDER 0% VAT BORDER 0% VAT BORDER– Similar to Paradise PapersBORDER 0% VAT BORDER 0% VAT BORDER 0% VAT BORDEREU VAT fraud Detecting relationships between people,accounts, companiesCompany inother EU membercountryTax authority Interactive graph analysis in Apex withCytoscape.jsCopyright 2018, Oracle and/or its affiliates. All rights reserved. 13

Mazda Management of Bill-of-materials– Automotive manufacturing process– Supporting high variance and shortinnovation cycles Data coming from various sources Complex PGQL queries to associateparts and subcomponents– Performance as key requirement– Happy with response times andscaleabilityCopyright 2018, Oracle and/or its affiliates. All rights reserved. 14

Feature OverviewCopyright 2018, Oracle and/or its affiliates. All rights reserved. 15

In-memory Analytic EngineJava APIsGraph Storage ManagementBlueprints & SolrCloud / LucenePython, Perl, PHP, Ruby,Javascript, Graph AnalyticsREST Web ServiceCytoscape Plug-inR Integration (OAAgraph)Spark integrationOracle Graph Analytics ArchitectureJava APIs/JDBC/SQL/PLSQLScalable and Persistent StorageProperty Graph Support onApache HBase, Oracle NoSQL or Oracle 12.2Copyright 2018, Oracle and/or its affiliates. All rights reserved. 16

Interacting with the GraphOn-premise product geared towards data scientists and developers Access through APIs– Implementation of Apache Tinkerpop Blueprints APIs– Based on Java, REST plus SolR Cloud/Lucene support for text search Scripting– Groovy, Python, Javascript, .– Apache Zeppelin integration, Javascript (Node.js) language binding Graphical UIs– Cytoscape, plug-in available– Commercial Tools such as TomSawyer PerspectivesCopyright 2018, Oracle and/or its affiliates. All rights reserved. 17

Example: Betweenness Centrality in Big Data etTopKValues(15)FJGBAIHCDKECopyright 2018, Oracle and/or its affiliates. All rights reserved. 18

Pattern matching in Property Graphs using PGQL Finding a given pattern in graph– Fraud detection– Anomaly detection– Subgraph extraction– . SQL-like syntax but with graphpattern description and propertyaccess Proposed for standardization byOracle– Specification available on-line– Open-sourced front-end (i.e. parser)https://github.com/oracle/pgql-lang– Interactive (real-time) analysis– Supporting aggregates, comparison,such as max, min, order by, group byCopyright 2018, Oracle and/or its affiliates. All rights reserved. 19

Basic graph pattern matching Find all instances of a given pattern/template in the data graphSELECTFROMMATCHWHEREv3.name, v3.agesocialNetworkGraph(v1:Person) –[:friendOf]- (v2:Person) –[:knows]- (v3:Person)v1.name ‘Amber’100:Personname ‘Amber’age 25:worksAt{1831}startDate ’09/01/2015’777:friendOf{1173}:friendOf {2513}since ’08/01/2014’:Personname ‘Paul’age 30200:Companyname ‘Oracle’location ‘Redwood City’socialNetworkGraphQuery: Find all people who are knownby friends of ‘Amber’.300:knows{2200}:Personname ‘Heather’age 27Copyright 2018, Oracle and/or its affiliates. All rights reserved. 20

Basic graph pattern matching Find all instances of a given pattern/template in the data graphSELECTFROMMATCHWHEREv3.name, v3.agesocialNetworkGraph(v1:Person) –[:friendOf]- (v2:Person) –[:knows]- (v3:Person)v1.name ‘Amber’100:Personname ‘Amber’age 25:worksAt{1831}startDate ’09/01/2015’777:friendOf{1173}:friendOf {2513}since ’08/01/2014’:Personname ‘Paul’age 30200:Companyname ‘Oracle’location ‘Redwood City’socialNetworkGraphQuery: Find all people who are knownby friends of ‘Amber’.300:knows{2200}:Personname ‘Heather’age 27Copyright 2018, Oracle and/or its affiliates. All rights reserved. 21

Regular path expressions Matching a patternrepeatedly– Define a PATH expression at thetop of a query– Instantiate the expression in theMATCH clause– Match repeatedly, e.g. zero ormore times (*) or one or moretimes ( )PATHSELECTFROMMATCH,WHEREhas parent AS (child) –[:has father has mother]- (parent)x.name, y.name, ancestor.namesnGraph(x:Person) –/:has parent /- (ancestor)(y) -/:has parent /- (ancestor)x.name 'Peter' AND x ysnGraph:Personname ‘Amber’age 292100:likessince ‘2016-04-04’0400:Personname ‘Dwight’age 15200:has father3:Person1:likessince ‘2016-04-04’:has father:Personname ‘Paul’age 64300 name ‘Retta’age 434:has mother5:has mother6:likessince ‘2013-02-14’Copyright 2018, Oracle and/or its affiliates. All rights reserved. 7:likessince ‘2015-11-08’500:Personname ‘Peter’age 1222

Regular path expressions Matching a patternrepeatedly– Define a PATH expression at thetop of a queryPATHSELECTFROMMATCH,WHEREhas parent AS (child) –[:has father has mother]- (parent)x.name, y.name, ancestor.namesnGraph(x:Person) –/:has parent /- (ancestor)(y) -/:has parent /- (ancestor)x.name 'Peter' AND x y– Instantiate the expression in theMATCH clause– Match repeatedly, e.g. zero ormore times (*) or one or moretimes ( ) -------- -------- --------------- x.name y.name ancestor.name -------- -------- --------------- Peter Retta Paul Peter Dwight Paul Peter Dwight Retta -------- -------- --------------- snGraph:Personname ‘Amber’age 292100:likessince ‘2016-04-04’Result set0400:Personname ‘Dwight’age 15200:has father3:Person1:likessince ‘2016-04-04’:has father:Personname ‘Paul’age 64300 name ‘Retta’age 434:has mother5:has mother6:likessince ‘2013-02-14’Copyright 2018, Oracle and/or its affiliates. All rights reserved. 7:likessince ‘2015-11-08’500:Personname ‘Peter’age 1223

In-memory Analytic EngineJava APIsGraph Storage ManagementBlueprints & SolrCloud / LucenePython, Perl, PHP, Ruby,Javascript, Graph AnalyticsREST Web ServiceCytoscape Plug-inR Integration (OAAgraph)Spark integrationOracle Graph Analytics ArchitectureJava APIs/JDBC/SQL/PLSQLScalable and Persistent StorageProperty Graph Support onApache HBase, Oracle NoSQL or Oracle 12.2Copyright 2018, Oracle and/or its affiliates. All rights reserved. 24

In-memory Analytic EnginePGQL in PGXJava APIsGraph Storage ManagementBlueprints & SolrCloud / LucenePGQL-to-SQLPython, Perl, PHP, Ruby,Javascript, Graph AnalyticsREST Web ServiceCytoscape Plug-inR Integration (OAAgraph)Spark integrationSupport for Graph Pattern MatchingJava APIs/JDBC/SQL/PLSQLScalable and Persistent StorageProperty Graph Support onApache HBase, Oracle NoSQL or Oracle 12.2Copyright 2018, Oracle and/or its affiliates. All rights reserved. 25

Path Query (Parallel Recursive With)PGQL:PATH knows path : () -[:knows]- ()SELECT s1.fname, s2.fnameWHERE (s1) -/:knows path*/- (o) -/:knows path*/-(s2)ORDER BY s1,s2Find the pairs of people who areconnected to a common personthrough the “knows” relationSQL:SELECT T2.T AS "s1.fname T",T2.V AS "s1.fname V",T2.VN AS "s1.fname VN",T2.VT AS "s1.fname VT",T3.T AS "s2.fname T",T3.V AS "s2.fname V",T3.VN AS "s2.fname VN",T3.VT AS "s2.fname VT"FROM (/*Path[*/SELECT DISTINCT SVID, DVID FROM ( SELECT VID AS SVID, VID AS DVID FROM "GRAPH1VT " UNION ALL SELECTFROM (WITH RW (ROOT, SVID, DVID, LVL) AS ( SELECT ROOT, SVID, DVID, LVL FROM (SELECT SVID ROOT, SVID, DVID,FROM (SELECT T0.SVID AS SVID, T0.DVID AS DVID FROM "GRAPH1GT " T0 WHERE (T0.EL n'knows'))) UNION ALL SELECT DISTINCT RW.ROOT, R.SVID, R.DVID, RW.LVL 1 FROM (SELECT T1.SVID AS SVID,T1.DVID AS DVID FROM "GRAPH1GT " T1 WHERE (T1.EL n'knows')) R, RW WHERE RW.DVID R.SVID )CYCLE SVID SET cycle col TO 1 DEFAULT 0 SELECT ROOT SVID, DVID FROM RW ))/*]Path*/) T6,(/*Path[*/SELECT DISTINCT SVID, DVID FROM ( SELECT VID AS SVID, VID AS DVID FROM "GRAPH1VT " UNION ALL SELECTFROM (WITH RW (ROOT, SVID, DVID, LVL) AS ( SELECT ROOT, SVID, DVID, LVL FROM (SELECT SVID ROOT, SVID, DVID,FROM (SELECT T4.SVID AS SVID, T4.DVID AS DVID FROM "GRAPH1GT " T4 WHERE (T4.EL n'knows'))) UNION ALL SELECT DISTINCT RW.ROOT, R.SVID, R.DVID, RW.LVL 1 FROM (SELECT T5.SVID AS SVID,T5.DVID AS DVID FROM "GRAPH1GT " T5 WHERE (T5.EL n'knows')) R, RW WHERE RW.DVID R.SVID )CYCLE SVID SET cycle col TO 1 DEFAULT 0 SELECT ROOT SVID, DVID FROM RW ))/*]Path*/) T7,"GRAPH1VT " T2, "GRAPH1VT " T3WHERE T2.K n'fname' AND T3.K n'fname' AND T6.SVID T2.VID AND T6.DVID T7.DVID AND T7.SVID T3.VIDORDER BY T6.SVID ASC NULLS LAST, T7.SVID ASC NULLS LASTCopyright 2018, Oracle and/or its affiliates. All rights reserved. SVID,DVID1 LVLSVID,DVID1 LVL26

Notebook integration Multi-purpose notebook for data analysisand visualization– Browser-based script and query execution For documentation and interactiveanalysis– Typically used by Data Scientist Interpreters for graph analysis and graphpattern matching– PGX, PGQL, Markdown Graph visualization Integrated with Graph Cloud ServiceCopyright 2018, Oracle and/or its affiliates. All rights reserved. 27

Combining Graph Analytics and Machine LearningGraph Analytics Compute graph metric(s) Explore graph or computenew metrics using ML resultMachine LearningAdd tostructured dataAdd to graph Build predictive modelusing graph metric Build model(s) andscore or classify dataCopyright 2018, Oracle and/or its affiliates. All rights reserved. 28

OAAgraph integration with R OAAgraph integrates in-memory engine into ORE and ORAAH Adds powerful graph analytics and querying capa

–Hadoop, HBase, Oracle NoSQL Supported both on BDA and commodity hardware –CDH and Hortonworks Database connectivity through Big Data Connectors or Big Data SQL Part of Big Data Cloud Service Oracle Spatial and Graph (DB option) Available with Oracle 12.2/18c (EE) Using tables for graph persistence In-database graph analytics