Query Real-Time Kafka Streams With Oracle SQL

Transcription

Query Real-Time Kafka Streamswith Oracle SQLMelli AnnamalaiSenior Principal Product ManagerOracleOctober 23, 2018melliyal.Annamalai@oracle.comCopyright 2018, Oracle and/or its affiliates. All rights reserved.

Safe Harbor StatementThe following is intended to outline our general product direction. It is intended forinformation purposes only, and may not be incorporated into any contract. It is not acommitment to deliver any material, code, or functionality, and should not be relied uponin making purchasing decisions. The development, release, timing, and pricing of anyfeatures or functionality described for Oracle’s products may change and remains at thesole discretion of Oracle Corporation.Copyright 2018, Oracle and/or its affiliates. All rights reserved.

Agenda Demo: Monitor temperature readings from food distribution centers Apache Kafka Concepts SQL Connector for Kafka– Database objects– SQL syntax examples– Performance and scale RoadmapCopyright 2018, Oracle and/or its affiliates. All rights reserved. 3

Streaming DataIn real-time: Process datastreams as they occurCopyright 2018, Oracle and/or its affiliates. All rights reserved. 4

Streaming Data Examples Sensor data Clickstream data Machine/equipmentHigh Velocity Network monitoringHigh Volume Pipeline monitoringExample: 200k/second, 24x7 Fleet monitoring Copyright 2018, Oracle and/or its affiliates. All rights reserved. 5

DemoCopyright 2018, Oracle and/or its affiliates. All rights reserved. 6

Monitoring Temperature at Food Distribution CentersCopyright 2018, Oracle and/or its affiliates. All rights reserved. 7

Apache Kafka ConceptsCopyright 2018, Oracle and/or its affiliates. All rights reserved. 8

Apache KafkaProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. ConsumersApplications thatread and process messages9

Kafka TopicsMessages with acommon formatTopic 1Topic 2Topic 3Topic 4Topic 5ProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. ConsumersApplications thatread and process messages11

Kafka Topic ht 2018, Oracle and/or its affiliates. All rights reserved. 12

Kafka Topic PartitionsSubset of messagesin a Topic, managed by a brokerProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. ConsumersApplications thatread and process messages13

Kafka Topic PartitionsPartition 1Partition 2Partition 3Partition 4ProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. ConsumersApplications thatread and process messages14

A Kafka Consumer ApplicationShould read all partitionsPartition 1Partition 2Partition 3Partition 4ProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. ConsumersApplications thatread and process messages15

Kafka Topic PartitionsPartition 1Partition 2Partition 3Partition 4ProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. ConsumersApplications thatread and process messages16

Kafka Topic Partition OffsetsLogical Position of aMessage within a Partition34.3 36.2 34.2 33.4 34.5 31.9 32.1ProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. ConsumersApplications thatread and process messages17

Logical Position (Offset) Maps to Sequence Numbers . 103 104 105 106 107 108 109 110 111 112 Offset uniquely identifies a message . 34.3 36.2 34.2 33.4 34.5 31.9 32.1 32.6 33.2 33.1 Consumers can read from an Offset Starting offset is 0(at the beginning of a topic)Copyright 2018, Oracle and/or its affiliates. All rights reserved. 18

Using Data from Kafka Topics Kafka Consumer application Kafka Streams API Stream Kafka topic data into HDFS/Object store/databases using Kafkaconnectors KSQL: Streaming SQL engine for real-time data processing of Kafka topicsCopyright 2018, Oracle and/or its affiliates. All rights reserved. 19

Kafka Consumer ExampleWith Automatic Offset CommitsFrom: lCopyright 2018, Oracle and/or its affiliates. All rights reserved. 20

Oracle Database as a Kafka ConsumerEnable Oracle SQL access to Kafka TopicsPartition 1Partition 2Partition 3Partition 4ProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. Oracle DatabaseExternal tables and views21

Oracle SQL and Kafka1. Integrate Kafka messages with Oracle Database applications2. Enrich Kafka data with Oracle Database table data– Feed enriched topics back to KafkaCopyright 2018, Oracle and/or its affiliates. All rights reserved. 22

1. Integrate Kafka with Database Applications Query Kafka messages– Integrate and analyze with Oracle Database data– Use the full richness of Oracle SQL Join data in a Kafka time interval with an Oracle Database table Load into Oracle Database table using Oracle SQLCopyright 2018, Oracle and/or its affiliates. All rights reserved. 23

2. Enrich Kafka Data with Oracle Database Table Data . 34.3 36.2 34.2 33.4 34.5 31.9 33.1 . 34.3, Seattle, LastOutageDate; 36.2, Austin, LastOutageDate; 34.2, Philadelphia, LastOutageDate Copyright 2018, Oracle and/or its affiliates. All rights reserved. 24

SQL Connector for KafkaCopyright 2018, Oracle and/or its affiliates. All rights reserved. 25

Apache Kafka Oracle DatabasePartition 1Partition 2Partition 3Partition 4Kafka ClusterExternaltablesViews onthe externaltablesCopyright 2018, Oracle and/or its affiliates. All rights reserved. Database applicationsquery views to accessKafka messages26

Database Objects Views over Kafka TopicsNameNull?------------------- -------KAFKA PARTITIONKAFKA OFFSETKAFKA P(6)NUMBERNUMBERNUMBER(6,3)NUMBER(6,3) Views are created per Topic There can be multiple sets of views per Topic: One set per application(Consumer Group)Copyright 2018, Oracle and/or its affiliates. All rights reserved. 27

Database Objects Metadata tables– DBMS KAFKA CLUSTER– DBMS KAFKA PARTITION– DBMS KAFKA LOAD METRICS– DBMS KAFKA APPLICATION Message schema tableNameNull?------------------- -------MSGNUMBERNOT R(6,3)Copyright 2018, Oracle and/or its affiliates. All rights reserved. 28

Query Streaming Data with a ViewNo Need to Persist or Stage Data in a Table . 34.3 36.2 34.2 33.4 34.5 31.9 33.1 NameNull?------------------- -------KAFKA PARTITIONKAFKA OFFSETKAFKA P(6)NUMBERNUMBERNUMBER(6,3)NUMBER(6,3)Copyright 2018, Oracle and/or its affiliates. All rights reserved. 29

Join Kafka Topic View with Oracle Database TableNameNull?------------------- -------KAFKA PARTITIONKAFKA OFFSETKAFKA P(6)NUMBERNUMBERNUMBER(6,3)NUMBER(6,3)View on Kafka TopicsNameNull?------------------- -------SENSORUNITIDNOT )NUMBERNUMBERVARCHAR2(30)Database tableFind cities where the temperature fluctuation is more than normalCopyright 2018, Oracle and/or its affiliates. All rights reserved. 30

Offset Management0123434.2 34.5 34.1 34.6 34.3 . . . .Copyright 2018, Oracle and/or its affiliates. All rights reserved. 31

Offset ManagementCommit offset0123434.2 34.5 34.1 34.6 34.3RECORD OFFSET() . . . .Copyright 2018, Oracle and/or its affiliates. All rights reserved. 32

Offset ManagementSET OFFSET()Start locationfor next operation012345678910111213 . 34.2 34.5 34.1 34.6 34.3 36.2 34.2 33.4 34.5 31.9 33.1 34.5 31.6 34.5 . Offsets of each Consumer Group application are managed independentlyCopyright 2018, Oracle and/or its affiliates. All rights reserved. 33

SQL Syntax ExamplesCopyright 2018, Oracle and/or its affiliates. All rights reserved. 34

1. REGISTER CLUSTERSQL BEGINdbms kafka.register cluster(‘SENS2',‘ Zookeeper URL :2181',‘ Kafka broker URL :9092','DBMSKAFKA DEFAULT DIR','DBMSKAFKA LOCATION DIR‘'Testing DBMS KAFKA');END;Copyright 2018, Oracle and/or its affiliates. All rights reserved. 35

2. Create ViewsDECLAREviews created INTEGER;view prefixVARCHAR2(128);BEGINDBMS KAFKA.CREATE KAFKA VIEWS(‘SENS2',‘MONITORAPP', ‘sensor2', ‘sensor messages shape‘,views created, view prefix);END;/One view per partitionCopyright 2018, Oracle and/or its affiliates. All rights reserved. 36

3. Query with Oracle SQLSQL select count(*) from KV SENS2 MONITORAPP SENSOR2 0;SQL select timestamp, sensorunitid, temperaturereadingfrom KV SENS2 MONITORAPP SENSOR2 0;Copyright 2018, Oracle and/or its affiliates. All rights reserved. 37

Record Offset and Set OffsetSQL execute dbms kafka.record offset('KV SENS2 MONITORAPP SENSOR2 0');Record offset0134.2 34.52334.1 34.6SQL execute dbms kafka.set offset('KV SENS2 MONITORAPP SENSOR2 0');0134.2 34.5Copyright 2018, Oracle and/or its affiliates. All rights reserved. 2432.1Set offset334.1 34.64532.1 32.238

Loading into Oracle DatabaseDECLARErows loaded number;BEGINdbms kafka.load table('SENS2','LOADAPP','sensor2','sensor messages shape',rows loaded);dbms output.put line('rows loaded: ' rows loaded);END;/Copyright 2018, Oracle and/or its affiliates. All rights reserved. 39

Performance and ScaleCopyright 2018, Oracle and/or its affiliates. All rights reserved. 40

Scaling in Kafka Increase number of Partitions Increase Consumer instancesCopyright 2018, Oracle and/or its affiliates. All rights reserved. 41

One Consumer reads from all PartitionsKafka Consumer GroupsPartition 1Partition 2Partition 3Partition 4Topic with 4 PartitionsProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. ConsumersApplications thatread and process messages42

One Consumer reads from two PartitionsKafka Consumer GroupsPartition 1Partition 2Partition 3Partition 4Topic with 4 PartitionsProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. ConsumersApplications thatread and process messages43

One Consumer per PartitionKafka Consumer GroupsPartition 1Partition 2Partition 3Partition 4Topic with 4 PartitionsProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. ConsumersApplications thatread and process messages44

One view reads from all PartitionsKafka Consumer GroupsPartition 1Partition 2Partition 3Partition 4Topic with 4 PartitionsProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. Database views45

Each view reads from two PartitionsKafka Consumer GroupsPartition 1Partition 2Partition 3Partition 4Topic with 4 PartitionsProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. Database views46

Four views, one per PartitionKafka Consumer GroupsPartition 1Partition 2Partition 3Partition 4Topic with 4 PartitionsProducersEntities producingstreaming dataKafka ClusterStores and manages streaming datain a distributed, replicated,fault-tolerant clusterCopyright 2018, Oracle and/or its affiliates. All rights reserved. Database views47

Kafka Partitions and Database Views Number of views can be 1 total number of partitions Scale by increasing the number of views Can have multiple sets of views for the same Topic– A set of views is a ‘Consumer Group’– Offsets for each Consumer Group managed independentlyCopyright 2018, Oracle and/or its affiliates. All rights reserved. 48

Factors that Impact Performance Degree of parallelism in the database Network bandwidth– Between Kafka brokers and database systemCopyright 2018, Oracle and/or its affiliates. All rights reserved. 49

Kafka and Oracle Products and Cloud ServicesCopyright 2018, Oracle and/or its affiliates. All rights reserved. 50

Oracle Big Data SQLKafka ClusterHadoop Cluster with Apache HiveBig Data SQL external tableShipping since 2017Copyright 2018, Oracle and/or its affiliates. All rights reserved. 51

Query Hive table fromOracle DatabaseHive table reads fromKafkaCREATE TABLE TRAFFIC JSON TAB(ID VARCHAR2(256),CLIENTID VARCHAR2(256), ),EVENTTIMEASSTRING VARCHAR2(256),SENDER VARCHAR2(256),TYPE VARCHAR2(256),PROPERTIES VARCHAR2(256),DIRECTION VARCHAR2(256),RECEIVEDTIME NUMBER(38,0),RECEIVEDTIMEASSTRING VARCHAR2(256),PAYLOAD VARCHAR2(4000))ORGANIZATION EXTERNAL(TYPE ORACLE HIVE DEFAULT DIRECTORY DEFAULT DIRACCESS PARAMETERS(com.oracle.bigdata.cluster bigdatalitecom.oracle.bigdata.tablename oracletest.traffic json tab))Copyright 2018, Oracle and/or its affiliates. All rights reserved. 52

Oracle Event Hub Cloud ServiceEnterprise Kafka features in Event Hub Scalable and highly availableMessaging/Event brokerSecuritySchema registryConnectors:object storage, databases, other external systemsAccess from Database Cloud Servicevia Database viewsCopyright 2018, Oracle and/or its affiliates. All rights reserved. 53

Oracle Event Hub Cloud Service Use CasesSingle Infra for Data Movement, Application Messaging, Events Broker, Stream AnalyticsKafka Use casesOracle PaaS ServiceAdd-OnsData Movement into data lakes (Firehose) and DB Change propagation (Kafka Connect)Big Data CloudApplication messaging for Microservices; replacement for JMS; Events Broker (Kafka Connect,Kafka Streams)Integration CloudEvent Processing & Analytics (Kafka Streams and KSQL)Streaming AnalyticsIoT Events and Logs transport (Kafka Produce/Consume, Kafka Connect)Mobile CloudData Lake FirehoseApplication MessagingIoT Use CaseCopyright 2018, Oracle and/or its affiliates. All rights reserved. Database Change Use Case55

RoadmapCopyright 2018, Oracle and/or its affiliates. All rights reserved. 55

Planned Release Roadmap Addition to Big Data Connectors product suite BYOL for Cloud deployments– Note: Big Data Connectors are included in Big Data Cloud Service and Big DataCloud@Customer Cloud Managed Services– Autonomous Data Warehouse Cloud Service (ADWCS)Copyright 2018, Oracle and/or its affiliates. All rights reserved. 56

Peek into Planned Functionality Roadmap Support Kafka Streams– Enrich Kafka Topics using Oracle SQL and database tables Retrieval based on timestamp informationCopyright 2018, Oracle and/or its affiliates. All rights reserved. 57

Q/Amelliyal.Annamalai@oracle.comCopyright 2018, Oracle and/or its affiliates. All rights reserved. 58

Query Real-Time Kafka Streams with Oracle SQL Author: Oracle Subject: Oracle Open World 2018 Keywords: cloud;