Apache Kafka And The Rise Of Event-Driven Microservices

Transcription

Apache Kafka and the Rise ofEvent-Driven MicroservicesJun RaoCo-founder of Confluent

LinkedIn at 2010 : World’s Largest Professional NetworkConnecting Talent Opportunity. At scale 200M Members Worldwide2 newMembers Per Second100M Monthly Unique Visitors2M Company Pages2

It’s all about data!UserValue Virality DataProductInsights Signals Science3

Initial database driven architecturewebapplicationwebapplicationdatabase

Realization #1: Event State State: I work at Confluent Event: I changed job to work at Confluent

Event driven microservicesmemberrecommendationnew jobdescriptionsearchindexgraphengine

Realization #2: leverage nontransactional data Business metrics– clicks, search keywords, pageviews Operational metrics– requests/sec, request types/sec Application logs– service calls, errors IOT

Database a mismatch for both!

Mismatch #1: no first class API tionsearchindexgraphengineTremendous load pressure on database!

Mismatch #2: not suitable for nontransactional data 1000X more volume Different transactional needs Not always needing a relation view

Danger of Point-to-point Pipelines

Ideal Architecture

1st Attempt: Don’t Reinvent the Wheels Why not messaging systems?

Version 1 of Kafka High throughput pub/sub– Design 1: make log first class citizen– Design 2: distributed architecture

Design #1: log as first a class citizendatabasetableloglong poll() API16

Design #1: log as first a class citizenEasy to optimize for throughputdatabasetableloglong poll() API17

Design #1: log as first a class citizendatabasePersistency for lagging/rewindingconsumptiontableloglong poll() API18

Design #1: log as first a class citizendatabaseOrdered delivery to reduce consumerbookkeeping overheadtableloglong poll() API19

Design #2: distributed architectureproducerproducerproducerKafka clusterbroker 1broker 2broker 3broker sumerconsumerconsumer20

Kafka at LinkedIn in 2011 28 billion messages/day460 thousand messages written/sec2.3 million messages read/secTens of thousands of producers– Every production service is a producer Data democracy!

Kafka Apache in 20116 of the top 10travel companies7 of the top 10global banks8 of the top 10insurance companies9 of the top 10telecom companies

Royal Bank of Canada Event-DrivenBankingConsumerCredit ServicesCorporate RealEstateInvestorServicesTreasuryServices .30 Use-cases50 apps10 different linesof seSaaSFraudSecurityLowering anomalydetection from weeksto real-time

Carnival cruise line

Building the processing iceKafka pub/subevent-drivenmicroservice Transformation Enrichment Aggregation

Kafka StreamsKStream Integer, Integer input builder.stream(“numbers-topic”);// Stateless computationKStream Integer, Integer doubled input.mapValues(v - v * 2);// Stateful computationKStream Integer, Integer sumOfOdds input.filter((k,v) - v % 2 ! 0).selectKey((k, v) - 1).reduceByKey((v1, v2) - v1 v2, ”sum-of-odds").toStream();

KSQL (from Confluent)CREATE STREAM vip actions ASSELECT userid, page, actionFROM clickstream cLEFT JOIN users uON c.userid u.user idWHERE u.level 'Platinum';

Event driven oserviceevent-drivenmicroservicekstreams/ksqlKafka pub/subevent-drivenmicroservice

Still interesting work ahead Scalability in metadata Streaming database Cloud integration

Conclusion The success for business not only depends onsoftware, but how they build software Apache Kafka offers a new platform thantraditional database This is an exciting time to work on streams

Kafka Apache in 2011 . 6 of the top 10 travel companies . 8 of the top 10 insurance companies . 7 of the top 10 global banks . 9 of the top 10 telecom companies . Royal Bank of Canada Event-Driven Banking . 30 Use-cases . 50 apps . 10 different lines of businesses. Low