Transcription
Apache Kafka and the Rise ofEvent-Driven MicroservicesJun RaoCo-founder of Confluent
LinkedIn at 2010 : World’s Largest Professional NetworkConnecting Talent Opportunity. At scale 200M Members Worldwide2 newMembers Per Second100M Monthly Unique Visitors2M Company Pages2
It’s all about data!UserValue Virality DataProductInsights Signals Science3
Initial database driven architecturewebapplicationwebapplicationdatabase
Realization #1: Event State State: I work at Confluent Event: I changed job to work at Confluent
Event driven microservicesmemberrecommendationnew jobdescriptionsearchindexgraphengine
Realization #2: leverage nontransactional data Business metrics– clicks, search keywords, pageviews Operational metrics– requests/sec, request types/sec Application logs– service calls, errors IOT
Database a mismatch for both!
Mismatch #1: no first class API tionsearchindexgraphengineTremendous load pressure on database!
Mismatch #2: not suitable for nontransactional data 1000X more volume Different transactional needs Not always needing a relation view
Danger of Point-to-point Pipelines
Ideal Architecture
1st Attempt: Don’t Reinvent the Wheels Why not messaging systems?
Version 1 of Kafka High throughput pub/sub– Design 1: make log first class citizen– Design 2: distributed architecture
Design #1: log as first a class citizendatabasetableloglong poll() API16
Design #1: log as first a class citizenEasy to optimize for throughputdatabasetableloglong poll() API17
Design #1: log as first a class citizendatabasePersistency for lagging/rewindingconsumptiontableloglong poll() API18
Design #1: log as first a class citizendatabaseOrdered delivery to reduce consumerbookkeeping overheadtableloglong poll() API19
Design #2: distributed architectureproducerproducerproducerKafka clusterbroker 1broker 2broker 3broker sumerconsumerconsumer20
Kafka at LinkedIn in 2011 28 billion messages/day460 thousand messages written/sec2.3 million messages read/secTens of thousands of producers– Every production service is a producer Data democracy!
Kafka Apache in 20116 of the top 10travel companies7 of the top 10global banks8 of the top 10insurance companies9 of the top 10telecom companies
Royal Bank of Canada Event-DrivenBankingConsumerCredit ServicesCorporate RealEstateInvestorServicesTreasuryServices .30 Use-cases50 apps10 different linesof seSaaSFraudSecurityLowering anomalydetection from weeksto real-time
Carnival cruise line
Building the processing iceKafka pub/subevent-drivenmicroservice Transformation Enrichment Aggregation
Kafka StreamsKStream Integer, Integer input builder.stream(“numbers-topic”);// Stateless computationKStream Integer, Integer doubled input.mapValues(v - v * 2);// Stateful computationKStream Integer, Integer sumOfOdds input.filter((k,v) - v % 2 ! 0).selectKey((k, v) - 1).reduceByKey((v1, v2) - v1 v2, ”sum-of-odds").toStream();
KSQL (from Confluent)CREATE STREAM vip actions ASSELECT userid, page, actionFROM clickstream cLEFT JOIN users uON c.userid u.user idWHERE u.level 'Platinum';
Event driven oserviceevent-drivenmicroservicekstreams/ksqlKafka pub/subevent-drivenmicroservice
Still interesting work ahead Scalability in metadata Streaming database Cloud integration
Conclusion The success for business not only depends onsoftware, but how they build software Apache Kafka offers a new platform thantraditional database This is an exciting time to work on streams
Kafka Apache in 2011 . 6 of the top 10 travel companies . 8 of the top 10 insurance companies . 7 of the top 10 global banks . 9 of the top 10 telecom companies . Royal Bank of Canada Event-Driven Banking . 30 Use-cases . 50 apps . 10 different lines of businesses. Low