Amazon Managed Streaming For Apache Kafka - Developer

Transcription

Amazon ManagedStreaming for Apache KafkaDeveloper Guide

Amazon Managed Streaming forApache Kafka Developer GuideAmazon Managed Streaming for Apache Kafka: Developer GuideCopyright Amazon Web Services, Inc. and/or its affiliates. All rights reserved.Amazon's trademarks and trade dress may not be used in connection with any product or service that is notAmazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages ordiscredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who mayor may not be affiliated with, connected to, or sponsored by Amazon.

Amazon Managed Streaming forApache Kafka Developer GuideTable of ContentsWelcome . 1What Is Amazon MSK? . 1Setting Up . 4Sign Up for AWS . 4Download Libraries and Tools . 4Getting Started . 5Step 1: Create a Cluster . 5Step 2: Create a Client Machine . 5Step 3: Create a Topic . 6Step 4: Produce and Consume Data . 7Step 5: View Metrics . 8Step 6: Delete the Resources . 8How It Works . 10Creating a Cluster . 10Broker types . 10Creating a cluster using the AWS Management Console . 11Creating a cluster using the AWS CLI . 12Creating a cluster with a custom MSK configuration using the AWS CLI . 13Creating a cluster using the API . 13Deleting a Cluster . 13Deleting a cluster using the AWS Management Console . 13Deleting a cluster using the AWS CLI . 13Deleting a cluster using the API . 14Getting the Apache ZooKeeper Connection String . 14Getting the Apache ZooKeeper connection string using the AWS Management Console . 14Getting the Apache ZooKeeper connection string using the AWS CLI . 14Getting the Apache ZooKeeper connection string using the API . 15Getting the Bootstrap Brokers . 15Getting the bootstrap brokers using the AWS Management Console . 15Getting the bootstrap brokers using the AWS CLI . 16Getting the bootstrap brokers using the API . 16Listing Clusters . 16Listing clusters using the AWS Management Console . 16Listing clusters using the AWS CLI . 16Listing clusters using the API . 17Provisioning Storage Throughput . 17Throughput bottlenecks . 17Measuring storage throughput . 17Configuration update . 18Provisioning storage throughput using the AWS Management Console . 18Provisioning storage throughput using the AWS CLI . 18Provisioning storage throughput using the API . 19Scaling Up Broker Storage . 19Automatic scaling . 20Manual scaling . 21Updating the Broker Type . 22Updating the broker type using the AWS Management Console . 22Updating the broker type using the AWS CLI . 22Updating the broker type using the API . 23Updating the Configuration of a Cluster . 24Updating the configuration of a cluster using the AWS CLI . 24Updating the configuration of a cluster using the API . 25Expanding a Cluster . 25Expanding a cluster using the AWS Management Console . 26iii

Amazon Managed Streaming forApache Kafka Developer GuideExpanding a cluster using the AWS CLI .Expanding a cluster using the API .Updating Security .Updating a cluster's security settings using the AWS Management Console .Updating a cluster's security settings using the AWS CLI .Updating a cluster's security settings using the API .Rebooting a Broker for a Cluster .Rebooting a Broker Using the AWS Management Console .Rebooting a Broker Using the AWS CLI .Rebooting a Broker Using the API .Tagging a Cluster .Tag Basics .Tracking Costs Using Tagging .Tag Restrictions .Tagging Resources Using the Amazon MSK API .Configuration .Custom Configurations .Dynamic Configuration .Topic-Level Configuration .States .Default Configuration .Configuration Operations .Create Configuration .To update an MSK configuration .To delete an MSK configuration .To describe an MSK configuration .To describe an MSK configuration revision .To list all MSK configurations in your account for the current Region .MSK Serverless .Getting started tutorial .Step 1: Create a cluster .Step 2: Create an IAM role .Step 3: Create a client machine .Step 4: Create a topic .Step 5: Produce and consume data .Step 6: Delete resources .Configuration .Monitoring .MSK Connect .What is MSK Connect? .Getting Started .Step 1: Set up required resources .Step 2: Create custom plugin .Step 3: Create client machine and Apache Kafka topic .Step 4: Create connector .Step 5: Send data .Connectors .Capacity .Creating a connector .Plugins .Workers .Default worker configuration .Supported worker configuration properties .Creating a custom configuration .Managing connector offsets .Configuration providers .IAM Roles and Policies 26363646465656768

Amazon Managed Streaming forApache Kafka Developer GuideService Execution Role . 68Example Policies . 70Cross-service confused deputy prevention . 71AWS managed policies . 72Using Service-Linked Roles . 75Enabling internet access . 76Setting up a NAT gateway for Amazon MSK Connect . 76Logging . 77Preventing secrets from appearing in connector logs . 78Monitoring . 78Examples . 80Amazon S3 sink connector . 80Amazon Redshift sink connector . 81Debezium source connector . 86Cluster States . 93Security . 95Data Protection . 95Encryption . 96How Do I Get Started with Encryption? . 97Authentication and Authorization for Amazon MSK APIs . 99How Amazon MSK Works with IAM . 99Identity-Based Policy Examples . 102Service-Linked Roles . 105AWS managed policies . 106Troubleshooting . 110Authentication and Authorization for Apache Kafka APIs . 111IAM Access Control . 111Mutual TLS Authentication . 119SASL/SCRAM Authentication . 123Apache Kafka ACLs . 126Changing Security Groups . 127Controlling Access to Apache ZooKeeper . 128To place your Apache ZooKeeper nodes in a separate security group . 128Using TLS security with Apache ZooKeeper . 129Logging . 130Broker logs . 130CloudTrail events . 131Compliance Validation . 134Resilience . 135Infrastructure Security . 135Connecting to an MSK cluster . 136Public Access . 136Access from Within AWS . 138Amazon VPC Peering . 138AWS Direct Connect . 139AWS Transit Gateway . 139VPN Connections . 139REST Proxies . 139Multiple Region Multi-VPC Connectivity . 139EC2-Classic . 139Port information . 140Migration . 141Migrating Your Apache Kafka Cluster to Amazon MSK . 141Migrating From One Amazon MSK Cluster to Another . 142MirrorMaker 1.0 Best Practices . 142MirrorMaker 2.* Advantages . 143Monitoring a Cluster . 144v

Amazon Managed Streaming forApache Kafka Developer GuideAmazon MSK Metrics for Monitoring with CloudWatch .DEFAULT Level Monitoring .PER BROKER Level Monitoring .PER TOPIC PER BROKER Level Monitoring .PER TOPIC PER PARTITION Level Monitoring .Viewing Amazon MSK Metrics Using CloudWatch .Consumer-Lag Monitoring .Open Monitoring with Prometheus .Creating an Amazon MSK Cluster with Open Monitoring Enabled .Enabling Open Monitoring for an Existing Amazon MSK Cluster .Setting Up a Prometheus Host on an Amazon EC2 Instance .Prometheus Metrics .Storing Prometheus metrics in Amazon Managed Service for Prometheus .Cruise Control .Quota .Amazon MSK quota .Quota for serverless clusters .MSK Connect quota .Resources .Apache Kafka Versions .Supported Apache Kafka versions .Apache Kafka version 2.8.1 .Apache Kafka version 2.8.0 .Apache Kafka version 2.7.2 .Apache Kafka version 2.7.1 .Apache Kafka version 2.6.3 .Apache Kafka version 2.6.2 [Recommended] .Apache Kafka version 2.7.0 .Apache Kafka version 2.6.1 .Apache Kafka version 2.6.0 .Apache Kafka version 2.5.1 .Amazon MSK bug-fix version 2.4.1.1 .Apache Kafka version 2.4.1 (use 2.4.1.1 instead) .Apache Kafka version 2.3.1 .Apache Kafka version 2.2.1 .Apache Kafka version 1.1.1 (for existing clusters only) .Updating the Apache Kafka version .Troubleshooting .Consumer group stuck in PreparingRebalance state .Static Membership Protocol .Identify and Reboot .Error delivering broker logs to Amazon CloudWatch Logs .No default security group .Cluster appears stuck in the CREATING state .Cluster state goes from CREATING to FAILED .Cluster state is ACTIVE but producers cannot send data or consumers cannot receive data .AWS CLI doesn't recognize Amazon MSK .Partitions go offline or replicas are out of sync .Disk space is running low .Memory running low .Producer Gets NotLeaderForPartitionException .Under-Replicated Partitions (URP) greater than zero .Cluster has topics called amazon msk canary and amazon msk canary state .Partition replication fails .Unable to access cluster that has public access turned on .Unable to access cluster from within AWS: networking issues .Amazon EC2 client and MSK cluster in the same VPC 170171171171171171172172

Amazon Managed Streaming forApache Kafka Developer GuideAmazon EC2 client and MSK cluster in different VPCs . 173On-premises client . 173AWS Direct Connect . 173Failed authentication: Too many connects . 173MSK Serverless: Cluster creation fails .

lets you use Apache Kafka data-plane operations, such as those for producing and consuming data. It runs open-source versions of Apache Kafka. This means existing applications, tooling, and plugins from partners and the Apache Kafka community are