Skip to content

Latest commit

 

History

History
115 lines (89 loc) · 7.29 KB

prereq.md

File metadata and controls

115 lines (89 loc) · 7.29 KB

image

Prerequisites

Before we can run the hands-on workshop, a working infrastructure in Confluent Cloud must exist:

  • an environment with the Schema Registry enabled
  • a Confluent Cluster in one of the supported regions for the Flink provider
  • 3 topics
  • events generated by our Sample Data Datagen Source connector

And of course, we need a working account for Confluent Cloud to do all of this. Sign-up with Confluent Cloud is very easy and you will get a $400 budget for our Hands-on Workshop. If you don't have a working Confluent Cloud account please Sign-up to Confluent Cloud.

Now you have two possibilities to create the Hands-On Workshop Confluent Cloud resources:

  1. Let Terraform create it: If you are comfortable running Terraform, then follow this guide.
  2. Create all resources manually.

On your desktop, we expect that confluent CLI will be installed. Install the cli on your desktop. You need the CLI to run the Flink SQL shell. The shell gives you a better experience in the workshop. Please bring the client on the latest version (Version: v3.53.0):

confluent update

If you installed the CLI via brew install confluentinc/tap/cli, rerun the command to bring the CLI to the latest version.

Confluent Cloud Resources for the Hands-on Workshop: Manual Setup

IMPORTANT TO KNOW FOR THE WORKSHOP: We are now running in AWS, Azure and GCP. Currently, we support 21 Regions. Please be aware that the cluster and the Flink Pool must be in the same Cloud-Provider-Region.

You can create each Confluent Cloud resource with the Confluent CLI tool, the Confluent Cloud Control Plane GUI or the Confluent Terraform Provider. Both are using the confluent cloud API in the background. If you want to use the CLI, you must install the CLI on your desktop. This workshop guide will cover the GUI only.

Create Environment and Schema Registry

Login into Confluent Cloud and create an environment with Schema Registry:

  • Click the Add cloud environment button
  • Enter a New environment name e.g. handson-flink and push the create button
  • Choose Essentials Stream Governance package

The environment is ready to work and will create a Schema Registry in the region of the first cluster. image

Create Kafka Cluster in Environment handson-flink

Next, create a Basic Cluster in your chosen environment based on the rule above. Click the button Create cluster

  • choose BASIC and the Begin configuration button to start the cluster creation config.
  • Choose your preferred region with a single zone and click Continue
  • Give the cluster a name, e.g. cc_handson_cluster and check rate card overview and configs, then press Launch cluster

The cluster will be up and running in seconds. image

Create topics in Kafka Cluster cc_handson_cluster

Now, we need three topics to store our events.

  • shoe_products
  • shoe_customers
  • shoe_orders

Via the GUI the topic creation is straightforward. Create a topic by clicking (left.hand menu) Topics and then clicking the Create topic button.

  • Topic name: shoe_products, Partitions: 1 and then click the Create with defaults button
  • Skip adding a data contract
  • Repeat the same steps for shoe_customers and shoe_orders

Three topics are created. image

Create Sample Data connectors to fill the topics show_products and shoe_customers and shoe_orders

Confluent has the Datagen connector, which is a test data generator. In Confluent Cloud a range of Quickstarts (predefined data) are available and will generate data of a given format. NOTE: We use Datagen with the following templates:

Choose the Connectors menu entry (left side) and search for Sample Data. Click on the Sample Data Icon.

  • Press "Additional configuration"
  • Choose a topic: shoe_products and click Continue
  • Click My Account (already selected by default) and download the API Key. Typically, you will configure the connector with restrictive access to your resources (what we did in the terraform setup). For the hands-on a global key is sufficient. Click Generate API Key & Download, enter a description Datagen Connector Products and click Continue
  • Select the format AVRO, because Flink requires AVRO for now, and a template (Show more Option) Shoes and click Continue
  • Check Summary, we will go with one Task (slider) and click Continue
  • Enter the name DSoC_products and finally click Continue

Now, events will be produced into the topic shoe_products generated from datagen connector DSoC_products image

Click Stream Lineage (left side) for your current data pipeline. Click on the topic shoe_products and enter the description Shoe products. This is how you place metadata on your data product. image

Go back to your Cluster cc_handson_cluster and create two more datagen connectors to fill the topics shoe_customers and shoe_orders, go to Connectors and click Add Connector. Pay attention when you select the template for the datagen connector and ensure that it corresponds with the selected topic as shown in the following. Deviations in this step will result in invalid queries later in the workshop.

  • Connector Plug-in Sample Data, Topic shoe_customers, Global Access and Download API Key with Description Datagen Connector Customers, Format AVRO, template Shoe customers, 1 Task, Connector Name DSoC_customers
  • Connector Plug-in Sample Data, Topic shoe_orders, Global Access and Download API Key with Description Datagen Connector Orders, Format AVRO, template Shoe orders, 1 Task, Connector Name DSoC_orders

Three Connectors are up and running and are generating data for us. image

All three connectors generate events in AVRO format and automatically create a schema for all three topics. You can have a look at the schema in the Schema Registry. image

Or use the topic viewer, where you can

  • View the events flying in
  • all metadata information
  • configs
  • and schemas as well

image

The preparation is finished, well done.

The infrastructure for the Hands-on Workshop is up and running. And we can now start to develop our use case of a loyalty program in Flink SQL. image

End of prerequisites, continue with LAB 1.