AWS: Running Flink Application on Kinesis Data Analytics(KDA)- Part 2

Arjun Sunil Kumar
Cloud Engineering
Published in
3 min readApr 12, 2020

--

In Part 1, we saw how to build uber/fat jar for our flink application. In this tutorial, we will cover how to deploy the app in KDA.

Pre-requisites.

  • Kinesis Stream
  • S3 Bucket
  • IAM Role

Kinesis Stream:

Let's create a Kinesis stream for feeding our Flink Application. You can also use Kafka or RabbitMQ as a source.

S3 Bucket:

Deployment of flink application jar in KDA requires an S3 bucket (to act as a repository). We upload our jar into this S3 bucket and then point the KDA source to S3 Object URL. Let name the S3 bucket as appname-kda-repository-bucket-dev .

IAM Role:

We need an IAM role which has only the right amount of access for our streaming job. We can set policies to limit the application access to particular kinesis stream etc.

NOTE: An IAM role gets automatically created if you specify the same during KDA app creation in AWS Web Console.

Create a KDA Application

Via AWS CLI:

This option is really good if you want to automate the KDA app creation part. There were many times when I needed to delete the running KDA app and create a new app with the same configurations.

Instructions:

  • Fill in the necessary details in the below KDA app-create-config JSON.
  • Run the following command to create the KDA app.
  • If you face any issue in the above code, you start fresh with the skeleton code as follows:

Via AWS Console (Web)

For creating a KDA app via the web, you can refer the following docs.

Once everything is set you can click on Run and you application will start running. You will find you applications Job graph in the KDA UI.

Job Graph

KDA UI

Since we used .startNewChain(), we will see blocks representing each operator ( with configured.name() )

sample chain job graph

If you hadn’t used .startNewChain(), you would be seeing a monolith block clubbing all your operators.

sample monolith job graph

HASH/FORWARD/REBALANCE/BROADCAST

You can see HASH/FORWARD/REBALANCE/BROADCAST on the operator arrows. What does that signify? You can read my answer on Stack Overflow.

Cloudwatch

You can find you application logs in the cloudwatch console (If previously enabled while creating KDA app). Search for the log group with prefix /aws/kinesis-analytics .

The logs would be in JSON format. You can make it more legible by using cloudwatch insight .

You can select your log group and use the below filter.

Sample Insight Query

Similarly, you can create a dashboard for monitoring metrics for your KDA app.

I have covered some finding w.r.t KDA and flink, based on my experience using it. Hope this helps someone.

Found it Interesting?
Please show your support by 👏.

--

--

Arjun Sunil Kumar
Cloud Engineering

Writes on Database Kernel, Distributed Systems, Cloud Technology, Data Engineering & SDE Paradigm. github.com/arjunsk