Scala DSL for Kafka Streams Topology Testing Mocked Streams 3.4 is out.

Exposing Remote JMX in your Kafka Setup

This article is missing documentation of how to setup JMX to be used remotely. Proper monitoring and alerting is essential. It allows developers to understand the system and its subsystem by numbers. JMX to Graphite or Prometheus bridges exist, however, one might try to avoid putting these adapters on the same machine to de-couple the monitoring from the actual service.

Java Management Extensions (JMX) is an old technology, however, it's still omnipresent when setting up data pipelines with the Kafka ecosystem (in this article, using the Confluent Community Platform). It's absolutely essential to do proper monitoring in production environments. Not only to alert the team when things fail, but also how to get a sense of how a system and its subsystems behave.

The metrics that are exposed by Kafka, Kafka Streams, Schema Registry and KSQL as MBeans are diverse, and it really allows developers to understand the inner-workings by numbers. Confluent is providing a great documentation on the JMX metrics. Accessing the JMX interface remotely instead of placing a job on the same machine allows better de-coupling between service and the monitoring stack. Accessing JMX remotely is however not straight-forward, and for exactly that, documentation is missing.

Modifying the Service

The following setup assumes that you employ Confluent's open source platform, however, if you go with single packages, the environment variables might be the same. Here we use the latest, at the time of writing, it's Confluent Community Platform 5.1. After installing the platform, assuming you'll use the SystemD scripts to start your Kafka components, we need to add JMX_OPTS to use JMX remotely. The exact parameter will change depending on the component.

Kafka

Let's start with the Kafka brokers. To avoid any confusion which lines to add in the service script, I added the whole file below. For the other services, such as KSQL server or Zookeeper, I'll give the location of the file, and then the file contents. The original script for the Kafka startup is located at /lib/systemd/system/confluent-kafka.service.

[Unit]
Description=Apache Kafka - broker
Documentation=http://docs.confluent.io/
After=network.target confluent-zookeeper.target

[Service]
Type=simple
User=cp-kafka
Group=confluent
ExecStart=/usr/bin/kafka-server-start /etc/kafka/server.properties
Environment=KAFKA_JMX_OPTS="-Djava.rmi.server.hostname=${hostip} -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.rmi.port=1099 -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
Environment=KAFKA_HEAP_OPTS="-Xmx6G -Xms6G"

TimeoutStopSec=180
Restart=no

[Install]
WantedBy=multi-user.target

Zookeeper

The service script is located at /lib/systemd/system/confluent-zookeeper.service.

[Unit]
Description=Apache Kafka - ZooKeeper
Documentation=http://docs.confluent.io/
After=network.target

[Service]
Type=simple
User=cp-kafka
Group=confluent
ExecStart=/usr/bin/zookeeper-server-start /etc/kafka/zookeeper.properties
Environment=KAFKA_JMX_OPTS="-Djava.rmi.server.hostname=${hostip} -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.rmi.port=1099 -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
Environment=JMX_PORT=1099
TimeoutStopSec=180
Restart=no

[Install]
WantedBy=multi-user.target

KSQL Server

The service script is located at /lib/systemd/system/confluent-ksql.service.

  [Unit]
  Description=Streaming SQL engine for Apache Kafka
  Documentation=http://docs.confluent.io/
  After=network.target confluent-kafka.target confluent-schema-registry.target

  [Service]
  Type=simple
  User=cp-ksql
  Group=confluent
  Environment="LOG_DIR=/var/log/confluent/ksql"
  Environment=KSQL_OPTS="-Djava.rmi.server.hostname=${hostip} -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.rmi.port=1099 -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
  Environment=JMX_PORT=1099
  ExecStart=/usr/bin/ksql-server-start /etc/ksql/ksql-server.properties
  TimeoutStopSec=180
  Restart=no

  [Install]
  WantedBy=multi-user.target

Schema Registry

The service script is located at /lib/systemd/system/confluent-schema-registry.service.

[Unit]
Description=RESTful Avro schema registry for Apache Kafka
Documentation=http://docs.confluent.io/
After=network.target confluent-kafka.target

[Service]
Type=simple
User=cp-schema-registry
Group=confluent
Environment="LOG_DIR=/var/log/confluent/schema-registry"
ExecStart=/usr/bin/schema-registry-start /etc/schema-registry/schema-registry.properties
Environment=SCHEMA_REGISTRY_JMX_OPTS="-Djava.rmi.server.hostname=${hostip} -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.rmi.port=1099 -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
Environment=JMX_PORT=1099
TimeoutStopSec=180
Restart=no

[Install]
WantedBy=multi-user.target

One or two mails a month about the latest technology I'm hacking on.