Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Unpack to some location, like /opt/kafka, which will be referred to as <kafkaInstallationDir>.

Kafka has historically leveraged the Zookeeper software for its controller layer, but as Use Kafka Raft ("KRaft") for Kafka's controller layer. As of Kafka 3.3.1 (Oct 2022) it’s recommended to use the native Kafka Raft (“KRaft”) which improves the native KRaft is recommended over the Zookeeper controller software. KRaft improves upon performance and consolidates the server configuration files and process to 1 each (per node). NOTE: These instructions where tested with Kafka version 3.3.1.

Configure server.properties

In KRaft mode, the relevant configuration file is located at <kafkaInstallationDir>/config/kraft/server.properties.

A Use a minimum server cluster size of 3 is recommended for in SHRINE production environments, in which each server node functions as both a broker and controller. This is enabled by setting the process.roles to broker,controller, as noted in the Server Basics section:

...

where advertised.listeners is the second parameter (after node.id) that is unique to each server node.log.dirs specifies not the location of server logs (that is in <kafkaInstallationDir/logs>, but actual topic data, so provide a reliable location outside the default /tmp; for example /var/opt/kafka. 

SASL_SSL on the broker listener is required to enforce client/server user authentication and authorization since that traffic is traversing the public internet. SASL_SSL may also be enabled for the controller listener with properly configured keystores and truststores; however if the server nodes communicate exclusively in private network space (as described above), then SASL_PLAINTEXT may be considered sufficient.

...

log.dirs specifies not the location of server logs (that is in <kafkaInstallationDir/logs>, but actual topic data, so provide a reliable location outside the default /tmp; for example /var/opt/kafka. 

...

Set the SASL mechanism parameters to PLAIN:

...

Code Block
languagejs
themeRDark
export KAFKA_OPTS="-Djava.security.auth.login.config=<kafkaInstallationDir<kafkaInstallationDir>/config/kraft/kafka_server_jaas.conf"

...

Create Server Keystores and Truststores

TODO: bit about creating a keystore unique to each server node, and a common truststore. Catalyst has its own CA with self-signed wildcard certs, enabling 1 shared server keystore and 1 shared truststore. Production clusters must get certs signed by a real CA with no wildcard CNs. This is yet-untested on internal systems so we have no proven documentation for it.

In order to secure traffic through the internet with TLS/SSL, Kafka requires client/server authentication via public key infrastructure (PKI). Each Kafka server needs a keystore for identifying itself, while each client and server needs a truststore to authenticate servers. As the Kafka administrator you may choose to have all server certificates signed by a true Certificate Authority (CA), or to manage a private CA within your organization and use it for signing.

In either case, each Kafka server node's keystore must store a unique private key (always kept secure) and certificate (which gets signed). Each server node's truststore must store a list of all signed server certificates, or alternatively a CA's own cert, in order to let server nodes behave as logical clients during inter-broker communication. The servers' truststore contents can be identical to that of clients. One benefit of managing a private CA is enabling all client and server truststores to identically contain only the CA's cert, in effect telling all systems in the network to trust the CA and every cert signed by it. Additionally this enables server nodes to join or leave the cluster without truststores needing cert addition or revocation. See here for comprehensive documentation on PKI architecture and keystore management in Kafka:

https://kafka.apache.org/33/documentation.html#security_ssl

https://docs.confluent.io/platform/current/security/security_tutorial.html

Create keystores and truststores in PKCS12 format. When creating, you will be prompted for passwords. Add the file Add the keystore and truststore locations and passwords to the end of server.properties:

Code Block
languagejs
themeRDark
titleserver.properties
# ssl.key.password= needed if using real CA?<thisServersKeystorePassword>
ssl.keystore.location=<path/to/this/servers/kafka_server_keystore.pkcs12>
ssl.keystore.password=<thisServersKeystorePassword>
ssl.truststore.location=<path/to/shared/kafka_server/_truststore.pkcs12>
ssl.truststore.password=<sharedServerTruststorePassword>

...

On all server nodes, format the storage directories using that uuid:

Code Block
languagejs
themeRDark
<kafkaInstallationDir/bin/kafka-storage.sh format --cluster-id <uuid> --config <kafkaInstallationDir>/config/kraft/server.properties

The storage directory is provided to Kafka in server.properties as log.dirs. Kafka will create the directory if it does not exist.

...

Run Kafka

On all server nodes, run:

...

Add your hub's Kafka user name (set in the Kafka servers' kafka_server_jaas.conf) and your client truststore location to shrine.conf:

Code Block
languagejs
themeRDark
titleshrine.conf
shrine {
...
  kafka {
    sasl.jaas.username = "yourShrineHubUser"
    ssl.truststore.location = "yourKafkaUserNamepath/to/your/kafka_client_truststore.pkcs12"
  }
...
}//shrine

Tomcat will need the And your hub's Kafka credentials to send and receive messages in tomcat's user password as well as the client truststore password to password.conf:

Code Block
languagejs
themeRDark
titlepassword.conf
shrine.kafka.sasl.jaas.password = "hubPassword"

...

yourShrineHubUserPassword"
shrine.kafka.ssl.truststore.password =

...

 "clientTruststorePassword"

Configure shrineLifecycle tool

The shrine-network-lifecycle-tool (aka "shrineLifecycle")

Configure shrineNetworkLifecycle

The shrineNetworkLifecycle needs the admin Kafka credentials to create, modify, and delete Kafka topics, as well as your client truststore. Add that to the shrineNetworkLifecyclethem to shrine-network-lifecycle-tool's conf /override.conf and conf/password.conf files:

Code Block
languagejs
themeRDark
titlepasswordoverride.conf
shrine.kafka.sasl.jaas.passwordusername = "andminPassword" kafka-admin"
shrine.kafka.ssl.truststore.location = "path/to/your/kafka_client_truststore.pkcs12"
Code Block
languagejs
themeRDark
titlepassword.conf
shrine.kafka.sasl.jaas.usernamepassword = "andminUsernameyourKafkaAdminPassword" 

...


shrine.kafka.ssl.truststore.password =

...

 "clientTruststorePassword"

Note: it's safe to mix the "curly-bracket" and "dot" syntax styles in the same file.

In network.conf, add

network.conf

To use Kafka: in network.conf, specify a Kafka section with the specifics to share with downstream nodes:

Code Block
languagejs
themeRDark
shrine {
  network {
    network {
      name = "Network Name"
      hubQueueName = "hub"
      adminEmail = "yourEmail@yourhospital.edu"
      momId = "hubUsernameyourShrineHubKafkaUser"
      kafka = {
	    networkPrefix = "BestNetworkNetworkName"
        bootstrapServers = "YourBootstrap1:Port,YourBootstrap2:Port,YourBootstrap3:Portkafka-1.yourDomain.com:9092,kafka-2.yourDomain.com:9092,kafka-3.yourDomain.com:9092"
        securityProtocol = "SASL_SSL"
        securityMechanism = "SCRAM-SHA-512PLAIN"
      }
    }
    nodes = [
      {
        name = "Hub's nodeNetwork Name hub"
        key = "hub-nodeqep"
        userDomainName = "network-hub"
        queueName = "hubNodehubQep"
        sendQueries = "false"
        adminEmail = "yourEmail@yourhospital.edu"
        momId = "hubUsernameyourShrineHubKafkaUser"
      }
    ]
  }
}

TODO current code uses something less secure. What's right?

Code Block
languagejs
themeRDark
       kafka = {
         networkPrefix = "shrinedev"
         bootstrapServers = "kafka-1.catalyst.harvard.edu:9092,kafka-2.catalyst.harvard.edu:9092,kafka-3.catalyst.harvard.edu:9092"
         securityProtocol = "SASL_PLAINTEXT"
         securityMechanism = "PLAIN"
       }


Choose a network prefix. This will be prepended to queue names to allow managing enable the scenario of multiple networks on the same Kafka cluster.Use the Kafka user name of the account to receive the messages for the momId. Find this on TODO

Note that the network's momId is the same as hub's node momId. You - the Kafka admin - will make a Kafka account for each downstream node, and share the username and password by a secure channel.

The bootstrapServers parameter is a list of some or all server nodes running in the cluster. Once connection is made to one of them, the cluster's controller.quorum.voters setting takes precedence. At least 2 bootstrapServers are recommended, but the real number of server nodes can grow or shrink without clients needing to know.