Java EE Clustering in Docker

Andrea Scanzani
Digital Software Architecture
9 min readDec 11, 2020

--

In this article we will find out how to contain a Java EE Web Application on Docker, creating a cluster between the instances of our application that we will deploy on Docker, maintaining the typical clustering mechanisms such as HTTP Session Replication and EJB Clustering between the application nodes.

Clustering, Scalability & HA (High Availability)

A Cluster is a series of connected machines that work in parallel; The use of a Cluster system allows you to perform complex operations by distributing the processing on all the nodes that make up the cluster.

The benefits of introducing a Cluster are the following:

  • Scalability
  • High Availability

Scalability

It is the property of a system, or application, to handle large amounts of work with the ability to increase or decrease resources according to greater demand.

There are two types of scalability:

  • Horizontal Scalability (Scale Out)

In this process, multiple server instances or nodes are added to the Cluster. All servers that make up the Cluster work in parallel, with load balancing so that requests can be distributed among all nodes.

The balancing of requests is done with Load Balancing tools.

Horizontal Scalability — Scale Out
Horizontal Scalability — Scale Out
  • Vertical Scalability (Scale Up)

A system scales vertically when resources (ex: RAM, processors, etc ..) are added on existing nodes.

Vertical Scalability — Scale Up
Vertical Scalability — Scale Up

High Availability

The term High Availability (HA) describes how a system provides resources over a given period of time. The high reliability guarantees an absolute level of functional continuity within a time window expressed as the ratio between uptime and downtime. High reliability is useful in contexts of IT applications that are subject to SLA (Service Level Agreements), which may include levels of availability for the application.

High Availability
High Availability

As you can see from the image, in this scenario the Load Balancer implements High Reliability by directing incoming requests to only the active nodes in the Cluster.

Cluster Management

In the real world, almost all applications have a state, for example Web Applications must handle HTTP Sessions as a state, Java EE applications must manage the state of EJB Statefuls or messages on JMS queues.

For correct operation it is necessary to replicate or distribute the application states within the Cluster.

There are substantial differences between the two approaches:

  • Distributed →The state is divided into datasets and partitioned for the Cluster nodes. An example of a Distributed state is the Sticky Session mechanism, which sends HTTP requests to the same node that created the HTTP Session:
State — Distributed
State — Distributed

Obviously this mechanism does not guarantee High Reliability, as if the Node that has the HTTP Session is down, the session user will have lost his status and will encounter an error.

  • Replicated → All the nodes of the Cluster have all the information of the state. This always guarantees High Reliability as the Status information (EJB, HTTP Session, JMS, …) is automatically replicated on all nodes.
State — Replicated
State — Replicated

It is also possible to save the state of an application in a separate Layer, such as distributed Cache Management tools such as Hazelcast or Redis.

State — Chache Layer
State — Chache Layer

In this article we are going to create a Web Application in Java EE, with the use of these technologies:

  • JSF
  • CDI
  • EJB
  • JAX-RS

We are going to build a Cluster on Docker using Wildfly as Application Server, Traefik as Load Balancer; implementing High Availability and understanding how to scale the application by managing the state in a Replicated manner between the various nodes.

Java EE Web Application

The full source code is available at the following git repos:

First let’s start creating a multi-module Maven project to generate an EAR that contains a JAR for the EJBs and a WAR for the Web component.

The pom.xml file of the Maven project, that is the root of the multi-module is this:

pom.xml

In the pom.xml we have declared our three modules (module-ear, module-ejb and module-web) and added the JavaEE dependency (with scope = provided as the implementation is provided by Wildfly), and configured the maven-ear-plugin for generating the application EAR package.

In our EJB module, we create:

  • one EJB Stateless → to genereta a UUID value
  • one EJB Stateful → with the count of how many parameters have been entered in the session
EJBs

Now let’s configure the web module, starting with the web.xml file (inside the web module):

web.xml

In the web.xml file we have inserted two servlet-mappings to enable JSF (javax.faces.webapp.FacesServlet) and JAX-RS (javax.ws.rs.core.Application), and we indicate which is our welcome-file, and we add the tag:

<distributable />

this tag is used to tell the Application Server to enable session replication in our Application!

Let’s move on to the Java code, write our CDI Bean for the JSF page:

MyJSFBean.java

Our bean is of type @SessionScoped so as to show how the replication of HTTP sessions is managed within the Cluster; we also inject a Stateful EJB (again to show the state replica) and a Stateless EJB.

Now let’s write our JSF page, which will hook to the Bean created above:

index.xhtml

On the page we have inserted:

  1. A paragraph with the information of the Node on which we landed from Load Balancer
  2. A paragraph with Session information (sessionId)
  3. A paragraph with the data recovered from the two EJBs
  4. A form to add parameters to the HTTP Session
  5. A list showing all the parameters present in the HTTP Session

To conclude the exercise we also create a REST API in GET which returns the UUID (Universal Unique Identifier) value and the execution node:

RestApi.java

Our application is finished, from the root of the project and running the command:

mvn clean package

Maven build the JavaEEClusterApp.ear

Deploying it on standalone Wildfly we will have the following paths:

Deploy on Docker

In the root of our project we create the Dockerfile in order to generate the Docker image.

Dockerfile

nside the Dockerfile we specify which Application Server image to use (jboss/wildfly:20.0.0.Final) and copy our EAR package into the Wildfly deployments directory.

Next we create our Username to access the Wildfly administration console, thanks to the command:

RUN /opt/jboss/wildfly/bin/add-user.sh admin Password1 --silent

Now we can access the console with admin / Password1.

Lastly, we indicate how to start Wildfly inside the Container:

ENTRYPOINT /opt/jboss/wildfly/bin/standalone.sh -b=0.0.0.0 -bmanagement=0.0.0.0 -Djboss.server.default.config=standalone-full-ha.xml -Djboss.node.name=$(hostname -i) -Djava.net.preferIPv4Stack=true -Djgroups.bind_addr=$(hostname -i) -Djboss.messaging.cluster.password=cluster_password1

the parameters we set are the following:

  • -b=0.0.0.0 → default configuration to bind Wildfly with 0.0.0.0
  • -bmanagement=0.0.0.0 → as above, but for the admin console
  • -Djboss.server.default.config=standalone-full-ha.xml → We indicate which Wildfly profile should use. Full stands for the full loading of Java EE components, while HA stands for High Availability.
  • -Djboss.node.name=$(hostname -i) → We set the value of the Wildfly node with the IP that Docker generates (inside the container)
  • -Djava.net.preferIPv4Stack=true → To tell Wildfly to use IPv4
  • -Djgroups.bind_addr=$(hostname -i) → To bind JGroups to the IP that Docker assigns to the container. The JGroups subsystem provides group communication support for HA services in the form of JGroups channels. This configuration is also used for Wildfly Multicast.
  • -Djboss.messaging.cluster.password=cluster_password1 → Set the password for HornetQ cluster

Our application is now ready to be containerized:

mvn clean packagedocker build -t javaee-clustered-app .

At the end of the process we will find our “javaee-clustered-app” image in the local Docker repository:

> docker image lsREPOSITORY                                TAG                 IMAGE ID            CREATED             SIZE
javaee-clustered-app latest e0f2ff1a9003 2 days ago 766MB

Our image is ready and available, and we can test it with the creation of the container:

docker run -it -p 8080:8080 javaee-clustered-app

From the browser we will be able to see our application:

We stop the container with CTRL+C, and proceed to write the file to create the application Stack:

docker-compose.yaml

Our stack consists of two services:

  1. traefik → Application that takes care of acting as the Load Balancer of the application. Listening on two ports: 80 for Load Balancing and 8080 for the administration console
  2. javaee_clusterd_app → is our JavaEE application. We add the labels to communicate with traefik (such as the PathPrefix, and the indication of the port 8080 with which traefik will divert the http traffic)

Let’s proceed with the stack deployment:

docker stack deploy --compose-file docker-compose.yaml javaee_stack

Once our stack has been created, we can verify the outcome:

#Verifichiamo lo stack
> docker stack ls
#Verifichiamo la composizione del nostro stack
> docker stack ps javaee_stack
#Vediamo i servizi attivi
> docker service ls

We will have to have the following result:

docker stack
docker stack

Let’s connect to the Traefik Console to check the HTTP Routing configuration to our Java EE application. From browser we go to:

http://localhost:8080/dashboard

Under the HTTP Routers tab we should find our routing with the PathPrefix (‘/JavaEEClusterApp’) (the one specified in the docker-compose.yaml file):

Traefik — HTTP Routes
Traefik — HTTP Routes

Under the HTTP Services tab we will find our backend consisting of our Java EE application:

Traefik — HTTP Services
Traefik — HTTP Services

Going into detail we will see how our Service consists of a single Server:

Traefik — HTTP Services — Detail
Traefik — HTTP Services — Detail

LThe address we see under Servers is Docker’s internal IP, not visible from the outside but only from the network created in the docker stack (of which Traefik is part of it).

We can test the configuration from our browser:

http://localhost/JavaEEClusterApp/

Now let’s create our Application Cluster, adding 2 more nodes to the existing one.

We use the following command:

docker service scale javaee_stack_javaee_clustered_app=3

From the console we will see how Docker creates two more instances/nodes of our Cluster, creating two more containers.

By running the verification commands, we will see how now our application has a replica of 3:

docker stack replicated
docker stack replicated

We will also have the two new nodes under the Traefik balancer:

Traefik — HTTP Services — Detail Load Balancing
Traefik — HTTP Services — Detail Load Balancing

Testing the Cluster

Now that we have a Cluster composed of 3 nodes, ready we can start a verification of the state replication mechanism of our Java EE application.

Let’s start by verifying the correct balance of our 3 servers through invocations to our API Rest.

Using the Curl tool we invoke our API 3 times:

curl -X GET http://localhost/JavaEEClusterApp/api/uuid

and we will see that each time the nodeName attribute will have a different value, indicating how the load is balanced on all nodes!

Load Balancing — Rest
Load Balancing — Rest

Let’s proceed with the test of the WEB part, verifying the replication mechanisms of the HTTP Sessions and of the Stateful EJBs.

From browser we go to the page:

http://localhost/JavaEEClusterApp/

Let’s add some attributes in session, and clicking on the “Refresh Page” button it is possible to see how changing the Node on which we land from the balancer (Node Name field) both the Session ID field and the list of Session attributes are always the same!

Note how the state maintained by the EJB Stateful remains unchanged even when changing the execution node!

--

--

Andrea Scanzani
Digital Software Architecture

IT Solution Architect and Project Leader (PMI-ACP®, PRINCE2®, TOGAF®, PSM®, ITIL®, IBM® ACE).