Brian McCallister

Wed, 01 Oct 2003

Clustering Tomcat is Easier than People Think

JGroups is all the rage for server clustering in the Java world at the moment. It is also a pain in the butt at times. Tomcat does something like it with its internal multicast replication (backported from 5 to 4), but it is rather dificult to get working, and finicky once it does work.

So, instead of fighting with multicast in-memory replication remember that jk2 does server-affinity very nicely with Tomcat behind it and use a database to store session information. It works disturbingly well, and is much easier to configure.

It's quite easy to do this, lets look at a typical <Context /> element with a database backed session manager:


<Context path=""
    docBase="/usr/local/my-web-app"
    debug="0">

    <Manager
        className="org.apache.catalina.session.PersistentManager"
        debug="0"
        saveOnRestart="true"
        maxActiveSessions="-1"
        minIdleSwap="30"
        maxIdleSwap="600"
        maxIdleBackup="0">
        <Store className="org.apache.catalina.session.JDBCStore"
            driverName="org.postgresql.Driver"
            connectionURL="jdbc:postgresql://db.example.com/tomcat?user=user&password=password"
            sessionTable="tomcat_sessions"
            sessionIdCol="id"
            sessionDataCol="data"
            sessionValidCol="valid"
            sessionMaxInactiveCol="maxinactive"
            sessionLastAccessedCol="lastaccess"
            checkInterval="60"
            debug="0" />
        </Manager>
</Context>

The difference between this and other contexts is the <Manager /> element. It specifies a database to store sessions in. It requires a pretty basic schema:


tomcat=# \d tomcat_sessions 
          Table "public.tomcat_sessions"
   Column    |          Type          | Modifiers 
-------------+------------------------+-----------
 id          | character varying(100) | not null
 valid       | character(1)           | not null
 maxinactive | integer                | not null
 lastaccess  | bigint                 | 
 data        | bytea                  | 
 app         | bytea                  | 
Indexes: tomcat_sessions_pkey primary key btree (id)

tomcat=#

This can sit in pretty much any database. MySQL in uberfast mode (ie, no transactions) is a good choice as session storage is inherently single-threaded. I use Postgres because I like Postgres =)

Now the fun part, making it really useful for a cluster. I will start with the presumption that we are using a small cluster, say four app servers: sam, frodo, pippin, and merry; two http sprayers: morgul and isengard; and a database server: db (I know, it breaks the naming convention).

The sprayers are running Apache HTTPD with jk2 and the following entries for each app server in worker2.properties:


[channel.socket:frodo.example.com:8009]
info=Ajp13 forwarding over socket
debug=0
group=lb
tomcatId=frodo

s/frodo/$server_name/g for each additional app server. The tomcatId is important.

Each app server needs to modify its <Engine /> element to include a jvmRoute attribute:


<Engine name="Standalone" defaultHost="localhost" debug="0" jvmRoute="frodo">

Works for frodo. The jvmRoute attribute needs to match the tomcatId for that server in the jk2 configuration. In addition, each app server needs the earlier described session manager.

Round robin the external DNS between the HTTPD instances. They are stateless for their affinity -- all the information they need is contained in the JSESSIONID. If the app server it wants to send the request to is down it picks another at random. That app server relaizes it doesn't have the session and you get a database hit to retrieve the session. This only happens when a server goes down, however. Normally you will have full session affinity

The problem with this is a database update per request. This can really put the crimp on an already bogged down database server. However it is easily solved - use a seperate database on a seperate network. Each app server will have two or three NIC's and be on two or three networks. The first is the network between the app servers and the sprayers. The second is a network between the app servers and the session database. The third is an administration network. This is overkill, but hey, nics and switches are cheap.

This setup will handle a lot of load and the chokepoint will become the RDBMS normally. C-JDBC might be the cost effective choice to solve that problem, but I don't trust it enough yet for production use. It may be there, but my understanding of it and faith in it isn't yet. Postgres's recently released replication server can provide failover capabilities, but it still means you need a big performant database. Anyone have a good solution for that?

6 writebacks [/src/java] permanent link

Brian's Waste of Time