The following steps describe a typical setup of Platform cluster:

This step is done in the configuration.properties file. This configuration.properties file must be set in the same way on all the cluster nodes. First, point the exo.shared.dir variable to a directory shared between cluster nodes.

The path is shared, so all nodes will need the read/write access to this path. Then, switch the JCR to the cluster mode.

In this step, JCR enables the automatic network replication and discovery between other cluster nodes.

You need to indicate the cluster kernel profile to eXo Platform. This can be done by editing the startup script in the bin/gatein.sh folder as below:

or use the start_eXo.sh script with such parameters:

For the initial startup of your JCR cluster, you should only start a single node. This node will initialize the internal JCR database and create the system workspace. Once the initial node is definitely started, you can start the other nodes.

The cluster mode is preconfigured to work out of the box. The eXo Platform clustering fully relies on the JBossCache replication which uses JGroups internally. The default configuration of JBossCache lies in exo.portal.component.common-x.x.x.jar. Since eXo Platform 3.5, the JCR's JBossCache configuration is externalized to the gatein.conf.dir configuration folder:

Q1. How to migrate from local to the cluster mode?

If you intend to migrate your production system from local (non-cluster) to the cluster mode, follow these steps:

1. Update the configuration to the cluster mode as explained above on your main server.

2. Use the same configuration on other cluster nodes.

3. Move the index and value storage to the shared file system.

4. Start the cluster.

Q2. Why is startup failed with the "Port value out of range" error?

On Linux, your startup is failed if you encounter the following error:

[INFO] Caused by: java.lang.IllegalArgumentException: Port value out of range: 65536

This problem happens under specific circumstances when JGroups-the networking library behind the clustering attempts to detect the IP to use for communication with other nodes.

You need to verify:

  • The host name is a valid IP address, served by one of the network devices, such as eth0, eth1.

  • The host name is NOT defined as localhost or 127.0.0.1.

Q3. How to solve the "failed sending message to null" error?

If you encounter the following error when starting up in the cluster mode on Linux:

Dec 15, 2010 6:11:31 PM org.jgroups.protocols.TP down
SEVERE: failed sending message to null (44 bytes)
java.lang.Exception: dest=/228.10.10.10:45588 (47 bytes)

Be aware that clustering on Linux only works with IPv4. Therefore, when using a cluster under Linux, add the following property to JVM parameters:

 -Djava.net.preferIPv4Stack=true 

Q3. How to hide JGroups protocol warnings in the log?

In cluster mode several Platform subsystems, such as JCR, various caches, organization service, use shared JGroups transport. And in case of used by default UDP transport it might cause a side effect - lot of warnings, like these below:

WARNING: discarded message from different group "gatein-idm-api-cluster" (our group is "gatein-idm-store-cluster"). Sender was 192.168.1.55:54232
Dec 16, 2011 4:46:09 PM org.jgroups.protocols.TP passMessageUp
WARNING: discarded message from different group "gatein-idm-store-cluster" (our group is "gatein-idm-api-cluster"). Sender was 192.168.1.55:63364
Dec 16, 2011 4:46:10 PM org.jgroups.protocols.TP passMessageUp

To hide such warnings need configure the Application Server logger in appropriate way:

  • Apache Tomcat, in ${CATALINA_HOME}/conf/logging.properties add following lines

org.jgroups.level = SEVERE
org.jgroups.handlers = java.util.logging.ConsoleHandler,6gatein.org.apache.juli.FileHandler
  • JBoss Application Server, for server profile all, add in ${jboss_server}/server/all/conf/jboss-log4j.xml following

  <category name="org.jgroups.protocols.UDP">
    <priority value="ERROR"/>
  </category>
Copyright © 2009-2012. All rights reserved. eXo Platform SAS