In Flux, we have made improvements in recent versions to handle database or network failures and recover from such failures gracefully without needing to restart Flux. Sometimes, it would be nice to notify administrators about these failures and they can act upon them in case of an unscheduled outage. Flux is dependent on the database for maintaining job states and schedules, so fixing this problem is critical.

BoneCP, a popular JDBC Connection Pool has an interesting and useful feature which allows the connection pool to automatically recover from such DB outages and it replays the transactions when a healthy connection becomes available. This feature became available since its 0.6.5 release and I had tested this before and found this to be very useful when configured with Flux.

In BoneCP, we can implement a connection hook which gets triggered when database failures occur and it is easy to configure in Flux to use BoneCP as the datasource pool provider. In case of any database failure, you may want to send an email notification or initiate a SNMP trap so other downstream systems can be handled accordingly.

Here is a basic DatabaseShutdownHook would look like:

import com.jolbox.bonecp.ConnectionHandle;
import com.jolbox.bonecp.hooks.AbstractConnectionHook;
import com.jolbox.bonecp.hooks.AcquireFailConfig;

import java.util.Date;

public class DatabaseShutdownHook extends AbstractConnectionHook {
    @Override
    public boolean onConnectionException(ConnectionHandle connection, String state, Throwable t) {
        // handle notifications here: SNMP or SMTP
        System.out.println("Database down at " + new Date());
        return super.onConnectionException(connection, state, t);
    }

    @Override
    public boolean onAcquireFail(Throwable t, AcquireFailConfig acquireConfig) {
        // handle notifications here: SNMP or SMTP
        System.out.println("Failure to acquire connection at " + new Date() + ". Retry attempts remaining : " + acquireConfig.getAcquireRetryAttempts());
        return super.onAcquireFail(t, acquireConfig);
    }

}

Let us now see how to configure BoneCP as a Data source in Flux.

import com.jolbox.bonecp.BoneCPDataSource;
import flux.Configuration;
import flux.DatabaseType;
import flux.Engine;
import flux.Factory;

import javax.naming.Context;
import javax.naming.InitialContext;
import javax.naming.NamingException;
import java.sql.Connection;
import java.sql.SQLException;

public class FluxEngine {

    private static Context initialContext;

    static {
        try {
            System.setProperty(Context.INITIAL_CONTEXT_FACTORY, "org.apache.naming.java.javaURLContextFactory");
            System.setProperty(Context.URL_PKG_PREFIXES, "org.apache.naming");
            initialContext = new InitialContext();
            initializeDataSource();
        } catch (NamingException e) {
            // log exception
        } catch (SQLException e) {
            // log exception
        }
    }

    public static void initializeDataSource()  throws NamingException, SQLException {
        BoneCPDataSource ds = new BoneCPDataSource();
        ds.setJdbcUrl("jdbc:mysql://localhost:3306/flux710?relaxAutoCommit=true");
        ds.setUsername("flux");
        ds.setPassword("secret");
        ds.setMinConnectionsPerPartition(10);
        ds.setMaxConnectionsPerPartition(50);
        ds.setPartitionCount(1);
        ds.setConnectionHook(new DatabaseShutdownHook());// Required only if you need notifications.
        ds.setTransactionRecoveryEnabled(true);// Important: This should be enabled
        ds.setAcquireRetryAttempts(10);//default is 5
        ds.setAcquireRetryDelay(10000);// default is 7 secs
        ds.setReleaseHelperThreads(5);

        Connection con = ds.getConnection();
        if (con != null) {
            initialContext.rebind("FluxDataSource", ds);
            con.close();
        }
        System.out.println("DataSource configured.");
    }

    public static void main(String[] args) throws Exception {
        Factory f = Factory.makeInstance();
        Configuration c = f.makeConfiguration();
        c.setDatabaseType(DatabaseType.MYSQL);
        c.setDataSource("FluxDataSource");
        Engine engine = f.makeEngine(c);
        engine.start();
        System.out.println("Engine started.");
    }
}

Data sources should be made available via JNDI for Flux to use them. So, in this example I used Tomcat JNDI support to expose the BoneCP datasource via JNDI. There are also other ways to expose this via JNDI when running Flux as a standalone server. Tomcat JNDI was easy to configure though as you can see from the code above. The BoneCP data source should be configured for transaction recovery and you can set the number of recovery attempts and retry delay.

I have created a simple flow chart which has a Timer Trigger followed by a Java Action. The timer is configured to fire every 15 seconds for 5 times as shown below.

process_data.png

Here is a sample output of running this job in Flux configured with BoneCP and MySQL database. I shutdown MySQL server when this job was running and you can see the BoneCP recovery attempts and after a while I brought back the MySQL server and BoneCP successfully recovered. Now, you can see Flux execute the last occurrence of this job successfully.

DataSource configured.
Engine started.
Done processing data

Done processing data

Done processing data

Done processing data

Database down at Sun Oct 10 12:18:12 MDT 2010
Oct 10, 2010 12:18:12 PM com.jolbox.bonecp.ConnectionHandle markPossiblyBroken
SEVERE: Database access problem. Killing off all remaining connections in the connection pool. SQL State = 08007
Oct 10, 2010 12:18:12 PM com.jolbox.bonecp.MemorizeTransactionProxy invoke
SEVERE: Connection failed. Attempting to recover transaction on Thread #70
Oct 10, 2010 12:18:14 PM com.jolbox.bonecp.hooks.AbstractConnectionHook onAcquireFail
SEVERE: Failed to acquire connection Sleeping for 10000ms and trying again. Attempts left: 10. Exception: java.net.ConnectException: Connection refused: connect
Failure to acquire connection at Sun Oct 10 12:18:14 MDT 2010. Retry attempts remaining : 10
Oct 10, 2010 12:18:26 PM com.jolbox.bonecp.hooks.AbstractConnectionHook onAcquireFail
SEVERE: Failed to acquire connection Sleeping for 10000ms and trying again. Attempts left: 9. Exception: java.net.ConnectException: Connection refused: connect
Failure to acquire connection at Sun Oct 10 12:18:26 MDT 2010. Retry attempts remaining : 9
Oct 10, 2010 12:18:38 PM com.jolbox.bonecp.hooks.AbstractConnectionHook onAcquireFail
SEVERE: Failed to acquire connection Sleeping for 10000ms and trying again. Attempts left: 8. Exception: java.net.ConnectException: Connection refused: connect
Failure to acquire connection at Sun Oct 10 12:18:38 MDT 2010. Retry attempts remaining : 8
Oct 10, 2010 12:18:48 PM com.jolbox.bonecp.MemorizeTransactionProxy invoke
SEVERE: Recovery succeeded on Thread #70
Done processing data

The following BoneCP dependencies are required in the classpath:
bonecp-0.7.0.jar
guava-r07.jar
slf4j-api-1.6.1.jar
slf4j-jdk14-1.6.1.jar (Note: You can use any logger bindings supported by slf4j. Using JDK logger for simplicity and one less jar.)

To configure Tomcat JNDI, you need to have these jars in classpath:
catalina.jar
tomcat-juli.jar

Let me know if you have any trouble setting up this in Flux.

(Update: 10/13): Updated to reference the latest stable 0.7.0 release which now uses Google guava instead of the retired Google collections library and fixed slf4j dependency requirement.

Possibly Related Posts:


2 Responses to “Handling Flux Database outages using BoneCP Connection Pool”

  • Wallace Wadge

    Thanks for the blog post. Just a minor correction: as per slf4j docs, you might have a different logger installed and thus slf4j-jdk14-1.6.1.jar is not strictly a dependency of bonecp.

  • Arul

    Hi Wallace,

    Thanks for catching that. I have updated the blog and thanks for developing this wonderful tool.

    -Arul