TDI Reconnect Rules

Tuesday, 17 April 2012
Network connections have a habit of dropping whether it be because of network issues, server load or application faults.

TDI, fortunately, has some in-built mechanisms to help the TDI developer cope with such situations - Reconnect Rules. The theory is that should a connection to a data source "disappear" for some reason, TDI can attempt to reconnect to the data source and continue processing. This reconnection can be attempted a number of times and can also be configured to wait a certain amount of time before attempting to reconnect. But of course, I'm not really educating you seasoned TDI professionals by telling you this. You already understand the concepts and where to configure the rules, right?

So all is well in the world again. Our JDBC, LDAP and HTTP connectors can all be configured to be robust enough to withstand network glitches.

What about our Function Components? What about, for example, the Axis Easy Invoke Soap Web Service Function Component? There is certainly a tab for Connection Errors available which would suggest that we can avail of the same technology to ensure we have a robust Assembly Line.

Alas, it would seem not. I was recently confronted with a badly behaving Web Service data source that every now and again wouldn't bother to respond to my requests. I would get a much to my frustration. I didn't want to set the Function Component's timeout unreasonably high - I merely wanted to retry my operation so turned to my Connection Errors tab. Behaviourally? No change!

So what was going on here? Can it be as simple as Reconnect Rules are only appropriate for connectors? I simulated my issue to test the theory. I used an HTTP Client connector to connect to an HTTP Server connector that would never respond and configured my HTTP Client connector with just a 2 second timeout. I got my after 2 seconds, unsurprisingly. I also saw my connector wait for 2 seconds and attempt a reconnection (as per the configuration in the above diagram).

That said, I killed my client Assembly Line after a minute or so to find the following in my log file:

08:40:54,615 INFO  - CTGDIS087I Iterating.
08:40:54,615 INFO  - CTGDIS086I No iterator in AssemblyLine, will run single pass only.
08:40:54,615 INFO  - CTGDIS092I Using runtime provided entry as working entry (first pass only).
08:41:54,011 INFO  - CTGDIS100I Printing the Connector statistics.
08:41:54,161 INFO  -  [HTTPClientConnector] Reconnect attempts:13, Errors:14
08:41:54,161 INFO  - CTGDIS104I Total: Reconnect attempts:13, Errors:14.
08:41:54,161 INFO  - CTGDIS101I Finished printing the Connector statistics.
08:41:54,171 INFO  - CTGDIS080I Terminated successfully (14 errors).

13 reconnection attempts? Can that be right? I only configured my Connection Errors tab to reconnect once! Odd. Maybe it has something to do with the fact that I'm receiving a I shut down my HTTP Server component and re-ran my client and got the following:

08:48:35,298 INFO  - [HTTPClientConnector] CTGDJP201I Finding entries with URL set to http://localhost:8091.
08:48:36,309 INFO  - [HTTPClientConnector] CTGDIS495I handleException , lookup, Connection refused: connect
08:48:38,322 INFO  - [HTTPClientConnector] CTGDJP201I Finding entries with URL set to http://localhost:8091.
08:48:39,304 INFO  - [HTTPClientConnector] CTGDIS495I handleException , lookup, Connection refused: connect
08:48:41,316 INFO  - [HTTPClientConnector] CTGDJP201I Finding entries with URL set to http://localhost:8091.
08:48:42,698 INFO  - [HTTPClientConnector] CTGDIS495I handleException , lookup, Connection refused: connect
08:48:44,821 INFO  - [HTTPClientConnector] CTGDJP201I Finding entries with URL set to http://localhost:8091.
08:48:45,813 INFO  - [HTTPClientConnector] CTGDIS495I handleException , lookup, Connection refused: connect

Reconnecting furiously on now and continually! Maybe that's an issue for another day though as I'm really interested in what happens with my Function Component as no matter what I put into the Connection Errors tab, I always get an aborting Assembly Line.

The answer to the problem is probably as I suggested - Connection Errors are inappropriate for Function Components. I can also tell that this is the case because if I examine the underlying XML for my Assembly Line I do not see the Reconnect tag as I do in my HTTP Client connector:

   <parameter name="autoreconnect">true</parameter>
   <parameter name="initreconnect">true</parameter>
   <parameter name="numberOfRetries">1</parameter>
   <parameter name="retryDelay">2</parameter>

If I'm being brutally honest, I always knew that the reconnect rules wouldn't work on Function Components as the tab wasn't even present in TDI v6.1.1 for Function Components. It seems that it has appeared at some point without me really noticing - everything above was performed using TDI v7.1.5. So, a tab has appeared which is misleading. Ah well, the only way forward now is to handle the failure myself in my hooks. A simple catch for, a counter, a sleep and a system.skipTo() will do the trick nicely.