This is one of a series of blog posts about the Spring Batch framework, based on lessons learned from building a number of batch jobs.
For a description of the Spring Batch framework, please take a look here.ย
With some Spring Batch jobs you end up processing thousands, if not millions, of records. It can be useful to log the progress, for two reasons:
Thankfully, itโs quite easy to set up using the listener interfaces provided by the Spring Batch framework. For some background info on the listeners available take a look here.
What we need is a listener class to count the number of records read and output a log message every time the count reaches a multiple of a certain value, such as one thousand. The ChunkListener provides a means to be notified during the processing of records within a step.
Hereโs an example:
package my.package;
import java.text.MessageFormat;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.springframework.batch.core.ChunkListener;
import org.springframework.batch.core.scope.context.ChunkContext;
/**
* Log the count of items processed at a specified interval.
*
* @author Jeremy Yearron
*
*/
public class ChunkCountListener implements ChunkListener{
private static final Logger log = LogManager.getLogger(ChunkCountListener.class);
private MessageFormat fmt = new MessageFormat("{0} items processed");
private int loggingInterval = 1000;
@Override
public void beforeChunk(ChunkContext context) {
// Nothing to do here
}
@Override
public void afterChunk(ChunkContext context) {
int count = context.getStepContext().getStepExecution().getReadCount();
// If the number of records processed so far is a multiple of the logging interval then output a log message.
if (count > 0 && count % loggingInterval == 0) {
log.info( fmt.format(new Object[] {new Integer(count) })) ;
}
}
@Override
public void afterChunkError(ChunkContext context) {
// Nothing to do here
}
public void setItemName(String itemName) {
this.fmt = new MessageFormat("{0} " + itemName + " processed");
}
public void setLoggingInterval(int loggingInterval) {
this.loggingInterval = loggingInterval;
}
}
This class allows you to specify the interval at which messages will be written to the log – I find that 1000 is a good default, but this can be changed to suit your circumstances. It also allows you to provide a name for the items that are being counted. This means that you can count the records for a number of steps and identify each set clearly in the log. The class uses the count of items read, rather than the items written, as you may have a processor that filters out some of the recordsย – in which case the number of items written will not increase in a consistent manner.
If youโre configuring your Spring app using XML, then you add a bean like this:
<bean id="myCountListener" class="my.package.ChunkCountListener">
<property name="itemName" value="Customers" />
<property name="loggingInterval" value="10000" />
</bean>
Using this config, the listener will write out something like this:
10,000 Customers processed
20,000 Customers processed
...
Then you add the listener to the step:
<step id="myStep">
<tasklet>
<chunk reader="myReader" processor="myProcessor" writer="myListWriter" commit-interval=โ500โ/>
</tasklet>
<listeners>
<listener ref="myCountListenerโ/>
</listeners>
</step>
Note that the logging Interval value for the listener must be a multiple of the commit-interval value for the step or the number of items read may never be a multiple of the loggingInterval value and the messages may never get written to the log.
So now you can run your job and the progress will be recorded in the log file. But what if you want to take a closer look at the performance of the job?
How you approach this depends on your circumstances, but creating a class to parse the log file is straightforward. If you configure the logging so that the timestamp is included in each log record then you can calculate the time taken to process each thousand records or whatever your logging interval is. Then the time taken to process each interval could be plotted in a graph to illustrate how the speed of processing changed over the course of the execution and highlight any changes in performance.
Our independent tech team has been servicing enterprise clients for over 15 years from our HQ in Bristol, UK. Let’s see how we can work together and get the most out of your Salesforce implementation.