Java Braindump

Tuesday, April 1, 2014

Extending Spring Data

What is Spring data

Spring Data is by theit own definition "Makes it easy to use new data access technologies". It is a collection of various database specific subprojects. At the heart of it, Spring data's goal is to make implementing a Data Access layer effortless

Almost all CRUD applications are designed with MVC pattern. Model-represents the Data, View the UI and COntroller the business logic. Usually, you implement a Data Access layer that interfaes between the controller and the model. The job of the Data layer is to isolate the specifics of the database implementation from the business logic. This is a good way to keep your controller from getting bloated, except that it leads to

a) Data access has a lot of boiler plate code

b) There is a common tendency fro Data access code to seep into Controller

Spring Data solves these problem by delegating the job of writing Data Access code to code. The idea is that the developer tells the framework what kind of Data access queries s/he wants, and Spring Data will generate the actual code. SO, for exmple, lets say we have an Employee entity, and the developer wants to implement a Data access layer for it using Spring data, all s/he has to do is this

public interface EmployeeRepository extends JpaRepository {}

Boom! That's it. Spring scans the packaged for interfaces that extend JpaRepository and automatically generates a bean that implements that interface and adds the autogenerated bean to the application context. Any other classes that autowire an instance of EmployeeRepository will get the generated bean. JpaRepository declares the most common CRUD methods along with support for paging and sorting. The users of EmployeeRepository can use these methods right out of the box

If you want to add more methods to find Employee, all you need to do is add a method to the interface

public interface EmployeeRepository extends JpaRepository {
    @Query
    Page findByName(String name, Pageable pageable);
}

That's it. Spring Data already knows how to generate the code to find employee by name and to return the results page by page. It figures out what column to match by parsing the function name. The developer can also provide the JQL directly. To find out differrent ways of adding find queries into Spring Data, please consult the Spring documentation

If you want to move from using JPA to using MongoDB, all you need to do is change the interface that EmployeeRepsitory extends. So, all you use is change to this

public interface EmployeeRepository extends MongoRepository {
}

This tells Spring Data to generate code that talks to Mongo DB instead of JPA

[edit]How does it work?

A lot of the internals of Spring Data are undocumented, and even the Spring data code is poorly documented. I went through some effort of reverse engineering Spring Data to understand this* I am documenting this here so the next generation of hackers don't have to hack through Spring Data again

_{*I believe, I have seen Mike Bond do some very similar things in Lucy, and I am guessing he has gone through the same process.}

As explained earlier, all of the Spring Data magic works by load-time code generation. Whenever a Spring context is loaded, Spring scans the packages for interfaces that extend the Spring Data interfaces and generates bean that implement these interfaces. However, to make Spring Data extensible and debuggable, not all code is generated. Most of the code resides in classes implemented by Spring Data developers. At load-time, Spring Data generates a proxy class that delegates the calls to the "real" classes. The proxy simply acts as a bridge between the controller and the implementaiton provided by Spring

The way it is desinged is that for each of the respective Spring Data interfaces, there is a class that implements the interface. So, for example, there is a class named JpaRepositoryImpl that implements JpaRepository. SO, during load time, Spring Data scans the packages, and when it sees an interface that extends JpaRepository, it invokes a factory that

creates an instance of JpaRepositoryImpl
Generates a proxy class that implements EmployeeRepository. The proxy class wraps the instance of JpaRepositoryImpl. The proxy is very lightweight and simply delegates to the wrapped instance
Adds the proxy to the spring's application context. Since, at this point, this bean it's just like any other bean, it can be used as any other bean, which means it can be injected into other beans

[edit]Wait a minute! What about methods annotated with @Query

Good question! So, if you have an interface that looks like this

public interface EmployeeRepository extends JpaRepository {
    @Query
    Page findByName(String name, Pageable pageable);
}

WHat happens to the findByName method. Long answer short, this uses the same "pattern" of the proxy class delegating to abnother class that does the real query. During load time, Spring instantiates a class that implements RepositoryQuery interface for every method that has the Query annotation. The type of the concrete class depends on the type of query(for example if it's a JQL, it will use SimpleJPAQuery). The instance of RepositoryQuery is initialized with enough information to run the query (for example, the JQL will be passed to the SimpleJPAQuery). The proxy class will contain an object of RespositoryQuery for every method annotated with @Query , and it has the code to delegate to the correct ReqpositoryQuery

[edit]How do you extend it?

Once you understand how Spring Data works, extending Spring Data is a simple matter of either extending existing Spring Data classes or simply implementing your own classes. The beauty of Spring Data is that you don't have to extend any of their repositories, and still get SPring Data like functionality

Let's go through a simple sample:- We are trying to implement a Spring Data interface that can store data in CSV files. We are not extending existing Spring Data functionality. We are going to provide completely new interfaces.

These are the steps to follow

[edit]Define interface

Define an interface that your users have to implement. Your interface should define all the methods for basic CRUD functionality. To make it easy, you can just extend one of Spring data provided interfaces. You don;t have to. It just makes it easy

public interface CSVBasedRepository extends PagingAndSortingRepository {
}

This is the interface that your users will extend.. like this

public interface EmployeeRepository extends CSVBasedRepository  {
}

You can add more methods to CSVBasedRepository if you like. To keep things simple, we will keep it as is

[edit]Provide implementation for interface

Now, you need to provide an implementation of CSVBasedRepository. THis class does the real work of reading and writing to the CSV file

public class CSVBasedRepositoryImpl implements CSVBasedRepository {

    @Override

    public T findOne(ID id) {
        //TODO add code here
    }


    @Override

    public boolean exists(ID id) {

    //TODO

   }

.
.
.
.
// other methods declared CSVBasedRepository

}

So, now you have an implementation of a CSV based Spring repository that does most of the work of reading/writing a record(s) tio CSV

[edit]Provide implementation for Queries

First thing you need to decide is what kind of queries you want to support JPA supports, JQL queries, Named queries and PartTree queries. You don't have to support all of these. You can support only one, or you can come up with your own type of Query. The only thing is that your users have to know what kind of Queries will you support. Spring Data let's you define your own rules

So, in our example, let's say we don;t want to support JQL queries and Named queries because it's too difficult, and unnescary. Let's say, we just want to support parttree. So, the user can define the repository as

public interface EmployeeRepository extends CSVBasedRepository  {

   @Query

    public List findByName(String name);

    @Query

    public List findByNameAndJoinDate(String name, Date joinDate);

}

Here the name of the method tells you which columns to search

You need to implement a class that implements RespositoryQuery


class CSVQuery implements RepositoryQuery {

    final Method method;
    final RespositoryMetaData metadata;
    final List columnNames;

    public CSVQuery(Method method, RepositoryMetadata metadata,List columnNames){

     this.columnNames = new ArrayList(columnNames);
     this.method = method;
     this.metadata = metadata;

   }

   @Override

   public List execute(Object[] parameters) {
      //code to search the CSV file goes here.. The columns to search are in the constructor. The data to search for is in parameters
  }

  @Override

  public QueryMethod getQueryMethod() {

      // dont; think about this too much. Spring data just wants it like this... haven;t figured out rhyme or reason why this is here
      return new QueryMethod(method, metadata);

  }

}

[edit]Extend factory classes

So, now you have implemented a class that provides basic CRUD functionality, and a class that can run queries against your data source. Now you need to make the magic happen. The magic lies in factory classes that generate a proxy class. Just to make it simple for us common folk. Spring Data has provided classes that provide the basic functioality of the factory that you can extend. You don;t have to generate the proxies yourself. Just provide a factory that creates the real repository and query beans, and the base class takes care of generating the proxy.

Also, because Spring Data doesn't want things to be too easy for us, they decided the best way to do this is to have a factory that returns a factory that returns. Don't think about it too much.

This is where all the magic behind Spring Data lies, and unfortunately this code is very badly documented, and frankly, very confusing. Anyways, the awesomeness of Spring Data kind of justifies the over engineered code

[edit]Implement QueryLookupStrategy

QueryLookupStrategy is the factory that creates an instance of RepositoryQuery and the query class. You have to implement one method named resolveQuery that returns RepositoryQuery

class CSVQueryLookupStrategy extends QueryLookupStrategy {

     public RepositoryQuery resolveQuery(Method method, RepositoryMetadata metadata, NamedQueries namedQueries) {
        // here method refers to the method being implemented and metadata gives access to the interface being implemented
        List colunNames = extractColumnNamesFromMethod(method); 
        return new CSVQuery(method, metadata, columnNames);
    }
}

[edit]Implement RepositoryFactorySupport

This is a class that acts as a factory for the Repository interface and factory for QueryLookupStrategy (which makes it a factory for a factory)

class CSVRepositoryFactorySupport extends RepositoryFactorySupport {

   /**

   * Spring calls this to create the object that will get all the calls

   */
   @Override
   public Object getTargetRepository(final RepositoryMetadata metadata) {
        return new CSVBasedRepositoryImpl();

   }

   /**

    * Spring calls this to get the class that the proxy will wrap

   */

    @Override

    protected Class getRepositoryBaseClass(final RepositoryMetadata metadata) {

        return CSVBasedRepositoryImpl.class;

   }
   /**

    * Spring calls this to get the factory that will create the query objects

   */

   @Override
   protected QueryLookupStrategy getQueryLookupStrategy(final Key key) {

      return new CSVQueryLookupStrategy();

   }

}

[edit] Implement RepositoryFactoryBeanSupport

This is a factory that creates RepositoryFactorySupport. ^{Yes this makes it a factory that is a factory for a factory. No idea why everything is in one factory}

public class CSVRepositoryFactoryBean, S, ID extends Serializable> extends RepositoryFactoryBeanSupport{ 

   @Override
   protected RepositoryFactorySupport createRepositoryFactory() { return new CSVRepositoryFactorySupport();}

}

That's it. You have just implemented your own Spring Data

[edit]What are the limitations?

The limitations are really around the Query annotation. For some reason, all thoughts of extensibility have been thrown out while designing the support for Query annotation.

The authors have decided to make SimpleJPAQuery final. So, if you wanted to make a custom JPARespository that modifies JQL query, you are out of luck.
PartTreeJPAQuery is also not very extensible. It has a nested class that converts the method name to a JPA query. Unfortunately, you cannot extend it

Wednesday, October 23, 2013

Using Java tools to diagnose and fix memory problems

In Sep 2011, I made the following blogpost on my employer's internal blogging software because I wanted to use this as training material for offshore developers. I am replicating it here for my own use

Introduction

A lot of times an application's performance is hampered because of improper use of computing resources. It becomes difficult to diagnose these problems because usually these problems are hidden when you test with small data sets. Also, it is hard to debug a performance problem when it takes time to execute a test. Fortunately, Java provides some excellent tools that help you monitor and profile applications. This helps in diagnosis of the problems.

In this post, I will present a case study regarding how we improved performance of cashflow generation in CFP Lite using out-of-the-box tools provided by Java

Primer on CFP Lite Cashflow Generation

CFP Lite allows the user to generate cashflows by running them through an excel model. The loans that are imported into CFP Lite are fed into an excel sheet, and the formulas on the excel sheet are executed. The formulas compute cashflows. CFP Lite extracts the cashflows and saves them into the database.

Since, cashflow generation for loans are independent of each other, CFP Lite generates cashflows in a thread pool. This is done to improve the throughput of the application. However, the problem was that the impelmentation wasn't designed to take large number of loans into account and performance would degrade expenentially with number of loans. The cashflow generation code went through 2 iterations of fixes. Here, I will present how we diagnosed and fixed cashflow generation.

To make things easier for the reader to understand, I made an appplication that mimics what CFP Lite was doing. This is done to reduce the amount of code that the reader has to understand. The intent is so that the application makes it easier for the reader to understand the issue instead of getting bogged down by the specifics of CFP Lite. I have attached the sample applications to this post. The reader can run the application for him/herself and is actually encouraged to do so.

It is exepcted that the reader knows how thread pools work. For detailed explanation of how thread pools work, please see my related blog post on thread pools

Initial implementation

The initial implementation of CFP Lite had a main thread that started a Thread pool. The main thread loaded the cohorts from the database, and for each cohort created a task to be executed to the pool. The task would query the database for the loans belonging to the cohorts, create an excel model, generate cashflows using the excel model and then return the cashflows

The code

To mimic this behavior, I have implemented the application below

public class RunIt {  
 static Logger logger = Logger.getLogger(RunIt.class.getName());  
 /** 
  * @param args 
  */  
 public static void main(String[] args) {  
  long currentTIme = System.currentTimeMillis();  
  ExecutorService threadPool = Executors.newFixedThreadPool(5);  
  List> futureCashflows = new ArrayList>(); // the list that contains the cashflows   
    
  logger.info("Submitting cohorts to executor");  
  //submit all the cohorts to the thread pool  
  for(long i=0; i< 1L*1000; i++) {  
   Cohort cohort = getNextCohort(i);  
   CashflowGenerator callable = new CashflowGenerator(cohort);  
   Future futureCashflow = threadPool.submit(callable);  
   futureCashflows.add(futureCashflow);  
  }  
  logger.info("Submitted loans to queue.. now wait for results");  
  // now the loans are processing.. keep getting the results in parallel and process them  
  long i=0;  
  for(Future futureCashflow: futureCashflows) {  
   i++;  
   Cashflows cashflow;  
   try {  
    cashflow = futureCashflow.get();// this will block until the background threads have completed processing the task  
    processCashflow(cashflow);  
    if(i%100==0) logger.info("Processed " + i + " cashflows");  
   } catch (Exception e) {  
    // log error and continue  
   }  
  }  
    
  logger.info("Cashflow generation took " + (System.currentTimeMillis()-currentTIme) + " ms");  
  System.exit(0);  
 }  
.  
.  
.  
.  
.  
}  
public class CashflowGenerator implements Callable {  
 final Cohort cohort;  
 public CashflowGenerator(Cohort cohort) {  
  super();  
  this.cohort = cohort;  
 }  
 public Cashflows call() throws Exception {  
    
  // first load loans from database  
  getLoansFromDatabase();  
    
    
  // generate cashflows  
  Cashflows cashflows = generateCashflowsUsingModel();  
    
  return cashflows;  
 }  
 private Cashflows generateCashflowsUsingModel() {  
  Cashflows cashflows = new Cashflows();  
  initializeModel();  
  while(balance>0) {  
     
   Cashflow cashflow = generateCashflow(r, monthlypayment, balance);  
   cashflows.getCashflows().add(cashflow);// add  a  cashflow to the cashflows  
   balance-=cashflow.getSched(); // remaining balance  
  }  
  return cashflows;  
 }  
.  
.  
.  
.  
.  
}  

Here the Runit class contains a main method that mimics the main thread of CFP Lite. It gets all the cohorts from the DB. For each cohort it creates aCashflowGenerator task and submits the task to the thread pool. The task queries the database for the loans, creates the excel model, then generates cashflows using the model and returns the cashflows. The cashflows are stored by the executor in a future object, and the main thread maintains a list of future objects. The main thread loops through all the future results, and when they are available, gets the cashflows from the result and stores it into the DB

The problem

When you run this with 1000 cohorts, you get the following output

Sep 15, 2011 12:49:11 PM tryit.memoryleak.RunIt main  
INFO: Submitting cohorts to executor  
Sep 15, 2011 12:49:11 PM tryit.memoryleak.RunIt main  
INFO: Submitted loans to queue.. now wait for results  
Sep 15, 2011 12:49:15 PM tryit.memoryleak.RunIt main  
INFO: Processed 100 cashflows  
Sep 15, 2011 12:49:19 PM tryit.memoryleak.RunIt main  
INFO: Processed 200 cashflows  
Sep 15, 2011 12:49:22 PM tryit.memoryleak.RunIt main  
INFO: Processed 300 cashflows  
Sep 15, 2011 12:49:26 PM tryit.memoryleak.RunIt main  
INFO: Processed 400 cashflows  
Sep 15, 2011 12:49:30 PM tryit.memoryleak.RunIt main  
INFO: Processed 500 cashflows  
Sep 15, 2011 12:49:33 PM tryit.memoryleak.RunIt main  
INFO: Processed 600 cashflows  
Sep 15, 2011 12:49:37 PM tryit.memoryleak.RunIt main  
INFO: Processed 700 cashflows  
Sep 15, 2011 12:49:41 PM tryit.memoryleak.RunIt main  
INFO: Processed 800 cashflows  
Sep 15, 2011 12:49:44 PM tryit.memoryleak.RunIt main  
INFO: Processed 900 cashflows  
Sep 15, 2011 12:49:48 PM tryit.memoryleak.RunIt main  
INFO: Processed 1000 cashflows  
Sep 15, 2011 12:49:48 PM tryit.memoryleak.RunIt main  
INFO: Cashflow generation took 36551 ms  

It takes less than a second to sumbit the cohorts to the pool. It takes about 4 secs to generate cashflows for 100 cohorts, and 1000 cohorts execute in 36s (please note that CFP Lite ran much slower than this. The mime app ruhs much faster because it's doing less and also I didn't want people to fall asleep. The intent is to show the problem, and not focus on actual performance numbers)

Now, you would expect that 10K loans should take 360s = 5 mins, right? Let's see what happens with 10K loans. Here is what we get in the log after running it for 6 minutes

Sep 15, 2011 1:18:29 PM tryit.memoryleak.RunIt main  
INFO: Submitting cohorts to executor  
Sep 15, 2011 1:18:30 PM tryit.memoryleak.RunIt main  
INFO: Submitted loans to queue.. now wait for results  
Sep 15, 2011 1:18:34 PM tryit.memoryleak.RunIt main  
INFO: Processed 100 cashflows  
Sep 15, 2011 1:18:37 PM tryit.memoryleak.RunIt main  
INFO: Processed 200 cashflows  
Sep 15, 2011 1:18:41 PM tryit.memoryleak.RunIt main  
INFO: Processed 300 cashflows  
Sep 15, 2011 1:18:45 PM tryit.memoryleak.RunIt main  
INFO: Processed 400 cashflows  
Sep 15, 2011 1:18:48 PM tryit.memoryleak.RunIt main  
INFO: Processed 500 cashflows  
Sep 15, 2011 1:18:52 PM tryit.memoryleak.RunIt main  
INFO: Processed 600 cashflows  
Sep 15, 2011 1:18:56 PM tryit.memoryleak.RunIt main  
INFO: Processed 700 cashflows  
Sep 15, 2011 1:19:00 PM tryit.memoryleak.RunIt main  
INFO: Processed 800 cashflows  
Sep 15, 2011 1:19:04 PM tryit.memoryleak.RunIt main  
INFO: Processed 900 cashflows  
Sep 15, 2011 1:19:08 PM tryit.memoryleak.RunIt main  
INFO: Processed 1000 cashflows  
Sep 15, 2011 1:19:12 PM tryit.memoryleak.RunIt main  
INFO: Processed 1100 cashflows  
Sep 15, 2011 1:19:16 PM tryit.memoryleak.RunIt main  
INFO: Processed 1200 cashflows  
Sep 15, 2011 1:19:20 PM tryit.memoryleak.RunIt main  
INFO: Processed 1300 cashflows  
Sep 15, 2011 1:19:24 PM tryit.memoryleak.RunIt main  
INFO: Processed 1400 cashflows  
Sep 15, 2011 1:19:27 PM tryit.memoryleak.RunIt main  
INFO: Processed 1500 cashflows  
Sep 15, 2011 1:19:31 PM tryit.memoryleak.RunIt main  
INFO: Processed 1600 cashflows  
Sep 15, 2011 1:19:35 PM tryit.memoryleak.RunIt main  
INFO: Processed 1700 cashflows  
Sep 15, 2011 1:19:40 PM tryit.memoryleak.RunIt main  
INFO: Processed 1800 cashflows  
Sep 15, 2011 1:19:44 PM tryit.memoryleak.RunIt main  
INFO: Processed 1900 cashflows  
Sep 15, 2011 1:19:48 PM tryit.memoryleak.RunIt main  
INFO: Processed 2000 cashflows  
Sep 15, 2011 1:19:52 PM tryit.memoryleak.RunIt main  
INFO: Processed 2100 cashflows  
Sep 15, 2011 1:19:56 PM tryit.memoryleak.RunIt main  
INFO: Processed 2200 cashflows  
Sep 15, 2011 1:20:00 PM tryit.memoryleak.RunIt main  
INFO: Processed 2300 cashflows  
Sep 15, 2011 1:20:04 PM tryit.memoryleak.RunIt main  
INFO: Processed 2400 cashflows  
Sep 15, 2011 1:20:08 PM tryit.memoryleak.RunIt main  
INFO: Processed 2500 cashflows  
Sep 15, 2011 1:20:12 PM tryit.memoryleak.RunIt main  
INFO: Processed 2600 cashflows  
Sep 15, 2011 1:20:16 PM tryit.memoryleak.RunIt main  
INFO: Processed 2700 cashflows  
Sep 15, 2011 1:20:20 PM tryit.memoryleak.RunIt main  
INFO: Processed 2800 cashflows  
Sep 15, 2011 1:20:25 PM tryit.memoryleak.RunIt main  
INFO: Processed 2900 cashflows  
Sep 15, 2011 1:20:29 PM tryit.memoryleak.RunIt main  
INFO: Processed 3000 cashflows  
Sep 15, 2011 1:20:33 PM tryit.memoryleak.RunIt main  
INFO: Processed 3100 cashflows  
Sep 15, 2011 1:20:38 PM tryit.memoryleak.RunIt main  
INFO: Processed 3200 cashflows  
Sep 15, 2011 1:20:42 PM tryit.memoryleak.RunIt main  
INFO: Processed 3300 cashflows  
Sep 15, 2011 1:20:46 PM tryit.memoryleak.RunIt main  
INFO: Processed 3400 cashflows  
Sep 15, 2011 1:20:50 PM tryit.memoryleak.RunIt main  
INFO: Processed 3500 cashflows  
Sep 15, 2011 1:20:55 PM tryit.memoryleak.RunIt main  
INFO: Processed 3600 cashflows  
Sep 15, 2011 1:20:59 PM tryit.memoryleak.RunIt main  
INFO: Processed 3700 cashflows  
Sep 15, 2011 1:21:12 PM tryit.memoryleak.RunIt main  
INFO: Processed 3800 cashflows  
Sep 15, 2011 1:21:17 PM tryit.memoryleak.RunIt main  
INFO: Processed 3900 cashflows  
Sep 15, 2011 1:22:15 PM tryit.memoryleak.RunIt main  
INFO: Processed 4000 cashflows  
Sep 15, 2011 1:24:14 PM tryit.memoryleak.RunIt main  
INFO: Processed 4100 cashflows  

It processed only 4100 cohorts in 6 mins! What happenned? Should be done by now! Also if you see it starts off by taking 4 secs per 100 cohorts like usual. However, right around the 4 min mark, it is suddenly taking 1 min or more per 100 cohorts. Eventually, this thing will slow down further and further. What ahppenned after 3800 cohorts? Each task is doing exactly the same thing. It should take the same time

ETA: I let it run while I wastyping out this blog post, and 10K loans took 3851206ms = 64 mins. A 12 times reduction in performance

The analysis

So, to investigate this problem, I fired up JConsole. Jconsole is an application that is available along with Java. If you have java in your class path, you should have JConsole in your class path. It helps you look inside the application and tells you how it is behaving. When you start JConsole, it gives you the ist of java applications that you can monitor. It gives the class namee of the class that has the main thread. Note that JConsole is itself s Java application, so it shows itself

If you have Java based IDE running, it will show you that too (in my case eclipse). In my case my main class is Runit, and I select that and hit connect. If you are monitoring a web app running in JBoss, it will show up as JBoss bootstrap class

After you connect, it shows you a dashboard that gives you an overview of what is happenning under the hood

The dashboard shows the CPU, the memory, threads and number of classes. The tabs above the screen show you a detailed view of each. Click on the memory tab and you can see this

Here you see a plot of heap memory usage by default, and it also shows you how much memory is used by eden space, survivor space and old gen space (more on this later). Now, let's see how the heap memory looks like after you run 10K loans

Tada! 5 points for anyone who can guess what the problem is at this point

The memory is getting full. As you can see, the application started with low memory, but the memory started filling up quickly, right after the 3-4 min mark, at which point it is right under 1 GB. 1 GB is the maximum amount of memory that I gave to Java. Also, see that the Old Gen space is full( it's the 3rds bar from the left). All these things indicate that the application is using too much memory. Also, another indicator is found by comparing the VM summarry and Memory screen

VM summary	Memory

Note in the Process CPU time it says 20 mins. This means that the CPU spent 20 mins in the application. note that this isn't the time the app was running.. that is uptime. This is the CPU time. If you have multiple threads running in a multi-CPU machine, JConsole will add up all the time spent by all the CPUS.. which is why Process CPU time is more than uptime Also see that GC time in the memory tab is 10 minutes. Note:- the application was running for 13 mins, out of which 20 mins were used in CPU. 10 mins out of that 20 minutes was spent in GC. That is excessive. The application would have been 100% faster if there were no GC running!

So, by looking at the memory usage and the GC time, we get a clue that the problem is excessive memory usage. The question is where is the memory usage. But before we handle that the question is why does excessive memory usage slow down the application? SHouldn;t it just go out of memory? Why should I even think about reducing memory usage when the problem reported to me is that the application is slow? Shouldn;t I try to optimize the database queries or increase the number of threads? Yes those things might help but they won't solve the core problem of memory (Actually when this problem was reported, Atul K did try to increase threads, and what happenned was that it took more time.. not less. Yes!! application ran slower by increasing the threads, which seems paradoxical) Unless you don't open JConsole and look under the hood you won't know exactly what is happenning

Why does using more memory slow down a Java application

The answer has to do with how GC works. GC is the biggest boon and bane for Java. Before Java, when most high performing applications were written in C/C++, the programmer had to manage his/her own memory. o, if a memory was allocated, the code had to deallocate the memory. That led to huge problems because the code will either a) forget to deallocate memory which led to memory leaks or b) use an object that is already deallocated, which led to crashes(this is also called as the dangling pointer problem) It wasn't uncommon for application to require a reboot every once in a while (to "clear" memory leaks), or to crash unexpectedly because of a dangling pointer. GC more or less solved these problems by handling the deallocation for the programmer. The code can allocate memory, but doesn;t deallocate it. Once an object is out of scope, and doesn;t have any references from other objects, then it becomes a candidate for deallocation. GC goes ahead and deallocates the memory when it runs.

The problem is finding objects for deallocation is not a simple thing. Let's say you have a List A that contains an Object B. Internally B contains 3 objects C, D and E. Let's say D contains a back reference to the parent Object B, and E contains a reference to itself. Now let's say you remove B from A. The programmer would expect that B would get deallocated because there are no references to it. However, it;s not as simple. GC has to find that when B is deallocated, it has to deallocate C, D and E too, so it has to walk through the object tree to find all the related objects that should be deallocated. Also, from B --> D and D --> B. There is a circular reference to B. To properly deallocate B, it has to detect the circular reference and ignore it. Also, E has a reference to itself. It has to detect that circular reference too. This requires multiple passes through the object tree. Proper implementation of GC is not a simple topic, and there are still people doing in academia doing research on improving GC implementations. It would be out of scope and useless for any of us programmers to know the nitty gritty details. Suffice to say that what we as programmers need to know is this GC is costly

Also, another thing to note is that GC becomes slower as you use more memory. When GC needs to find objects to collect, it has to walk through the object tree. If you have 1000 objects in memory that don't need to be collected, GC still has to walk through those objects. If you have 10000 objects, it's going to 10 times longer to run GC

Since people who design JVMs know that GC is costly, they make the JVM so that GC is lazy. Normally in an application that is idle most of the time, GC runs periodically. It doesn;t run every time an object goes out of scope. However, in an active application that is actively creating objects and releasing objects, GC will run when it absolutely has to. What the JVM does is that it holds off on collection of the objects until the memory usage reaches a limit, and then runs GC. If you look at the memory graph above, you see patterns of memory increasing and then dropping slightly. The drop is where GC ran a collection and released objects. This is done to improve performance. However, the side effect of this is that GC runs more often when you use more memory. If you are running the memory right under the max memory given to Java, the JVM is going to hit the limit more often and run GC more often. To draw an anaology, this is like when you are bouncing a basketball on the ground. If you keep your hand at your chest level and then start moving it down, the ball bounces faster and faster even though you control the speed of descent of the ball. This is because the ball has less distance to travel the closer it is to the ground. Similarly, the GC "bounces" more often if the memory usage is closer to the top.

So, the end result of GC being costly + becoming costlier as you use more memory + running more often is what causes the application to slow down as it uses more memory. Eventually, if GC is not able to release any memory an Out of memory exception is thrown. However, that happens long after the application has slowed down to a crawl. The first clue for most memory problems in Java apps is almost always "My application is running slower" Sometimes, what this means is that there can be unexpected side effects in typical applications. For example, let's say you have a bit of code that opens a connection and runs sql queries in a loop. Now, if code inside the loop is using too much memory, it will start getting slower and slower in every iteration. At some point, the database connection times out, and the code gets a SQLException. The code logs the exception and exits. When it exits, all the memory that it allocated inside the loop gets deallocated, and GC clears it up in the next collection. Suddenly the memory becomes low, and the Out of memory problem "heals" itself. The sysadmin might look at the memory usage after the problem happens, and see that memory usage is low and hence mistakenly conclude that memory is not a problem. When the problem is reported to the developer, the stack trace says "Connection time out" along with a note from the admin that says memory usage is low. So, the developer might looking at the SQL queries, and may spend hours if not days tuning the queries. That does make things a bit faster, but doesn't solve the root cause. The stack trace is misleading that leads the developer down the wrong rabbit hole.

The only way to solve this problem is for the developer to test performance under load him/herself with JConsole running before the application is released. Nothing beats running JConsole yourself

The fix

After we determined that memory was an issue, we have to find out where exactly the memory was being used. Please note that the analysis done through JConsole doesn't help us find the exact problem, but leads us down the correct path. There are several ways we can find memory problems, and there are tools that we can use to look inside the heap. I will cover that later. In this case, Atul K went through the code and found that the Cashflows were being stored in the Future objects, and theFuture objects aren;t released until the entire cashflow processing was done. So, we changed the code so that the task writes the cashflows to a CSV file. The main thread takes the CSV files and uses sqlloader to load them into the database after all tasks are complete. This had 2 advantages

a) the Future object doesn't need to have cashflows

b) part of the processing that was done by main thread will now be done inside the task. This means that it will be done in parallel, which will make the application faster

Actually, there was no need to keep a Future object anymore, and also we could have used Runnable instead of Callable. However, since we were under a time constraint (it was evening before release) we decided to do a quick change that provided the most improvement and decided to not remove the Future objects