Akshay Deo

Do(nt) Code Seriously

Be Careful While Using Executors.newCachedThreadPool()


I am currently writing a scalable backend server for AppSurfer, which does routing work for different components associated with a session. Recently I went into a trouble with the new version of the server. Even for a single connection CPU load was almost 100%. I was clueless about the issue and started digging into it.

My setup

  • netty4x using our home cooked wrapper Transporter.
  • JDK with latest update number 40.
  • CentOS.

CaseStudy

We at AppSurfer solve a very tricky problem on our technology side. Our servers need to respond almost real time with minimilstic load on the server as it might affect other sessions. For better scaling we shifted onto netty (i.e nio) from standard io. It really helped us to improve overall response time as well as efficient usage of resources. But during shift we faced one critical issue related to 100% CPU usage. Some initial googling introduced me to this JVM issue. But this issue was tackled very well in netty3x as well as netty 4x. Then I started digging into this issue more.

CPU usage shown by HTOP command :
Screenshot from 2013-09-28 21:06:24
CPU usage shown by TOP command :
Screenshot from 2013-09-28 21:12:48

I was clueless about what was going wrong after adding all the fixed related to nio. Then yourkit output indicated one strange thing related to garbage collection. There was a lot of garbage collection happening during session runs. Then digging into it more revealed that the main issue was related with Executors.newCachedThreadPool().

Garbage collection output:
Screenshot from 2013-10-06 23:36:15

What I read about newCachedThreadPool in Java documentation was:

“Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available. These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks. Calls to execute will reuse previously constructed threads if available. If no existing thread is available, a new thread will be created and added to the pool. Threads that have not been used for sixty seconds are terminated and removed from the cache. Thus, a pool that remains idle for long enough will not consume any resources. Note that pools with similar properties but different details (for example, timeout parameters) may be created using ThreadPoolExecutor constructors.”

But I missed one of the important crux of this line. If no existing thread is available, a new thread will be created and added to the pool. This means if my tasks are getting created at a very faster rate for lets say 2-3 mins (and in our case its throughout the session run), you will end up with thousands of threads created and destroyed (to be precise during our single session for 50 sec, number of threads shown by yourkit was around 10500). I always knew that, in our case there are going to be a lot of short duration tasks will be submitted to the workers but in the java documentation they have mentioned that These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks. But what they mean by that is, when you know the number of tasks are under control and they not virtually concurrently submitted.

Solution

We wrote our custom thread pool executor to control the number of threads getting spawned at the runtime, and now cpu usage is like 1% or 2% throughtout the lifetime.

public static ThreadPoolExecutor get(final int coreSize, final int maxSize, final int idleTimeout,
final TimeUnit timeUnit, final int queueSize, final String namePrefix) {
        return new EfficientThreadPoolExecutor(coreSize, // core size
                maxSize, // max size
                idleTimeout, // idle timeout
                timeUnit,
                new ArrayBlockingQueue(queueSize), // queue with a size
                new PriorityThreadFactory(namePrefix, Thread.NORM_PRIORITY));
    }

Thanks to Norman Maurer, most active contributor of netty, for helping me out in solving this issue.

The Night When We Released Our AppSurfer Mobile App…


Its one of the biggest day for us, as a company, as entrepreneurs, as technology lovers!! We released out the first version of mobile app AppSurfer on 21st early morning (like 1.30 am early :P). We were happy, and equally nervous. And at the same time Google play was giving us a bit of hard time :D. We were kind of polling onto our expected web page on Google Play and it was returning 404 :-/. We were kind of sleepy, exhausted and equally excited to see our product page on Google play. We were too bored to click refresh again and again. And then the programmer within us resulted into this script

import urllib
import urllib.request
import sys
from multiprocessing import Process
import time
 
def read_page():
    print("reading source")
    req = urllib.request.Request("https://play.google.com/store/apps/details?id=main.java.com.appsurfer")
    try:
        page_source = urllib.request.urlopen(req)
    except:
        print("no")
        return
    print("AppSurfer app is up \m/ yohoooooo !!")
 
if __name__ == "__main__":
    #sendmail()
    while(True):
        read_page()
        time.sleep(15)

Output:

reading source
no
reading source
no
reading source
no
reading source
no
reading source
no
reading source
no
reading source
no
reading source
AppSurfer app is up \m/ yohoooooo !!
reading source
AppSurfer app is up \m/ yohoooooo !!

And finally we saw AppSufer is up, it was almost 3.25 am. But nevertheless, its a big day for us. We are sure that you are gonna like AppSurfer service, and tune in with us for exciting upcoming updates.

P.S. If you liked inception, I am sure you are gonna like this page

While Adding Splash Screen for Your Android Application


Right now I am working on Android application for AppSurfer. And while designing a prototype for the application, I had to integrate it with a splash screen. Segmentation is a well known issue for Android developers, and hence we have to be careful about the design components. They should not break on devices which are very large or very small. So I did a search and found following set of dimensions that should be used for splash screen which accommodates almost all screen sizes.

  1. LDPI
    Portrait : 200x320px
    Landscape : 320x200px
  2. MDPI
    Portrait : 320x480px
    Landscape : 480x320px
  3. HDPI
    Portrait : 480x800px
    Landscape : 800x480px
  4. XHDPI
    Portrait : 720px1280px
    Landscape : 1280x720px

Happy coding \m/

Using Libvirt Java API Bindings in Maven


Hello coders, I am writing this quick post just to help a small bunch of people who are trying to use libvirt JAVA API bindings. I was trying to build .jar from git repo mentioned on the Libvirt site, but

ant build

command was giving a bunch of errors, to be precise, 100 errors. Then I decided to go with maven but there was no documentation. There is no rocket science in understanding maven architecture but for the coders trying to find ready template here it is :

<repositories>
        <repository>
            <id>libvirt repo</id>
            <url>http://www.libvirt.org/maven2/</url>
        </repository>
    </repositories>

    <dependencies>
        <dependency>
            <groupId>org.libvirt</groupId>
            <artifactId>libvirt</artifactId>
            <version>0.4.9</version>
        </dependency>

    </dependencies>

Add this in your pom and you are ready to go.
Thanks :)

Managing Multiple Versions of Python on a Linux Box


Hello coders, if you are working on Python or any of your development tool is based on Python then you must have faced a problem in using multiple versions of Python. For an instance I develop web applications using Django which has kind of beta support for Python3 so I have to be on Python27 and same for repo from Google which does not run on Python3. But Python3 is required in some cases for example I am running arch Linux which demands for Python3 for some of its components. I came across an amazing tool named pythonbrew.

Pythonbrew provides very simple commands to install, uninstall and switch between multiple versions of python.

To install a python version
pythonbrew install 3.2
To remove a python version
pythonbrew uninstall 3.2
To switch to a particular version of Python
pythonbrew switch 2.7

In action:
Installing a version

[akshay@amd_dev ~/Downloads/pythonbrew]$ pythonbrew install 3.2
Downloading Python-3.2.tgz as /home/akshay/.pythonbrew/dists/Python-3.2.tgz
######################################################################## 100.0%
Extracting Python-3.2.tgz into /home/akshay/.pythonbrew/build/Python-3.2

This could take a while. You can run the following command on another shell to track the status:
  tail -f "/home/akshay/.pythonbrew/log/build.log"

Patching Python-3.2
Installing Python-3.2 into /home/akshay/.pythonbrew/pythons/Python-3.2
Downloading distribute_setup.py as /home/akshay/.pythonbrew/dists/distribute_setup.py
######################################################################## 100.0%
Installing distribute into /home/akshay/.pythonbrew/pythons/Python-3.2
Installing pip into /home/akshay/.pythonbrew/pythons/Python-3.2

Installed Python-3.2 successfully. Run the following command to switch to Python-3.2.
  pythonbrew switch 3.2
[akshay@amd_dev ~/Downloads/pythonbrew]$ python
Python 2.7 (r27:82500, Aug 18 2012, 12:32:52) 
[GCC 4.7.1 20120721 (prerelease)] on linux3
Type "help", "copyright", "credits" or "license" for more information.
>>>

Switch to version 3.2 which is just installed

[akshay@amd_dev ~/Downloads/pythonbrew]$ pythonbrew switch 3.2
Switched to Python-3.2
[akshay@amd_dev ~/Downloads/pythonbrew]$ python
Python 3.2 (r32:88445, Aug 18 2012, 12:56:01) 
[GCC 4.7.1 20120721 (prerelease)] on linux3
Type "help", "copyright", "credits" or "license" for more information.
>>> 

Switch back to version 2.7

[akshay@amd_dev ~/Downloads/pythonbrew]$ pythonbrew switch 2.7
Switched to Python-2.7
[akshay@amd_dev ~/Downloads/pythonbrew]$ python
Python 2.7 (r27:82500, Aug 18 2012, 12:32:52) 
[GCC 4.7.1 20120721 (prerelease)] on linux3
Type "help", "copyright", "credits" or "license" for more information.
>>> 
[akshay@amd_dev ~/Downloads/pythonbrew]$ clear

Hope this will help some of you :)
Happy coding !!

Lesson Learnt : Switching From Desktop Application to Web Application Development


Just to give an introduction, I work on core technology of AppSurfer. Part of which is done in Django. Before starting my own venture I was a part of a MNC where I used to work on desktop applications written in C#.net. So while starting my first major web application, I was never part of any web application. And hence I struggled a lot to decide architecture of my web application. I am going to discuss one aspect of web application here, Database transactions during request processing.

The problem
When we are developing a desktop application, what we mainly concentrate on is modularity of code. We take out entities, implement them and go on expanding the system depending upon responsibilities of every entity. We really dont care about (mostly as it will be running locally and now a days normal desktop is quite fast so that we can rely on optimisation made by compiler, databases and database connectors) number of db writes and pull, subroutine calls and number of indirections. Our main concern is to make system more lossely coupled, and modular.

What goes wrong
In the process of making my web application more modular and lossely coupled, I ended up making a lot of modules and a lot of persisting data which means more interaction with database. Now usually the business logic of your web application takes on an average say 10% (its based on statistics of my web application) and 90% of the time is for db queries roughly. So obviously less number of db queries, less response time hence faster execution, less load on server etc etc.

What we can do
One of the thing I did was, I removed unnecessary modules from my web application. And wrote most part of the apploication as a script. Secondly once an object is taken from db, perform all operations related to it and hence bring most of the operations related to it at a place and then store that back into the db. Use other db and UDP based logging system so as to make your application independent of logging and error catching. This small change brought a huge difference in db response and overall app response time.

Consider marked points, as snapshot is from production and other is from staging server.(Using new relic analytics)
Before my changes overall db response time

After my changes overall db response time

Hope you will find this post useful.
Thanks and keep coding :)

Significance of Final Class in Java


Hello guys, recently I have been working on Java for Android Application of AppSurfer. And while coding, specially for mobile devices, you need to be concerned about speed of your application. So this post is related with improving speed of execution to some extent using final classes whenever it is applicable.

When to use a final class :
When you are sure that the class is not going to be extended any further, then make that class as final class. When you make a class as final class, it does not allow any class to extend it.

What is the advantage of that
When you make a class as final class, then implicitly all of the methods inside it are final. So Java compiler is sure that those methods are not going to get overridden anywhere else. Java compiler may be able to inline final method then.

final class SquareMaker{
                public int getSquare(int input) {return input*input;}
        }

        public class test {
                public static void main(String args[])
                {
                        SquareMaker sqMkr = new SquareMaker();
                        System.out.println(""+ sqMkr.getSquare(10));
                }
        }

This would run about twice as fast when class Square maker is made final (http://www.glenmccl.com/).

Happy coding !! :)

Switching to Postgresql From MySql


I have been using MySQL database as back end since my first Django web application. But as my applications started getting more and more complex, MySQL started giving me a lot of issues. One of the biggest issue was while performing south migrations. Usually migrations used to affect/break relationships and Django ORM used to return None objects. But a lot of times these migrations used to run correctly on my test PostgreSQL db. So finally I decided to switch onto PostgreSQL db.

But initially I faced a lot of problems while setting up PostgreSQL on my Ubuntu box. Basically its way different than the way we use MySQL. So gathered a lot of information over web and aggregating those here. This was for me, but probably useful for few of you, so making it public.

Install PostgreSQL

PostgreSQL server

sudo apt-get install postgresql

Pgadmin tool for managing databases

sudo apt-get install pgadmin

Add password for default superuser postgres

The default superuser, called ‘postgres’, does not have a password by default. So we need to add password for this user

sudo su postgres -c psql template1
postgres=# ALTER USER postgres with PASSWORD '<your password>';
postgres=# q

Crate a user and database under that user

Create user

sudo -u postgres createuser -d -R -P <new username>

Create database

sudo -u postgres createdb -O <usename> <database name>

These are the basic things one should know who wants to switch their Django apps from MySql to PostgreSQL. Hope this post will help you. Do correct me if I made some wrong comments here in this post :).
Happy coding m/

Switching Branch With Repo for Android Source Code


I find working with repo as difficult as working with Android source code. We faced a lot of trouble while working with different versions of Android Source code. So I googled for repo command help but didnt find much help. So I experimented a bit and found a good way of switching branch.

  1. git reset command for removing changes that you have made

    $ repo forall -c git reset –hard

  2. then initialize repo with new branch.Suppose you have checked out version 4.0.4_r1.2 and you want to revert to 4.0.1_r1 (which was my case actually :P) then

    $ repo init -u https://android.googlesource.com/platform/manifest -b android-4.0.1_r1

  3. Sync repo

    $ repo sync

  4. It’s applicable for all combinations

And magic happens :).
Hope you will find this useful. Cheers and happy coding m/.

Basecamp vs JIRA


Being part of a start-up, I continuously look out for different products that are available to set up an end-to-end workflow in the company. Issue management and content management are one of the core parts of this workflow. If you google about “issue tracking”, “project management”, “content management” or similar search terms, you will be presented with a bunch of solutions. I am going to put my views on two of those solutions that we have used in RainingClouds, Basecamp and JIRA, and why I feel JIRA is much better than Basecamp, specifically for the software industry.

We started with Basecamp for AppSurfer. Initially it felt good. Minimal design, minimal clicks for surfing and easy to understand product. Single view for looking at all the issues/tasks of a project. We were part of private beta of new Basecamp. They made it more intuitive with good quality icons and visuals. But it still remains a generic task management system (which is their aim as far as I know). So it is not bad in that perspective. But as a software company Basecamp fails in some of the aspects :

  1. We used to report a lot of tasks and issues. Task list feature is fine but its like hell lot of information on single page
  2. Every task goes through stages like : Created | In progress | In Testing | Resolved | Build broken. No such facility is available in Basecamp
  3. Once an issue is closed it becomes part of archive which is difficult to go through.
  4. Source code changes associated with each task is really important concern. I used a third party tool Zapier, but its too inconvenient and inconsistent, As there is no single point where I could manage all these things.
  5. Linking tasks is not possible as far as I understood Basecamp. Most of the times we faced a requirement to attach two issues / link two issues with each other.
  6. Documentation part is also not that good. Very little formatting tools, no feature of supporting code snippets in discussion and no facility to attach a document page (created within Basecamp) with existing tasks. (Thought it could be achieved by copy paste thing but its not expected)

After facing these issues, I started searching for a better solution and I came across Atlassian products. I was really impressed by two of their products JIRA and Confluence. So I started with their trial account, and bang !!. I am so impressed with them is that, we are almost sure about we are going to continue with them. There are a lot of features which are yet to be discovered but a few most obvious and appealing points are :

  1. They know the target audience hence provide all required features for task management like task workflow, task types, task linking etc.
  2. Customization of issue workflow is amazing. We need In Testing stage of an issue so as to show fix is ready for the issue and is under testing, that we achieved by customizing the workflow.
  3. We have our source code repositories on GitHub. JIRA provides a way to connect to my repositories on GitHub ( and also to BitBucket) and connect each of my code commits with issues. That gives me an issue with all the code changes associated with that issue on a single JIRA issue page. That is awesome for us !!
  4. Confluence, an amazing wiki replacement for us. We used wikimedia for some time but were not too happy with it. Confluence looks amazing and seamlessly integrates with JIRA. So a task has its wiki page associated with it. I love it !!
  5. Sharing of Spaces (that is what they call) is also a nice feature to have. All of the information can not be disclosed to all people. So this provides a nice way of privacy.
  6. They also provide nice analysis of issues, deadlines and combination of information which not much exciting but yes its useful.

I am still exploring these two products. But for sure if you are a software firm and confused which tool to use, I would recommend you to go for Atlassian products. At least try their trial period. And 99% of you wont regret it.

PS: This post contains my personal opinions and regarding current status of products at the time I was writing it.