The Magic behind Burp and ZAP and other Proxies

If you build web applications and care about security, you have probably used the Burp and ZAP proxy security tools. These tools perform dynamic analysis on live web applications to identify security vulnerabilities. Burp and ZAP can discover issues with your applications as you nagivate through them via a browser. Essentially, you they configured to be a “man in the middle” and intercept all traffic between your browser and web application. Have you ever wondered how it is possible to intercept encrypted traffic over https? This article explains how it is done and provides a basic framework for creating your own proxy software.

To get started with Burp and ZAP (from now on, I’ll refer to these as simply the “proxy”), you have to decide what port you want the proxy to listen on and configure your browser to use that port as a proxy. In Firefox, to use port 9000, your configuration might look like the following:

Note, that the Use this proxy server for all protocols box should be checked. Your browser is now ready to send and receive data through a proxy. Let’s now start to put together some code to handle these browser requests.

First, we’ll need to set up a ServerSocket to listen for requests:

ServerSocketFactory serverSocketFactory = ServerSocketFactory.getDefault();
proxyServerSocket = serverSocketFactory.createServerSocket(LOCAL_PORT, 10000, InetAddress.getByName(LOCAL_INTERFACE));
ServerSocketHandler handler = new ServerSocketHandler(proxyServerSocket, false);

Here we use a ServerSocketHandler thread to handle the requests:

class ServerSocketHandler extends Thread {
    ServerSocket serverSocket;
    boolean secure;

    ServerSocketHandler(ServerSocket serverSocket, boolean secure) {
        this.serverSocket = serverSocket; = secure;

    public void run() {
        try {
            while (true) {
                Socket socket = serverSocket.accept();
                RequestHandler handler = new RequestHandler(socket);
        } catch (IOException e) {

This thread loops indefinitely. When it receives a request, it creates a RequestHandler to process it. The first thing it does is read the request:

while ( (line = br.readLine()) != null) {

    if (first) {
        first = false;
        String chunks[] = line.split(" ");
        method = chunks[0];
        urlText = chunks[1];

    buffer.append(line + "\r\n");
    if (line.equals(""))

In the code that follows there is an if statement to determine if the request is proxied over https and an else clause that follows. Let’s start with the else clause, which shows how non-https traffic is handled:

else if (urlText != null) {
    System.out.println("ATTEMPTING URL: " + urlText);
    URL url = new URL(urlText);

    Request request = new Request(Message.parseMessage(new ByteArrayInputStream(buffer.toString().getBytes()), true));

    SocketFactory socketFactory = SocketFactory.getDefault();
    int port = url.getPort();
    Socket targetSocket = socketFactory.createSocket(url.getHost(), port < 0 ? 80 : port);
    OutputStream targetOs = targetSocket.getOutputStream();

    InputStream targetIs = targetSocket.getInputStream();
    Response response = new Response(Message.parseMessage(targetIs, false));
    final RequestResponse rr = new RequestResponse(request, response);
    Thread thread = new Thread() {
        public void run() {
            try { DataManager.INSTANCE.add(rr); } 
            catch (IOException e) {


In this block of code, we first create a Request object by parsing the bytes from the buffer object. buffer contains the contents of the HTTP request. The Request object is used to store an HTTP request. Its contents can be used to forward the original request to the target host and can be used to store the original request for our own purposes (just like Burp and ZAP).

Next we create a socket using the target host address and send the request. We then read the response from the target host and parse that into a Response object. Now that we have a Request and Response, we can create a RequestResponse object and store it indefinitely using the DataManager class. We create a new thread to perform the actual storage because we don’t want to wait for this task to complete before sending the response back to the client. Finally we send the response back to the client.

There should be no major surprises in what we have discussed thus far. Just as you might imagine a proxy intercepts a request and records it. It forward the request to the target host and the records the response before sending it back to the client. Proxying really only gets interesting when we start to examine what happens in an HTTPS request. You might think that the exact same process happens, except that it uses HTTPS. If you weren’t interesting in capturing the unencrypted data, there would be some truth to this, but otherwise it gets quite complicated. You must consider that a client wishing to exchange encrypted data with expects to receive an Certificate stamped with “” by a Certificate Authority. If you have used Burp or ZAP before, then you probably already know that they generate their own certificates using their own Certificate Authority certificate, which must also be installed as a trusted certificate in your browser. If that wasn’t clear, here it is in bullet form:

1. In PKI encryption, senders and receivers of encrypted data use certificates to identify themselves. Each certificate must be issued by a Certificate Authority. Certificate Authorities have certificates pre-installed in your browser and are automatically trusted.

2. Our proxy can’t use these certificates because they are password protected. Hence, we create our own Certificate Authority certificate, which we can use to issue our own certificates.

3. Our homegrown Certificate Authority certificate must be installed in the browser in order to trust our own web site certificate.

Our first problem to tackle is generating a Certificate Authority certificate. I put together a utility class for this called CreateCertificateAuthorityUtil. It must be run once independent from the Proxy to create the certificate. It has a main method, which invokes the createAndExportSelfSignedCertificateTo (…) method. Here they are:

public static void main(String[] args) throws Exception {
  createAndExportSelfSignedCertificateTo(CertUtil.KEY_PAIR_GENERATOR.generateKeyPair(), CA_KEYSTORE_FILE, CA_CERT_FILE);

public static X509Certificate createAndExportSelfSignedCertificateTo(KeyPair caKeyPair, String keyStoreLocation, String derFileLocation) throws Exception {
  X509Certificate caCert = CertUtil.createCertificate(CA_X500_NAME, CA_X500_NAME, caKeyPair.getPublic(), caKeyPair.getPrivate(), true);
  KeyStore ks = KeyStore.getInstance("JKS");
  ks.load(null, CA_KEYSTORE_PASSWORD.toCharArray());
  ks.setKeyEntry(CA_KEY_ALIAS, caKeyPair.getPrivate(), CA_KEYSTORE_PASSWORD.toCharArray(), new X509Certificate[]{caCert});

  try (PrintStream ps = new PrintStream(derFileLocation); 
    FileOutputStream fos = new FileOutputStream(keyStoreLocation);) {
    CertUtil.printCertificateTo(caCert, ps);, CA_KEYSTORE_PASSWORD.toCharArray());

  return caCert;

These methods reference some class-scoped variables, some of which are initialized in a static initalizer block:

static final File TMP_DIRECTORY = new File(System.getProperty(""));
static final File DEFAULT_CA_KEY_STORE_FILE = new File(TMP_DIRECTORY, "cakeystore.ks");
static final File DEFAULT_CA_CERT_FILE = new File(TMP_DIRECTORY, "cacert.der");

static final String CA_KEY_STORE_FILE_PROPERTY_NAME = "ca-keystore-file";
static final String CA_CERT_FILE_PROPERTY_NAME = "ca-cert-file";

public static final String CA_KEYSTORE_PASSWORD = "password";
public static final String CA_KEY_ALIAS = "ca_alias";

public static final String X500_CN_COMMON_NAME_VALUE = "Fake Certificate Authority Inc.";
public static final String X500_OU_ORGANIZATIONAL_UNIT_VALUE = "Fake Certificate Authority OrgUnit";
public static final String X500_O_ORGANIZATIONAL_VALUE = "Fake Certificate Authority Org";
public static final String X500_L_LOCALITY_VALUE = "Fake Certificate Authority City";
public static final String X500_ST_STATE_PROVINCE_NAME_VALUE = "Fake Certificate Authority State";
public static final String X500_C_COUNTRY_NAME_VALUE = "Fake Certificate Authority Country";

public static final String CA_KEYSTORE_FILE;
public static final String CA_CERT_FILE;

public static final X500Name CA_X500_NAME;

static {
  String value = System.getProperty(CA_KEY_STORE_FILE_PROPERTY_NAME);
  CA_KEYSTORE_FILE = value == null ? DEFAULT_CA_KEY_STORE_FILE.getAbsolutePath() : value;
  value = System.getProperty(CA_CERT_FILE_PROPERTY_NAME);
  CA_CERT_FILE = value == null ? DEFAULT_CA_CERT_FILE.getAbsolutePath() : value;
  catch (IOException e) {
    throw new Error("Couldn't create Certificate Authority X500 Name", e);
  System.out.println("CA Keystore file: " + CA_KEYSTORE_FILE + ", and CA Cert file: " + CA_CERT_FILE);
  System.out.println("CA X500 Name: " + CA_X500_NAME);

The static initializer is used set up the CA_KEYSTORE_FILE, CA_CERT_FILE, and CA_X500_Name constants. By making them static, they can be referenced by the proxy. The proxy will use the CA_KEYSTORE_FILE to get the Certificate Authority certificate’s private key, which will be used to create the certificate to send to the browser to impersonate the real host. It also uses CA_X500_NAME to create the certificate.

Let’s now take a closer look at the createAndExportSelfSignedCertificateTo(…) method. The bulk of the code creates a KeyStore and stores the Certificate Authority certificate. However, the first line creates the Certificate Authority certificate by calling a method in the CertUtil utility class, createCertificate(…). This method can be used to create Certificate Authority certificates or SSL certificates. It has a boolean isCertificateAuthority parameter to specify which type. Here is the code:

public static X509Certificate createCertificate(X500Name name, X500Name issuerName, PublicKey publicKey, PrivateKey signerPrivateKey, boolean isCertificateAuthority) throws Exception {
  X509CertInfo certInfo = new X509CertInfo();
  certInfo.set(X509CertInfo.SERIAL_NUMBER, new CertificateSerialNumber(new BigInteger(64, new SecureRandom())));
  certInfo.set(X509CertInfo.VERSION, new CertificateVersion(CertificateVersion.V3));

  // Validity
  Date validFrom = new Date();
  Date validTo = new Date(validFrom.getTime() + ONE_YEAR);
  certInfo.set(X509CertInfo.VALIDITY, new CertificateValidity(validFrom, validTo));

  boolean justName = isJavaAtLeast(1.8);

  if (justName) {
    certInfo.set(X509CertInfo.SUBJECT, name);
    certInfo.set(X509CertInfo.ISSUER, issuerName);
  else {
    certInfo.set(X509CertInfo.SUBJECT, new CertificateSubjectName(name));
    certInfo.set(X509CertInfo.ISSUER, new CertificateIssuerName(issuerName));

  if (isCertificateAuthority) {
    CertificateExtensions ext = new CertificateExtensions();
    ext.set(BasicConstraintsExtension.NAME, new BasicConstraintsExtension(Boolean.TRUE, Boolean.TRUE, 0));
    certInfo.set(X509CertInfo.EXTENSIONS, ext);

  // Key and algorithm
  certInfo.set(X509CertInfo.KEY, new CertificateX509Key(publicKey));
  AlgorithmId algorithm = new AlgorithmId(AlgorithmId.sha1WithRSAEncryption_oid);
  certInfo.set(X509CertInfo.ALGORITHM_ID, new CertificateAlgorithmId(algorithm));

  // Create a new certificate and sign it
  X509CertImpl cert = new X509CertImpl(certInfo);
  cert.sign(signerPrivateKey, SHA1WITHRSA);

  return cert;

Here is an overview of what it does:

  1. Creates an X509CertInfo object and gives it a serial and version number
  2. Defines the period for which it is valid
  3. Sets the subject and issuer
  4. Sets appropriate properties if creating a Certificate Authority
  5. Sets the public key and algorithm
  6. Signs it
If we now jump back to the createAndExportSelfSignedCertificateTo(…) method, you’ll note it not only stores the certificate in the KeyStore, but also writes out the certificate to disk. This is so it can easily be imported by a browser. By default, the certificate is written to <>/cacert.der. Let’s now import it into FireFox.
First, click the Open Menu button and select Options:

Next, search for “Certificate” and click View Certificates:

In the result dialog, click the Authorities tab and then click the Import… button:

A file browser dialog will appear. Now select the <>/cacert.der file you created earlier. The following dialog will appear:

Click the first check box and click the OK button. You have now imported the Certificate Authority certificate into your browser.

We can now get back to understanding the proxy code, but first, let’s consider how a proxy would be able to process an HTTPS request if it is encrypted. Specifically, if HTTPS functioned the same as HTTP, the proxy wouldn’t know the target address because it is embedded in an encrypted HTTP request. To solve this problem, HTTPS uses a variation on HTTP. Instead of embedding the target host and port in the request headers, it sends a Connect request, which contains the URL and the HTTP method (i.e. GET, POST etc). The proxy will send back a receipt of the Connect request with a 200 response code. Here is the relevant code:

if ("connect".equalsIgnoreCase(method)) {
String connectResponse = "HTTP/1.0 200 Connection established\n" +
"Proxy-agent: ProxyServer/1.0\n" +


The next step is to create an SSLSocketFactory object to communicate with the real host over SSL:

SSLSocketFactory secureSocketFactory = CertUtil.getTunnelSSLSocketFactory(url.getHost());

You will notice the getTunnelSSLSocketFactory(…) method takes a parameter of the target host. We need this information to set up a fake certificate. Here is the code:

public static SSLSocketFactory getTunnelSSLSocketFactory(String hostname) throws UnrecoverableKeyException, KeyStoreException, NoSuchAlgorithmException, Exception {
  SSLSocketFactory factory = factoryMap.get(hostname);
  if (factory != null)
    return factory;

  try {
      SSLContext ctx = SSLContext.getInstance("TLS");
      KeyManagerFactory kmf = KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm());
      KeyStore ks = generateHostKeystore(hostname);

      kmf.init(ks, PASSWORD.toCharArray()); random = new;
      ctx.init(kmf.getKeyManagers(), null, random);
      SSLSocketFactory tunnelSSLFactory = ctx.getSocketFactory();

      factoryMap.put(hostname, tunnelSSLFactory);

      return tunnelSSLFactory;
  catch (Exception e) {
     throw new RuntimeException(e);

The first line checks to see if we have set up an SSLSocketFactory for this host already. If so, we re-use it. The rest of this code is standard code for setting up the factory except for this line:

KeyStore ks = generateHostKeystore(hostname);

generateHostKeyStore(…) is another utility method in CertUtil, which we require to set up the SSLSocketFactory. Here is the code:

public synchronized static KeyStore generateHostKeystore(String host) throws Exception {
  try (FileInputStream fis = new FileInputStream(CreateCertificateAuthorityUtil.CA_KEYSTORE_FILE); ) {
    KeyStore caKeyStore = KeyStore.getInstance("JKS");
    caKeyStore.load(fis, CreateCertificateAuthorityUtil.CA_KEYSTORE_PASSWORD.toCharArray());

    PrivateKey caPrivateKey = (PrivateKey) caKeyStore.getKey(CreateCertificateAuthorityUtil.CA_KEY_ALIAS, CreateCertificateAuthorityUtil.CA_KEYSTORE_PASSWORD.toCharArray());
    KeyPair hostKeyPair = KEY_PAIR_GENERATOR.generateKeyPair();
    X500Name hostName = createName(host, X500_HOST_OU_ORGANIZATIONAL_UNIT_VALUE, X500_HOST_O_ORGANIZATIONAL_VALUE, X500_HOST_L_LOCALITY_VALUE, X500_HOST_ST_STATE_PROVINCE_NAME_VALUE, X500_HOST_C_COUNTRY_NAME_VALUE); cert = createCertificate(hostName, CreateCertificateAuthorityUtil.CA_X500_NAME, hostKeyPair.getPublic(), caPrivateKey, false);
    PrivateKey privateKey = hostKeyPair.getPrivate();

    KeyStore hostKeyStore = KeyStore.getInstance("JKS");
    hostKeyStore.load(null, PASSWORD.toCharArray());
    hostKeyStore.setKeyEntry(HOST_ALIAS, privateKey, PASSWORD.toCharArray(), new[] { cert });

    return hostKeyStore;

Here is an overview of the code:

  1. Extract the Certificate Authority private key from the Certificate Authority keystore
  2. Generate a new KeyPair for the new SSL certificate
  3. Create the X500Name for the host
  4. Create the host certificate using the CertUtil.createCertificate(…) method
  5. Store the host certificate in an in-memory KeyStore

Let’s now go back to the proxy code. After we create the SSLSocketFactory, we have the following line:

SSLSocket sslSocket = (SSLSocket) secureSocketFactory.createSocket(socket, socket.getInetAddress().getHostAddress(), socket.getPort(), true);

This is a special version of the SecureSocketFactory.createSocket(…) method, which takes a socket a parameter. Essentially, it abstract the encryption/decryption for us so we can read and write unencrypted data to the socket that the browser is connected to.

The next few lines tell the socket to behave in client mode and initate the SSL handshake:

try {
} catch (Exception e) {
System.err.println("Error on URL: " + urlText);
  throw new RuntimeException(e);

The rest of the code in this method is essentially the same as the code for handling HTTP in the else clause. You can now run the Proxy class and start proxying!

You should note that this code is only proof-of-concept quality and would need a vast amount of improvement before being production ready. For example, it only handles 200 response codes. It will throw exceptions for other response codes.

You may also want to note the following:

  1. The code uses restricted access APIs from the JRE. In order to reference them in Eclipse, you will have to modify how Eclipse treats these APIs. Specifically, go to Java Compiler Errors/Warnings panel and change the Forbidden Reference (access rules) setting from Error to Warning or less.
  2. When testing your proxy code, it can be a little bit overwhelming when visiting a real web page. Most web pages don’t generate a single request. With images, style sheets, and scripts, a single web page can generate many requests. It can be confusing the debug your proxy code, when multiple requests are being created concurrently. To skirt this issue, I created a ProxyTest class, which uses HttpURLConnection to send a single request. Here is the code:
public static void main(String[] args) throws IOException { proxy = 
    new, new InetSocketAddress(Proxy.DEFAULT_LOCAL_INTERFACE, Proxy.DEFAULT_LOCAL_PORT));

//    URL url = new URL("");
//    URL url = new URL("");
//    URL url = new URL("");
    URL url = new URL("");

    HttpURLConnection c = (HttpURLConnection) url.openConnection(proxy);

    InputStream is = c.getInputStream();
    IOUtil.copy(is, System.out);

You should be aware that HttpURLConnection has setFollowRedirects(…) set to false. It will otherwise generate multiple requests for a redirect response code, which can be confusing.

You can find the full source code for this project on GitHub and the javadocs here.

Posted in https, proxy, Security, Web | Leave a comment

Practical Byte Code Engineering

Over the past few years, I have written a few blogs about how to use byte code engineering. My first article was a brief overview while others discussed specific case studies.  In hindsight, I think I have overlooked covering the basic building blocks of byte code engineering: the Java agent and the Instrumentation API. Additionally, some downloadable and practical byte code engineering example projects might helpful. This article aims to reconcile these issues.

There are two main ways to instrument Java byte code. One way is to modify your target classes prior to run time and then adjust your classpath (and possibly boot classpath) accordingly to point to your instrumented classes, Fortunately, (since Java 1.5), there is a specific Java API for instrumentation (among other things) called JVM TI (Tooling Interface). JVM TI allows you to attach native or Java agents. This blog will focus only on Java agents (I tell people I prefer them for their platform portability, but the truth is my C programming skills are really rusty).

The Java agent deployment unit is a jar file. The jar file must have a manifest built to support agents. You can refer to the instrumentation package documentation for details for the manifest and other requirements, but here is the condensed version of the manifest attributes:

  • Premain-Class: Required to support attaching agents as the JVM is started. You can think of this as similar to the class containing “public static final void main” that is used to invoke any Java program.
  • Agent-Class: Required to support attaching agents dynamically after the JVM is already running. Again, you can think of this as similar to the class containing “public static final void main” that is used to invoke any Java program.

You will probably want to define both properties and the values will probably point to the same class.

  • Boot-Class-Path: I never define this property. If I want to manipulate the bootclasspath, I prefer to do it using the Instrumentation API once the agent has loaded.
  • Can-Redefine-Classes: Indicator that this agent can call Instrumentation.redefineClasses(…). “Redefinition” is applied to classes that have already been loaded.
  • Can-Retransform-Classes: Indicator that this agent can call Instrumentation.retransformClasses(…) “Retransformation” is applied to classes as they are loaded.
  • Can-Set-Native-Method-Prefix: This blog is about non-native agents, but if interested, you can get a detailed description of this property in the Instrumentation.setNativeMethodPrefix method documentation.
    Next, lets take a look at an agent “entry point” class. As specified in the Command-Line Interface section of the java.lang.instrumentation package javadocs,
    the entry point method name must be either “premain” for agents that are attached when the JVM is initially invoked or “agentmain” for agents that are attached after the JVM has started. Here are the valid method signatures:
  • public static void premain(String agentArgs, Instrumentation inst);
  • public static void premain(String agentArgs);
  • public static void agentmain(String agentArgs, Instrumentation inst);
  • public static void agentmain(String agentArgs);

The JVM will attempt to invoke the flavor with the Instrumentation parameter first and only invoke the other if the first doesn’t exist.

To invoke an agent at JVM initialization time, you specify the following JVM parameter:

-javaagent:<path to agent jar>=<agent arguments>

Attaching an agent to a running JVM is a little more complicated. The Attach API‘s VirtualMachine.attach(String pid) method will allow one JVM to attach to another. Once attached, the VirtualMachine.loadAgent(…) methods can be used to load an agent into the JVM that you are attached to. You should note that the “pid” parameter of the attach method is the process ID of the target JVM. If you know the PID of the target JVM you could use the following (oversimplified) code to attach:

public static void main(String args[]) throws Exception {

Assuming this method was in a class called ca.discotek.attachexample.Attacher, you would invoke this code like so:

java ca.discotek.attachexample.Attacher 1234 C:/agentdir/myagent.jar

Please note that the Attach API is part of tools.jar, which is only available in JDKs.

Getting back to building an agent class, if you want to do any class transformations, you will need to implement the java.lang.instrument.ClassFileTransfomer interface, which defines the following method:

byte[] transform(ClassLoader loader,
                 String className,
                 Class<?> classBeingRedefined,
                 ProtectionDomain protectionDomain,
                 byte[] classfileBuffer)
                 throws IllegalClassFormatException;

This method returns a byte array which contains the byte code definition for a given class. Here are some notes regarding the parameters:

  1. loader: Will be null if the ClassLoader is the bootstrap ClassLoader, so don’t assume it will be non-null!
  2. className: Believe it or not, this value can be null sometimes, so don’t assume it will be non-null! Also, it will always use a forward slash as a package/class name separator (e.g. java/lang/String, not java.lang.String).
  3. classBeingRedefined: Will be null if the class is being loaded for the first time, so don’t assume it will be non-null!
  4. protectionDomain: I never use this parameter, so I don’t have anything to say about it.
  5. classfileBuffer: This will never be non-null!

Your agent class doesn’t have to be the class that implements this interface, but your agent class code will probably be responsible for activating any ClassFileTransformer implementation by calling instrumentation.addTransformer(…). This method comes in two flavors:

public void addTransformer(ClassFileTransformer transformer) 
public void addTransformer(ClassFileTransformer transformer, boolean canRetransform)

These methods will register a ClassFileTransformer with the JVM. The second method contains a boolean parameter, which flags the transformer as interested in processing class byte code as it is loaded. A ClassFileTransformer, which is added without this parameter set to true, will not be able to transform byte code. Its transform method will be invoked, but the byte code array returned by this method will simply be discarded.

One of the main obstacles in front of developers wishing to learn more about agents is putting together a project, which does all of the above before getting to the fun part of implementing some agent functionality. I have put together the following projects to help smooth over these bumps. Please note, resources for these projects can be downloaded from the Practical Byte Code Engineering download page.

agent-example-0-basic [Download]

Builds an agent jar which just prints all the class names and their class loaders as they are loaded:

    public byte[] transform(ClassLoader loader, String className,
            Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
            byte[] classfileBuffer) throws IllegalClassFormatException {

        System.out.println("Basic Agent: " + className + " : " + loader);

        return null;

To see it in action

  1. Run the default task of the build.xml ANT script at the root of the project.
  2. Run the BasicTest program (under test in the root of project) with the following JVM parameter: -javaagent:<path to project>/dist/myagent.jar

agent-example-1-attach [Download]

This project doesn’t have a lot of code, but it is somewhat complicated. It demonstrates how you can attach an agent to a running JVM. Let’s first look at the agent class. We attaching to a JVM, so we only need the agentmain method in our ca.discotek.agent.example.attache.MyAgent agent class:

    public static void agentmain(String agentArgs, Instrumentation inst) {
        initialize(agentArgs, inst, false);

Let’s now look at the initialize method:

    public static void initialize(String agentArgs, Instrumentation inst, boolean isPremain) {
        MyAgent.instrumentation = inst;
        inst.addTransformer(new MyClassFileTransformer(), true);

        Runnable r = new Runnable() {
            public void run() {
                while (true) {
                    try {
                        Class classes[] = instrumentation.getAllLoadedClasses();
                        for (int i=0; i<classes.length; i++) {
                            if (classes[i].getName().equals("ca.discotek.agent.example.attach.test.AttachTest")) {
                                System.out.println("Reloading: " + classes[i].getName());

                    catch (Exception e) {

        Thread t = new Thread(r);

The initialize method does two main things:

  1. The second line registers a new instance of MyClassFileTransformer as a ClassFileTransformer with the JVM using the Instrumentation.addTransformer(…) method.
  2. Creates some looping code in a thread which will continually schedule the AttachTest class to be retransformed.

Next we have the ca.discotek.agent.example.attache.MyClassFileTransformer class. It implements the ClassFileTransformer interface and has the following transform method:

    public byte[] transform(ClassLoader loader, String className,
            Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
            byte[] classfileBuffer) throws IllegalClassFormatException {

        if (className != null && className.startsWith("ca/discotek/"))
            System.out.println("Attach Agent: " + className);

        return null;

This method doesn’t do much except print out the name of any classes that are being loaded/transformed that start with ca/discotek/. This isn’t particularly interesting, but it will demonstrate that the MyAgent agent was attached and successfully added the MyClassFileTransformer, which continually reloads classes.

We also have an ca.discotek.agent.example.attach.Attacher class with a single method:

    public static void main(String[] args) throws Exception {

This class take a PID of a Java process as the first parameter and the path to an agent jar as a second parameter.

Lastl, we have ca.discotek.agent.example.attach.test.AttachTest, which is just used for creating a JVM to demonstrate the attach/agent functionality:

    public static void main(String[] args) throws Exception {
        while (true) {

To see this agent in action

  1. Run the default task of the build.xml ANT script at the root of the project.
  2. Run the AttachTest program. You can run this from your IDE or the command line.
  3. Run the Attacher program (same package as MyAgent) with the following parameters <pid> <path to project>/dist/myagent.jar, where you will need to determine the pid using tools like jps, jconsole, or your operating systems process manager. Remember, you will need to have the JDK’s tools.jar in your classpath. The command line will look something like this:
java -classpath .../discotek-agent-example-1-attach/bin;/java/jdkx.y.z/lib/tools.jar ca.discotek.agent.example.attach.Attacher 16948 .../discotek-agent-example-1-attach/dist/myagent.jar

You can confirm that you have correctly attached to the running JVM when the MyClassFileTransformer class is printing the following repeatedly:

Reloading: ca.discotek.agent.example.attach.test.AttachTest
Attach Agent: ca/discotek/agent/example/attach/test/AttachTest

agent-example-2-access-javassist [Download]

Builds an agent jar that will using Javassist to change the access modifiers of a test class from private to public. It has ClassFileTransformer.transform() method implementation as follows:

    public byte[] transform(ClassLoader loader, String className,
            Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
            byte[] classfileBuffer) throws IllegalClassFormatException {

        String dotName = className.replace('/', '.');
        if (className != null && transformPattern.matcher(dotName).matches()) {

            try {
                ClassPool pool = ClassPool.getDefault();
                pool.appendClassPath(new ByteArrayClassPath(dotName, classfileBuffer));
                CtClass ctClass = pool.get(dotName);
                int modifiers = ctClass.getModifiers();

                CtField ctField = ctClass.getDeclaredField("privateMessage");
                modifiers = ctField.getModifiers();

                CtConstructor ctConstructor = ctClass.getDeclaredConstructor(new CtClass[]{});
                modifiers = ctConstructor.getModifiers();

                CtMethod ctMethod = ctClass.getDeclaredMethod("printMessage");
                modifiers = ctMethod.getModifiers();

                return ctClass.toBytecode(); 
            catch (Exception e) {
                throw new RuntimeException("Bug", e);


        return null;

The project also a test source folder which contains classes for testing the agent. These classes are:

ca.discotek.agent.example.access.javassist.testee.PrivateTest, which has methods:

    private String privateMessage = "This is a private message";

    private PrivateTest() {
        System.out.println("Private Constructor");

    private void printMessage() {
        System.out.println("This is a private method");

This class has a private field, private constructor, and a private method. The class itself is also private.

The test source folder also has class ca.discotek.agent.example.access.javassist.tester.AccessJavassistTest with main method:

    public static void main(String[] args) throws Exception {

        Class c = null;
        try {
            c = Class.forName("ca.discotek.agent.example.access.javassist.testee.PrivateTest");
            System.out.println("Class is public? " + Modifier.isPublic(c.getModifiers()) );
        catch (Throwable t) {

        Object o = null;
        try {
            Constructor constructor = c.getDeclaredConstructor(new Class[]{});
            System.out.println("Constructor is public? " + Modifier.isPublic(constructor.getModifiers()) );
            o = constructor.newInstance(new Object[]{});
        catch (Throwable t) {

        try {
            Field field = c.getField("privateMessage");
            System.out.println("Field is public? " + Modifier.isPublic(field.getModifiers()) );
            Object value  = field.get(o);
            System.out.println("Field value: " +  value);
        catch (Throwable t) {

        try {
            Method method = c.getMethod("printMessage", new Class[]{});
            System.out.println("Method is public? " + Modifier.isPublic(method.getModifiers()) );
            method.invoke(o, new Object[]{});
        catch (Throwable t) {

This method will test if any of the private entities in the PrivateTest class are accessible.

To see this agent in action:

  1. Run the default task of the build.xml ANT script at the root of the project.
  2. Run the ca.discotek.agent.example.access.javassist.tester.AccessJavassistTest program. You can run this from your IDE or the command line. You will need to add the -javaagent parameter. Here is what it might look like:
java -javaagent:.../discotek-agent-example-2-access-javassist/dist/agent-access-javassist.jar=.*discotek.*test.* -classpath .../discotek-agent-example-2-access-javassist/bin;.../discotek-agent-example-2-access-javassist/lib/javassist.jar ca.discotek.agent.example.access.javassist.tester.AccessJavassistTest

This assumes your IDE compiles classes in to bin directory at the root of your project (otherwise the -classpath in the example above would be incorrect).

agent-example-3-access-asm [Download]

Builds an agent jar that will using ASM to change the access modifiers of a test class from private to public. It has ClassFileTransformer.transform() method implementation as follows:

    public byte[] transform(ClassLoader loader, String className,
            Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
            byte[] classfileBuffer) throws IllegalClassFormatException {

        if (className != null && transformPattern.matcher(className.replace('/', '.')).matches()) {
            ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_MAXS);
            AccessClassVisitor accessClassVisitor = new AccessClassVisitor(cw);
            ClassReader cr = new ClassReader(classfileBuffer);
            cr.accept(accessClassVisitor, ClassReader.SKIP_FRAMES);

            return cw.toByteArray();

        return null;

We also have AccessClassVisitor which is used to perform the byte code modifications:

public class AccessClassVisitor extends ClassVisitor {

    static int convertToPublicAccess(int access) {
        access &= ~Opcodes.ACC_PRIVATE;
        access &= ~Opcodes.ACC_PROTECTED;
        access |= Opcodes.ACC_PUBLIC;
        return access;

    public AccessClassVisitor(ClassVisitor cv) {
        super(Opcodes.ASM5, cv);

    public void visit(int version, int access, String name, String signature, String superName, String[] interfaces) {
        super.visit(version, convertToPublicAccess(access), name, signature, superName, interfaces);

    public MethodVisitor visitMethod(int access,
            String name,
            String desc,
            String signature,
            String[] exceptions) {    

        return super.visitMethod(convertToPublicAccess(access), name, desc, signature, exceptions);

    public FieldVisitor visitField(int access,
            String name,
            String desc,
            String signature,
            Object value) {
        return super.visitField(convertToPublicAccess(access), name, desc, signature, value);

To see this agent in action:

  1. Run the default task of the build.xml ANT script at the root of the project.
  2. Run the ca.discotek.agent.example.access.asm.tester.AccessAsmTest program. You can run this from your IDE or the command line. You will need to add the -javaagent parameter. Here is what it might look like:
java -noverify -javaagent:.../discotek-agent-example-3-access-asm/dist/agent-access-asm.jar=.*discotek.*test.* -classpath .../discotek-agent-example-3-access-asm/bin;.../discotek-agent-example-3-access-asm/lib/asm-5.0.4.jar ca.discotek.agent.example.access.asm.tester.AccessAsmTest

This assumes your IDE compiles classes in to bin directory at the root of your project (otherwise the -classpath in the example above would be incorrect).

Please note the use of the -noverify JVM parameter. This is the first time this parameter has been introduced in these projects. If you are using a modern JVM it is most likely required to to avoid the JVM rules regarding stack frames (explained here). Adding -noverify simply bypasses the byte code verifier, which avoids these rules. You should be careful about using -noverify in a production environment.

agent-example-4-asm-internal-types [Download]

This next project does not a build an agent. It builds a program that can help anyone struggling with specifying JVM internal descriptors correctly. Undertanding JVM internal descriptor correctly is very important for byte code engineering and
are used very frequently with ASM. They are used by Javassist as well, but not as frequently. The main Javassist example I can think of is the CtClass‘s getConstructor(String descriptor) and getMethod(String name, String descriptor) methods.

I won’t paste the relevant code here, but here is a screen shot of it in action. Here it has loaded the jar built by running the default build of the ANT script in the root of this project.

To see this code in action either:

  • Run the code directly from your IDE using entry point class ca.discotek.agent.example.internaltypes.InternalTypeConverterView.
    1. Run the default task of the build.xml ANT script at the root of the project.
    2. Then invoke the program using a command like: java -classpath …/discotek-agent-example-4-asm-internal-types/dist/internal-types.jar ca.discotek.agent.example.internaltypes.InternalTypeConverterView

agent-example-5-objectsize-asm [Download]

Instead of byte code transformations, this next project uses Instrumentation‘s getObjectSize(Object o) method to calculate the the size of an arbitrary object in memory.

Here is a screen shot of the final product:

In the top half of the GUI there is a table with four columns:

  1. Type: There is a row for every basic Java type. Every class that you define will contain some combination of these types (possibly none)
  2. Count: This column represents the number of fields of the of the Type specified for a given row that would appear in an arbitrary class
  3. Type Size: This column is the size in bytes for Type in each row
  4. Calculated Subtotal: This column is the calculated size that the types for a given row would consume (i.e. Count * Type Size)

Just below the table, there are Increment and Decrement buttons. These buttons will increase or decrease the number in the Count column for a given row.

At the bottom of the table, there is a text field, which shows the calculated total object size and the size as return by Instrumentation‘s getObjectSize(Object o) method.

The code in the agent class is very simple:

    public static void premain(String agentArgs, Instrumentation inst) {
        initialize(agentArgs, inst, true);

    public static void agentmain(String agentArgs, Instrumentation inst) {
        initialize(agentArgs, inst, false);

    public static void initialize(String agentArgs, Instrumentation inst, boolean isPremain) {
        try {  ObjectSizeAnalyzerView.showObjectAnalyzerView(inst); } 
        catch (Exception e) { e.printStackTrace(); }

Most of the ObjectSizeAnalyzeView code isn’t very interesting either. However, ASM is used to generate an object with the fields as specified in the table, which is worth noting. A single updateSize() method is used to generate a new class with the specified fields and then instantiate an instance of that class. It then passes that object as a parameter to the Instrumentation.getObjectSize(Object o) method

    void updateSize() {

        String className = "MyClass" + classIndex++;

        ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_MAXS);
        cw.visit(Opcodes.V1_6, Opcodes.ACC_PUBLIC, className, null, "java/lang/Object", null);

        Method m = Method.getMethod("void()");
        GeneratorAdapter mg = new GeneratorAdapter(Opcodes.ACC_PUBLIC, m, null, null, cw);
        mg.invokeConstructor(Type.getType(Object.class), m);

        int fieldIndex = 0;
        Type type;
        int count;
        String desc;
        for (int i=0; i<TYPES.length; i++) {
            count = model.countList.get(i);
            for (int j=0; j<count; j++) {
                type = Type.getType(TYPE_CLASSES[i]);
                desc = getDescriptor(type);
                cw.visitField(Opcodes.ACC_PUBLIC, "field" + fieldIndex++, desc, null, null);


        MyClassLoader loader = new MyClassLoader(className, cw.toByteArray());
        try {
            Class c = loader.loadClass(className);
            currentObject = c.newInstance();
            long objectSize = instrumentation.getObjectSize(currentObject);
            StringBuilder buffer = new StringBuilder();
            long calculated = 8;
            Integer ints[] = model.countList.toArray(new Integer[model.countList.size()]);
            for (int i=0; i

To see this agent in action:

  1. Run the default task of the build.xml ANT script at the root of the project.
  2. Run the ca.discotek.agent.example.objectsize.asm.test.ObjectSizeAsmTest program. You can run this from your IDE or the command line. You will need to add the -javaagent parameter. Here is what it might look like:
java -javaagent:.../discotek-agent-example-5-objectsize-asm/dist/agent-objectsize-asm.jar -classpath .../discotek-agent-example-5-objectsize-asm/bin;.../discotek-agent-example-5-objectsize-asm/lib/asm-5.0.4.jar;.../discotek-agent-example-5-objectsize-asm/lib/asm-commons-5.0.4.jar ca.discotek.agent.example.objectsize.asm.test.ObjectSizeAsmTest

agent-example-6-profile-javassist [Download]

This project produces an agent jar that demonstrates one of Javassist’s shortcomings. Specifically, if you add a local variable using CtBehaviour‘s insertBefore method, you cannot later reference it in code added using CtBehaviour‘s insertAfter method.
Javassist doesn’t know anything about the byte code you previously inserted. Let’s see what happens when we use Javassist to create an agent to to instrument methods in order to profile execution time. The agent will instrument all methods in classes whose name match a regular expression provided in the agent arguments. Here is how the agent is initialized:

    public static void premain(String agentArgs, Instrumentation inst) {
        initialize(agentArgs, inst, false);

    public static void agentmain(String agentArgs, Instrumentation inst) {
        initialize(agentArgs, inst, true);

    static void initialize(String agentArgs, Instrumentation inst, boolean isPremain) {
        MyAgent.instrumentation = inst;
        inst.addTransformer(new MyAgent(Pattern.compile(agentArgs)), true);

    public MyAgent(Pattern transformPattern) {
        this.transformPattern = transformPattern;

The transform method filters out any unwanted classes and passes the byte code to an instrument(…) method for processing:

    void instrument(CtBehavior behaviour) throws CannotCompileException {
        String beforeCode = "long __start_time__ = System.currentTimeMillis();";

        StringBuilder buffer = new StringBuilder();
        buffer.append("long __end_time__ = System.currentTimeMillis();");
        String method = "$class" + '.' + behaviour.getName() + behaviour.getSignature();
        buffer.append("System.out.println(\"Ellapsed " + method + ": \" + (__end_time__ - __start_time__));");

        String afterCode = buffer.toString();
        behaviour.insertAfter(afterCode, true);

        behaviour.addLocalVariable("__start_time__", CtClass.longType);
        behaviour.addLocalVariable("__end_time__", CtClass.longType);

Despite the fact this code will produce a run-time Javassist error, here are some points worth noting:

  • The instrument(…) method takes a CtBehaviour object as a parameter. CtBehaviour is the super class of both CtConstructor and CtMethod, so it can process either.
  • A constructor is a special type of methods used to initialize an object. As such, it has rule that there should be no byte code instructions to invoke any methods in the constructor preceding the constructors call to super. Fortunately, Javassist’s CtBehaviour.insertBefore is aware and will insert your code after the call to super.
  • You will notice that the names of the local variables inserted into each method (e.g. __start_time__, __end_time__) are a little strange looking. If you are inserting any construct into byte code that you are unfamiliar with, you need to make sure you don’t break rules about uniqueness. In this case, you can’t have two local variables with the same name. If we had used “start” as the variable name instead of “__start_time__”, it is more likely a clash might occur. I usually include “discotek” in the name of a variable to ensure uniqueness.

Lastly, here is our simple test client code:

public class ProfileJavassistTest {

    public static final void main(String[] args) throws Exception {
        System.out.println("Jon Snow, Ned Stark's bastard, likes to say \"Winter is coming.\"");

To see this agent in action:

  1. Run the default task of the build.xml ANT script at the root of the project.
  2. Run the ca.discotek.agent.example.profile.javassist.test.ProfileJavassistTest program. You can run this from your IDE or the command line. You will need to add the -javaagent parameter. Here is what it might look like:
java -noverify -javaagent:.../discotek-agent-example-6-profile-javassist/dist/agent-profile-javassist.jar=.*discotek.*test.* -classpath .../discotek-agent-example-6-profile-javassist/bin;.../discotek-agent-example-6-profile-javassist/lib/javassist.jar ca.discotek.agent.example.profile.javassist.test.ProfileJavassistTest

Here is what the output will look like:

javassist.CannotCompileException: [source error] no such field: __start_time__
    at javassist.CtBehavior.insertAfter(
    at ca.discotek.agent.example.profile.javassist.MyAgent.instrument(
    at ca.discotek.agent.example.profile.javassist.MyAgent.transform(
    at sun.instrument.TransformerManager.transform(Unknown Source)
    at sun.instrument.InstrumentationImpl.transform(Unknown Source)
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(Unknown Source)
    at Source)
    at Source)
    at$100(Unknown Source)
    at$ Source)
    at$ Source)
    at Method)
    at Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at sun.launcher.LauncherHelper.checkAndLoadMain(Unknown Source)
Caused by: compile error: no such field: __start_time__
    at javassist.compiler.TypeChecker.fieldAccess(
    at javassist.compiler.TypeChecker.atFieldRead(
    at javassist.compiler.TypeChecker.atMember(
    at javassist.compiler.JvstTypeChecker.atMember(
    at javassist.compiler.ast.Member.accept(
    at javassist.compiler.TypeChecker.atBinExpr(
    at javassist.compiler.ast.BinExpr.accept(
    at javassist.compiler.TypeChecker.atPlusExpr(
    at javassist.compiler.TypeChecker.atBinExpr(
    at javassist.compiler.ast.BinExpr.accept(
    at javassist.compiler.JvstTypeChecker.atMethodArgs(
    at javassist.compiler.TypeChecker.atMethodCallCore(
    at javassist.compiler.TypeChecker.atCallExpr(
    at javassist.compiler.JvstTypeChecker.atCallExpr(
    at javassist.compiler.ast.CallExpr.accept(
    at javassist.compiler.CodeGen.doTypeCheck(
    at javassist.compiler.CodeGen.atStmnt(
    at javassist.compiler.ast.Stmnt.accept(
    at javassist.compiler.Javac.compileStmnt(
    at javassist.CtBehavior.insertAfterHandler(
    at javassist.CtBehavior.insertAfter(
    ... 17 more
Jon Snow, Ned Stark's bastard, likes to say "Winter is coming."

Here we see that when we tried to add the code using the insertAfter method, it couldn’t find the __start_time__ local variable added in the insertBefore method. This is a colossal pain in the neck. The beauty of Javassist is that you can insert real Java code without understanding much about byte code.

agent-example-7-profile-asm [Download]

This project produces an agent jar that demonstrates how you can add execution profiling code to byte code methods. The agent will instrument all methods in classes whose name match a regular expression provided in the agent arguments. Here is how the agent is initialized (exactly the same as the last project example):

    public static void premain(String agentArgs, Instrumentation inst) {
        initialize(agentArgs, inst, true);

    public static void agentmain(String agentArgs, Instrumentation inst) {
        initialize(agentArgs, inst, false);

    static void initialize(String agentArgs, Instrumentation inst, boolean isPremain) {
        MyAgent.instrumentation = inst;
        inst.addTransformer(new MyAgent(Pattern.compile(agentArgs)), true);

    public MyAgent(Pattern transformPattern) {
        this.transformPattern = transformPattern;

And here is the transform method:

    public byte[] transform(ClassLoader loader, String className,
            Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
            byte[] classfileBuffer) throws IllegalClassFormatException {

        if (className != null && transformPattern.matcher(className.replace('/', '.')).matches()) {
            ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_MAXS);
            ProfileClassVisitor accessClassVisitor = new ProfileClassVisitor(cw);
            ClassReader cr = new ClassReader(classfileBuffer);
            cr.accept(accessClassVisitor, ClassReader.SKIP_FRAMES);

            return cw.toByteArray();

        return null;

The transform method hands off the byte code processing to the ProfileClassVisitor class:

public class ProfileClassVisitor extends ClassVisitor {

    String classDotName;

    public ProfileClassVisitor(ClassVisitor cv) {
        super(Opcodes.ASM5, cv);

    public void visit(int version, int access, String name, String signature, String superName, String[] interfaces) {
        super.visit(version, access, name, signature, superName, interfaces);
        this.classDotName = name.replace('/', '.');

    public MethodVisitor visitMethod(int access,
            String name,
            String desc,
            String signature,
            String[] exceptions) {    

        MethodVisitor mv = super.visitMethod(access, name, desc, signature, exceptions);
        return new ProfileMethodVisitor(mv, access, name, desc);

    class ProfileMethodVisitor extends AdviceAdapter {

        String methodName = null;
        String desc = null;

        int startTimeVar = -1;

        Label timeStart = new Label();
        Label timeEnd = new Label();

        Label finallyStart = new Label();
        Label finallyEnd = new Label();

        String signature = null;

        public ProfileMethodVisitor(MethodVisitor mv, int access, String name, String desc) {
            super(Opcodes.ASM5, mv, access, name, desc);
            this.methodName = name;
            this.desc = desc;

            signature = classDotName + '.' + methodName + toParameterString(desc); 

        String toParameterString(String desc) {
            Type methodType = Type.getMethodType(desc);
            StringBuilder buffer = new StringBuilder();


            Type argTypes[] = methodType.getArgumentTypes();
            for (int i=0; i<argTypes.length; i++) {

                if (i<argTypes.length-1)
                    buffer.append(", ");


            return buffer.toString();

        public void visitCode() {

            startTimeVar = newLocal(Type.LONG_TYPE);
            mv.visitMethodInsn(INVOKESTATIC, "java/lang/System", "currentTimeMillis", "()J", false);
            mv.visitVarInsn(LSTORE, startTimeVar);


        protected void onMethodExit(int opcode) {
            if (opcode != ATHROW)  

        private void onFinally(int opcode) {
            if (opcode == ATHROW)

            int throwableVarIndex = newLocal(Type.getType(Throwable.class));
            mv.visitVarInsn(Opcodes.ASTORE, throwableVarIndex);

            mv.visitFieldInsn(Opcodes.GETSTATIC, "java/lang/System", "out", "Ljava/io/PrintStream;");
            mv.visitLdcInsn(signature + ": ");
            mv.visitMethodInsn(Opcodes.INVOKEVIRTUAL, "java/io/PrintStream", "print", "(Ljava/lang/String;)V", false);

            mv.visitFieldInsn(Opcodes.GETSTATIC, "java/lang/System", "out", "Ljava/io/PrintStream;");
            mv.visitMethodInsn(INVOKESTATIC, "java/lang/System", "currentTimeMillis", "()J", false);
            mv.visitVarInsn(Opcodes.LLOAD, startTimeVar);
            mv.visitMethodInsn(Opcodes.INVOKEVIRTUAL, "java/io/PrintStream", "print", "(J)V", false);

            mv.visitFieldInsn(Opcodes.GETSTATIC, "java/lang/System", "out", "Ljava/io/PrintStream;");
            mv.visitMethodInsn(Opcodes.INVOKEVIRTUAL, "java/io/PrintStream", "println", "(Ljava/lang/String;)V", false);

        public void visitMaxs(int stack, int locals) { 
            Label endFinally = new Label(); 
            mv.visitTryCatchBlock(finallyStart, endFinally, endFinally, null); 
            visitLocalVariable("_time_", Type.LONG_TYPE.getDescriptor(), null, timeStart, timeEnd, startTimeVar);

            super.visitMaxs(stack, locals); 

We also have test classes ProfileAsmTest:

public class ProfileAsmTest {

    public static void main(String[] args) throws Exception {
        ProfileTest test = new ProfileTest();

…and ProfileTest:

public class ProfileTest {

    static final long DEFAULT_SLEEP = 500;

    static {
        System.out.println("in static initializer");

    public ProfileTest() throws InterruptedException {

    public void sleep() throws InterruptedException {

    public void sleep(long sleep) throws InterruptedException {

To see this agent in action:

  1. Run the default task of the build.xml ANT script at the root of the project.
  2. Run the ca.discotek.agent.example.profile.asm.test.ProfileAsmTest program. You can run this from your IDE or the command line. You will need to add the -javaagent parameter. Here is what it might look like:
java -noverify -javaagent:.../discotek-agent-example-7-profile-asm/dist/agent-profile-asm.jar=.*discotek.*test.* -classpath .../discotek-agent-example-7-profile-asm/bin;.../discotek-agent-example-7-profile-asm/lib/asm-5.0.4.jar;.../discotek-agent-example-7-profile-asm/lib/asm-commons-5.0.4.jar ca.discotek.agent.example.profile.asm.test.ProfileAsmTest

The output should look like this:

in static initializer
ca.discotek.agent.example.profile.asm.test.ProfileTest.<clinit>(): 0ms
ca.discotek.agent.example.profile.asm.test.ProfileTest.sleep(long): 100ms
ca.discotek.agent.example.profile.asm.test.ProfileTest.<init>(): 100ms
ca.discotek.agent.example.profile.asm.test.ProfileTest.sleep(long): 501ms
ca.discotek.agent.example.profile.asm.test.ProfileTest.sleep(): 501ms
ca.discotek.agent.example.profile.asm.test.ProfileAsmTest.main(java.lang.String[]): 607ms

agent-example-8-profile-classpath-asm [Download]

This project is almost functionally the same as the last project, but illustrates how an agent might be assembled and deployed in a realistic scenario. First, let’s acknowledge that agent jars appear on the JVM’s classpath. This presents a problem when the application code running in your target JVM has class name clashes with the code in your agent. For example, the agent in the previous example uses ASM, but ASM is used most modern application servers and web containers. If ASM appears earlier in the classpath than your agent, the application ASM classes would be loaded before the agent’s ASM classes. This issue is fairly easily resolved by rebundling the ASM code. Instead of referencing ASM class in its original jar (like the previous project), the ASM source code (which is open source) can be added to this project itself. It can then be refactored to give every class a unique name space. In this project, the ASM package path org.objectweb.asm is refactored to ca.discotek gives the package path a name that is unique to the organization and ca.discotek.example gives the package path a name that is unqiue to the agent. This necessary when an organization has multiple agents that might be installed in the same JVM.

A second issue occurs when an agent needs to insert new classes into the application space. The previous example does not do this, but it is not a stretch the imagination to understand how it might. Instead of calling System.out.println(…) to output the method profiling results, it would probably be more useful to use some API to record the results centrally. To this end, this project adds the ca.discotek.agent.example.profileclasspath.Recorder class:

public class Recorder {

    public static void record(String methodSignature, long start, long end) {
        StringBuilder buffer = new StringBuilder();
        buffer.append(": ");
        buffer.append(end - start);

This class has a single record method which uses System.out.println(…) to output the results, but could easily be modified to store the results in a database. We now need to change our instrumentation code slightly to call Recorder.record(…). The last line of the onFinally method now calls this method:

        private void onFinally(int opcode) {
            if (opcode == ATHROW)

            mv.visitVarInsn(Opcodes.LLOAD, startTimeVar);
            mv.visitMethodInsn(INVOKESTATIC, "java/lang/System", "currentTimeMillis", "()J", false);
            mv.visitMethodInsn(Opcodes.INVOKESTATIC, Recorder.class.getName().replace('.', '/'), "record", "(Ljava/lang/String;JJ)V", false);

Let’s now return to discussing the issue with inserting new classes into application space. In this scenario, we are modifying application methods to call Recorder.record(…). If we bundle the Recorder class directly within the agent jar, it will be discoverable by any classloader that uses the system classpath. However, classes loaded by the boot classloader or custom classloaders that don’t use the system classpath will not be able to find the Recorder class. Adding it to the boot classpath fixes this issue. We can do this by taking advantage of Instrumentation‘s appendToBootstrapClassLoaderSearch(…) method. This method accepts a JarFile as a parameter. This means we must bundle the Recorder class in its own jar and bundle that jar within the agent jar. At agent initialization time, we can extract this jar and invoke the appendToBootstrapClassLoaderSearch(…) method. Here is the relevant build.xml code:

    <target name="jar" depends="compile">

        <jar destfile="${build}/${boot-classpath-jar}" update="true" >
            <fileset dir="${classes}">
                <include name="ca/discotek/agent/example/profileclasspath/Recorder.class"/>

        <jar destfile="${dist}/${agent-jar}" update="true" >
                <attribute name="Premain-Class" value="${agent-class-name}"/>
                <attribute name="Agent-Class" value="${agent-class-name}"/>
                <attribute name="Can-Redefine-Classes" value="true"/>
                <attribute name="Can-Retransform-Classes" value="true"/>

            <fileset dir="${classes}">
                <include name="ca/discotek/agent/example/profileclasspath/asm/*.class"/>
                <include name="ca/discotek/example/rebundled/**/*.class"/>

            <fileset dir="${build}">
                <include name="${boot-classpath-jar}"/>


        <jar destfile="${dist}/${test-jar}" update="true" >
            <fileset dir="${test-classes}">
                <include name="ca/discotek/agent/example/profileclasspath/**/*.class"/>


The first jar task create the jar with the Recorder class and the second jar task creates the agent jar which bundles the Recorder jar.

Here is the initialize(…) method in MyAgent class which extracts the jar and adds it to the boot classpath:

    public static void initialize(String agentArgs, Instrumentation inst, boolean isPremain) {
        MyAgent.instrumentation = inst;

        try {
            URL url = MyAgent.class.getProtectionDomain().getCodeSource().getLocation();
            File file = new File(url.getFile());
            JarFile agentJar = new JarFile(file);
            ZipEntry entry = agentJar.getEntry("profile-classpath-boot-classpath.jar");
            InputStream is = agentJar.getInputStream(entry);

            File tmpDir = new File(System.getProperty(""));
            File bootClassPathFile = new File(tmpDir, entry.getName());
            FileOutputStream fos = new FileOutputStream(bootClassPathFile);

            int length;
            byte bytes[] = new byte[10 * 1024];
            while ( (length = > 0)
                fos.write(bytes, 0, length);


            JarFile jar = new JarFile(bootClassPathFile);

            inst.addTransformer(new MyAgent(Pattern.compile(agentArgs)), true);
        catch (Exception e) {
            System.err.println("Unexpected error occured while installing agent. See following stack trace. Aborting.");

You’ll note the above code uses MyAgent.class.getProtectionDomain().getCodeSource().getLocation(). This is a neat trick, but it may not work if you run the test code from your IDE. IDEs like Eclipse will attempt to hotswap your code. This applies to application code and agent code. It will hotswap the MyAgent class and discover it in the <project>/bin directory. If this happens, MyAgent.class.getProtectionDomain().getCodeSource().getLocation() will return a URL to the bin directory, not the agent jar.

To see this agent in action:

  1. Run the default task of the build.xml ANT script at the root of the project.
  2. Run the ca.discotek.agent.example.profile.asm.test.ProfileAsmTest program from a shell (not your IDE – see note immediately above). You will need to add the -javaagent parameter. Here is what it might look like:
java -noverify -javaagent:.../discotek-agent-example-8-profile-classpath-asm/dist/agent-profile-classpath-asm.jar=.*discotek.*test.* -classpath .../discotek-agent-example-8-profile-classpath-asm/dist/profile-classpath-test.jar ca.discotek.agent.example.profileclasspath.asm.test.ProfileClasspathAsmTest

This concludes the project examples. I hope they are easy to use and are informative. To finish off, here are a couple of tips for byte code engineering:

  • If you use Javassist, you may find yourself using ClassPool.getDefault() to create a ClassPool object. This can become a bad habit. The default ClassPool cannot be garbage collected and all ClassPool objects will retain (memory-wise) at least some part of each CtClass that it loads. If you instrumenting a lot of classes, this can easily become a memory leak. Alternatively, instantiate a new ClassPool object and immediately call appendSystemPath() to achieve a similar goal.
  • If any sort of exception occurs in the ClassFileTransformer.transform(…) method, it will likely get silently swallowed and you will have no idea why your target byte code is not being transformed. To avoid this pit-fall, wrap any code you place in the transform(…) method in a try/catch block, which catches Throwable and handle the exception in way that will inform you something goes wrong.
  • This last tip might not make sense until it happens to you… when byte code engineering libraries like ASM and Javassist will need to find and parse classes that your code does not know how to find. For instance, if you instrument class X, you may get an error saying that it cannot find class Y, which is its super class. The answer is to use the ClassLoader parameter from the transform method. The full solution for ASM is a bit complicated and may require overriding ClassWriter‘s getCommonSuperClass(…) method. However, Javassist is a bit easier. You can call ClassPool.appendClassPath(…) to add a LoaderClassPath.


All resources mentioned in this tutorial can be downloaded from the Practical Byte Code Engineering download page


Posted in Byte Code Engineering | Leave a comment

Instrumenting JBoss with javaagent jars

As the author of Feenix, a class reloading framework, I have recently been investigating how to reload web resources such as JSPs, JSF, and other web resources (images, css, etc). Unlike class reloading, web reloading is unique to each web container and consequently each web container implementation must be researched and tested individually. JBoss has a large market share and warrants being one of the first supported containers. However, whenever I always dread working with JBoss because of its incompatibilities with javaagent instrumentation. Googling tells me I am not the only developer or javaagent provider suffering through this problem. It has been such a pain to deal with, I finally did some of my own R&D to figure out what exactly is going on and how to resolve it.

There is no handbook for figuring out how each web container handles web requests. Further, even when the container is open source, it is not always clear how the code will executes (i.e. many if/else branches that depend on variables only known at run time). To help me over this hurdle, I use a home grown tool called Execution Profiler. This tool is extremely handy for determining the execution path for a given request. Execution profiler is a javaagent, which instruments the target classes such that each method invocation is recorded. Unfortunately, it doesn’t play very nicely with JBoss (I am using 7.1.1) out of the box. Unlike most web containers (or their parent application servers), JBoss using the OSGI model, which does some unexpected things with the system classpath. It would be too much of a tangent for me to research and document what OSGI is supposed to do classloader-wise, so I’ll just be documenting how it behaved and how I was able to overcome these obstacles.

Here is how I configured Execution Profiler standalone.bat initially:

set JAVA_OPTS=%JAVA_OPTS% -javaagent:"C:\agents\discotek.execution-profiler-agent-1.0.jar"=classes="javax.servlet.*|*|org.apache.catalina.*|org.jboss.servlet.*|org.apache.tomcat.*|*|org.apache.naming.resources.*"

If you are reading this article then your probably already know what set JAVA_OPTS=… is for. This allows one to modify the JBoss JVM environment. The above configuration will install the Execution Profiler and will pass it a classes parameter whose value is a (Java) regular expression. This expression specifies the set of classes to be instrumented by Execution Profiler. Here is the error you get when you run JBoss with this configuration:

Exception in thread "main" java.lang.ExceptionInInitializerError
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(
        at java.lang.reflect.Method.invoke(
        at org.jboss.modules.Main.main(
Caused by: java.lang.IllegalStateException: The LogManager was not properly installed (you must set the "java.util.logging.manager" system property to "org.jboss.logmanager.LogManager")
        at org.jboss.logmanager.Logger.getLogger(
        at org.jboss.logmanager.log4j.BridgeRepositorySelector.(
        ... 7 more

The Caused by exception provides some fairly simply instructions for fixing this problem. Let’s add this instruction to standalone.bat, so we now have:

set JAVA_OPTS=%JAVA_OPTS% -javaagent:"C:\agents\discotek.execution-profiler-agent-1.0.jar"=classes="javax.servlet.*|*|org.apache.catalina.*|org.jboss.servlet.*|org.apache.tomcat.*|*|org.apache.naming.resources.*" set JAVA_OPTS=%JAVA_OPTS% -Djava.util.logging.manager=org.jboss.logmanager.LogManager

Running standalone.bat again now gives us this error:

Could not load Logmanager "org.jboss.logmanager.LogManager"
java.lang.ClassNotFoundException: org.jboss.logmanager.LogManager
        at Method)
        at java.lang.ClassLoader.loadClass(
        at sun.misc.Launcher$AppClassLoader.loadClass(
        at java.lang.ClassLoader.loadClass(
        at java.util.logging.LogManager$
        at Method)
        at java.util.logging.LogManager.<clinit>(
        at java.util.logging.Logger.getLogger(
        at sun.awt.SunToolkit.<clinit>(
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(
        at java.awt.Toolkit$
        at Method)
        at java.awt.Toolkit.getDefaultToolkit(
        at java.awt.SystemTray.isSupported(
        at ca.discotek.profiler.SystemTraySupport.install(
        at ca.discotek.profiler.ExecutionProfilerAgent.premain(
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(
        at java.lang.reflect.Method.invoke(
        at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(
        at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(


This error is not so helpful. However, there is some useful information here:

  1. The message tells us that the org.jboss.logmanager.LogManager class cannot be found
  2. The stack tells us that the error occurs when trying to install the Execution Profiler System Tray

This information is useful because it tells us that this error is a result of the two changes we made in standalone.bat. That is, it is searching for the org.jboss.logmanager.LogManager class because we configured java.util.logging to use this class and installing the Execution Profiler agent caused JBoss to load classes that require the JBoss java.util.logging resources to be available before the OSGI model makes them available. The next step is to make the org.jboss.logmanager.LogManager (and other classes from the same jar) immediately available.


Googling tells me there is a JBoss system property called jboss.modules.system.pkgs that tells JBoss to allow its comma-delimited list of packages to be allowed to be discovered from any class loader. We could blindly set this property, but it might override packages set by this property elsewhere. Specifically, if you look standalone.conf, you’ll see the following:

if [ "x$JBOSS_MODULES_SYSTEM_PKGS" = "x" ]; then JBOSS_MODULES_SYSTEM_PKGS="org.jboss.byteman" fi

I have no idea what byteman is, but I don’t want to remove it. Hence, we should include it in our configuration, which now looks like this:

set JAVA_OPTS=%JAVA_OPTS% -javaagent:"C:\agents\discotek.execution-profiler-agent-1.0.jar"=classes="javax.servlet.*|*|org.apache.catalina.*|org.jboss.servlet.*|org.apache.tomcat.*|*|org.apache.naming.resources.*" set JAVA_OPTS=%JAVA_OPTS% -Djava.util.logging.manager=org.jboss.logmanager.LogManager set JAVA_OPTS=%JAVA_OPTS% -Djboss.modules.system.pkgs=org.jboss.byteman,org.jboss.logmanager

Here is the output after running standalone.bat with the above configuration:

Could not load Logmanager "org.jboss.logmanager.LogManager"
java.lang.ClassNotFoundException: org.jboss.logmanager.LogManager
        at Method)
        at java.lang.ClassLoader.loadClass(
        at sun.misc.Launcher$AppClassLoader.loadClass(
        at java.lang.ClassLoader.loadClass(
        at java.util.logging.LogManager$
        at Method)
        at java.util.logging.LogManager.(
        at java.util.logging.Logger.getLogger(
        at sun.awt.SunToolkit.(
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(
        at java.awt.Toolkit$
        at Method)
        at java.awt.Toolkit.getDefaultToolkit(
        at java.awt.SystemTray.isSupported(
        at ca.discotek.profiler.SystemTraySupport.install(
        at ca.discotek.profiler.ExecutionProfilerAgent.premain(
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(
        at java.lang.reflect.Method.invoke(
        at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(
        at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(


This is the exact same stack trace as the previous run. We now now know that JBoss will allow this package to be discovered, so perhaps we need to explicitly make it available using and -Xbootclasspath entry. Our configuration now looks like this:

set JAVA_OPTS=%JAVA_OPTS% -javaagent:"C:\agents\discotek.execution-profiler-agent-1.0.jar"=classes="javax.servlet.*|*|org.apache.catalina.*|org.jboss.servlet.*|org.apache.tomcat.*|*|org.apache.naming.resources.*" set JAVA_OPTS=%JAVA_OPTS% -Djava.util.logging.manager=org.jboss.logmanager.LogManager set JAVA_OPTS=%JAVA_OPTS% -Djboss.modules.system.pkgs=org.jboss.byteman,org.jboss.logmanager set JAVA_OPTS=%JAVA_OPTS% -Xbootclasspath/a:C:\java\jboss-as-7.1.1.Final\modules\org\jboss\logmanager\main\jboss-logmanager-1.2.2.GA.jar

Here is the output after running standalone.bat with the above configuration:

Exception in thread "main" java.lang.NoClassDefFoundError: org/jboss/logmanager/log4j/BridgeRepositorySelector
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(
        at java.lang.reflect.Method.invoke(
        at org.jboss.modules.Main.main(
Caused by: java.lang.ClassNotFoundException: org.jboss.logmanager.log4j.BridgeRepositorySelector
        at Method)
        at java.lang.ClassLoader.loadClass(
        at sun.misc.Launcher$AppClassLoader.loadClass(
        at java.lang.ClassLoader.loadClass(
        at org.jboss.modules.ConcurrentClassLoader.performLoadClass(
        at org.jboss.modules.ConcurrentClassLoader.loadClass(
        at java.lang.ClassLoader.loadClassInternal(
        ... 7 more


This is an improvement. JBoss can now find the org.jboss.logmanager classes. It now cannot find the org.jboss.logmanager.log4j classes. Let’s now modify the above configuration so that this package is included in the jboss.modules.system.pkgs and the -Xbootclasspath entries:

set JAVA_OPTS=%JAVA_OPTS% -javaagent:"C:\agents\discotek.execution-profiler-agent-1.0.jar"=classes="javax.servlet.*|*|org.apache.catalina.*|org.jboss.servlet.*|org.apache.tomcat.*|*|org.apache.naming.resources.*" set JAVA_OPTS=%JAVA_OPTS% -Djava.util.logging.manager=org.jboss.logmanager.LogManager set JAVA_OPTS=%JAVA_OPTS% -Djboss.modules.system.pkgs=org.jboss.byteman,org.jboss.logmanager,org.jboss.logmanager.log4j set JAVA_OPTS=%JAVA_OPTS% -Xbootclasspath/a:C:\java\jboss-as-7.1.1.Final\modules\org\jboss\logmanager\main\jboss-logmanager-1.2.2.GA.jar;C:\java\jboss-as-7.1.1.Final\modules\org\jboss\logmanager\log4j\main\jboss-logmanager-log4j-1.0.0.GA.jar

Here is the error with the next run:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/spi/RepositorySelector
        at java.lang.ClassLoader.findBootstrapClass(Native Method)
        at java.lang.ClassLoader.findBootstrapClass0(
        at java.lang.ClassLoader.loadClass(
        at java.lang.ClassLoader.loadClass(
        at sun.misc.Launcher$AppClassLoader.loadClass(
        at java.lang.ClassLoader.loadClass(
        at org.jboss.modules.ConcurrentClassLoader.performLoadClass(
        at org.jboss.modules.ConcurrentClassLoader.loadClass(
        at java.lang.ClassLoader.loadClassInternal(
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(
        at java.lang.reflect.Method.invoke(
        at org.jboss.modules.Main.main(


We are getting closer. We now need to repeat the above with the Log4j jar/package:

set JAVA_OPTS=%JAVA_OPTS% -javaagent:"C:\agents\discotek.execution-profiler-agent-1.0.jar"=classes="javax.servlet.*|*|org.apache.catalina.*|org.jboss.servlet.*|org.apache.tomcat.*|*|org.apache.naming.resources.*" set JAVA_OPTS=%JAVA_OPTS% -Djava.util.logging.manager=org.jboss.logmanager.LogManager set JAVA_OPTS=%JAVA_OPTS% -Djboss.modules.system.pkgs=org.jboss.byteman,org.jboss.logmanager,org.jboss.logmanager.log4j,org.apache.log4j set JAVA_OPTS=%JAVA_OPTS% -Xbootclasspath/a:C:\java\jboss-as-7.1.1.Final\modules\org\jboss\logmanager\main\jboss-logmanager-1.2.2.GA.jar;C:\java\jboss-as-7.1.1.Final\modules\org\jboss\logmanager\log4j\main\jboss-logmanager-log4j-1.0.0.GA.jar;C:\java\jboss-as-7.1.1.Final\modules\org\apache\log4j\main\log4j-1.2.16.jar

Here is the error with the next run:

19:45:42,578 ERROR [] JBAS015956: Caught exception during boot: JBAS01676: Failed to parse configuration
        at [jboss-as-controller-7.1.1.Final.jar:7.1.1.Final]
        at [jboss-as-server-7.1.1.Final.jar:7.1.1.Final]
        at$ [jboss-as-controller-7.1.1.Final.jar:7.1.1.Final]
        at [rt.jar:1.7.0-ea]
Caused by: Failed to load module
        at [jboss-as-controller-7.1.1.Final.jar:7.1.1.Final]
        at [jboss-as-server-7.1.1.Final.jar:7.1.1.Final]
        at [jboss-as-server-7.1.1.Final.jar:7.1.1.Final]
        at [jboss-as-server-7.1.1.Final.jar:7.1.1.Final]
        at org.jboss.staxmapper.XMLMapperImpl.processNested( [staxmapper-1.1.0.Final.jar:1.1.0.Final]
        at org.jboss.staxmapper.XMLMapperImpl.parseDocument( [staxmapper-1.1.0.Final.jar:1.1.0.Final]
        at [jboss-as-controller-7.1.1.Final.jar:7.1.1.Final]
        ... 3 more
Caused by: java.util.concurrent.ExecutionException: java.util.ServiceConfigurationError: Provider could not be instantiated: java.lang.NoClassDefFoundError: ca/discotek/profiler/Recorder
        at java.util.concurrent.FutureTask$Sync.innerGet( [rt.jar:1.7.0-ea]
        at java.util.concurrent.FutureTask.get( [rt.jar:1.7.0-ea]
        at [jboss-as-controller-7.1.1.Final.jar:7.1.1.Final]
        ... 9 more


It may not be obvious from this error, but we have gotten over the initial hurdle of JBoss loading the JBoss java.util.logging implementation. The relevant information in this error is java.lang.NoClassDefFoundError: ca/discotek/profiler/Recorder. References to the Recorder class are inserted into classes instrumented by Execution Profiler. This error tells us that JBoss doesn’t know how to find the ca.discotek.profiler classes. We now have to repeat the above process for this package. However, javaagent classes are already on the system classpath, so we won’t need to modify -Xbootclasspath:

set JAVA_OPTS=%JAVA_OPTS% -javaagent:"C:\agents\discotek.execution-profiler-agent-1.0.jar"=classes="javax.servlet.*|*|org.apache.catalina.*|org.jboss.servlet.*|org.apache.tomcat.*|*|org.apache.naming.resources.*" set JAVA_OPTS=%JAVA_OPTS% -Djava.util.logging.manager=org.jboss.logmanager.LogManager set JAVA_OPTS=%JAVA_OPTS% -Djboss.modules.system.pkgs=org.jboss.byteman,org.jboss.logmanager,org.jboss.logmanager.log4j,org.apache.log4j,ca.discotek.profiler set JAVA_OPTS=%JAVA_OPTS% -Xbootclasspath/a:C:\java\jboss-as-7.1.1.Final\modules\org\jboss\logmanager\main\jboss-logmanager-1.2.2.GA.jar;C:\java\jboss-as-7.1.1.Final\modules\org\jboss\logmanager\log4j\main\jboss-logmanager-log4j-1.0.0.GA.jar;C:\java\jboss-as-7.1.1.Final\modules\org\apache\log4j\main\log4j-1.2.16.jar

With these final changes, Execution Profiler now works with JBoss. I hope if you run into similar issues, the above information can help resolve your own problems with JBoss.

BTW, it is worth noting that I use a version of Execution Profiler that is not obfuscated, but the publicly available one is. The obfuscated version has package paths ca.discotek.profiler and discotek_exprof, whereas the version documented above uses only ca.discotek.profiler. To get the public/obfuscated version to work with JBoss, you will probably need to modify the working configuration above to include discotek_exprof in the jboss.modules.system.pkgs entry like this:

set JAVA_OPTS=%JAVA_OPTS% -javaagent:"C:\agents\discotek.execution-profiler-agent-1.0.jar"=classes="javax.servlet.*|*|org.apache.catalina.*|org.jboss.servlet.*|org.apache.tomcat.*|*|org.apache.naming.resources.*" set JAVA_OPTS=%JAVA_OPTS% -Djava.util.logging.manager=org.jboss.logmanager.LogManager set JAVA_OPTS=%JAVA_OPTS% -Djboss.modules.system.pkgs=org.jboss.byteman,org.jboss.logmanager,org.jboss.logmanager.log4j,org.apache.log4j,ca.discotek.profiler,discotek_exprof set JAVA_OPTS=%JAVA_OPTS% -Xbootclasspath/a:C:\java\jboss-as-7.1.1.Final\modules\org\jboss\logmanager\main\jboss-logmanager-1.2.2.GA.jar;C:\java\jboss-as-7.1.1.Final\modules\org\jboss\logmanager\log4j\main\jboss-logmanager-log4j-1.0.0.GA.jar;C:\java\jboss-as-7.1.1.Final\modules\org\apache\log4j\main\log4j-1.2.16.jar




Posted in Byte Code Engineering, Developer Tool, JBoss | Leave a comment

JRebel Alternative: Feenix 2.2 beta is ready!

Newcomers to this blog may not know about’s Feenix project, but will almost certainly have heard about JRebel, the class and framework reloading software.’s original class reloading project, Feenix, used the Instrumentation API and was vastly inferior to JRebel (further discussion here). However, Feenix has been entirely re-written and now supports class reloading in a fashion similar to JRebel. Further, Feenix is free and the first beta version is now available!

Before I go any further, I’d like to give praise to the original author of JRebel, Jevgeni Kabanov. As I developed this new version of Feenix, I had to overcome many complex problems. Some of these problems were so complex, that I only persevered to find solutions because I knew they had already been solved by others. Not only did the Jevgeni conceive the JRebel approach to class reloading, but solved each problem without the certainty it could be done. Needless to say, I am impressed, but then again, I am no PhD either.

The title of this blog is JRebel Alternative…, but JRebel does much more than Feenix. For now, Feenix just does class reloading; it does not reload resource bundles or frameworks. Adding support for resource bundles should not be difficult, but the thought of supporting every version of every framework is daunting. I’ll probably just add support for major frameworks like Spring, JSF, etc first (although, I am sure this is more easily said than done). Feenix also doesn’t handle anonymous inner classes (but until it does, this issue can be overcome by giving your inner classes a name). On the other hand, Feenix does support the following:

  1. Basic class reloading
  2. Provide access to fields in an object that only exist in future versions of the class used to instantiate the object (kinda mind blowing, right?)
  3. Invoke methods on an object that did not exist in the version of the class used to instantiate the object (again, very cool – am I right?)
  4. A remote class loading feature similar to the now defunct LiveRebel

The rest of this blog is going to explain a little about the magic behind bullets 2 and 3 in the preceding list, explain how to use Feenix, and show you where to download it. BTW, the first bullet above has already been covered in a previous blog post.

Adding Future Fields to a Class/Instance

By revealing this trick, I actually feel a bit like a magician who explains how to cut someone in half. Its actually really simple and you will roll your eyes afterwards. Every type in Java can be referenced as either a primitive or a java.lang.Object and every field in a class has a unique name. If we create a class with accessors for any type of value or object, which stores these values or objects in an internal Map, you have a construct that can store values and objects that may be defined in future versions of a class. Here is what my implementation looks like:

public class PhantomFieldHolder {

    public static final byte DEFAULT_BYTE_VALUE = 0; 
    public static final short DEFAULT_SHORT_VALUE = 0; 
    public static final int DEFAULT_INT_VALUE = 0;
    public static final long DEFAULT_LONG_VALUE = 0L; 
    public static final float DEFAULT_FLOAT_VALUE = 0.0f; 
    public static final double DEFAULT_DOUBLE_VALUE = 0.0d; 
    public static final char DEFAULT_CHAR_VALUE = '\u0000';
    public static final boolean DEFAULT_BOOLEAN_VALUE = false; 

    Map map = new HashMap();

    public void setBoolean(String name, boolean value) {
        map.put(name, value);

    public boolean getBoolean(String name) {
        Boolean value = (Boolean) map.get(name);
        return value == null ? DEFAULT_BOOLEAN_VALUE : value;

    public void setByte(String name, byte value) {
        map.put(name, value);

    public byte getByte(String name) {
        Byte value = (Byte) map.get(name);
        return value == null ? DEFAULT_BYTE_VALUE : value;

    public void setShort(String name, short value) {
        map.put(name, value);

    public short getShort(String name) {
        Short value = (Short) map.get(name);
        return value == null ? DEFAULT_SHORT_VALUE : value;

    public void setInt(String name, int value) {
        map.put(name, value);

    public int getInt(String name) {
        Integer value = (Integer) map.get(name);
        return value == null ? DEFAULT_INT_VALUE : value;

    public void setFloat(String name, float value) {
        map.put(name, value);

    public float getFloat(String name) {
        Float value = (Float) map.get(name);
        return value == null ? DEFAULT_FLOAT_VALUE : value;

    public void setDouble(String name, double value) {
        map.put(name, value);

    public double getDouble(String name) {
        Double value = (Double) map.get(name);
        return value == null ? DEFAULT_DOUBLE_VALUE : value;

    public void setLong(String name, long value) {
        map.put(name, value);

    public long getLong(String name) {
        Long value = (Long) map.get(name);
        return value == null ? DEFAULT_LONG_VALUE : value;

    public void setObject(String name, Object value) {
        map.put(name, value);

    public Object getObject(String name) {
        return map.get(name);

You must also instrument every class (within the specified class reloading namespace) such that it has two PhantomFieldHolder fields (one each for static and non-static fields). The rest of the trick involves instrumenting the byte code of any class that accesses these fields. If the field is defined in a give class, access it normally. Otherwise, access it through the PhantomFieldHolder.

Adding Future Methods to Class/Instance

There is even less magic to adding future methods. We have already seen half of this trick in a previous blog. Specifically, every method is instrumented such that when invoked, it checks to see if there is an updated version of itself. If there is, a special version of the updated class is generated, which implements an interface the original class is aware of (it contains a known invoke(…) method). The original class forwards execution via the invoke method. If there is no updated class, it simply continues execution as it normally would.

Clever readers may ask, what happens when newer code wants to invoke new methods on an old object? For example, if you instantiated an instance of Car with only an accelerate() method, but later updated that class definition such that it also has a brake() method, how can you invoke the new brake() method on the original Car object? The answer here is similar to above in access fields. Classes must be instrumented such that when invoking a method in a reloadable class, if the method does not exist in the class you are invoking, it must exist in a newer version of the class. Hence, just as above, generate the new special version of the newer class and invoke its invoke(…) method with the appropriate parameters. Admittedly, I am making it sound easier than it is and I am not sure I implemented this functionality 100% correctly, but it seems to work well in my testing thus far.

Configuring Feenix

Standalone Mode

In standalone mode, Feenix will reload classes of a given namespace (e.g. com.example.*) from a provided repository of classes. For now, the repository must be a file system directory. In the near future, jar/zip, war, and ear respositories will likely be added.

Feenix is a Java agent and is integrated using the -javaagent JVM parameter. The syntax is as follows:

java -javaagent:<path to Feenix agent jar>=<path to Feenix configuration file> [-noverify] <program> <program parameters>

In case the syntax is at all complicated, I’ll provide a real example. I built a simple text editor in Swing to help test Feenix. It is bundled with the Feenix distribution, which looks like this:

Let’s extract the distribution to /java/feenix/. Next, for the sake of this example, let’s assume you set up your IDE to output class files to /java/projects/feenix-editor/bin. To run the editor’s main class, ca.discotek.feenixtest.FeenixTestGui, you would invoke:

java -classpath /java/projects/feenix-editor/bin ca.discotek.feenixtest.FeenixTestGui

Before we can add Feenix agent parameters to this command line, we need to create a configuration file. Let’s create file /java/project/feenix-editor/ In this file we’ll add the following properties:

  1. project-name=testgui
  2. feenix_enabled=true
  3. class_inclusion=ca\.discotek\.feenixtest.*
  4. feenix_classpath_entry=/java/projects/feenix-editor/bin

The project-name property is used by Feenix internally (But make sure the value consists of valid characters for your file system. You may get unexpected results if you use a slash, colon, question mark, etc). The feenix_enabled property allows you to turn off Feenix functionality without having to modify the Feenix configuration or program command line. You may have multiple class_inclusion=… definitions. This property’s value is a Java regular expression which represents the namespace of the classes you wish to reload. The feenix_classpath_entry property defines a directory location where Feenix can find new versions of your classes. You may also have multiple feenix_classpath_entry definitions. Not shown above is the class_exclusion property. It is used to exclude class from the included set.

To invoke the program with Feenix configured, you invoke:

java -classpath /java/projects/feenix-editor/bin -noverify -javaagent:/java/feenix/discotek.feenix-agent-2.2-beta.jar=/java/project/feenix-editor/ ca.discotek.feenixtest.FeenixTestGui

You’ll notice in the syntax above there is an optional -noverify JVM parameter. If you want to have reloadable constructors, you will need to use the -noverify flag. The problem stems from Feenix needing to insert code before a constructor’s required call to super(…). Specifically, Feenix must insert the if there is a newer version of this class, forward the execution etc code ahead of the call to super(…). Inserting instructions in this manner will cause JVM verification errors at run-time. If you don’t include the -noverify parameter, Feenix will not add the required code to reload constructors.

Note, the GUI for the Remote Mode below comes with an editor for creating configuration files.

Remote Mode

Feenix allows you to develop your code on one machine and execute it on another (this is similar to JRebel’s LiveRebel software, for which they have discontinued support). To configure and run the server that provides your newly developed classes, run one of the following commands:

java -jar /java/feenix/discotek.feenix-gui-2.2-beta.jar


java -classpath /java/feenix/discotek.feenix-gui-2.2-beta.jar ca.discotek.feenix.gui.FeenixProjectManagerGui

Here is a screen shot of the server GUI:

Let’s create a new project and configure it to serve classes to remote clients. First, click the New… button to enter a new project name:

Next, let’s click the Class Inclusions Add… button:

This dialog may provide two options for adding inclusions: Manual Edit and Select from Running JVM. The Manual Edit option is always available, but the Select from Running JVM is only available if you are running ca.discotek.feenix.gui.FeenixProjectManagerGui from a JDK JRE and its <jdkhome>/lib/tools.jar file is on the classpath. You need to add it to the classpath yourself. The command line examples above do not include tools.jar.

If you select the Manual Edit option, you’ll be prompted with another dialog with a text field in which you can enter the class inclusion regular expression. If you select the Select from Running JVM option, you’ll be presented with a dialog that allows you to select a package from the class namespace of a running JVM. This is a convenience feature to help speed up configuration time and eliminate the errors that might occur while manually editing the regular expression. Be sure to have your target application running in a JVM before launching the dialog (of course if the target JVM is only running remotely, this feature may not be very helpful). In the following dialog…

…I have three Java processes running. 1884 is the FeenixProjectManagerGui class we are currently using to create a configuration. 1472 is the JVM running an instance of Eclipse. 1904 is the target JVM running the simple editor test code. When the 1904 process is selected, the JVM’s loaded classes will be displayed. The following dialog…

…shows you the package path expanded such that ca.discotek.feenixtest can be selected and the field below it shows the regular expression to be used as an inclusion. Note, you can select multiple packages and/or classes. Once you click Okay, the regular expression(s) will show up in the inclusion table.

You can edit exclusions in the exact same manner as inclusions.

To add a classpath entry, where Feenix will discover your new classes, click the Classpath Add… button. You will be presented with a dialog similar to the following, in which you can manually type the file system path or select it with a file system browser:

Click the Okay button and our Feenix configuration editor will look similar to this:

Next, let’s look at the Remote tab:

The Host Name field is used to enter the host address that the local ServerSocket will bind to. The Port value is the port the ServerSocket will bind to. The Poll Frequency field allow you to configure out often Feenix will poll the classpath for updates. The Feenix server functionality is implemented as follows:

  1. When the server is started, it will bind to the address and port as specified above.
  2. It will then wait for clients to connect to the server.
  3. The server will poll the classpath for updates at the specified frequency.
  4. When updates are discovered, they will be propagated to each client.

You can start the server by clicking the Start button in the GUI editor:

Once the server has started, you can connect a client. A client is simply the target JVM configured to use Feenix, but has an additional JVM property, feenix-remote-enabled, which explicitly tells Feenix, it should find its new classes via a remote server. The following shows how the above standalone configuration command line would be modified to find classes remotely:

java -classpath /java/projects/feenix-editor/bin -noverify -Dfeenix-remote-enabled=true -javaagent:/java/feenix/discotek.feenix-agent-2.2-beta.jar=/java/project/feenix-editor/ ca.discotek.feenixtest.FeenixTestGui

In the standalone configuration, the configuraton file requires properties for the location of new classes and properties for their inclusion and exclusion. In a remote configuration, the remote client doesn’t need these properties. It just needs the properties for discovering the server. The following is adequate:


However, if you wanted to use the server configuration file to configure the remote client, you can. The remote client will simply ignore polling frequency, classpath, inclusion, and exclusion properties when in remote mode. To this end, you can export the server configuration (File->Export Configuration…) for use on a remote client.

Final Notes

If you download Feenix and use its classloading functionality, I have some notes that may be worth reading first.

  • I’d love to hear your feedback. You can do that here:
  • In hindsight, the class inclusion and exclusion properties are probably not necessary and will likely disappear in future versions. Be sure to let me know if you think they are useful.
  • If you want to use the example code provided with the distribution to evaluate Feenix, I suggest you comment out everything except the main method of FeenixTestGui. From there, uncomment chunks of code and recompile to see Feenix’s reload functionality in action.
  • This software has a built-in expiry date of 6 months from the date it was built. This is to ensure that no beta versions of Feenix are kept around. Once the beta is over, there will be no expiry date and builds will never expire (unless there are future beta builds).
  • At this time, I would not recommend using Feenix (standalone or remote mode) in production.
  • Feenix is not open source at this time. It may be in the future, but it will always be free.
  • Feenix only does class reloading at this time (no resource bundles, web frameworks, EJBs etc)
  • At this time, Feenix does not handle anonymous inner classes (just name your inner classes so they aren’t anonymous).
  • When possible, Feenix will attempt to call Instrumentation.redefineClasses(…) to redefine the class for existing objects. This is mostly important for improving the efficiency of accessing fields and methods (i.e. it is more efficient to run classes that don’t require reloading functionality discussed in this article). By default, Feenix does not tell you when this is not possible. If you need more information about when class can and cannot be reloaded, you can add the following JVM parameter: -Dfeenix-verbose=true
  • Stack traces for reloaded classes will look a little strange, but the class names and line numbers they provide will match up with your source code.
  • Remember to use -noverify or otherwise, your constructors won’t be reloaded.
  • Lastly, go easy on me – JRebel has millions of dollars in venture capital and in 2012 had 50-100 developers, but apparently tripled in size in 2015. I am just one dude who starts coding after his wife and kids are asleep at night!


You can download the Feenix 2.2-beta distribution here.
Update! Feenix 3.0-beta is now available. Download it here.

If you liked this article and would like to be notified of future articles or Feenix releases, follow on twitter.

Posted in Byte Code Engineering, Developer Tool | Tagged , , , , | Leave a comment

Find JVM Memory Leaks with Instrumentation and PhantomReferences

Over the past year or so there has been quite a lot of attention on finding memory leaks in a JVM. Memory leaks can cause havoc in a JVM. They can be unpredictable and result in costly performance degradation or even down time during server restarts. Ideally, memory leaks are found long before an application runs in production. Unfortunately, lower-state testing is often not adequate enough to cause performance degradation or OutOfMemoryErrors. Sometimes, a leak slowly steals memory over weeks or months of up-time. This type of leak is difficult to detect before production, but not impossible. This article will outline one approach to finding such leaks without costing you a penny!

The nuts and bolts of this approach rely on PhantomReferences and instrumentation. PhantomReferences are not commonly used, so let’s take a closer look. In Java, there are three types of Reference: WeakReference, SoftReference, and PhantomReference. Most developers are familiar with WeakReferences. They simply do not prevent garbage collection of the objects they reference. SoftReferences are similar to WeakReferences, but will sometimes prevent their referent objects from being garbage collected. SoftReferences will likely prevent an object from being garbage collected if the available memory is deemed plentiful during garbage collection. Lastly, PhantomReferences are almost nothing like their weak and soft siblings in that they are not intended for an application to directly hold these references. PhantomReferences are used as a notification mechanism of when an object is about to be garbage collected. The Javadoc says “Phantom references are most often used for scheduling pre-mortem cleanup actions in a more flexible way than is possible with the Java finalization mechanism.”. We are not interested in performing any clean-up, but we will record when an object is garbage collected.

Instrumentation is the other integral functionality. Instrumentation is the process of altering the bytecode of a class before it is loaded by the VM. It is a powerful feature of Java and can be used for monitoring, profiling, and in our case, event logging. We will use instrumentation to modify application classes such that any time an object is instantiated, we will create a PhantomReference for it. The design for this memory leak detection mechanism should be shaping up now. We’ll use instrumentation to coerce classes into telling us when they create objects and we’ll use PhantomReferences to record when they are garbage collected. Lastly, we’ll use a data store to record this data. This data will be the basis for our analysis to determine if objects are being leaked.

Before we go any further, let’s skip to a screen shot of what we’ll be able to do at the end of this article.

This graph displays the number of objects that have been allocated, but not garbage collected over time. The running code is Plumbr’s own memory leak demo application. Plumbr added a couple of memory leaks to Spring framework‘s Pet Clinic sample application. The graph will make more sense when you see the relevant leaky code:

public class LeakingInterceptor extends HandlerInterceptorAdapter {

  static List<ImageEntry> lastUsedImages = 
      Collections.synchronizedList(new LinkedList<ImageEntry>());

  private final byte[] imageBytes;

  public LeakingInterceptor(Resource res) throws IOException {
    imageBytes = FileCopyUtils.copyToByteArray(res.getInputStream());

  public boolean preHandle
      (HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
    byte[] image = new byte[imageBytes.length];
    System.arraycopy(imageBytes, 0, image, 0, image.length);
    lastUsedImages.add(new ImageEntry(image));
    return true;

With every request, this interceptor leaks an ImageEntry object, which references a byte array. Hence, after only 10 refreshes of the PetClinic’s front page, you can already see the leaky trend in the graph.

Now, let’s start writing some code. The first class we are going to need is an interface that the instrumented bytecode will call when a class is instantiated. Let’s call this class “Recorder” and create a static method for receiving objects:

class Recorder {
    public static void record(Object o) {

The implementation details of this method are out of the scope of this article. Now that we have an interface, we can work on our instrumentation code. Instrumentation is a broad topic, but we’ll keep the scope narrowed to our project. We’ll use ASM (from ObjectWeb) to perform the bytecode manipulation. This guide assumes you are already familiar with ASM. If you are not, you may want to take some time to brush-up first.

Simply put, we want to modify any application code that instantiates a new object such that it will call our Recorder.record(…) method with the new object as a parameter. To identify “application code”, we’ll allow the user to provide a set of regular expressions that will specify the set of classes to be included and classes to be excluded. We’ll create a class called Configuration to load a configuration properties file, which contains this information. This class will be used to determine if a class should be instrumented. Later, we’ll use it to define some other properties as well.

Instrumentation occurs at runtime as classes are loaded. Instrumentation is performed by “agents”, which are packaged in a jar file. If you are not familiar with agents, you can check out the java.lang.instrument package’s javadoc documentation. The entry point into an agent is the agent class. Here are the possible method signatures for the agent entry points:

	public static void premain(String agentArgs, Instrumentation inst);

	public static void agentmain(String agentArgs, Instrumentation inst);

The premain method is invoked when the agent is specified using the “-javaagent” parameter of the JVM command line at start up. The agent main method is invoked if the agent is attached to an existing JVM. Our agent will be most useful if it is applied at start up. Conceivably, you could attach the agent after you have discovered a memory leak, but it can only provide memory leak data that it has recorded since it was attached. Further it will only instrument classes that get loaded after it has attached. It is possible to force the JVM to redefine classes, but our component will not provide this functionality.

Let’s call our agent class HeapsterAgent and give the above methods a body each:

	public static void premain(String agentArgs, Instrumentation inst) {
		configure(agentArgs, inst);

	public static void agentmain(String agentArgs, Instrumentation inst) {
		configure(agentArgs, inst);

The initialization process for both entry points will be identical, so we’ll cover them in a single configuration method. We’ll skip over the most of the Configuration implementation details to focus on instrumentation. We’ll want our class to implement java.lang.instrument.ClassFileTransformer interface. When a ClassFileTransformer is registered with the JVM, it is given the opportunity to modify classes as they are loaded. Our HeapsterAgent class now has this signature:

public class HeapsterAgent implements ClassFileTransformer

The configure method needs to register an instance of HeapsterAgent with the JVM in order to intercept the classes being loaded. Here is the code:

    inst.addTransformer(new MemoryTraceAgent());

“inst” is the Instrumentation parameter of the configure(…) method.

Clever readers might already be thinking “How will the application classloaders discover the new Recorder class?”. There are a couple of solutions to this problem. We could use the Instrumentation class’ appendToBootstrapClassLoaderSearch(JarFile jarfile) method to append the relevant classes to the boot classpath, where classes should be discoverable by the application classloaders. However, in order to discover leaked classes, the ClassLoader class itself must be instrumented. This can only be done effectively by creating your own jar containing java.lang.ClassLoader and superseding the JRE’s own java.lang.ClassLoader using the -Xbootclasspath/p parameter. Hence, we may as well pack the other supporting classes in the same jar.

Let’s now move on to the transform method. This method is provided in the ClassFileTransformer interface. Here is the full signature:

public byte[] transform(ClassLoader loader, String className,
            Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
            byte[] classfileBuffer) throws IllegalClassFormatException;

This is where the instrumentation magic starts happening. This method is invoked every time the JVM loads a class. The most important parameters for us are the className and the classfileBuffer. className will help us determine if the class is an application class and classfileBuffer is the byte array of bytecode that we may wish to modify. Let’s take a look at how we’ll eliminate which classes to modify. Obviously we only want to modify application classes, so we’ll compare the className parameter to our inclusions and exclusions. Keep in mind that className is in its internal format and for name separators uses slashes (/) instead of dots (.). We also don’t want to instrument our agent code. We’ll be able to control that by comparing the package path of className with our own code base. Lastly, while developing this code, I isolated several Oracle classes that should simply never be instrumented (there are probably more). However, in general, if you are looking for leaks in your application, you can probably ignore java.*, javax.*, sun.* etc. I have hard coded some of these into the transform method. If you think there is a bug in Oracle code, you can always disable this filtering. However, I recommend that you be sparing with the Oracle code that you instrument from core packages like java.lang. It is very unlikely that you are the first to find a bug in these classes and you can send your JVM into an unrecoverable tailspin.

The last part of the transform method is the actual transformation. Here is the important code:

    public byte[] transform(ClassLoader loader, String className,
        Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
        byte[] classfileBuffer) throws IllegalClassFormatException {

        if (className.startsWith("java") || className.startsWith("sun")) return null;

        if (!isAgentClass(dotName) && configuration.isIncluded(dotName)) {
            ClassWriter writer = new ClassWriter(ClassWriter.COMPUTE_MAXS);
            Transformer transformer = new Transformer(writer);
            ClassReader reader = new ClassReader(classfileBuffer);
            reader.accept(transformer, ClassReader.EXPAND_FRAMES);
            return writer.toByteArray();
        else return null;

If the fully qualified class name starts with “java” or “sun”, we return null. Returning null is your agent’s way of saying “I am not interested in transforming this class”. Next we check to see if className matches an agent class by calling isAgentClass(…). Here is the implementation:

boolean isAgentClass(String className) {
    return className.startsWith("") || 

You’ll notice in the above code snippet that I have changed the base package name for the ASM classes from org.objectweb.asm to The agent classes will be made available on the boot classpath. If I had not changed the package namespace of the agent’s ASM classes, other tools or applications running in the JVM might unintentionally use the agent’s ASM classes.

The rest of the transform method is fairly basic ASM operations. However, we now need to take a close look at how the HeapsterTransformer class works. As you might guess, HeapsterTransformer extends the ClassVisitor class and overrides the visit method:

	public void visit(int version, int access, String name, String signature, String superName, String[] interfaces) {
		super.visit(version, access, name, signature, superName, interfaces);
		this.className = name;
		this.superName = superName;

It records the class name and the super class name for later use.

The visitMethod method is also overridden:

	public MethodVisitor visitMethod
	    (int access, String name, String desc, String signature, String[] exceptions) {
		return new HeapsterMethodVisitor
		    (name, access, desc, super.visitMethod(access, name, desc, signature, exceptions));

It forces our own MethodVisitor called HeapsterMethodVisitor to be visited. HeapsterMethodVisitor will need to add some local variables so it subclasses LocalVariableSorter. The constructor parameters include the method name, which it records for later use. The other methods that HeapsterMethodVisitor override are: visitMethodInsn, visitTypeInsn, and visitIntInsn. One might think we can get it all done in visitMethodInsn by adding code when we see an invocation to a constructor (<init>), but unfortunately, it is just not that simple. First, let’s review what we are trying to accomplish. We want to record each time an application object is instantiated. This can happen in a number of ways. The most obvious one is via a “new” operation. But what about Class.newInstance() or when an ObjectInputStream is deserialized via a readObject method? These methods do not use the “new” operator. Also, what about arrays? Creating an array is not a visitMethodInsn instruction, but we’ll want to record them too. Needless to say, assembling the code to capture all these events is tricky.

Let’s first take a look at the visitMethodInsn method. Here is the first statement:

    if (opcode == Opcodes.INVOKESPECIAL && name.equals("<init>") &&  
        !isIgnorableConstructorCall(className, methodName, owner, superName)) {

Opcodes.INVOKESPECIAL indicates that either a constructor is being called or a static initializer. We only care about calls to constructors. Additionally, we don’t care about all constructor calls. Specifically, we only care about the call to the first constructor, not the chain of constructors calls from a constructor to its super class constructor. This is why it was important to record the superClass name earlier. We use a method called isIgnorableConstructorCall to determine if we want to instrument or not:

    boolean isIgnorableConstructorCall(String className, String containingMethodName, String owner, String superName) {
        if (owner.equals(className) && containingMethodName.equals("<init>")) return true;
        else if (owner.equals("java/lang/Object")) return true;
        else return superName.equals(owner);    

The first if-statement checks if a constructor is calling another constructor within the same class (e.g. this(…)). The second line checks if the constructor call is being called on type java.lang.Object. When using ASM, any class whose super class is java.lang.Object, the super class will be specified as null. This prevents a NullPointerException from happening in the third line where we check that a contructor call to a super class is being called (e.g. super(…)). An object of type java.lang.Object will have a null super class type.

Now that we have established which constructors we can ignore, let’s get back to visitMethodInsn. Once the constructor invocation has been completed, we can record the object:

    mv.visitMethodInsn(opcode,  owner,  name,  desc);

The first line is identical to the original bytecode instruction. The second line calls addRecorderDotRecord(). This method contains the bytecode to call our Recorder class. We’ll reuse this several times, so it is in its own method. Here is the code:

    void addRecorderDotRecord() {
            (Opcodes.INVOKESTATIC, Recorder.class.getName().replace('.', '/'), "record", "(Ljava/lang/Object;)V");

This should all appear fairly straightforward if you understand ASM, but there is one unexplained omission that should be obvious to a bytecode expert. Java is stack based. When we called the original method:

    mv.visitMethodInsn(opcode,  owner,  name,  desc);

…it popped the new object off the stack. But addRecorderDotRecord‘s instruction expects the new object to still be on the stack. And when it completes, it will pop off the new object. This doesn’t make sense because we haven’t examined the rest of the overridden methods. Let’s skip down to visitTypeInsn(…). Here is the first half:

    public void visitTypeInsn(int opcode, String type) {
        mv.visitTypeInsn(opcode,  type);
        if (opcode == Opcodes.NEW) {

A visitTypeInsn with Opcode.NEW as a parameter will immediately precede a call to the object’s constructor. Furthermore, the JVM specification disallows you from calling others method before an object is initialized. By using visitTypeInsn first and visitMethodInsn second, we are able to add an extra reference to the object on the stack which can be used as the parameter to our Recorder.record(…) method.

Now let’s take a look at the else-if-statement of visitMethodInsn. The methods newInstance, clone, and readObject are special methods that can instantiate an object without using the “new” operator. When we come across these methods, we create a duplicate of the object reference on the stack (using addDup()) and then call our Recorder.record(…) method, which will pop our duplicate object reference off the stack. Here is the addDup() method:

    void addDup() {

We have already partially examined the the visitTypeInsn method, but let’s now review its entirety:

    public void visitTypeInsn(int opcode, String type) {   
        mv.visitTypeInsn(opcode,  type);
        if (opcode == Opcodes.NEW) {
        else if (opcode == Opcodes.ANEWARRAY) {

The first line of this method ensures the original instruction is executed. We have already discussed the if-statement, which is used to duplicate an object instantiated with the “new” operator before we add the call to Recorder.record(…). The else-if-statement handles creating 1-dimensional arrays of non-primitive types. In this case we add a duplicate array object reference on the stack and then call Recorder.record(…) which pops it off.

Next we have visitMultiANewArrayInsn:

            mv.visitMultiANewArrayInsn(desc,  dims);

This method is fairly simple to understand. The first line creates a new array of multiple dimensions. The second line pushes a duplicate reference on the stack and the third line calls our Recorder.record(…) method which pops the duplicate off the stack.

Lastly, we have visitIntInsn:

        public void visitIntInsn(int opcode, int operand)  {
            mv.visitIntInsn(opcode,  operand);

            if (opcode == Opcodes.NEWARRAY || opcode == Opcodes.MULTIANEWARRAY) {

This method handles the byte code operation of creating an array of primitives and a multi-dimensional array. The if-statement identifies these operations and its body ensures the original instruction is executed, then duplicates array object reference on the stack and then we call Recorder.record(…) which pops it off.

Let’s now change gears and review the ASM code to generate our custom java.lang.ClassLoader method. As mentioned earlier, we need to define our own java.lang.ClassLoader to record classes as they are defined. There is a ClassLoaderGenerator class, which does the grunt work of extracting the java.lang.ClassLoader class from the rt.jar file of our target JRE, but let’s drill in to the ASM code in ClassLoaderClassVisitor. Much of the code in this class is not particularly interesting. Let’s go right to the visitMethodInsn method of its MethodVistor class:

        public void visitMethodInsn
            (int opcode, String owner, String name, String desc, boolean isInterface) {
            mv.visitMethodInsn(opcode,  owner,  name,  desc, isInterface);

            if (opcode == Opcodes.INVOKEVIRTUAL &&
                includedMethodNameList.contains(name)) {
                    ("Instrumenting method invocation: " + owner + "." + name + ": " + desc);

                int variableIndex = newLocal(Type.getType(Class.class));
                visitVarInsn(Opcodes.ASTORE, variableIndex);
                visitVarInsn(Opcodes.ALOAD, variableIndex);
                visitVarInsn(Opcodes.ALOAD, variableIndex);

Line 3 invokes the original instruction. The if-statement of lines 5-6 identify the instruction as a defineClass method. The defineClass methods (namely, defineClass0, defineClass1, defineClass2) are native methods that return the java.lang.Class object. By capturing these calls, we can capture when classes are created. Lines 10-13 create a local variable to store the java.lang.Class object, create the call to Recorder.record(…), and place Class back on the stack. FYI In other ASM code, I used a dup instruction to duplicate the reference, but when I ran the code it didn’t cooperate, which lead me to use a local variable instead.

We have now covered all the required instrumentation. The other main concept to document is the use of PhantomReferences. We have already discussed how PhantomReferences can inform us when an object is garbage collected, but how does this help us track memory leaks? If we use PhantomReferences to reference each application object, we can eliminate objects as leaky if they are regularly garbage collected. The remaining set of objects become our candidate set. If we can observe a trend of increasing object counts for a particular type, it is quite likely that we have identified a leak. You should note that, these trends that persist beyond major garbage collections are even more likely to be leaky. However, this code does not consider garbage collections at this time.

We’ll now return to the Recorder class to examine the PhantomReference functionality. The record method has the following code:

    long classId = dataStore.newObjectEvent(o, System.currentTimeMillis());
    PhantomReference<Object> ref = new PhantomReference<Object>(o, queue);
    map.put(ref, classId);

The first line references a variable called dataStore. The data store is an implementation detail. I have implemented an in-memory data store, but I want to focus on PhantomReferences, so we’ll ignore these details for now. dataStore is an instance of BufferedDataStore, which has the following method signature:

    public long newObjectEvent(Object o, long time) throws NameNotFoundException;

This method takes the newly instantiated object as a parameter and the time the object was created. The method returns the a long value representing a unique identifier for the object’s type. The next step is to create the PhantomReference. We pass a ReferenceQueue into the PhantomReference‘s constructor which registers it as an object we want to be notified of when it is garbage collected. Lastly, we store the reference and its associated class id in a map. These lines will make more sense after looking at the code that listens to the queue

    static class RemoverThread extends Thread {
        public RemoverThread() {

        public void run() {
            while (alive) {
                PhantomReference ref = (PhantomReference) queue.remove();
                Long classId = map.remove(ref);
                dataStore.objectGcEvent(classId, System.currentTimeMillis());


This is a class that is defined inside the Recorder class. It is a daemon thread, which means that it will not prevent the JVM from exiting even if it is still alive. The run method contains a while loop that will run forever unless the stop method is called to change the alive property. The ReferenceQueue.remove() method blocks until there is a PhantomReference to remove. Once a PhantomReference appears, we look up the classId from the map. Then we record the event by calling the dataStore‘s objectGcEvent method.

We have now covered how to instrument application classes to insert the Recorder.record(…) method, how to create PhantomReferences for these objects, and how to respond to their garbage collection events. We now have the ability to record when objects are created and when they are garbage collected. Once this core functionality is established, you can implement your leak detector in a variety of ways. This code base uses an in-memory data store. This type of data source consumes the same heap space as your application to store data, so it is not recommended for long term leak detection (in other words, it is a memory leak itself!). A more sensible option for long term detection would be to store the data in a real database.

Another aspect of this leak detector is the approach to identifying a leak. Some leak detectors may tell you “I found a leak!”, but this one does not. It provides you with a graph of the top leak candidates and allows the user to evaluate which objects are in fact leaky. This means you have to be proactive about discovering leaks. However, this code could easily be improved upon. You could develop an algorithm for isolating memory leaks that informs the user reactively. There are also other possible improvements. The user interface for this tool is a line graph of the objects with the most potential for leaking. It is quite normal for object counts to climb when there is plenty of memory. Hence, one improvement would be be to record and draw the major garbage collections on the graph. Knowing that potentially leaky objects survive garbage collections is a very good indicator of a memory leak.

There is a lot of code in this project we have not covered. For instance, the code to find trends, to graph the data, or how to query the data was not covered. The intent of this article was to demonstrate how memory leak data can be collected, which makes unrelated code outside of the scope. However, there is one more topic we’ll cover, which is how to configure and run this software.

The type of data store used by the agent in the target JVM is determined by a data-store-class property in the configuration file. For now, there is only an in memory implementation, ca.discotek.heapster.datastore.MemoryDataStore, for storing leak data. As is, it is a terrible idea because it is a leak itself. It does not have an eviction policy and will eventually cause an OutOfMemoryError. When the MemoryDataStore initializes, it sets up a server socket, which clients can use to request data. The MemoryDataStore uses the configuration file to get the server port number. It also uses it to set the log level (which you probably don’t need to adjust, but valid values are trace, info, warn, and error.). The inclusion property is a Java language regular expression used to specify your application classes to be instrumented for memory leak detection. You can also specify an exclusion property to exclude name spaces from the inclusion property.

To connect to your server, you’ll need to run a client. There is a generic ca.discotek.heapster.client.gui.ClientGui, which looks up the client-class property in a configuration file. Since our agent is configured to use the MemoryDataStore class, we want our client to use the ca.discotek.heapster.client.MemoryClient to connect to the MemoryDataStore server. The MemoryClient class looks up the server port in the configuration file. To keep my configuration simple, I put both the server and client properties in one test.cfg configuration file. If your target JVM is on a different machine than your client, you’d have to have separate configuration files. Here is what I have been using:
I created a ca.discotek.heapster.client.MemoryClient class. This client uses the configuration file to look up the port


The inclusion property specify the namespace of my test app called LeakTester, which can create various types of objects in various ways. Here’s a screen shot:

In order to override the JVM’s java.lang.ClassLoader, we’ll generate our own bootstrap jar and use the -Xbootclasspath/p JVM flag to insert our bootstrap jar at the end of bootclasspath. We will have to perform this task for ever different JRE version of target JVM. There may be internal API changes between versions that would break compatibility if you used the generated ClassLoader class from JRE X with JRE Y.

Let’s assume you have downloaded the memory leak distribution and extracted it to /temp/heapster. Let’s also assume that your target JVM JRE version is 1.6.0_05. First, will create the directory /temp/heapster/1.6.0_05 to house the jar we are about to generate. Next we’ll run the following command:

java -jar /temp/heapster/heapster-bootpath-generator.jar /java/jdk1.6.0_05/jre/lib/rt.jar /temp/heapster/1.6.0_05

The second and third program arguments specify the location of the rt.jar of the target JVM and the location where you want to store the generated jar. This command will generate a heapster-classloader.jar in /temp/heapster/1.6.0_05.

Assuming you want to run the LeakTester app bundled with this project, you would run the following command:

/java/jdk1.6.0_05/bin/java -Xbootclasspath/p:/temp/heapster/1.6.0_05/heapster-classloader.jar -javaagent:/temp/heapster/discotek-heapster-agent.jar=/temp/heapster/config/test.cfg ca.discotek.testheapster.LeakTester

Next up, let’s run the client:

/java/jdk1.6.0_05/bin/java -classpath /temp/heapster/discotek-heapster-client.jar;/temp/heapster/discotek-graph-1.0.jar ca.discotek.heapster.client.gui.ClientGui /temp/heapster/config/test.cfg

You should now see a window similar to the screen shot above. However, that screen shot uses the Plumbr sample application, not my LeakTester app. If you’d like to see a graph of the Plumbr sample app, you can do the following:

  1. Get the Plumbr sample app running as per their instructions.
  2. Open the demo/start.bat file in an editor.
  3. In the Java command line toward the bottom, place -agentlib:plumbr -javaagent:..\..\plumbr.jar with -Xbootclasspath/p:/temp/heapster/1.6.0_05/heapster-classloader.jar -javaagent:/temp/heapster/discotek-heapster-agent.jar=/temp/heapster/config/test.cfg
  4. Save your changes.
  5. Open /temp/heapster/config/test.cfg in an editor.
  6. Change the inclusion property to inclusion=.*petclinic.*
  7. Save your changes.
  8. Run the Plumbr sample app as you did before.
  9. Start the ClientGui using the exact same command line as we used for the LeakTester scenario.

Note, you can generate traffic with the Plumbr demo two ways: 1. Use the create_usage.bat to drive traffic with JMeter, or 2. open the app in a browser (http://localhost:18080). I recommend you use a browser so you can control the traffic and observe the consequences of each page refresh.

This article has been an introduction into how one might find memory leaks with instrumentation and PhantomReferences. It is not meant to be a complete product. The following features could be added to improve the project:

  1. Indicate major garbage collections on the graph
  2. Allow for stack traces to be collected when leaky objects are instantiated to reveal the buggy sourc code
  3. Store the instantiation and garbage collection data in a database to avoid the application becoming a memory leak itself
  4. ClientGui could graph the available heap and permgen (similar to JConsole) (which might be useful to cross reference against the object graphs)
  5. Provide a mechanism to expunge instantiation and garbage collection data

If you liked this article and would like to read more, see other byte code engineering articles at or follow on twitter to be notified when my next article is ready on how to instrument classes to gather performance statistics!


Posted in Byte Code Engineering | Leave a comment

Web Security: Interview with an Expert

Most developers probably know a bit about security, but naively think they know enough to keep their apps secure. I don’t like to admit it, but I was in this camp until about a year ago. Unless terms like OWASP, Cross-site Scripting, Injection, Mitigating Factors, Dynamic and Static Code Analysis, and Penetration Testing are part of your vocabulary, you are probably in this camp too.

The topic of web security has never been more relevant with the recent hacking scandals, including the Impact Team hackers who stole  and posted data from to the internet. This past week I had the chance to catch up with a web security expert colleague of mine, Sherif Koussa. Sherif has been in the security business for about eight years working for big name clients such as Wells Fargo, Desjardins, Carleton University, Discover, and various Canadian Government clientele. He is also a member of the Steering Committee for the GIAC‘s GSSP-JAVA and GSSP-NET Exams. In addition, Sherif has authored courseware for a few SANS courses and GIAC exams. If that wasn’t enough, Sherif also leads the Ottawa Chapter of OWASP and was the main force behind OWASP’s WebGoat 5.0. In other words, when Sherif talks security, I listen… and so should you. To this end, I am publishing our exchange so others can benefit from his insight.

What are the most common attacks?

This really depends on the lens you look through. From an application security perspective, OWASP top 10 seems to be still a true reflection of the vulnerabilities we find looking at different applications. Focusing more on Java-specific applications, other than injection attacks like SQL Injection and Command injection attacks, Cross-site Scripting in particular seems to be a little bit more wide spread in Java applications vs other languages like .NET, because of the lack of any automated support from the platform like what other platforms like .NET or Ruby-on-Rails have.

From a different perspective, malware seems to be an increasingly dangerous and ever complicated problem. Where attackers use common delivery methods like email attachments, email links, games, and other means to deliver their payloads, and these payloads usually attack well-known vulnerabilities that are not patched yet or zero-days which don’t have patches available yet.

Are there any languages/platforms that are particularly vulnerable?

Yes, there are platforms that offer more security protection and features out of the box. The new platforms seem to be better than older. For example, the node.js stack seems to offer more than Ruby on Rails, which seems to offer more than .NET which seems to offer more than Java, etc. However, this is not really always an advantage because there are always ways to turn off these features or work around them. So, to put things into perspective a bit, .NET comes with some protection against Cross-site scripting out of the box. It is not great but it is good enough to disable some cross-site scripting attack vectors, yet we see so many developers disabling that because it messes up some of their features. Another example, .NET offers a very simple CSRF mitigation that is simply not followed by a lot of developers. Even in Java, it is very common to find developers using PreparedStatements in 99% of the cases, but guess what, attackers will find the 1% of the statements that are using String concatenation instead of PreparedStatements and will exploit that and gain unauthorized access to sensitive information or even compromise the whole network. Keep in mind, defenders are always at a disadvantage. They have to fix all the vulnerabilities while attackers just need one vulnerability. Attackers have more time, tools and skills than developers could ever have. Developers start to have an advantage when they understand what they are protecting, what do they need to do in order that, what the platform offers out of the box and what they need to do to bridge the gap. This combined with understanding attackers ways and mindset, only now can they take back the advantage and control their applications and data rather than being controlled.

What advice do you have for IT shops who are not application security experts?

While not all IT Shops have internal application security expertise, this does not mean they can’t write secure applications. Secure code is the something comparable to quality code. To release quality bug-free code, you have to have developers who write good code but you also have to have good quality assurance engineers who test that code or a really solid extensive set of test cases, but you can’t just code and release. You can’t just depend on the quality assurance engineers to find everything. The same goes for security. Software developers have to write secure code – use prepared statements, filter and cleanse user input, etc. Then application security experts would pick up from where the developers left. The role of security experts is to find those vulnerabilities that slipped through or test your applications against the latest emerging threats and attacks. To build an internal solid security practices, a company should start by increasing awareness of security attacks and their risk on the organization. Afterward, some simple steps should be followed:

  1. Add low-maintenance security touchpoint to the SDLC such as security design review, a light-weight security check list or a security specific static code analysis tool.
  2. Gradually raise the bar for security bugs, this could happen through following a customized security guidelines or through the company’s regular code reviews.
  3. Appointing someone as a security champion for people to ask questions helped a lot of organizations.

These are just examples for activities that an organization could do without hiring extra staff or budget. Of course, how you go from there depends on the dynamics of each organization and where are they now and where they want to go.

Are you working on anything other than consulting?

One of the cool things we are currently working on right now is a tool that helps organizations compare the risk that comes from using different open-source components and libraries. If you are choosing a new piece of open-source, it also helps you pick the most secure software, helps you understand where the risk comes  from and ultimately how to deal with it.

Lastly, can you describe a typical day in the life of a security consultant?

Part of the fun and challenge of being a security consultant is the variety of work. Most clients require assessment work, but they all have differing technology stacks (i.e. varying infrastructure, programming languages, frameworks, web vs. mobile etc). The assessment work usually includes secure code reviews (which includes static code analysis and manual code review) and dynamic assessments (which includes vulnerability scanning and penetration testing). In addition to the assessment work, I also carry out my own security research, create and update training materials, and of course, deliver these training materials.

Sherif can be reached through his consulting company Software Secured, or via his LinkedIn profile.

Posted in web security | Leave a comment

JRebel Unloaded


Welcome to the second installment of the series on byte code engineering. The first article, an overview of byte code engineering, can be found here.

JRebel is indisputably the industry leading class reloading software. It is a useful product has earned its reputation by helping to expedite Java development for many organizations. How this product works is a mystery to most. I’d like to explain how I think it works and provide a basic prototype (with source code).

Since the adoption of application servers to insulate business logic from generic plumbing logic, developers have been suffering through the time consuming process of building and redeploying before testing server side code changes. The larger the application, the longer the build/redeploy cycle tends to be. For a developer who tests frequently, the time spent building and redeploying can consume a significant part of a work day. The actual cost to a project can be equated to the number of developers * salary/per hour * number of hours spent building and redeploying. This figure does not have to be just the cost of doing business.

Some time ago when I was exploring instrumentation, I wrote a product called Feenix, which I thought would help people overcome the same class reloading as JRebel, but that didn’t happen. The product still exists on my web site, but I doubt anyone actually uses it. For now, I keep it there as a painful reminder of my failure, which should inspire me to build a better one. I didn’t understand why my product failed until Anton Arhipov (who has been directly involved with JRebel), provided some insightful criticism:

Feenix can do as much as the Java Instrumentation API allows it to do. Which basically means it doesn't really add value on top of standard HotSwap of the JVM.

There are several products that provide a mechanism to modify class functionality in a running JVM, but they are not all created equal. Probably the most well known is Java’s built-in hotswap, which IDE’s like Eclipse take advantage of in debug mode. Others, like Feenix, take advantage of Java’s built-in instrumentation API. Due to limitations of the JVM, most of these attempts fall short. Specifically, the JVM limits the types of changes allowed to a loaded class. For instance, the JVM will not allow you to change the class schema. This means you cannot change the number of fields or methods or their signatures. You also cannot change the inheritance hierarchy. They also cannot alter the behavior of existing objects. Unfortunately, this dramatically diminishes the utility of these products.

Enter JRebel. JRebel appears to be the most functional and praised class reloading product in the marketplace. It has very few shortcomings and appears to be extremely well supported. JRebel is a commercial product and is likely to be prohibitively expensive to most developers who pay for tools out of their own pocket. The JRebel supporters have published some articles discussing how they have solved various class reloading problems, but as they are a commercial product they naturally do not discuss implementation in detail. Knowing the details may lead to an alternative open source product. If there is enough interest, I’ll integrate the JRebel style class reloading into Feenix and open source it.

Creating a class reloading mechanism (CRM) must solve several problems:

  1. The CRM must be aware of the where the new versions of classes are located. These classes may be on a local disk or in a remote location. They may be bundled in a jar, war, or ear.
  2. While not technically class loading, the CRM should also support the reloading of non-class resources like images or html files.
  3. The CRM should ensure that when a classloader loads a class for the first time, it loads the latest version. Despite a class being already loaded by a classloader, the CRM should ensure new instances of a class will use the functionality of the latest version of a class.
  4. The CRM should ensure that the functionality of existing objects should use the functionality of the latest version of its class.
  5. While class reloading is clearly the core functionality required by any CRM, there are common frameworks used in many applications whose re-configuration would require a build/redeploy cycle. These changes ought to be less frequent than code changes, but there is still value in providing reload functionality of this kind.

The fourth problem above dwarfs the others in terms of complexity, but also usefulness. It is less expensive for application servers to reuse pooled objects rather than always create new instances. Unless a CRM can make pooled instances aware of class changes, it will serve very little purpose. The JRebel developers claim to do “class versioning” to solve these problems, but leave much room for interpretation of the implementation. We know that class loaders may only load a class once. The exception to this rule is instrumentation, but we know this isn’t how JRebel has solved this problem (mainly because they are open about it, but also) because instrumentation will not allow the class schema to be changed. Another approach to CRM design is commonly known as “throw-away classloaders”, which uses a new class loader to load each new version of a class. This design has many drawbacks, but above all cannot solve the problem of introducing new functionality to existing objects.

To introduce new functionality to existing objects, their execution must be forwarded to a method which contains the new functionality. As a class loader may load a given class only once, the new functionality must be hosted in a class with a new unique name. However, a class cannot know the name of it’s successor at compile- or run-time. We can use instrumentation to modify a class as it is loaded, but we won’t know the names of its successors until the CRM detects new compiled classes and makes them available to the JVM. Two mechanisms could be used to forward execution to its successor: reflection or an interface. Reflection can inspect a class’ methods and invoke the method with the matching name and signature. Reflection is known to be slow and is not suitable to be applied to every method invocation. Alternatively, an interface could be created which defines a method to allow invocation of any method in the successor class generically. Such a method might have the following name and signature:

public Object invoke(int methodId, Object invoker, Object args[]);

If the newer version of a given class implements this interface, execution can forwarded to the appropriate method. The methodId parameter is used to determine the method. The invoker parameter provides access to the state (fields) of the original object, and the args parameter provides the new method with access to the original method’s arguments.

A working solution has many more moving parts than the above outline. It also introduces two additional problems to solve. Each call to a reloaded object’s method will produce an extra unexpected frame on the stack, which may be confusing to developers. Any use of reflection on reloaded classes may not behave properly (given the class name has changed and an invoke method has been added, the inheritance hierachy doesn’t exist, etc). Identifying such problems is important as well as providing working solutions. Solving all of the above problems in one article will probably lead to heavy eyelids. Instead, let’s focus on a rudimentary implementation of the class forwarding functionality. We can always revisit the other issues in another article if there is interest.

This article will cover the following functional parts of a class reloading mechanism:

  1. A central component for discovering and managing class versions
  2. Generate a successor class and the interface to reference it
  3. Modify an application class to forward method calls to its successors
  4. Modify java.lang.ClassLoader to install the above functionality

Before diving into the details, I’d like to warn you that I have re-written this article twice. Despite my keen interest in byte code engineering, even I was boring myself to tears writing explanations of the ASM code. Consequently, this third and hopefully final draft will contain much less ASM code than the others. It will focus more on how class reloading works, but you can always refer to the source code in the Resources section to see the implementation details.

Class Reloading Mechanism Design

The Class Version Manager (AKA ClassManager) is going to have several jobs:

  • Load a configuration which specifies the name-space of classes to reload and where to find them
  • Determine if a class version is out-dated
  • Provide the byte code for:
    • the new versions of a given class
    • the generic invokable interface class
    • the interface implementation class (which contains the new functionality)

If I discuss all of the above in detail, this article will be longer than War and Peace. Instead, I’ll gloss over the details that are not directly related to byte code engineering. For detailed information
on the configuration, you can look in ca.discotek.feenix.Configuraton and the static initializer of ca.discotek.feenix.ClassManager. Here is a sample configuration file:

 <feenix-configuration project-name="example">

         <!-- alternatively, you can use jar, war, and ear files -->

         <!--  Use the exclude tag to exclude namespaces. It uses a Java regular expression. -->

To specify the location of the configuration file, use the feenix-config system property to specify the fully qualified path.

To determine if a class is outdated, we’ll use the following code found in ca.discotek.feenix.ClassManager:

    static Map<String, Long> classTimestampMap = new HashMap<String, Long>();

    static boolean isOutDated(String className, long timestamp) {
        Long l = classTimestampMap.get(className);
        if (l == null) {
            classTimestampMap.put(className, timestamp);
            return false;
        else {
            classTimestampMap.put(className, timestamp);
            return timestamp > l;

The caller passes in the name of the class and the timestamp of class they wish to test.

The last task of the Class Manager is to provide class byte code, but let’s first revisit exactly how classes will be reloaded. One important step is overriding the JVM’s java.lang.ClassLoader class such that it can instrument application classes as they are loaded. Each application class will have the following functionality inserted into the start of each method: if a new class version exists, forward execution to the corresponding method in an instance of that new class. Let’s look closer with a simple example of an application class:

class Printer {
    public void printMessage(String message) {

The above class would be instrumented by our special java.lang.ClassLoader to look something like this:

class Printer {

    Printer_interface printerInterface = null;

    static void check_update() {
        Printer_interface localPrinterInterface = ClassManager.getUpdate(ca.discotek.feenix.example.Printer.class);
        if (localPrinterInterface != null)
            printerInterface = localPrinterInterface;

    public void printMessage(String message) {
        if (printerInterface != null) {
            printerInterface.invoke(0, this, new Object[]{message});
        else {

The modified version of Print class has the following changes:

  • The Printer_interface printerInterface field was added.
  • The check_update method was added.
  • The printMessagemethod now has the logic:
    1. Check for a class update
    2. If an update exists, invoke the corresponding method in the new class.
    3. Otherwise, execute the original code

The check_update method calls ClassManager.getUpdate(…). This method will determine if an update is available and if so, generate a new implementation class:

    public static Object getUpdate(Class type) {
        String dotClassName = type.getName();
        String slashClassName = dotClassName.replace('.', '/');

        File file = db.getFile(slashClassName + ".class");
        if (file != null && file.isFile()) {
            long lastModified = file.lastModified();
            if (isOutDated(dotClassName, lastModified)) {
                String newName = slashClassName + IMPLEMENTATION_SUFFIX + getNextVersion(slashClassName);
                byte bytes[] = getClassBytes(newName);
                try { 
                    Method method = ClassLoader.class.getDeclaredMethod("defineMyClass", new Class[]{String.class, byte[].class});
                    Class newType = (Class) method.invoke(type.getClassLoader(), new Object[]{newName.replace('/', '.'), bytes});
                    return newType.newInstance();
                catch (Exception e) {

        return null;

Once getUpdate(…) has called ClassManager.getClassBytes(…) to retrieve the raw bytes representing the class, it will use reflection to call a defineMyClass method in java.lang.ClassLoader. defineMyClass is a method we’ll add later when we generate a custom java.lang.ClassLoader class. To convert raw bytes to a java.lang.Class object, you need to have access to the defineClass methods in java.lang.ClassLoader, but they all are restricted to protected access. Hence, we add our own public method which will forward the call to a defineClass method. We need to access the method using reflection as it does exist at compile time.

The modified Printer class introduces the Printer_interface class and the ClassManager.getUpdate(…) method introduces the new version of the Printer class, Printer_impl_0, which implements the Printer_interface interface class. These classes will not exist on the application classpath as they are generated at run-time. We’ll override java.lang.ClassLoader‘s loadClass methods to call getUpdate(…) has called ClassManager.getClassBytes(…) to discover new versions of our application classes and generate the interface and implementation classes as needed. Here is the getUpdate(…) has called getClassBytes(…) method:

    public static byte[] getClassBytes(String slashClassName) {
        if (isInterface(slashClassName)) 
            return InterfaceGenerator.generate(slashClassName, trimInterfaceSuffix(slashClassName));
        else if (isImplementation(slashClassName)) {
            String rootClassName = trimImplementationSuffix(slashClassName);
            File file = db.getFile(rootClassName.replace('.', '/') + ".class");
            if (file != null) 
                return ImplementationGenerator.generate(slashClassName, file);
        else {
            File file = db.getFile(slashClassName + ".class");
            if (file != null) 
                return ModifyClassVisitor.generate(slashClassName, file);

        return null;

There are a lot of implementation details that are not obvious from this method. The isInterface and isImplementation methods examine the class name suffix to make their determinations. If the class name suffix does not match the interface or implementation class known suffix formats, a request is for a regular class.

If the requested class is for the interface class that an implementation class implements, InterfaceGenerator.generate(…) is invoked to generate the interface class. Here is the generated interface’s invoke method for the Printer example:

public java.lang.Object __invoke__(int index, ca.discotek.feenix.example.gui.Printer__interface__, java.lang.Object[])

The ImplementationGenerator class is used to generate the class that implements the interface generated by InterfaceGenerator. This class is larger and more complicated than InterfaceGenerator. It does the following jobs:

  1. Generates the raw byte code for a class with a new namespace. The name will be the same as the original, but with a unique suffix appended.
  2. It copies all methods from the original class, but converts initializer methods to regular methods, with method name __init__ and static initializer names to __clinit__.
  3. For non-static methods, it adds a parameter of type <interface generated by InterfaceGenerator>.
  4. Changes non-static methods that operate on this to operate on the parameter added in the previous bullet.
  5. For constructors, it strips out calls to super.<init>. Regular methods cannot call instance initializers.

The InterfaceGenerator and ImplementationGenerator classes are useless without a way to modify application classes to take advantage of them. ModifyClassVisitor does this job. It adds the check_update method and modifies each method such that it will check for updated classes versions and forward execution to those if they exist. It also changes all fields to be public and non-final. This is necessary so they can be accessed by implementation classes. These attributes are most functional at compile time, but of course these changes may have an effect on applications that use reflection. Solving this problem will have to be put on the to-do list for now, but I suspect it is not all that difficult. The solution probably involves overriding the JRE’s classes reflection classes appropriately (BTW it can also solve problems arising from the use of reflection concerning the methods and fields we have added to application classes).

Let’s now discuss how to modify java.lang.ClassLoader. JRebel generates a bootstrap jar, which contains a new java.lang.ClassLoader class (among others) and supersedes the JRE’s java.lang.ClassLoader using the JVM’s -Xbootclasspath/p: parameter. We’ll also take this approach, but you should note you probably have to perform this task for every version of the target JVM you wish to run. There may be internal API changes between versions that would break compatibility if you used the generated ClassLoader class from JRE X with JRE Y.

To generate a new java.lang.ClassLoader, I have created three classes:

ClassLoaderGenerator does some basic tasks. It is the entry point into the program. It’s main method requires the path to the target JRE’s rt.jar file and the output directory. It pulls the raw bytes from the rt.jar’s java.lang.ClassLoader, it invokes ClassLoaderClassVisitor to produce the raw bytes of our modified java.lang.ClassLoader, and will then bundle the these bytes in a java/lang/ClassLoader.class entry of a feenix-classloader.jar file, which is the deposited to the specified output directory.

ClassLoaderClassVisitor uses ASM to make byte code modifications directly, but it also pulls raw byte code from ClassLoaderTargeted. Specifically, I wrote methods in ClassLoaderTargeted that I wanted to appear in the generated version of java.lang.ClassLoader. While I do enjoy writing byte code instructions directly with ASM, it can be really tedious, especially if you are continually making incremental changes as you develop. By writing the code in Java, this process becomes more like regular Java development (as opposed to byte code level development). This approach may cause some folks to say “But why not use the Asmifier” to generate the ASM code for you? This approach is probably half way between my approach and writing the ASM code from scratch, but running ASM and copying the generated code into ClassLoaderClassVisitor is fairly tedious work too.

Let’s take a look under the hood of ClassLoaderClassVisitor. The first job it will do will be to rename the defineClass and loadClass methods (we will add our own defineClass and loadClass methods later):

    public MethodVisitor visitMethod(int access,
            String name,
            String desc,
            String signature,
            String[] exceptions) {

        MethodVisitor mv = super.visitMethod(access, METHOD_NAME_UTIL.processName(name), desc, signature, exceptions);
        if (name.equals(LOAD_CLASS_METHOD_NAME) && desc.equals("(Ljava/lang/String;)Ljava/lang/Class;"))
            return new InvokeMethodNameReplacerMethodVisitor(mv, methodNameUtil);
        else if (name.equals(DEFINE_CLASS_METHOD_NAME))
            return new InvokeMethodNameReplacerMethodVisitor(mv, methodNameUtil);
            return mv;

The visitMethod method of line 7 is called for each method defined in java.lang.ClassLoader. The METHOD_NAME_UTIL is an object that is initialized to replace Strings match “defineClass” or “loadClass” with the same name, but with a “_feenix_” prefix. ClassLoader’s loadClass(String name) method calls loadClass(String name, boolean resolve) Lines 8-9 are used to update any method instructions in the new _feenix_loadClass(String name) method such that _feenix_loadClass(String name, boolean resolve) is called instead. Similarly, lines 10-11 ensure that the new _feenix_defineClass methods will always call other _feenix_defineClass methods and not the defineClass methods.

The other interesting part of ClassLoaderClassVisitor is the visitEnd method:

    public void visitEnd() {
        try {
            InputStream is = 
                Thread.currentThread().getContextClassLoader().getResourceAsStream(ClassLoaderTargeted.class.getName().replace('.', '/') + ".class");
            ClassReader cr = new ClassReader(is);
            ClassNode node = new UpdateMethodInvocationsClassNode();
            cr.accept(node, ClassReader.SKIP_FRAMES);

            Iterator<MethodNode> it = node.methods.listIterator();
            MethodNode method;
            String exceptions[];
            while (it.hasNext()) {
                method =;
                if ( || 

                    exceptions = method.exceptions == null ? null : method.exceptions.toArray(new String[method.exceptions.size()]);
                    MethodVisitor mv = super.visitMethod(method.access,, method.desc, method.signature, exceptions);
        catch (Exception e) {
            throw new Error("Unable to create classloader.", e);


This method reads all the methods defined in ClassLoaderTargeted and adds the methods we want (some are just there so that it will compile) to our java.lang.ClassLoader. The methods we want are all the defineClass, loadClass, and defineMyClass methods. There is just one problem with them: some the method instructions in these classes will operate on ClassLoaderTargeted, not java.lang.ClassLoader, so we need to sweep through each method instruction and adjust it accordingly. You’ll notice in line 6 we use a UpdateMethodInvocationsClassNode object to read the ClassLoaderTargeted byte code. This class will update the method instructions as necessary.

Class Reloading in Action

To try out Feenix 2.0 (BTW I am calling it 2.0 to distinguish it from the original 1.0 version, but by no means should this be considered a fully functioning finalized distribution) for yourself, do the following:

  1. Download the Feenix 2.0 distribution and unpack the zip. Let’s say you put it in /projects/feenix-2.0.
  2. Let’s assume your target JVM is located at /java/jdk1.7.0. Run the following command to generate the feenix-classloader.jar file in the /projects/feenix-2.0 directory:
        /java/jdk1.7.0/bin/java -jar /projects/feenix-2.0/discotek.feenix-2.0.jar /java/jdk1.7.0/jre/lib/rt.jar /projects/feenix-2.0
  1. Download the example project into directory /projects/feenix-example and unpack into that directory.
  2. Create a project in your favourite IDE that you will use to edit the example project code.
  3. Configure the /projects/feenix-example/feenix.xml file to point to the directory that contains the project’s compiled classes. If you are Eclipse, you can probably skip this step as it already points to the project’s bin directory.
  4. Using your IDE, run ca.discotek.feenix.example.Example with the following JVM options:
        -Xbootclasspath/p:C:\projects\feenix-2.0\feenix-classloader.jar;C:\projects\feenix-2.0\discotek.feenix-2.0.jar -noverify -Dfeenix-config=C:\projects\feenix-example\cfg\feenix.xml
  1. A window will appear with three buttons. Click each button to generate some baseline text.
    1. Print from Existing Printer. Demonstrates how you can alter the functionality for an existing object.
    2. Print from New Printer. Demonstrates how you can alter the functionality for new objects.
    3. Print Static. Demonstrates how you can alter the functionality for a static method.
  2. Navigate to the ca.discotek.feenix.example.gui.Printer class and modify the text for the message field. Navigate to ca.discotek.feenix.example.gui.ExampleGui and modify the Printer.printStatic‘s String parameter. Save your changes to cause the IDE to compile the new classes.
  3. Click each button in the window again and observe your changes.

This concludes our investigation into class reloading. You should keep in mind that this demonstration is a proof of concept and may not work as expected with your own project code (it is not thoroughly tested). You should also keep in mind the following points:

  • I should mention that the -noverify JVM parameter is required in order to allow constructors to be reloaded.
  • The code to override java.lang.ClassLoader does not override defineTransformedClass.
  • There are still some outstanding issues (mainly related to reflection).
  • There is still a major problem with accessing fields or methods that only exist in new versions of a class.
  • Should consider using the synthetic modifier to any generated fields or methods.
  • Feenix uses a rebundled copy of ASM. It is rebundled with the ca.discotek.rebundled package prefix to avoid class clashes when an application requires ASM on the classpath for its own purposes.
  • Some of the Class Reloading Mechanism goals listed in the introduction were not addressed (does not reload non-class resources or framework configuration files).


Next Blog in the Series Teaser

I would be surprised if anyone who stays up with the latest Java news has not yet heard of Plumbr. Plumbr uses a java agent to identify memory leaks in your application. At the time of writing, Plumbr is “$139 per JVM per month”. OUCH! In my next byte code engineering blog, I’ll show you how you can identify memory leaks in your code for free using instrumentation and Phantom References.

If you enjoyed this article, you may wish to follow discotek on twitter.

Posted in Byte Code Engineering | 11 Comments

jstack and jmap

When diagnosing performance issues, the JDK/bin tools, jstack and jmap, are two key sources of information to discovering what might be causing problems in your JVM(s). Running the jstack utility will cause each thread to dump its stack and jmap will take a snapshot of your entire heap. The jstack utility is fairly straightforward, whereas a jmap dump provides much more information. I’ll write more on the usefulness of a jmap dumps in the future. However, a comparison of the similar features of jstack and jmap is also worth discussing.

In order to illustrate the output of jstack and jmap, we’ll need to run some sample code. The following snippet is a simple servlet, which outputs the date and time to standard out. I have added methods methodOne, methodTwo, methodThree, and methodFour, which are functionally irrelevant to the program, but will be useful for demonstrating the power of a jmap.

public class SampleServlet extends HttpServlet {

    SimpleDateFormat FORMAT = new SimpleDateFormat("dd-MM-yyyy hh:mm:ss SSS");

    protected  void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
        PrintWriter writer = resp.getWriter();

        writer.println("The time is now " + FORMAT.format(new Date()));



    void methodOne(int a) {

    void methodTwo(String b) {

    void methodThree() {
        ThreadLocal local = new ThreadLocal();
        local.set(" rocks!");


    void methodFour() {
        try { Thread.sleep(10 * 1000); }
        catch (Exception e) {
        System.out.println("Waking up...");

The above code is bundled in a war file, which you can download from the resources section at the end of this article. The war should be deployable to any web container. The URL may differ depending on your choice of web container. With JBoss, you can access the app at http://localhost:8080/discotek.sample-webapp/sample. Once the application is invoked, it will hang for ten seconds. This is intentional in order to give us time to run our utilities.

Before we can invoke the jstack and jmap commands, we will need to know the ID of the java process. We can get this information in a number of different ways:

  • On Unix/Linux boxes, we can use the ps command to list the running processes with their IDs. On a Windows box we can use Task Manager‘s Processes tab.
  • The JDK/bin jconsole utility has many features, but will also list the running Java processes with their IDs.
  • The JDK/bin jps command line utility will output a list of all running Java processes with their IDs.

Now that we have the PID for the Java process, let’s start by running the jstack utility. The syntax is:

    jstack [-l] 
        (to connect to running process)

    -l  long listing. Prints additional information about locks
    -h or -help to print this help message

jstack will dump a stack trace for each active thread in the JVM. I am using JBoss to run the sample web application, which has too many threads to display here. Here is the output for the thread we are interested in:

"http--" daemon prio=6 tid=0x35e3f400 nid=0x450 waiting on condition [0x36dcf000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at ca.discotek.sample.webapp.SampleServlet.methodFour(
        at ca.discotek.sample.webapp.SampleServlet.methodThree(
        at ca.discotek.sample.webapp.SampleServlet.methodTwo(
        at ca.discotek.sample.webapp.SampleServlet.methodOne(
        at ca.discotek.sample.webapp.SampleServlet.doGet(
        at javax.servlet.http.HttpServlet.service(
        at javax.servlet.http.HttpServlet.service(
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(
        at org.apache.catalina.core.StandardWrapperValve.invoke(
        at org.apache.catalina.core.StandardContextValve.invoke(
        at org.apache.catalina.core.StandardHostValve.invoke(
        at org.apache.catalina.valves.ErrorReportValve.invoke(
        at org.apache.catalina.valves.AccessLogValve.invoke(
        at org.apache.catalina.core.StandardEngineValve.invoke(
        at org.apache.catalina.connector.CoyoteAdapter.service(
        at org.apache.coyote.http11.Http11Processor.process(
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(
        at Source)

   Locked ownable synchronizers:
        - None

The java.lang.Thread.State enumeration class provides the states a thread may be in:

A thread can be in one of the following states:

  • NEWA thread that has not yet started is in this state.
  • RUNNABLEA thread executing in the Java virtual machine is in this state.
  • BLOCKEDA thread that is blocked waiting for a monitor lock
    is in this state.
  • WAITINGA thread that is waiting indefinitely for another thread to
    perform a particular action is in this state.
  • TIMED_WAITINGA thread that is waiting for another thread to perform an action
    for up to a specified waiting time is in this state.
  • TERMINATEDA thread that has exited is in this state.

The Thread.sleep(10 * 1000); code causes our thread to be in a TIMED_WAITING state. Knowing the state of each thread is important for identifying many problems including infinite loops and excessive memory consumption/retention. In this example, we have purposely added a poorly performing method to the servlet, so there is nothing to diagnose.

Unfortunately, jstack information is not always useful. Specifically, if you are diagnosing an OutOfMemoryError, the code causing the memory leak may not be running in a thread at the time the jstack was executed. In this case, the stack traces might be misleading to the inexperienced eye.

The jmap utility will produce a heap dump of a given JVM. The jmap process is more heavyweight than jstack. Both utilities cause a “stop the world” pause in the JVM, but a jstack is almost instantaneous. The speed of jstacks is particularly relevant to production servers where responsiveness is critical. A jmap’s speed depends on a couple of factors. It depends on the amount of heap consumed and the complexity of the object graph within the heap. Generally, I expect most admins would not hesistate to take a jstack, but would consider current volume, size of the heap, etc before taking a jmap. Conceivably, under some circumstances, an immediate restart of the server may result in less loss to the business. Of course, the down-time of a single server would be more affordable in a clustered environment.

While jstacks are speedy, the amount and type of data you can get from a jmap heap dump is far superior to that of a jstack. First of all, it is possible to re-create each thread’s stack from a heap dump. Here is the Usage output from the jmap utility:

    jmap -histo 
      (to connect to running process and print histogram of java object heap
    jmap -dump: 
      (to connect to running process and dump java heap)

      format=b     binary default
      file=  dump heap to 

    Example:       jmap -dump:format=b,file=heap.bin

The histo option will only waste your time (and in the same breath, I might as well add that the jhat utility is too). Instead, create heap dumps using the a command similar to the usage example. You can skip the format=b parameter (b is binary which is the default) and I prefer to use a file extension of “.hprof” for the file name, rather than “.bin” as it is more indicative of the file contents, whereas bin could represent any number of file types).

The industry standard tool for analyzing jmap heap dumps is MAT (Memory Analyzer Tool). Alternative tools, like Oracle’s JVisualVM which is bundled with the JDK, are convenient, but simply don’t have the investigative power of MAT. One such example of MAT’s superiority is its ability to re-create each thread’s stack. Assuming you have MAT available, open the heap dump via the File->Open Heap Dump… menu item and select your heap dump file. Once loaded, you may be presented with a Getting Started Wizard, which will generate various reports. Hit cancel, then navigate to the Open Query Browser button’s drop down menu…

…and click through to Java Basics->Thread Overview and Stacks:

Just click the Finish button on the resulting dialog and you’ll be presented with a table similar to the following:

Let’s now get back to comparing jstack and jmap. The above screen shot illustrates how a jmap can provide the same thread name information, but it also provides the thread class name, the thread’s shallow and retained heap (infinitely useful, but isn’t the topic of this blog), the thread’s context classloader, and whether it is a daemon thread. Furthermore, the data is presented in a succinct table, whereas the jstack output is raw text which is more tedious to sift through.

Let’s now drill into a stack. An easy way to find the thread for our example servlet is to recall the name of the thread from the jstack output: “http–″. Here is the stack view:

This should look similar to the jstack output, but there are some subtle differences. The jmap doesn’t provide you with thread lock information. Wait – that statement may not be entirely true. The lock information is probably in the dump, but MAT doesn’t assemble it with the thread information. In any case, other than speed, this is probably the only advantage of jstack over jmap. The other difference is that the jmap provides the full method signatures, whereas jstack only provides the class and method names. It will include line number information, but only if it is available (line number data is optional and can be omitted at compile time). If the full method signature is available, there is no guess work required for overloaded methods if the line number information isn’t available. It should be noted that the method signature is provided using the JVM’s internal type and method descriptors. The descriptors are easy to understand, but explaining them is outside of scope. See this blog on byte code engineering for details.

In addition to the jstack information, you can also examine thread and local variables. The jstack information might help you understand what code is causing the problem, but it may leave you wondering why. The ability to exmaine the stack variables allows you to know the exact data that your methods were operating on. For example, it might answer the question “how could this possibly be an infinite loop?”.

We’ll now take a look at the variables on the stack that were created when the sample servlet was invoked. The doGet method calls methodOne with a parameter value of 2. methodOne names this parameter a. If we expand the methodOne stack element, we should expect to see the local variable a with value 2, but that does not happen:

We only see the SampleServlet object, which is the this object and is always a local variable for all non-static methods. Evidently, MAT is not clever enough to determine primitive local variables. You’ll notice that MAT only exposes objects whether you are looking stacks, histograms, or any other MAT view.

Let’s now take a look at methodTwo which takes an object type, java.lang.String, as a parameter:

As expected, the second local variable is a String of value “Hi”. methodThree declares a ThreadLocal variable, which programmatically makes no sense as this variable will become inaccessible once the method has exited. It only makes sense to declare ThreadLocal variables at class-scope. Placing it in a method was strictly for presentation. In any case, thread local variable values don’t show up in the stack either:

We see the ThreadLocal object, but it references no value because the value is associated with the thread. To find the value, we’ll have to examine the thread’s threadLocals instance variable. I am going to drill into the thread by right-clicking on it and clicking through to the pop-up menu item List object->with outgoing references:

The result will be a table with a single row. Expand the thread node to view a table similar to the following:

I have drawn a box around the last row. This row indicates there are more rows to expand. Right-click on the row and click the bottom menu item, Expand All, from the pop-up menu. Next, scroll toward the bottom of the table until you can see the threadLocalsvariable. Expand that node. Under this node, expand the table variable node. Here are all the ThreadLocal objects for this thread. To find the methodThree ThreadLocal variable, we’ll have to find the java.lang.ThreadLocal$ThreadLocalMap$Entry object with a referent variable whose memory address (0x2b87d20) matches the ThreadLocal instance we are looking for. Or since, we have the source code and know what the value will be, we can look for that instead. Here is the result:

Although we have just scratched the surface of what MAT is capable of, it should be obvious that it has tremendous benefit over jstack. jstack’s strengths are its speed and lock information, but if you can afford it, always choose to take a jmap heap dump to aid in diagnosing JVM problems.

The truth is, I didn’t set out to write an article comparing jstack and jmap. Despite examining thread and heap dumps for years, I was inspired by something I learned just the other day, which I want to pass on. The jmap utility has a -F switch to force a heap dump (not documented in the -help output above. See the jmap documentation for details.). Oracle’s Trouble Shooting Guide for HotSpot VM says:

If the jmap pid command does not respond because of a hung process, the -F option can be used 
(on Solaris OS and Linux only) to force the use of the Serviceability Agent.

This switch sounds like it ought to be useful, but it should be avoided if at all possible. The reality is, the -F switch tells the JVM to use the Serviceability Agent interface for extracting the heap data. This alternate interface is extremely slow. A dump that takes five minutes without the -F switch might take an hour with the -F switch. So, never use the -F switch, right? Conceivably, there might be a circumstance where the JVM is hung in a way that such that the default interface does not work. If getting the heap dump is critical, then it might be worth trying the -F switch. You would have to balance down-time against the value of understanding the root cause of the problem.

I also learned something else that you should be aware of. If the user running the JVM process is different than the user running the jmap process, you may get an error that prevents you from taking the heap dump. In this scenario, using the -F switch may allow you to take the dump, despite the mis-matched users. Once again, only use the -F switch in an emergency production situation. If you find yourself requiring the -F switch to circumvent a user mis-match, consider using your operating system’s facilities to temporarily become another user while taking the dump.

If you enjoyed this article, please let me know in the comments and feel free to follow on twitter.


Posted in jdk tools, memory leaks | Tagged , , , , , | Leave a comment

Byte Code Engineering


This blog entry is the first of a multi-part series of articles discussing the merits of byte code engineering and its application. Byte code engineering encompasses the creation of new byte code in the form of classes and the modification of existing byte code. Byte code engineering has many applications. It is used in tools for compilers, class reloading, memory leak detection, and performance monitoring. Also, most application servers use byte code libraries to generate classes at run-time. Byte code engineering is used more often than you think. As a matter of fact, you can find popular byte code engineering libraries bundled in the JRE including BCEL and ASM. Despite its widespread usage, there appears to be very few university or college courses that teach byte code engineering. It is an aspect of programming that developers must learn on their own and for those who don’t, it remains a mysterious black art. The truth is, byte code engineering libraries make learning this field easy and are a gateway to a deeper understanding of JVM internals. The intent of these articles is to provide a starting point and then document some advanced concepts, which will hopefully inspire readers to develop their own skills.


There are a few resources that anyone learning byte code engineering should have handy at all times. The first is the Java Virtual Machine Specification (FYI this page has links to both the language and JVM specifications). Chapter 4, The Class File Format is indispensable. A second resource, which is useful for quick reference is the Wikipedia page entitled Java bytecode instruction listings. In terms of byte code instructions, it is more concise and informative that the JVM specification itself. Another resource to have handy for the beginner is a table of the internal descriptor format for field types. This table is taken directly from the JVM specification.

BaseType Character Type Interpretation
B byte signed byte
C char Unicode character code point in the Basic Multilingual
Plane, encoded with UTF-16
D double double-precision floating-point value
F float single-precision floating-point value
I int integer
J long long integer
L<ClassName>; reference an instance of class <ClassName>
S short signed short
Z boolean true or false
[ reference one array dimension

Most primitive field types simply use the field type's first initial to represent the type internally (i.e. I for int, F for float, etc), however, a long is J and a boolean is Z. Object types are not intuitive. An object type begins with the letter L and ends with a semi-colon. Between these characters is the fully qualified class name, with each name separated by forward slashes. For instance, the internal descriptor for the field type java.lang.Integer is Ljava/lang/Integer;. Lastly, array dimensions are indicated by the the '[' character. For each dimension, insert a '[' character. For instance a two-dimensional int array would be
[[I, whereas a two-dimensional java.lang.Integer array would be [[Ljava/lang/Integer;

Methods also have an internal descriptor format. The format is (<parameter types>)<return type>. All types use the above field type descriptor format above. A void return type is represented by the letter V. There is no separator for parameter types. Here are some examples:

  • A program entry point method of public static final void main(String args[]) would be ([Ljava/lang/String;)V
  • A constructor of the form public Info(int index, java.lang.Object types[], byte bytes[]) would be (I[Ljava/lang/Object;[Z)V
  • A method with signature int getCount() would be ()I

Speaking of constructors, I should also mention that all constructors have an internal method name of <init>. Also, all static initializers in source code are placed into a single static initializer method with internal method name <clinit>.


Before we discuss byte code engineering libraries, there is an essential learning tool bundled in the JDK bin directory called javap. Javap is a program which will disassemble byte code and provide a textual representation. Let's examine what it can do with the compiled version of the following code:

package ca.discotek.helloworld;

public class HelloWorld {

    static String message = 
            "Hello World!";

    public static void main(String[] args) {
        try { 
        catch (Exception e) {

Here is the output from the javap -help command:

Usage: javap  ...

where options include:
   -c                        Disassemble the code
   -classpath <pathlist>     Specify where to find user class files
   -extdirs <dirs>           Override location of installed extensions
   -help                     Print this usage message
   -J<flag>                  Pass  directly to the runtime system
   -l                        Print line number and local variable tables
   -public                   Show only public classes and members
   -protected                Show protected/public classes and members
   -package                  Show package/protected/public classes
                             and members (default)
   -private                  Show all classes and members
   -s                        Print internal type signatures
   -bootclasspath <pathlist> Override location of class files loaded
                             by the bootstrap class loader
   -verbose                  Print stack size, number of locals and args for methods
                             If verifying, print reasons for failure

Here is the output when we use javap to disassemble the HelloWorld program:

javap.exe -classpath "C:\projects\sandbox2\bin" -c -private -s -verbose ca.discotek.helloworld.HelloWorld
Compiled from ""
public class ca.discotek.helloworld.HelloWorld extends java.lang.Object
  SourceFile: ""
  minor version: 0
  major version: 50
  Constant pool:
const #1 = class        #2;     //  ca/discotek/helloworld/HelloWorld
const #2 = Asciz        ca/discotek/helloworld/HelloWorld;
const #3 = class        #4;     //  java/lang/Object
const #4 = Asciz        java/lang/Object;
const #5 = Asciz        message;
const #6 = Asciz        Ljava/lang/String;;
const #7 = Asciz        <clinit>;
const #8 = Asciz        ()V;
const #9 = Asciz        Code;
const #10 = String      #11;    //  Hello World!
const #11 = Asciz       Hello World!;
const #12 = Field       #1.#13; //  ca/discotek/helloworld/HelloWorld.message:Ljava/lang/String;
const #13 = NameAndType #5:#6;//  message:Ljava/lang/String;
const #14 = Asciz       LineNumberTable;
const #15 = Asciz       LocalVariableTable;
const #16 = Asciz       <init>;
const #17 = Method      #3.#18; //  java/lang/Object."<init>":()V
const #18 = NameAndType #16:#8;//  "<init>":()V
const #19 = Asciz       this;
const #20 = Asciz       Lca/discotek/helloworld/HelloWorld;;
const #21 = Asciz       main;
const #22 = Asciz       ([Ljava/lang/String;)V;
const #23 = Field       #24.#26;        //  java/lang/System.out:Ljava/io/PrintStream;
const #24 = class       #25;    //  java/lang/System
const #25 = Asciz       java/lang/System;
const #26 = NameAndType #27:#28;//  out:Ljava/io/PrintStream;
const #27 = Asciz       out;
const #28 = Asciz       Ljava/io/PrintStream;;
const #29 = Method      #30.#32;        //  java/io/PrintStream.println:(Ljava/lang/String;)V
const #30 = class       #31;    //  java/io/PrintStream
const #31 = Asciz       java/io/PrintStream;
const #32 = NameAndType #33:#34;//  println:(Ljava/lang/String;)V
const #33 = Asciz       println;
const #34 = Asciz       (Ljava/lang/String;)V;
const #35 = Method      #36.#38;        //  java/lang/Exception.printStackTrace:()V
const #36 = class       #37;    //  java/lang/Exception
const #37 = Asciz       java/lang/Exception;
const #38 = NameAndType #39:#8;//  printStackTrace:()V
const #39 = Asciz       printStackTrace;
const #40 = Asciz       args;
const #41 = Asciz       [Ljava/lang/String;;
const #42 = Asciz       e;
const #43 = Asciz       Ljava/lang/Exception;;
const #44 = Asciz       StackMapTable;
const #45 = Asciz       SourceFile;
const #46 = Asciz;

static java.lang.String message;
  Signature: Ljava/lang/String;

static {};
  Signature: ()V
   Stack=1, Locals=0, Args_size=0
   0:   ldc     #10; //String Hello World!
   2:   putstatic       #12; //Field message:Ljava/lang/String;
   5:   return
   line 6: 0
   line 5: 2
   line 6: 5

public ca.discotek.helloworld.HelloWorld();
  Signature: ()V
   Stack=1, Locals=1, Args_size=1
   0:   aload_0
   1:   invokespecial   #17; //Method java/lang/Object."<init>":()V
   4:   return
   line 3: 0

   Start  Length  Slot  Name   Signature
   0      5      0    this       Lca/discotek/helloworld/HelloWorld;

public static void main(java.lang.String[]);
  Signature: ([Ljava/lang/String;)V
   Stack=2, Locals=2, Args_size=1
   0:   getstatic       #23; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   getstatic       #12; //Field message:Ljava/lang/String;
   6:   invokevirtual   #29; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   9:   goto    17
   12:  astore_1
   13:  aload_1
   14:  invokevirtual   #35; //Method java/lang/Exception.printStackTrace:()V
   17:  return
  Exception table:
   from   to  target type
     0     9    12   Class java/lang/Exception

   line 10: 0
   line 11: 9
   line 12: 12
   line 13: 13
   line 15: 17

   Start  Length  Slot  Name   Signature
   0      18      0    args       [Ljava/lang/String;
   13      4      1    e       Ljava/lang/Exception;

  StackMapTable: number_of_entries = 2
   frame_type = 76 /* same_locals_1_stack_item */
     stack = [ class java/lang/Exception ]
   frame_type = 4 /* same */


You should note that the -l flag to output line number information was purposely omitted. The -verbose flag outputs other relevant information including line numbers. If both are used the line number information will be printed twice.

Here is an overview of the output:

Line Numbers Description
2 Command line to invoke javap. See javap -help output above for explanation of parameters.
3 Source code file provided by debug information included in byte code.
4 Class signature
5 Source code file provided by debug information included in byte code.
6-7 Major and Minor versions. 50.0 indicates the class was compiled with Java 6.
8-54 The class constant pool.
57-58 Declaration of the message field.
60 Declaration of the static initializer method.
61 Internal method descriptor for method.
63 Stack=1 indicates 1 slot is required on the operand stack. Locals=0 indicates no local variables are required.
Args_size=0 is the number of arguments to the method.
64-66 The byte code instructions to assign the String value Hello World! to the message field.
67-77 If compiled with debug information, each method will have a LineNumberTable. The format of each entry is
<line number of source code>: <starting instruction offset in byte code>. You'll notice that the LineNumberTable
has duplicate entries and seamingly out of order (i.e. 6, 5, 6). It may not seem intuitive, but the compiler assembles the byte code
instructions will target the stack based JVM, which means it will often have to re-arrange instructions.
72 Default constructor signature
73 Default constructor internal method descriptor
75 Stack=1 indicates 1 slot is required on the operand stack. Locals=1 indicates there is one local variable. Method
parameters are treated as local variables. In this case, its the args parameter.
Args_size=1 is the number of arguments to the method.
76-78 Default constructor code. Simply invokes the default constructor of the super class, java.lang.Object.
79-80 Although the default constructor is not explicitly defined, the LineNumberTableindicates that the
default constructor is associated with line 3, where the class signature resides.
82-84 You might be surprised to see an entry in a LocalVariableTable because the default constructor
defines no local variables and has no parameters. However, all non-static methods will define the "this" local
variable, which is what is seen here. The start and length values indicate the scope of the local variable within the method.
The start value indicates the index in the method's byte code array where the scope begins and the length value
indicates the location in the array where the scope ends (i.e. start + length = end). In the constructor, "this"
starts at index 0. This corresponds to the a_load0 instruction at line 78. The length is 5, which covers the entire method as
the last instruction is at index 4. The slot value indicates the order in which it is defined in the method. The name
attribute is the variable name as defined in the source code. The Signature attribute represents the type of variable.
You should note that local variable table information is added for debugging purposes. Assigning identifiers to chunks of memory
is entirely to help humans understand programs better. This information can be excluded from byte code.
86 Main method declaration
87 Main method internal descriptor.
89 Stack=2 indicates 2 slots are required on the operand stack. Locals=2 indicates two local variables are required
(The args and exception e from the catch block). Args_size=1 is the number of arguments to the method (args).
90-97 Byte code associated with printing the message and catching any exceptions.
98-100 Byte code does not have try/catch constructs, but it does have exception handling, which is implemented in the Exception table.
Each row in the table is an exception handling instruction. The from and to values indicate the range of instructions to
which the exception handling applies. If the given type of instruction occurs between the from and to instructions
(inclusively), execution will skip to the target instruction index. The value 12 represents the start of the catch block.
You'll also notice the goto instruction after the invokevirtual instruction, which cause execution to skip to the end
of the method if no exception occurs.
102-107 Main method's line number table which matches source code with byte code instructions.
109-112 Main methods' LocalVariableTable, which defines the scope of the args parameter and the e exception variable.
114-117 The JVM uses StackMapTable entries to verify type safety for each code block defined within a method. This information
can be ignored for now. It is most likely that your compiler or byte code engineering library will generate this byte code
for you.

Byte Code Engineering Libraries

The most popular byte code engineering libraries are BCEL, SERP, Javassist, and ASM. All of these libraries have their own merits, but overall, ASM is far superior for its speed and versatility. There are plenty of articles and blogs entries discussing these libraries in addition to the documentation on their web sites. Instead of duplicating these efforts, the following will provide links and hopefully other useful information.


The most obvious detractor for BCEL (Byte Code Engineering Library) has been its inconsistent support. If you look at the BCEL News and Status page, there have been releases in 2001, 2003, 2006, and 2011. Four releases spread over 10 years is not confidence inspiring. However, it should be noted that there appears to be a version 6 release candidate, which can be downloaded from GitHub, but not Apache. Additionally, the enhancements and bug fixes discussed in the download's RELEASE-NOTES.txt file are substantial, including support for the language features of Java 6, 7, and 8.

BCEL is a natural starting place for the uninitiated byte code developer because it has the prestige of the Apache Software Foundation. Often, it may serve the developer's purpose. One of BCEL's benefits is that it has an API for both the SAX and DOM approaches to parsing byte code. However, when byte code manipulation is more complex, BCEL will likely end in frustration due to its API documentation and community support. It should be noted that BCEL is bundled with a BCELifier utility which parses byte code and will output the BCEL API Java code to produce the parsed byte code. If you choose BCEL as your byte code engineering library, this utility will be invaluable (but note that ASM has an equivalent ASMifier).


SERP is a lesser known library. My experience with it is limited, but I did find it useful for building a Javadoc-style tool for byte code. SERP was the only API that could give me program counter information so I could hyperlink branching instructions to their targets. Although the SERP release documentation indicates there is support for Java 8's invokedynamic instruction, it is not clear to me that it receives continuous support from the author and there is very little community support. The author also discusses its limitations which include issues with speed, memory consumption, and thread safety.


Javassist is the only library that provides some functionality not supported by ASM... and its pretty awesome. Javassist allows you to insert Java source code into existing byte code. You can insert Java code before a method body or append it after the method body. You
can also wrap a method body in a try-block and add your own catch-block (of Java code). You can also subsitute an entire method body or other smaller constructs with your own Java source code. Lastly, you can add methods to a class which contain your own Java source code. This feature is extremely powerful as it allows a Java developer to manipulate byte code without requiring an in-depth understanding of the underlying byte code. However, this feature does have its limitations. For instance, if you introduce variables in an insertBefore() block of code, they cannot be referenced later in an insertAfter() block of code. Additionally, ASM is generally faster than Javassist, but the benefits in Javassist's simplicity may outweigh gains in ASM's performance. Javassists is continually supported by the authors at JBoss and receives much community support.


ASM has it all. It is well supported, it is fast, and it can do just about anything. ASM has both SAX and DOM style APIs for parsing byte code. ASM also has an ASMifier which can parse byte code and generate the corresponding Java source code, which when run will produce the parsed byte code. This is an invaluable tool. It is expected that the developer has some knowledge of byte code, but ASM can update frame information for you if you add local variables etc. It also has many utility classes for common tasks in its commons package. Further, common byte code transformations are documented in exceptional detail. You can also get help from the ASM mailing list. Lastly, forums like StackOverflow provide additional support. Almost certainly any problem you have has already been discussed in the ASM documentation or in a StackOverflow thread.

Useful Links


Admittedly, this blog entry has not been particularly instructional. The intention is to give the beginner a place to start. In my experience, the best way to learn is to have a project in mind to which you'll apply what you are learning. Documenting a few basic byte code engineering tasks will only duplicate other's efforts. I developed my byte code skills from an interest in reverse engineering. I would prefer not to document those skills as it would be counter-productive to my other efforts (I built a commerical byte code obfuscator called Modifly, which can perform obfuscation transformations at run-time). However, I am willing to share what I have learned by demonstrating how to apply byte code engineering to class reloading and memory leak detection (and perhaps other areas if there is interest).

Next Blog in the Series Teaser

Even if you don't use JRebel, you probably haven't escaped their ads. JRebel's home page claims "Reload Code Changes Instantly. Skip the build and redeploy process. JRebel reloads changes to Java classes, resources, and over 90 frameworks.". Have you ever wondered how they do it? I'll show you exactly how they do it with working code in my next blog in this series.

If you enjoyed this blog, you may wish to follow on twitter.

Posted in Byte Code Engineering | Tagged , , , , , | Leave a comment

Modifly is now available! is proud to announce the availability of Modifly. Modifly is a byte code obfuscator that can perform run-time transformations. This means you never run the same byte code twice, but each run is functionally equivalent.

Be sure to read more about run-time transformations and Modifly‘s other exciting features here.

Posted in Obfuscation, Security | Leave a comment