There are situations where you want to change Java code not at the source code level but at runtime. For example when you want to instrument code for logging purposes but do not want to change the source code (because you might not have access to it). This can be done by using a Java Agent. For example, Dynatrace uses a Java agent to collect data from inside the JVM. Another example is the GraalVM tracing agent (here) which helps you create configuration for the generation of native images. Logging is one use-case but you can also more dramatically alter runtime code to obtain a completely different runtime behavior.
This blog post is not a step by step introduction for creating Java Agents. For that please take a look at the following. In this blog post I have created a Java agent which rewrites synchronized methods to use a ReentrantLock instead (see here). The use-case for this is to allow applications to use Project Loom's Virtual Threads more efficiently.
You can find the code of the agent here.
The Java Agent
When I created the agent, I encountered several challenges, which I listed below and how I dealt with them. This might help you to overcome some initial challenges when writing your own Java Agent.
Keep the dependencies to a minimum
First, the JAR for the agent must include its dependencies. You often do not know in which context the agent is loaded and if required dependencies are supplied by another application. The maven-dependency-plugin with the descriptor reference jar-with-dependencies helps you generate a JAR file including dependencies.
When you include dependencies however in the JAR containing also your agent, they might cause issues with other classes which are loaded by the JVM (because you usually run an application or even application server). The specific challenge which I encountered was that I was using slf4j-simple and my application was using Logback for logging. The solution was simple; remove slf4-simple. More generally; use as few external dependencies as viable for the agent. I used only javassist. I needed this dependency to easily do bytecode manipulation.
Trigger class transformation for every class
There are several solutions available to obtain the correct bytecode. Be careful though that your bytecode transformer is triggered for every class that is loaded when it is loaded. Solutions which only look at currently loaded classes (instrumentation.getAllLoadedClasses()) or a scan based on reflection utils like javassist (example here) won't do that. You can register a transformer class that will be triggered when a specific class is loaded, but if you want your agent to be triggered for every class, it is better to register a transformer without having a specific class specified; instrumentation.addTransformer(new RemsyncTransformer());. The agent will be the first to be loaded when using the premain method / the javaagent JVM switch when starting your application.
Of course, if you want to attach the agent to an already running JVM (using the agentmain method), you still need to process the already loaded classes.
Obtain the bytecode
I've seen several examples on how to obtain the bytecode of the class to edit in the transformer class. The signature of the transform method of transformer class is:
public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined,
ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException
It is tempting to use the loader, className and classBeingRedefined to obtain the relevant CtClass instance. The CtClass instance can be edited using javassist. For example by doing something like:
String targetClassName = className.replaceAll("\\.", "/");
ClassPool cp = ClassPool.getDefault();
CtClass cc = cp.get(targetClassName);
I'm however not a great fan of string parsing / editing to obtain the correct name of a class and then fetch it. The classfileBuffer contains the bytecode of the class being loaded and this can be edited directly. I implemented this like below (based on this which contains a more elaborate explanation).
private final ScopedClassPoolFactoryImpl scopedClassPoolFactory = new ScopedClassPoolFactoryImpl();
ClassPool classPool = scopedClassPoolFactory.create(loader, ClassPool.getDefault(),ScopedClassPoolRepositoryImpl.getInstance());
CtClass ctClass = classPool.makeClass(new ByteArrayInputStream(classfileBuffer));
Editing the bytecode
I used the following code to add a private instance variable, remove the synchronized modifier and wrap the contents of the method in a try/finally block:
try {
ctField = ctClass.getDeclaredField("lockCustomAgent");
} catch (NotFoundException e) {
ctClass.addField(CtField.make("final java.util.concurrent.locks.Lock lockCustomAgent = new java.util.concurrent.locks.ReentrantLock();", ctClass));
}
method.instrument(new ExprEditor() {
public void edit(MethodCall m) throws CannotCompileException {
m.replace("{ lockCustomAgent.lock(); try { $_ = $proceed($$); } finally { lockCustomAgent.unlock(); } }");
}
});
modifier = Modifier.clear(modifier, Modifier.SYNCHRONIZED);
method.setModifiers(modifier);
What this did was change the following Java code
synchronized void hi() {
System.out.println("Hi");
}
to the following
final java.util.concurrent.locks.Lock lockCustomAgent = new java.util.concurrent.locks.ReentrantLock();
void hi() {
lockCustomAgentStatic.lock();
try {
System.out.println("Hi");
} finally {
lockCustomAgentStatic.unlock();
}
}
This is exactly as described here at 'Mitigating limitations'.
Generate your JAR manifest
A manifest file helps the JVM to know which agent classes to use inside the JAR file. You can supply this file and package it together with the compiled sources in the JAR or let it be generated upon build. I choose the second option. The following will generate a manifest indicating which agent class needs to be started by the JVM.
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.3.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<archive>
<manifestEntries>
<Premain-Class>nl.amis.smeetsm.agent.RemsyncInstrumentationAgent</Premain-Class>
<Agent-Class>nl.amis.smeetsm.agent.RemsyncInstrumentationAgent</Agent-Class>
<Can-Redefine-Classes>true</Can-Redefine-Classes>
<Can-Retransform-Classes>true</Can-Retransform-Classes>
</manifestEntries>
</archive>
</configuration>
</plugin>
Finally
The agent described here illustrated some of the options you have for editing bytecode. I noticed though when creating this agent, that there are many situations where the agent cannot correctly rewrite code. This is probably because my rewrite code is not elaborate enough to deal with all code constructs it may encounter. An agent is probably only suitable for simple changes and not for complex rewrites. One of the strengths of using a method like this is though that you do not need to change any source code to check out if a certain change does what you expect. I was surprised at how easy it was to create a working agent that changes Java code at runtime. If you have such a use-case, do check out what you can do with this. PS for Project Loom, just using virtual threads and rewriting synchronized modifiers did not give me better performance.
You can find the code of the agent here.
No comments:
Post a Comment