|2. Troubleshooting scenarios|
In the Java world most of us are used to GUI tools for all stages of Java application development: writing your code, debugging and profiling it. We often prefer to set up the server environment on our dev boxes and try to reproduce the problems locally using familiar tools. Unfortunately, it is often impossible to reproduce some issues locally for various reasons. For example, you may not be authorised to access the real client data which is processed by your server application.
In situations like this one you need to troubleshoot the application remotely on the server box. You should keep in mind that you can not properly troubleshoot an application with bare JRE in your hands: it contains all the troubleshooting functionality, but there is literally no way to access it. As a result, you need either a JDK or some 3rd party tools on the same box. This article will describe JDK tools, because you are likely to be allowed to use it on production boxes compared to any 3rd party tools which require security audit in many organizations.
In general case, it is sufficient just to unpack the JDK distribution to your box – you don’t need to install it properly for troubleshooting purposes (actually, proper installation may be undesirable in a lot of cases). For JMX based functionality you can install literally any Java 7/8 JDK, but some tools can not recognize the future JDK, so I advice you to install either the latest Java 7/8 JDK or the build exactly matching to server JRE – it allows you to dump the app heap for applications with no safepoints being accessed at the moment (some applications in the idle mode are the easy example of “no safepoints” applications).
2. Troubleshooting scenarios
2.1. Getting a list of running JVMs
In order to start working you nearly always need to get a list of running JVMs, their process IDs and command line arguments. Sometimes it may enough: you may find a second instance of the same application doing the same job concurrently (and damaging the output files / reopening sockets / doing some other stupid things).
jcmd without any arguments. It will print you a list of running JVMs:
3824 org.jetbrains.idea.maven.server.RemoteMavenServer 2196 780 sun.tools.jcmd.JCmd
Now you can see what diagnostic commands are available for a given JVM by running
jcmd <PID> help command. Here is a sample output for VisualVM:
>jcmd 3036 help 3036: The following commands are available: JFR.stop JFR.start JFR.dump JFR.check VM.native_memory VM.check_commercial_features VM.unlock_commercial_features ManagementAgent.stop ManagementAgent.start_local ManagementAgent.start GC.rotate_log Thread.print GC.class_stats GC.class_histogram GC.heap_dump GC.run_finalization GC.run VM.uptime VM.flags VM.system_properties VM.command_line VM.version help
jcmd <PID> <COMMAND_NAME> to either run a diagnostic command or get an error message asking for command arguments:
>jcmd 3036 GC.heap_dump 3036: java.lang.IllegalArgumentException: The argument 'filename' is mandatory.
You can get more information about a diganostic command arguments by using the following command:
jcmd <PID> help <COMMAND_NAME>. For example, here is the output for
>jcmd 3036 help GC.heap_dump 3036: GC.heap_dump Generate a HPROF format dump of the Java heap. Impact: High: Depends on Java heap size and content. Request a full GC unless the '-all' option is specified. Permission: java.lang.management.ManagementPermission(monitor) Syntax : GC.heap_dump [options] <filename> Arguments: filename : Name of the dump file (STRING, no default value) Options: (options must be specified using the <key> or <key>=<value> syntax) -all : [optional] Dump all objects, including unreachable objects (BOOLEAN, false)
2.2. Making a heap dump
jcmd provides you with a handy interface for making the heap dump in HPROF format. Simply run
jcmd <PID> GC.heap_dump <FILENAME>. Pay attention that the file name is relative to the running JVM current directory instead of your current directory, so you may want to specify the full path. It is a good idea to use
.hprof extension for the dump file name.
After thread dump completion you can copy a file to your own box and open it either in VisualVM (it is a part of JDK) and use its heap walker and query language functionality or load it into JOverflow plugin of Java Mission Control and analyze it for various memory issues.
Note 1: of course there is lots of other tools capable to handle hprof files: NetBeans, Eclipse Memory Analyzer, YourKit, etc. Use your favorite one once you have downloaded an .hprof file to your box.
Note 2: you can make a heap dump using
jmap tool as well:
jmap -dump:live,file=<FILE_NAME> <PID>. The problem with it is that it is officially documented as unsupported. Many of us were thinking that unsupported stuff in JDK will remain forever, but it turns out that this is no longer the case: JEP 240, JEP 241
2.3. Analyzing a class histogram
If you are looking for a memory leak, you are usually interested just in a number of live objects of certain type in a heap. For example, you may know that you can have only one object of a certain type at a time (some sort of main working class in your application). There also may be one or more instances of the same class in the old generation which are not garbage collected so far, but they should not be accessible from the application roots.
To print the class histogram run either of these commands (both commands print the number of live objects):
jcmd <PID> GC.class_histogram jmap -histo:live <PID>
Here are the top few lines of the example output:
num #instances #bytes class name ---------------------------------------------- 1: 5923 5976952 [I 2: 50034 4127704 [C 3: 49465 1187160 java.lang.String 4: 188 1069496 [J 5: 3985 1067240 [Ljava.util.HashMap$Node; 6: 8756 982872 java.lang.Class 7: 2855 835792 [B 8: 23570 754240 java.util.HashMap$Node 9: 13964 671440 [Ljava.lang.Object; 10: 9642 308544 java.util.Hashtable$Entry 11: 4453 213744 java.util.HashMap
Note that occupied size in bytes is a shallow size – it does not include any child objects. It is easy to notice this fact from
char (class name =
String stats – while the number of instances is similar (though there are always more
Strings, the size of
char-s is noticeably bigger, which can not be the case if
String size included the underlying
Now you can just grep/search for the class name you are interested in and check the number of live instances. If you see more instances than expected, make a heap dump and analyze it in any heap walker (see above).
2.4. Making a thread dump
Sometimes your application may be reported as “not doing anything/got stuck”. There are many kinds of getting “stuck” – a deadlock, high resource contention or simply an
O(N10) algorithm processing the requests of a few million users 🙂 In all these situations you should know what your application threads are executing and what locks do they hold.
There are 2 kinds of locks: original ones based on
synchronized keyword and
Object.wait/notifyAll methods and the
java.util.concurrent locks introduced in Java 5. The major difference between them is that former are bound to the stack frame where you enter the
synchronized section and were always available in the thread dumps. The latter (
java.util.concurrent), on the other hand, are not stack frame bound – you can enter the lock in one method and leave it in the different one. As a result, for some time they were not printed in the thread dump at all and even now they are still an option. Nevertheless, you need both sorts of locks in your thread dump to properly investigate the threading issues.
There are 3 ways to print the application thread dump. You can run
kill -3 <PID> on Linux. Or you can run one of the following commands on any platform:
jstack <PID> jcmd <PID> Thread.print
2.5. Running Java Flight Recorder
All tools mentioned up to this point in this article should be used only for quick investigation. For deeper analysis I recommend to rely on the built-in Java Flight Recorder. Take a look at my recent article about Java Mission Control for more details.
Running JFR is a 3 step process:
You need to create a JFR template file containing desired settings. In order to do this, run
jmcand go to Window -> Flight Recording Template Manager menu. Once the profile is ready, export it to the file and then send it to the box you are working on.
JFR requires JDK commercial license. I ask you to take a look again at my recent Java Mission Control article for details. I suppose you are happy with the license terms. Now you need to unlock commercial features on the required JVM:
jcmd <PID> VM.unlock_commercial_features
After that you can start JFR. Here is an example of the command line:
jcmd <PID> JFR.start name=test duration=60s settings=template.jfc filename=output.jfr
This command runs JFR immediately (
delayproperty is not set) and collects JVM information for 60 seconds using settings from
template.jfctemplate file and writing results into
output.jfr(it makes sense to use absolute paths for both files).
Once recording is done, you can copy the
.jfr file to your laptop and analyze it inside
jmc GUI. It contains nearly all information you need to troubleshoot JVM with the exception of the full heap dump, which you can make and copy to your box separately.