Tag Archives: Java 6

String.intern in Java 6, 7 and 8 – string pooling

by Mikhail Vorontsov

This article will describe how String.intern method was implemented in Java 6 and what changes were made in it in Java 7 and Java 8.

First of all I want to thank Yannis Bres for inspiring me to write this article.

07 June 2014 update: added 60013 as a default string pool size since Java 7u40 (instead of Java 8), added -XX:+PrintFlagsFinal and -XX:+PrintStringTableStatistics JVM parameter references, cleaned up a few typos.

This is an updated version of this article including -XX:StringTableSize=N JVM parameter description. This article is followed by String.intern in Java 6, 7 and 8 – multithreaded access article describing the performance characteristics of the multithreaded access to String.intern().

String pooling

String pooling (aka string canonicalisation) is a process of replacing several String objects with equal value but different identity with a single shared String object. You can achieve this goal by keeping your own Map<String, String> (with possibly soft or weak references depending on your requirements) and using map values as canonicalised values. Or you can use String.intern() method which is provided to you by JDK.

At times of Java 6 using String.intern() was forbidden by many standards due to a high possibility to get an OutOfMemoryException if pooling went out of control. Oracle Java 7 implementation of string pooling was changed considerably. You can look for details in http://bugs.sun.com/view_bug.do?bug_id=6962931 and http://bugs.sun.com/view_bug.do?bug_id=6962930.

String.intern() in Java 6

In those good old days all interned strings were stored in the PermGen – the fixed size part of heap mainly used for storing loaded classes and string pool. Besides explicitly interned strings, PermGen string pool also contained all literal strings earlier used in your program (the important word here is used – if a class or method was never loaded/called, any constants defined in it will not be loaded).

The biggest issue with such string pool in Java 6 was its location – the PermGen. PermGen has a fixed size and can not be expanded at runtime. You can set it using -XX:MaxPermSize=N option. As far as I know, the default PermGen size varies between 32M and 96M depending on the platform. You can increase its size, but its size will still be fixed. Such limitation required very careful usage of String.intern – you’d better not intern any uncontrolled user input using this method. That’s why string pooling at times of Java 6 was mostly implemented in the manually managed maps.

String.intern() in Java 7

Oracle engineers made an extremely important change to the string pooling logic in Java 7 – the string pool was relocated to the heap. It means that you are no longer limited by a separate fixed size memory area. All strings are now located in the heap, as most of other ordinary objects, which allows you to manage only the heap size while tuning your application. Technically, this alone could be a sufficient reason to reconsider using String.intern() in your Java 7 programs. But there are other reasons.

Continue reading