Tag Archives: substring

Changes to String internal representation made in Java 1.7.0_06

by Mikhail Vorontsov

This post was updated on 19 Nov 2013 in order to reflect Java 8 changes.

This post was updated on 28 Nov 2013 in order to reflect Java 7u40 changes (thanks to Sunny Chan and his colleagues for pointing my attention at this JDK update).

Sharing an underlying char[]

An original String implementation has 4 non static field: char[] value with string characters, int offset and int count with an index of the first character to use from value and a number of characters to use and int hash with a cached value of a String hash code. As you can see, in a very large number of cases a String will have offset = 0 and count = value.length. The only exception to this rule were the strings created via String.substring calls and all API calls using this method internally (like Pattern.split).

String.substring created a String, which shared an internal char[] value with an original String, which allowed you:

  1. To save some memory by sharing character data
  2. To run String.substring in a constant time ( O(1) )

At the same time such feature was a source of a possible memory leak: if you extract a tiny substring from an original huge string and discard that original string, you will still have a live reference to the underlying huge char[] value taken from an original String. The only way to avoid it was to call a new String( String ) constructor on such string – it made a copy of a required section of underlying char[], thus unlinking your shorter string from its longer “parent”.

From Java 1.7.0_06 (as well as in current versions of Java 8 – Nov 13) offset and count fields were removed from a String. This means that you can’t share a part of an underlying char[] value anymore. Now you can forget about a memory leak described above and never ever use new String(String) constructor anymore. As a drawback, you now have to remember that String.substring has now a linear complexity instead of a constant one.

Continue reading