Category Archives: Memory optimization

Primitive types to String conversion and String concatenation

by Mikhail Vorontsov

Primitive types to String conversion

From time to time you may need to create a string in your program from several values, some of them may be of primitive types. If you have two or more primitive type values in the beginning of your string concatenation, you need to explicitly convert first of them to a string (otherwise System.out.println( 1 + 'a' ) will print ’98’, but not ‘1a’). Of course, there is a family of String.valueOf methods (or corresponding wrapper type methods), but who needs them if there is another way which requires less typing?

Concatenating an empty string literal and the first of your primitive type variables (in our example, "" + 1) is the easiest idea. Result of this expression is a String and after that you can safely concatenate any primitive type values to it – compiler will take care of all implicit conversions to String.

Unfortunately, this is the worst way one can imagine. In order to understand why it is so, we need to review how string concatenation operator is translated in Java. If we have a String value (doesn’t matter which sort of it – literal or variable or method call) followed by + operator followed by any type expression:

1
string_exp + any_exp
string_exp + any_exp

Java compiler will translate it to:

1
new StringBuilder().append( string_exp ).append( any_exp ).toString()
new StringBuilder().append( string_exp ).append( any_exp ).toString()

If you have more than one + operator in the expression, you will end up with several StringBuilder.append calls before final toString call.

Continue reading

String.intern in Java 6, 7 and 8 – multithreaded access

by Mikhail Vorontsov

This article follows the initial String.intern in Java 6, 7 and 8 – string pooling article describing the implementation and benefits of using String.intern() method in Java 7 and 8. The original article was already getting too long, so I had to write this article in order to describe the performance characteristics of the multithreaded access to the String.intern.

The tests we will perform today are calling String.intern() from the multiple threads. They will emulate the behavior of a lot of modern server applications (for example, specialized web crawlers). For tests I will use my new workstation with Intel Xeon E5-2650 CPU (8 physical, 16 virtual cores @ 2 Ghz) and 128 Gb RAM. It will allow us to test the rather high contention scenarios. In this article we will create 8 threads in order to utilize all physical cores.

There will be 4 tests:

  1. The benchmark one – a single thread calling String.intern() using testLongLoop method from the previous article. It will show us how fast this configuration is without any contention.
  2. All 8 threads are calling String.intern() with unique values – an own prefix will be added by each thread to the interned string. This test will show us the synchronization overhead of String.intern(). It should be the theoretical worst case: it is highly unlikely that the only thing the actual application would do is calling String.intern in a loop from many threads.
  3. Initially we start the first thread interning the set of strings. After 2 seconds delay we will start a second thread interning the same set of strings. We expect that the following assumptions will be true: str.intern()==str for the first thread; str.intern()!=str for the second thread. It will allow us to prove that there are no thread local JVM string pools.
  4. All 8 threads will intern the same set of values. This scenario will be closer to the real situation – it will provide us the rather likely mix of adding strings to the JVM pool and querying the strings from it. Nevertheless, such a high read contention on the JVM string pool is still an unlikely event.

Continue reading

String.intern in Java 6, 7 and 8 – string pooling

by Mikhail Vorontsov

This article will describe how String.intern method was implemented in Java 6 and what changes were made in it in Java 7 and Java 8.

First of all I want to thank Yannis Bres for inspiring me to write this article.

07 June 2014 update: added 60013 as a default string pool size since Java 7u40 (instead of Java 8), added -XX:+PrintFlagsFinal and -XX:+PrintStringTableStatistics JVM parameter references, cleaned up a few typos.

This is an updated version of this article including -XX:StringTableSize=N JVM parameter description. This article is followed by String.intern in Java 6, 7 and 8 – multithreaded access article describing the performance characteristics of the multithreaded access to String.intern().

String pooling

String pooling (aka string canonicalisation) is a process of replacing several String objects with equal value but different identity with a single shared String object. You can achieve this goal by keeping your own Map<String, String> (with possibly soft or weak references depending on your requirements) and using map values as canonicalised values. Or you can use String.intern() method which is provided to you by JDK.

At times of Java 6 using String.intern() was forbidden by many standards due to a high possibility to get an OutOfMemoryException if pooling went out of control. Oracle Java 7 implementation of string pooling was changed considerably. You can look for details in http://bugs.sun.com/view_bug.do?bug_id=6962931 and http://bugs.sun.com/view_bug.do?bug_id=6962930.

String.intern() in Java 6

In those good old days all interned strings were stored in the PermGen – the fixed size part of heap mainly used for storing loaded classes and string pool. Besides explicitly interned strings, PermGen string pool also contained all literal strings earlier used in your program (the important word here is used – if a class or method was never loaded/called, any constants defined in it will not be loaded).

The biggest issue with such string pool in Java 6 was its location – the PermGen. PermGen has a fixed size and can not be expanded at runtime. You can set it using -XX:MaxPermSize=N option. As far as I know, the default PermGen size varies between 32M and 96M depending on the platform. You can increase its size, but its size will still be fixed. Such limitation required very careful usage of String.intern – you’d better not intern any uncontrolled user input using this method. That’s why string pooling at times of Java 6 was mostly implemented in the manually managed maps.

String.intern() in Java 7

Oracle engineers made an extremely important change to the string pooling logic in Java 7 – the string pool was relocated to the heap. It means that you are no longer limited by a separate fixed size memory area. All strings are now located in the heap, as most of other ordinary objects, which allows you to manage only the heap size while tuning your application. Technically, this alone could be a sufficient reason to reconsider using String.intern() in your Java 7 programs. But there are other reasons.

Continue reading

A few more memory saving techniques in Java

by Mikhail Vorontsov

This article will list a few more ideas on cheap memory saving techniques in Java. This article follows An overview of memory saving techniques in Java, Memory consumption of popular Java data types – part 1 and Memory consumption of popular Java data types – part 2 articles earlier published in this blog. I strongly recommend you to read An overview of memory saving techniques in Java before reading this article. This article assumes that size of Object references is equal to 4 bytes.

Static inner classes

Java lacks a normal library tuple support like the one Scala has. This forces Java developers to create a lot of usually inner classes, whose only purpose is to compose a few elements as a single entity for a collection. One really popular example of such inner class is Pair, which is often implemented as a generic class.

Inner non-static classes have an important property – they store a reference to their parent class. Such reference allows them to seamlessly call parent methods and access parent fields. Such a reference is often not needed for the simple tuple-like classes. You can get rid of it by marking your class as static. This will save you 4 bytes and increase its chances to occupy 8 bytes less (see An overview of memory saving techniques in Java for more details).

You should generally mark all your inner classes as static until there is a need to remove such qualifier.

Continue reading

Memory consumption of popular Java data types – part 2

by Mikhail Vorontsov

This article will give you an overview of some popular Java data types memory consumption. This article follows An overview of memory saving techniques in Java and Memory consumption of popular Java data types – part 1 articles earlier published in this blog. I strongly recommend you to read An overview of memory saving techniques in Java before reading this one. This article assumes that size of references is equal to 4 bytes.

HashMap, THashMap

HashMap is the most popular map in JDK. Provided that it contains objects with a quality hashCode method and it has not too high load factor, it will generally give you O(1) performance of get and put methods (as well as similar methods like contains).

You can find a proper hash map description in one of popular textbooks (“Algorithms (4th Edition)” or “Introduction to Algorithms”) or in Wikipedia. Here we will only discuss JDK HashMap implementation.

HashMap is built on top of the array of Map.Entry objects. The implementation ensures that this array length is always equal to at least max( size, capacity ) / load_factor. Default load factor for HashMap is 0.75 and default capacity is 16. Load factor specifies which part of an array could be used for storage and is a value between 0 and 1. The higher is the load factor, the less space is being wasted, but HashMap starts to work slower due to increased rate of collisions. The smaller if the load factor, the more memory is wasted, but the performance of a HashMap is increasing due to smaller possibility of collisions.

So, as you have seen, the default (new HashMap<>()) size of array of entries is 16 / 0.75 = 21.33 ~ 22.

What is a HashMap.Entry? It contains a key, a value, int hash of a key and a pointer to the next entry (remember that entries array could be sparse). It means that an entry occupies 32 bytes (12 bytes header + 16 bytes data + 4 bytes padding). So, a HashMap with size = S has to spend 32 * S bytes for entries storage. Besides, it will use 4 * C bytes for entries array, where C is the map capacity.

As you can see, if you will make the map capacity small enough (less than 12.5%), its entry array size will start dominating over the entries.

Continue reading

Memory consumption of popular Java data types – part 1

by Mikhail Vorontsov

This article will give you an overview of some popular Java data types memory consumption. This article follows An overview of memory saving techniques in Java article earlier published in this blog. I strongly recommend you to read the previous article before reading this one.

Enum, EnumMap, EnumSet

Enums were added to Java 1.5. Technically speaking, each enum is an Object and has all memory consumption properties of objects described in the previous article. Luckily, they have several useful properties:

All enum objects are singletons. This is guaranteed by Java language – you can create an instance of enum only by its definition. This means that you pay for enum object once and then you only spend 4 bytes per its reference (in this article I will assume that object references occupy 4 bytes). But what is the benefit of enums if they consume as much memory as int variables and 4 times more than byte variables, besides type safety and better support in IDE?

The answer is the ordinal() method implemented by every enum. It is an increasing number starting from 0 and ending at the number of values in the given enum minus 1. Such enum property allows us to use arrays for enum to any other object mapping: a value related to the first enum value will be stored in array[0], second enum will be mapped to array[1] and so on according to Enum.ordinal() method result. By the way, it was a short description of JDK EnumMap class implementation.

If you need to implement a map from Object into enum, you won’t have a tailored JDK implementation at hand. Of course, you can use any JDK Object to Object map, but it wouldn’t be that efficient. The easy way here is to use Trove TObjectByteMap (or TObjectIntMap in rare cases when you have more than 128 values in your enum) and map from Object key into Enum.ordinal() value. You will need a decoding method for getters in order to convert byte into an enum. This method will require 1 byte per entry, which is the best we can do without paying CPU algorithmic penalty (of course, we can use less than 1 byte per enum, if there are less than 128, 64, 32, etc elements in the enum, but it may make your code more complicated for a very little memory gain).

With all this knowledge at hand, you may now realize that EnumSet is implemented similarly to BitSet. There are 2 EnumSet implementations in JDK: RegularEnumSet and JumboEnumSet. The former is used for enums having less than 65 values (it covers 99,9999% of real-world enums), the latter is used for larger enumerations.

RegularEnumSet utilizes the knowledge of the fact that there are less or equal to 64 values in the enum. It allows RegularEnumSet to use a single long to store all “enum present in the set” flags, rather than using long[] (utilized by JumboEnumSet or BitSet). Using a single long instead of long[1] allows to save 4 bytes on long[] reference and 16 bytes on long value (long occupies 8 bytes, long[1] needs 12 bytes for header, 8 bytes for a single long and 4 bytes for alignment).

Continue reading

An overview of memory saving techniques in Java

by Mikhail Vorontsov

This article will give you the general advices on memory consumption optimization in Java.

Memory usage optimization is the most important type of optimization in Java. Current systems are limited by memory access times rather than by CPU frequency (otherwise, why CPU producers are implementing all these L1, L2 and L3 caches?). It means that by reducing your application memory footprint you will most likely improve your program data processing speed by making your CPU to wait for smaller amount of data. Now let’s get back to Java.

General Java memory layout information

First of all, we have to revise the memory layout of Java objects: any Java Object occupies at least 16 bytes, 12 out of which are occupied by a Java object header. Besides that, all Java objects are aligned by 8 bytes boundary. It means that, for example, an object containing 2 fields: int and byte will occupy not 17 (12 + 4 + 1), but 24 bytes (17 aligned by 8 bytes).

Each Object reference occupies 4 bytes if the Java heap is under 32G and XX:+UseCompressedOops is turned on (it is turned on by default in the recent Oracle JVM versions). Otherwise, Object references occupy 8 bytes.

All primitive data types occupy their exact byte size:

byte, boolean 1 byte
short, char 2 bytes
int, float 4 bytes
long, double 8 bytes

In essence, this information is sufficient for Java memory optimization. But it will be more convenient if you will be aware of arrays / String / numeric wrappers memory consumption.

Continue reading

Protobuf data encoding for numeric datatypes

by Mikhail Vorontsov

This article will give you a short overview of Google protobuf varint encoding and its applicability to financial data processing.

Google protobuf defines the data serialization format useful for both data storage and RPC calls. It is widely used in Google itself (according to Protobuf documentation). It defines serialization formats for all datatypes ( check this page for more details ), but here is the overview:

Protobuf encoding for most used datatypes

Datatypes Description
Unsigned integral types (int32, int64, bool, enum)

Protobuf relies on the idea that average data contains more small numbers rather than large ones. Varint encoding contains a number of bytes. The most significant bit of each byte (MSB) tells if it is the last byte in the encoding (0) or there are more bytes (1). The first byte in the sequence contains the lowest 7 bits of the original number, the next byte contains the next 7 bits (bits 7-13) and so on. We stop encoding when there are no more set bits (=1) left in the number. For example, 34 (22 hex) will be encoded as 0 010 0010 = 22 hex = 34 in this scheme. On the other hand, 255 (1111 1111) will be encoded in 2 byte sequence: 1111 1111, 0000 0001 – first byte contains 7 lowest bits of an original number and the MSB tells us that there will be one more byte. Second byte contains the bit 7 of an original number. MSB of the second byte is set to zero – no more bytes are expected.

You may notice that this encoding is similar (but not identical) to UTF-8 encoding. It favors the small values over the big values, which is the case of most real world data applications (for example, volume traded in the single trade on the stock exchange will tend do be small).

Signed integral types You can not apply the previous encoding scheme to the fields that can get negative. All negative numbers have the highest bit set, so you will always have to spend the maximal amount of space for negative values encoding. Due to this problem, Protobuf uses a ZigZag encoding, which transforms a signed 32/64 bit number into an unsigned one: 0 is converted into 0, -1 into 1, 1 into 2, -2 -> 3, 2 -> 4 and so on. This scheme still favors small (by their absolute value) numbers by sacrificing just a single bit. A number converted with ZigZag is encoded using VarInt encoding afterwards.
Floating point numbers These numbers are written as fixed length 32/64 bit values using Float.floatToIntBits / Double.doubleToLongBits. Varint encoding is not applicable to these numbers due to their different bit structure (sign-exponent-mantissa).
Strings Strings are encoded as a VarInt encoded string length followed by UTF-8 encoded string contents (I have already written about similar in-memory data compression technique in String packing part 2: converting Strings to any other objects article).

Protobuf defines support for repeated fields as well as nested messages, but these are outside the scope of this article. You can read the official documentation for more details.

Continue reading

Various types of memory allocation in Java

by Mikhail Vorontsov

This article discusses various types of a memory buffer allocation in Java. We will see how to treat any sort of Java buffer uniformly using sun.misc.Unsafe memory access methods. This article may be especially interesting for ex-C programmers willing to work with the memory on the lowest possible level in Java.

If you are more interested in general Java memory optimization, take a look at An overview of memory saving techniques in Java article in this blog as well as its following parts: one, two.

Array allocation limitations

Array size in Java is limited by the fact of using int as an array index. This means that you can not allocate an array with more than Integer.MAX_VALUE ( = 2^31 - 1 ) elements. This doesn’t mean that the longest chunk of memory you can allocate in Java is 2 Gb. You can allocate an array of bigger type instead. For example,

1
final long[] ar = new long[ Integer.MAX_VALUE ];
final long[] ar = new long[ Integer.MAX_VALUE ];

will allocate 16Gb - 8 bytes, if you have sufficiently high -Xmx Java setting (usually you should have about 50% more memory in heap – so in order to allocate 16Gb buffer, you will have to specify -Xmx24G (this is a general rule, actual required heap size may vary).

Unfortunately, you will be limited by your array element type in pure Java. The only useful class for working with arrays is a ByteBuffer, which offers methods for getting/writing various Java data types in the buffer (see Various methods of binary serialization in Java for more details). The disadvantage of a ByteBuffer – you are limited with byte[] as a source array type, which means a limitation of 2Gb for your buffer.

Treating any arrays as a byte buffer

For a while let’s assume that 2Gb buffers were not sufficient for our needs, but a 16Gb buffer will make us happy. We have allocated a long[], but want to treat this buffer as a byte array. We need to use a best C programmer friend in Java – sun.misc.Unsafe. This class has 2 sets of methods: getN( Object, offset ), where N is a result type for reading a value of given type from the given offset in the object and putN( Object, offset, value ) for writing a value at a given offset.

Unfortunately, these methods set or get only an individual value. If you want to copy data to/from an array, you will need one more Unsafe method: copyMemory(srcObject, srcOffset, destObject, destOffset, count). It works similar to System.arraycopy, but copies bytes instead of array elements.

In order to access array data using sun.misc.Unsafe, you will need 2 components:

Continue reading

Use case: Optimizing memory footprint of a read only csv file (Trove, Unsafe, ByteBuffer, data compression)

by Mikhail Vorontsov

Suppose your application has to obtain some data from an auxiliary data source, which could be a csv file. Your csv file will contain several fields, one of those is used as a field ID. You need to keep all that file in memory and provide fast access to records by an ID field. Your additional target is to consume as little memory as possible, while keeping access cost as low as possible. In this article we will process fake person records. Here is an example:

{id=idnum10, surname=Smith10, address=10 One Way Road, Springfield, NJ, DOB=01/11/1965, names=John Paul 10 Ringo}
    

All these records are generated by the following class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
private static class DataGenerator
{
    private int cnt = 1;
 
    public Map<String, String> getNextEntry()
    {
        final Map<String, String> res = new HashMap<String, String>( 8 );
        res.put( ID, "idnum" + cnt );
        res.put( "names", "John Paul " + cnt + " Ringo" );
        res.put( "surname", "Smith" + cnt );
        res.put( "DOB", "01/" + two((cnt % 31) + 1) + "/1965" ); //date of birth, MM/DD/YYYY
        res.put( "address", cnt + " One Way Road, Springfield, NJ" );
        ++cnt;
        return res;
    }
 
    private String two( final int val )
    {
        return val < 10 ? "0" + val : Integer.toString( val );
    }
}
private static class DataGenerator
{
    private int cnt = 1;

    public Map<String, String> getNextEntry()
    {
        final Map<String, String> res = new HashMap<String, String>( 8 );
        res.put( ID, "idnum" + cnt );
        res.put( "names", "John Paul " + cnt + " Ringo" );
        res.put( "surname", "Smith" + cnt );
        res.put( "DOB", "01/" + two((cnt % 31) + 1) + "/1965" ); //date of birth, MM/DD/YYYY
        res.put( "address", cnt + " One Way Road, Springfield, NJ" );
        ++cnt;
        return res;
    }

    private String two( final int val )
    {
        return val < 10 ? "0" + val : Integer.toString( val );
    }
}

Simple approach – map of maps

We always have to start from something simple and easy to support. In this case it may be a map of maps: outer map is indexed by ID field and th inner ones are indexed by field names.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
    /**
     * Initial version. Outer map indexed by a key field value, inner map is field name to field value.
     * All field names are interned, but we still have to pay for storing references to field names.
     */
    private static class SimpleMapStorage extends SimpleStorage
    {
        private final Map<String, Map<String, String>> m_data = new HashMap<String, Map<String, String>>( 1000 );
 
        public void addEntry( final Map<String, String> entry )
        {
            m_data.put( entry.get( ID ), entry );
        }
 
        public Map<String, String> getById( final String id )
        {
            return m_data.get( id );
        }
    }
    
    /**
     * Initial version. Outer map indexed by a key field value, inner map is field name to field value.
     * All field names are interned, but we still have to pay for storing references to field names.
     */
    private static class SimpleMapStorage extends SimpleStorage
    {
        private final Map<String, Map<String, String>> m_data = new HashMap<String, Map<String, String>>( 1000 );

        public void addEntry( final Map<String, String> entry )
        {
            m_data.put( entry.get( ID ), entry );
        }

        public Map<String, String> getById( final String id )
        {
            return m_data.get( id );
        }
    }
    

For testing purposes, all storage implementations will either implement Storage interface or extend SimpleStorage class. You will see the purpose of pack method in the more advanced examples.

1
2
3
4
5
6
7
8
9
10
11
12
13
private interface Storage
{
    public void addEntry( final Map<String, String> entry );
    public Map<String, String> getById( final String id );
    public void pack();
}
 
public static abstract class SimpleStorage implements Storage
{
    public void pack()
    {
    }
}
private interface Storage
{
    public void addEntry( final Map<String, String> entry );
    public Map<String, String> getById( final String id );
    public void pack();
}

public static abstract class SimpleStorage implements Storage
{
    public void pack()
    {
    }
}

All storage implementations will be tested by the following method:

1
2
3
4
5
6
7
8
9
10
11
private static void testStorage(final int recordCount, final Storage storage)
{
    final DataGenerator generator = new DataGenerator();
    for ( int i = 0; i < recordCount; ++i )
        storage.addEntry( generator.getNextEntry() );
    storage.pack();
    System.gc();
    final long mem = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
    System.out.println( storage.getClass().getName() + ": " + mem / 1024.0 / 1024.0 + " MB");
    System.out.println( storage.getById( "idnum10" ) ); //in order to retain storage in memory
}
private static void testStorage(final int recordCount, final Storage storage)
{
    final DataGenerator generator = new DataGenerator();
    for ( int i = 0; i < recordCount; ++i )
        storage.addEntry( generator.getNextEntry() );
    storage.pack();
    System.gc();
    final long mem = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
    System.out.println( storage.getClass().getName() + ": " + mem / 1024.0 / 1024.0 + " MB");
    System.out.println( storage.getById( "idnum10" ) ); //in order to retain storage in memory
}

For every implementation in this article we will try to create 1 million entries. SimpleMapStorage consumes 706 Mb to store 1M records. The actual data size is about 82 Mb, which means that this simple implementation wastes about 85% of consumed memory. Yes, straightforward solutions for big data storage do not work well in Java…

Continue reading