Various types of memory allocation in Java

by Mikhail Vorontsov

This article discusses various types of a memory buffer allocation in Java. We will see how to treat any sort of Java buffer uniformly using sun.misc.Unsafe memory access methods. This article may be especially interesting for ex-C programmers willing to work with the memory on the lowest possible level in Java.

If you are more interested in general Java memory optimization, take a look at An overview of memory saving techniques in Java article in this blog as well as its following parts: one, two.

Array allocation limitations

Array size in Java is limited by the fact of using int as an array index. This means that you can not allocate an array with more than Integer.MAX_VALUE ( = 2^31 - 1 ) elements. This doesn’t mean that the longest chunk of memory you can allocate in Java is 2 Gb. You can allocate an array of bigger type instead. For example,

1
final long[] ar = new long[ Integer.MAX_VALUE ];
final long[] ar = new long[ Integer.MAX_VALUE ];

will allocate 16Gb - 8 bytes, if you have sufficiently high -Xmx Java setting (usually you should have about 50% more memory in heap – so in order to allocate 16Gb buffer, you will have to specify -Xmx24G (this is a general rule, actual required heap size may vary).

Unfortunately, you will be limited by your array element type in pure Java. The only useful class for working with arrays is a ByteBuffer, which offers methods for getting/writing various Java data types in the buffer (see Various methods of binary serialization in Java for more details). The disadvantage of a ByteBuffer – you are limited with byte[] as a source array type, which means a limitation of 2Gb for your buffer.

Treating any arrays as a byte buffer

For a while let’s assume that 2Gb buffers were not sufficient for our needs, but a 16Gb buffer will make us happy. We have allocated a long[], but want to treat this buffer as a byte array. We need to use a best C programmer friend in Java – sun.misc.Unsafe. This class has 2 sets of methods: getN( Object, offset ), where N is a result type for reading a value of given type from the given offset in the object and putN( Object, offset, value ) for writing a value at a given offset.

Unfortunately, these methods set or get only an individual value. If you want to copy data to/from an array, you will need one more Unsafe method: copyMemory(srcObject, srcOffset, destObject, destOffset, count). It works similar to System.arraycopy, but copies bytes instead of array elements.

In order to access array data using sun.misc.Unsafe, you will need 2 components:


  1. Offset of array data from the array object
  2. Offset of your element from the beginning of the array data

Arrays, like any other Java objects, have a header, which is located before the actual data. A length of this header could be obtained via unsafe.arrayBaseOffset( T[].class ), where T is a type of your array elements. Size of your array elements could be obtained via unsafe.arrayIndexScale( T[].class ) method call. This means that in order to access N-th element of type T in your buffer, your will need to use offset = arrayOffset + N * arrayScale.

You can look at the example of Unsafe memory access in UnsafeMemory class presented in Various methods of binary serialization in Java article.

Now let’s write a simple example. We will allocate a long[] and update a few of its bytes. We will set the last array element to -1 (0xFFFF FFFF FFFF FFFF in hex) and then clear its bytes one by one:

1
2
3
4
5
6
7
8
9
10
11
final long[] ar = new long[ 1000 ];
final int index = ar.length - 1;
ar[ index ] = -1; //FFFF FFFF FFFF FFFF
 
System.out.println( "Before change = " + Long.toHexString( ar[ index ] ));
 
for ( long i = 0; i < 8; ++i )
{
    unsafe.putByte( ar, longArrayOffset + 8L * index + i, (byte) 0);
    System.out.println( "After change: i = " + i + ", val = "  +  Long.toHexString( ar[ index ] ));
}
final long[] ar = new long[ 1000 ];
final int index = ar.length - 1;
ar[ index ] = -1; //FFFF FFFF FFFF FFFF

System.out.println( "Before change = " + Long.toHexString( ar[ index ] ));

for ( long i = 0; i < 8; ++i )
{
    unsafe.putByte( ar, longArrayOffset + 8L * index + i, (byte) 0);
    System.out.println( "After change: i = " + i + ", val = "  +  Long.toHexString( ar[ index ] ));
}

Examples from this article require the following static code in your test class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
private static final Unsafe unsafe;
static
{
    try
    {
        Field field = Unsafe.class.getDeclaredField("theUnsafe");
        field.setAccessible(true);
        unsafe = (Unsafe)field.get(null);
    }
    catch (Exception e)
    {
        throw new RuntimeException(e);
    }
}
 
private static final long longArrayOffset = unsafe.arrayBaseOffset(long[].class);
</blockquote>
 
<p>
    This is the output of this code snippet. As you can see, we have successfully changed the separate bytes in the <code>long[]</code>.
</p>
<blockquote>
    <pre>
Before change = ffffffffffffffff
After change: i = 0, val = ffffffffffffff00
After change: i = 1, val = ffffffffffff0000
After change: i = 2, val = ffffffffff000000
After change: i = 3, val = ffffffff00000000
After change: i = 4, val = ffffff0000000000
After change: i = 5, val = ffff000000000000
After change: i = 6, val = ff00000000000000
After change: i = 7, val = 0
private static final Unsafe unsafe;
static
{
    try
    {
        Field field = Unsafe.class.getDeclaredField("theUnsafe");
        field.setAccessible(true);
        unsafe = (Unsafe)field.get(null);
    }
    catch (Exception e)
    {
        throw new RuntimeException(e);
    }
}

private static final long longArrayOffset = unsafe.arrayBaseOffset(long[].class);
</blockquote>

<p>
    This is the output of this code snippet. As you can see, we have successfully changed the separate bytes in the <code>long[]</code>.
</p>
<blockquote>
    <pre>
Before change = ffffffffffffffff
After change: i = 0, val = ffffffffffffff00
After change: i = 1, val = ffffffffffff0000
After change: i = 2, val = ffffffffff000000
After change: i = 3, val = ffffffff00000000
After change: i = 4, val = ffffff0000000000
After change: i = 5, val = ffff000000000000
After change: i = 6, val = ff00000000000000
After change: i = 7, val = 0

sun.misc.Unsafe buffer allocation

As we have seen, we have a limitation on the maximum buffer size in pure Java. This limitation was added into the original version of Java at the ages when people never dared to think about several gigabyte memory buffers on the commodity computers. Now, at the age of the Big Data, we may need bigger memory buffers. There are 2 ways to get such buffers in Java:

  • Allocate several smaller buffers and treat them logically as one large buffer.
  • Use sun.misc.Unsafe.allocateMemory( long ) for memory buffer allocation.

The first approach is interesting only from an algorithmic point of view, so we will take a look at the second one.

sun.misc.Unsafe provides a group of methods for memory allocation/reallocation/deallocation. They are very similar to C malloc/free methods:

  • long Unsafe.allocateMemory(long size) - allocate a memory buffer. The buffer will contain rubbish (it will not be zeroed). This method will throw java.lang.OutOfMemoryError in case of allocation failure. Returns a non-zero buffer address (see below for description).
  • Unsafe.reallocateMemory(long address, long size) - reallocates a memory buffer, copies data from an old buffer (pointed by address) to the new buffer. If address==0, this method works as Unsafe.allocateMemory. Returns a new buffer address.
  • Unsafe.freeMemory(long address) - disposes of a memory buffer allocated by one of previous methods. Does not do anything if address==0.

Buffers allocated by these methods should be used in the so called single register address mode: Unsafe implements a group of get/put methods accepting just an address (unlike a double register mode, requiring an Object and an offset from the object). Buffers allocated by these methods may consume more memory than it was specified in the -Xmx Java parameter.

ATTENTION: Buffers allocated by Unsafe methods will NOT be garbage collected. You will have to manage them as any other normal resource.

Here is an example of a buffer allocation using Unsafe.allocateMemory and checking that the whole buffer is read-write accessible:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
final int size = Integer.MAX_VALUE / 2;
final long addr = unsafe.allocateMemory( size );
try
{
    System.out.println( "Unsafe address = " + addr );
    for ( int i = 0; i < size; ++i )
    {
        unsafe.putByte( addr + i, (byte) 123);
        if ( unsafe.getByte( addr + i ) != 123 )
            System.out.println( "Failed at offset = " + i );
    }
}
finally
{
    unsafe.freeMemory( addr );
}
final int size = Integer.MAX_VALUE / 2;
final long addr = unsafe.allocateMemory( size );
try
{
    System.out.println( "Unsafe address = " + addr );
    for ( int i = 0; i < size; ++i )
    {
        unsafe.putByte( addr + i, (byte) 123);
        if ( unsafe.getByte( addr + i ) != 123 )
            System.out.println( "Failed at offset = " + i );
    }
}
finally
{
    unsafe.freeMemory( addr );
}

As you can see, you can write a rather generic memory access code using sun.misc.Unsafe: you can treat any sort of buffer allocated in Java as a buffer capable to read/write any Java data type.

See also

Summary

  • Array size in Java is limited by the biggest int value = 2^31 - 1. On the other hand, you are not limited by 2Gb - 1 bytes as a size of your array - you may allocate a long[], which occupies 8 times more memory (16Gb - 8 bytes).
  • You may use sun.misc.Unsafe.allocateMemory(long size) for allocating a buffer longer than 2Gb, but you will have to free such buffers yourself.
  • You can use sun.misc.Unsafe memory access methods for reading/writing any Java datatype from/to both Java arrays and Unsafe buffers in the uniform manner.

Recommended reading

If you want to know everything about garbage collection in Java and in general, take a look at a best book on this subject: The Garbage Collection Handbook: The Art of Automatic Memory Management (Chapman & Hall/CRC Applied Algorithms and Data Structures series). Besides memory allocation algorithms in garbage collected environments, it describes mark-sweep; mark-compact; copying garbage collector; reference counting; comparing garbage collectors; generational garbage collecting and other partitioned schemes of garbage collection; parallel and concurrent garbage collection; concurrent mark-sweep; concurrent copying and compaction; concurrent reference counting and finally, real-time garbage collection.


One thought on “Various types of memory allocation in Java

  1. Pingback: 如何在Java中分配超过-Xmx限制的内存 | riaos

Comments are closed.