Category Archives: Trivial

Regexp-related methods of String

by Mikhail Vorontsov

java.lang.String has several methods, which are actually shortcuts for various java.util.Pattern/Matcher methods. These are

  • matches(String)
  • replaceAll(String, String), replaceFirst(String, String)
  • split(String), split(String, int)

These methods have 2 performance drawbacks:

  • they create and compile Pattern objects internally for each call
  • they may be used for simple cases, when manual parsing is absolutely appropriate

For example, matches(String) method is implemented as Pattern.matches(regex, this), which tries to match current string (this) against the given regex. Pattern.matches is defined as Pattern.compile(regex).matcher(input).matches()

All other String regex methods look similar: compile a temporary Pattern, create a Matcher for current String and call required method on the Matcher object. This is appropriate if you are using these String methods just once or rarely, but it is better to avoid them if you are calling regex methods for each piece of data you are processing.

Continue reading

Inefficient byte[] to String constructor

by Mikhail Vorontsov

Everything written in this post is related to Java 6 only.

A String constructor was added in Java 6 in order to facilitate conversion from byte[] to String using a provided Charset:

1
public String(byte bytes[], int offset, int length, Charset charset)
public String(byte bytes[], int offset, int length, Charset charset)

It looks harmless from the first sight, but it has a potential problem inside: it makes a temporary “defensive” copy of a provided byte[] inside StringCoding.decode(Charset cs, byte[] ba, int off, int len) method. It may be needed in a very few applications, but most applications will pay unfair price for this method call.

Continue reading

java.lang.Byte, Short, Integer, Long, Character (boxing and unboxing)

by Mikhail Vorontsov

In this article we will discuss how boxing and unboxing is implemented in Java 1.5+ and what implications has such implementation. There are no differences between Java 6 and 7 implementations.

Boxing is a process of conversion of primitive type variable into java.lang.Number subclass (Java.lang.Byte, Short, Integer, Long, Float, Double). Boxing is done via valueOf method. For example, for Integer it is:

1
static Integer valueOf( int i )
static Integer valueOf( int i )

Unboxing can be done from any of java.lang.Number subclasses to any primitive type. This is done via set of methods defined in java.lang.Number:

1
2
3
4
5
6
7
8
9
10
public abstract int intValue();
public abstract long longValue();
public abstract float floatValue();
public abstract double doubleValue();
public byte byteValue() {
    return (byte)intValue();
}
public short shortValue() {
    return (short)intValue();
}
public abstract int intValue();
public abstract long longValue();
public abstract float floatValue();
public abstract double doubleValue();
public byte byteValue() {
    return (byte)intValue();
}
public short shortValue() {
    return (short)intValue();
}

Implications on performance

There is a common mistake done while converting String to a primitive value type. Which method is better to use: parse*(String) or valueOf(String)? The answer is parse* (parseInt, parseLong and so on). It returns a primitive value. Remember that all valueOf methods return an Object (though it may be cached sometimes – see below).

Continue reading

Map.containsKey/Set.contains

by Mikhail Vorontsov

Both java.util.Map.containsKey and java.util.Set.contains methods should not be often used in your code. Their functionality is covered by other Map/Set methods which you are likely to use after containsKey/contains call.

Sets

If you want to check if you have some key in your set, and, if it is not present, add it and do something else, you may write code like this:

1
2
3
4
5
if ( !set.contains( key ) )
{
    set.add( key );
    //some extra code here
}
if ( !set.contains( key ) )
{
    set.add( key );
    //some extra code here
}

Instead, it will be faster to use check from add method itself. It will return true if a given key was not present in the set before (you can treat true as “did something” and false as “did nothing”).

1
2
3
4
if ( set.add( key ) )
{
    //same extra code could be added here
}
if ( set.add( key ) )
{
    //same extra code could be added here
}

Continue reading