Use case: FIX messages processing. Part 2: writing and optimizing a FIX gateway

by Mikhail Vorontsov

Now, having a fast parsing method from the first part of this article, let’s try to implement a FIX gateway. Its purpose is to filter out some messages based on a some criteria. We wouldn’t discuss filtering in this article – it is very straightforward and very task-dependent. Instead we will see what we can do in order to optimize the gateway processing loop. For the beginning, let’s suppose that it is “parse-filter-compose message” loop. We have a parser, you have written a filter, now we need a compose method, which takes a list of parsed fields and returns a message string.

The following method uses a StringBuilder with initial size of 1024 bytes (it makes sense to use either average or maximal message length for StringBuilder initialization) to compose a message. It is rather straightforward, but contains a problem with double values: if values are big enough, they will be formatted in scientific “e”-notation, which means that we will not get an original message. Though, depending on the gateway clients, it may not be a problem – general purpose double parsers support scientific notation. You can read more about double -> String -> double conversion in BigDecimal vs double article.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
//compose a message from a list of parsed fields
private static String compose( final List<Field> fields )
{
    boolean first = true;
    final StringBuilder sb = new StringBuilder( 1024 );
    for ( final Field fld : fields )
    {
        if ( first )
            first = false;
        else
            sb.append( FIELD_SEPARATOR_CHAR );
        sb.append( fld.id ).append( VALUE_SEPARATOR_CHAR );
        if ( DATE_FIELDS.get( fld.id ) )
            sb.append( DATE_FORMAT.get().format( fld.value ) );
        else if ( INT_FIELDS.get( fld.id ) )
            sb.append( Integer.toString( ( Integer ) fld.value ) );
        else if ( DOUBLE_FIELDS.get( fld.id ) )
            sb.append( Double.toString( ( Double ) fld.value ) );
        else
            sb.append( fld.value );
    }
    return sb.toString();
}
//compose a message from a list of parsed fields
private static String compose( final List<Field> fields )
{
    boolean first = true;
    final StringBuilder sb = new StringBuilder( 1024 );
    for ( final Field fld : fields )
    {
        if ( first )
            first = false;
        else
            sb.append( FIELD_SEPARATOR_CHAR );
        sb.append( fld.id ).append( VALUE_SEPARATOR_CHAR );
        if ( DATE_FIELDS.get( fld.id ) )
            sb.append( DATE_FORMAT.get().format( fld.value ) );
        else if ( INT_FIELDS.get( fld.id ) )
            sb.append( Integer.toString( ( Integer ) fld.value ) );
        else if ( DOUBLE_FIELDS.get( fld.id ) )
            sb.append( Double.toString( ( Double ) fld.value ) );
        else
            sb.append( fld.value );
    }
    return sb.toString();
}

How fast does it work? 28 sec for 10M calls on Java 6 and Java 7. By the way, this test is good enough for comparing StringBuffer and StringBuilder performance. The difference between them is that the former is synchronized (will you ever need it?) and the latter is not. Just replace a StringBuilder with a StringBuffer and run the test again. In my case, both Java 6 and 7 versions took 30.5 sec to complete. This difference is not so large as the one between a synchronized JDK version of ByteArrayOutputStream and its manually unsynchronized copy (see the first part of Various methods of binary serialization in Java article), but still worth the effort of changing a class name in your code.

As a result, “parse-compose” loop takes 42.5 sec for 10M messages (close to the sum of parsing and composing times). How we can optimize it? The best idea in this case is to do less.

Next step: optional field parsing

Let’s return to the conception of optional field parsing discussed earlier in this article. We will not develop optional parsing methods for Field class. Instead, we will expect that its value field will always contains an original unparsed string value. Now we will see how long it will take us to compose a message out of such fields (note that such approach will allow us to avoid conversion problems for double fields). A splitting method is very similar to parse5 method: the only difference is makeField method calls were replaced with direct invocations of Field constructor. Composing method got simpler:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
//compose a message from a list of not parsed fields
private static String composeNotParsed( final List<Field> fields )
{
    boolean first = true;
    final StringBuilder sb = new StringBuilder( 1024 );
    for ( final Field fld : fields )
    {
        if ( first )
            first = false;
        else
            sb.append( FIELD_SEPARATOR_CHAR );
        sb.append( fld.id ).append( VALUE_SEPARATOR_CHAR ).append( fld.value );
    }
    return sb.toString();
}
//compose a message from a list of not parsed fields
private static String composeNotParsed( final List<Field> fields )
{
    boolean first = true;
    final StringBuilder sb = new StringBuilder( 1024 );
    for ( final Field fld : fields )
    {
        if ( first )
            first = false;
        else
            sb.append( FIELD_SEPARATOR_CHAR );
        sb.append( fld.id ).append( VALUE_SEPARATOR_CHAR ).append( fld.value );
    }
    return sb.toString();
}

Final solution: don’t compose messages at all!

It takes 8 sec to compose 10M messages out of such fields on both Java 6 and 7. Good improvement compared to 28 sec, isn’t it? No, it isn’t. Actually, we shouldn’t pay for composing a message at all. We simply need to store an original message with a parsed list of fields and use it later when needed. This advice is applicable for any gateway-like applications: if you receive a message, use some of its fields (but do not modify it) and then pass this message to some other location – never discard an original message! Reuse it instead. It is always cheaper than composing it back: for example, it takes only 9.5 sec to split 10M messages into unparsed fields lists on my laptop, which is over 4 times less than original optimized parse-compose loop (42.5 sec).

Summary

Always try to avoid “binary/string -> Java type -> binary/string” conversions for short-living objects. It is always better to store an original message (if you don’t modify a message) or original fields (if you modify only some fields) and reuse them when you have to compose an output message, rather than to convert back from Java types into binary/text message. Besides saving CPU cycles on data conversions, you will also avoid unnecessary memory allocations for converted values.

Source code for FIX parser/gateway articles


Leave a Reply

Your email address will not be published. Required fields are marked *