Single file vs multi file storage

by Mikhail Vorontsov

Introduction

This article will discuss how badly your application may be impacted by a data storage containing a huge number of tiny files. While it is obvious that it is better to keep the same amount of data in a single file rather than in a million of tiny files, we will see what modern operating systems offer to alleviate this problem.

This article will describe data layout problem I had recently faced. Suppose we have a database containing a few million records with average size of 10 KB, but no hard upper limit. Each of these records has a unique long id. Besides that, there is some other data, but its size is unlikely to exceed ~5% of space occupied by records.

For some reason there was a decision to use ordinary files as a data storage. The original implementation used one file per entry. Files were separated into directories according to auxiliary attributes. All this data was read into memory on startup. There are other analytical batch tools which also read the whole database in memory before data processing.

The system required very long time to startup due to necessity to read all these tiny files from disk. My task was to solve this bottleneck.

Solution

As profiling has shown, most of time during load was spent on locating and opening these tiny files. We have decided to migrate the system to composite data files, containing large groups of records (groups were defined based on other system requirements). We have found that the number of such groups was nearly fixed (it grows very slowly), but the number of records inside a group was constantly growing.

We have used NoSQL ideas of appending data to these data files: the file contains a list of records, each of them is prefixed by id, used flag (as an optimization), record timestamp and record length. used flag was the only sidestep from pure appending functionality – it allowed the system to skip historical records without parsing them.

As a result, time to load the whole dataset in memory got very close to the time required for the SSD to load a sequential chunk of data. At the same time, sharding the data into groups allowed the system to be distributed onto the several independent machines if it will be required later.

Testing

In order to test this problem, I have used 2 computers:

Both these CPUs have nearly identical CPU frequency, which made these tests even less dependent of CPU speed.

The test compared two data storages:

  • A linked list storage described above in this article, where each segment was 20K long and there were 50.000 segments in this file.
  • A file set of 50.000 files, each of those was 20K long. Files were split in 2 levels of directories (digits 1 and 2 for the first level, digits 3 and 4 for the second level, digit 5 for the file name), so accessing each file required going 2 directories deep.

Reading tests were made on EXT4, FAT32 (sector size=64K) and NTFS (sector size=4K) file systems using the same external USB drive (unfortunately, a 3 year old one). Nothing else was kept on disk, so it was an optimal situation in terms of files location on disk. FAT32 and NTFS were tested on both systems, EXT4 was tested only on Ubuntu 12.

The important notice about disk read testing is that you have to reboot your system every time before reading a new test file set. Otherwise your operating system may (and it will in most situations) cache all the files in memory and serve you the cached data instead of reading it from disk.

  Ubuntu 12 Windows 7
ext4, one file 35.547 sec
ext4, 50K files 63.381 sec
fat32, one file 29.342 sec 32.806 sec
fat32, 50K files 148.746 sec 166.069 sec
ntfs, one file 30.196 sec 38.383 sec
ntfs, 50K files 54.629 sec 211.933 sec

As you can see, the time to read a single composite file was nearly identical on all file and operating systems. It was limited mostly by HDD read speed and to a lesser degree by the CPU and RAM speed, which is an absolutely expected behavior.

Multiple files test is much more interesting. We should look at Ubuntu and Win7 tests separately.

On Ubuntu it took about the same time to read the data from EXT4 and NTFS file systems, but approximately 3 times longer from FAT32 file system. I think the reason of this slowdown is that FAT32 was using 64K sectors and file sizes were 20K. It means that operating system has to read 3 times more data (64K/20K) in order to read these small files from disk. If this assumption is correct, then we may assume that Ubuntu was prefetching data on the physical “cylinder-head-sector” level, thus hiding a lot of expected performance downgrade.

Windows 7, on the other hand, seemed to make very little prefetching at least on its native NTFS file system, which is very sad.

I want to repeat that all these tests used the clean HDD for testing. A real life scenario should be worse, because all these small files will not be located physically adjacent to each other.

Summary

  • In general, avoid storing your data in a large number of data files. At least, limit the number of such files, so that the growth of your data will not linearly impact the number of files you have to process.
  • If you still need to handle a large number of small files, use Linux and any of its native file systems which allow you to use 4K sectors (thus limiting the storage and read/write overhead).
  • Once you have read the files (at least in Linux), you may expect to find them in the file cache, especially if the global memory consumption in your system is not too high. The same applies to the case of one application writing these files to disk and another one reading them – chances are high that the second program will read these files from OS file cache instead of actually reading them from the disk.

Test class source code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
public class MultipleFilesTest {
    private static final File ROOT = new File( "c:/_multiFileTest50" );
    public static final File COMPOSITE_FILE = new File( ROOT, "composite.dat" );
    private static final int SEGMENT_SIZE = 20000;
    private static final int MESSAGE_NUMBER = 50000;
 
    public static void main( String[] args ) throws IOException {
        //run these 2 methods to create a test set
        makeCompositeFile( MESSAGE_NUMBER );
        makeSingleFiles( MESSAGE_NUMBER );
        //then reboot your computer and run the following method. Do not run them in a single program - your data
        //will be read from the OS memory cache
        testReadSpeed( MESSAGE_NUMBER );
    }
 
    private static void testReadSpeed( final int count ) throws IOException {
        final long startComposite = System.currentTimeMillis();
        final List<Integer> resComposite = readCompositeFile();
        final long timeComposite = System.currentTimeMillis() - startComposite;
 
        final long startSingle = System.currentTimeMillis();
        final List<Integer> resSingle = readSingleFiles( count );
        final long timeSingle = System.currentTimeMillis() - startSingle;
 
        if ( !resComposite.equals( resSingle ) )
        System.out.println( "Results not equal!" );
 
        System.out.println( "Time to read composite file with " + count + " segments = " + ( timeComposite / 1000.0 ) + " sec" );
        System.out.println( "Time to read " + count + " single files = " + ( timeSingle / 1000.0 ) + " sec" );
    }
 
    private static List<Integer> readCompositeFile() throws IOException
    {
        final List<Integer> res = new ArrayList<Integer>( MESSAGE_NUMBER );
        final DataInputStream is = new DataInputStream( new BufferedInputStream( new FileInputStream( COMPOSITE_FILE ), 65536 ) );
        try
        {
            while ( true )
            {
                try
                {
                    final int id = is.readInt();
                    final int len = is.readInt();
                    final byte[] data = new byte[ len ];
                    is.readFully( data );
                    res.add( id );
                }
                catch ( EOFException ex ) {
                    return res;
                }
            }
        }
        finally {
            is.close();
        }
    }
 
    private static List<Integer> readSingleFiles( final int count ) throws IOException
    {
        final int maxLength = String.valueOf( count ).length();
        final List<Integer> res = new ArrayList<Integer>( count );
        for ( int i = 0; i < count; ++i )
        {
            final File file = indexToFile( i, maxLength );
            final DataInputStream dis = new DataInputStream( new BufferedInputStream( new FileInputStream( file ), SEGMENT_SIZE + 8 ) );
            try
            {
                res.add( i );
                final int id = dis.readInt();
                final int len = dis.readInt();
                final byte[] data = new byte[ len ];
                dis.readFully( data );
                res.add( id );
            }
            finally {
                dis.close();
            }
        }
        return res;
    }
 
    private static void makeCompositeFile( final int count ) throws IOException
    {
        if ( COMPOSITE_FILE.exists() )
            return;
        //this is segment data
        final byte[] data = getSegmentData();
        /*
        Composite file contains multiple segments. Each segment contains:
        id:int
        length:int
        data:byte[ length ]
        */
        final DataOutputStream os = new DataOutputStream( new FileOutputStream( COMPOSITE_FILE ) );
        try
        {
            for ( int i = 0; i < count; ++i )
            {
                os.writeInt( i ); //id
                os.writeInt( data.length ); //message length
                os.write( data ); //message content
            }
        }
        finally {
            os.close();
        }
    }
 
    private static void makeSingleFiles( final int count ) throws IOException
    {
        //this is segment data
        final byte[] data = getSegmentData();
        final int maxLength = String.valueOf( count ).length();
        //populate all files
        for ( int i = 0; i < count; ++i )
        {
            final File file = indexToFile( i, maxLength );
            if ( !file.getParentFile().exists() )
                file.getParentFile().mkdirs();
            if ( file.exists())
                continue;
            final DataOutputStream os = new DataOutputStream( new FileOutputStream( file ) );
            try
            {
                os.writeInt( i ); //some sort of id
                os.writeInt( data.length ); //message length
                os.write( data ); //message contents
            }
            finally {
                os.close();
            }
        }
    }
 
    private static byte[] getSegmentData() {
        final byte[] data = new byte[ SEGMENT_SIZE ];
        for ( int i = 0; i < data.length; ++i )
            data[ i ] = ( byte ) i;
        return data;
    }
 
    private static File indexToFile( final int index, final int maxLength )
    {
        String val = String.valueOf( index );
        while ( val.length() < maxLength )
            val = "0" + val;
        File curFile = ROOT;
        while ( val.length() > 2 )
        {
            curFile = new File( curFile, val.substring( 0, 2 ) );
            val = val.substring( 2 );
        }
        return new File( curFile, val );
    }
}
 
    
public class MultipleFilesTest {
    private static final File ROOT = new File( "c:/_multiFileTest50" );
    public static final File COMPOSITE_FILE = new File( ROOT, "composite.dat" );
    private static final int SEGMENT_SIZE = 20000;
    private static final int MESSAGE_NUMBER = 50000;

    public static void main( String[] args ) throws IOException {
        //run these 2 methods to create a test set
        makeCompositeFile( MESSAGE_NUMBER );
        makeSingleFiles( MESSAGE_NUMBER );
        //then reboot your computer and run the following method. Do not run them in a single program - your data
        //will be read from the OS memory cache
        testReadSpeed( MESSAGE_NUMBER );
    }

    private static void testReadSpeed( final int count ) throws IOException {
        final long startComposite = System.currentTimeMillis();
        final List<Integer> resComposite = readCompositeFile();
        final long timeComposite = System.currentTimeMillis() - startComposite;

        final long startSingle = System.currentTimeMillis();
        final List<Integer> resSingle = readSingleFiles( count );
        final long timeSingle = System.currentTimeMillis() - startSingle;

        if ( !resComposite.equals( resSingle ) )
        System.out.println( "Results not equal!" );

        System.out.println( "Time to read composite file with " + count + " segments = " + ( timeComposite / 1000.0 ) + " sec" );
        System.out.println( "Time to read " + count + " single files = " + ( timeSingle / 1000.0 ) + " sec" );
    }

    private static List<Integer> readCompositeFile() throws IOException
    {
        final List<Integer> res = new ArrayList<Integer>( MESSAGE_NUMBER );
        final DataInputStream is = new DataInputStream( new BufferedInputStream( new FileInputStream( COMPOSITE_FILE ), 65536 ) );
        try
        {
            while ( true )
            {
                try
                {
                    final int id = is.readInt();
                    final int len = is.readInt();
                    final byte[] data = new byte[ len ];
                    is.readFully( data );
                    res.add( id );
                }
                catch ( EOFException ex ) {
                    return res;
                }
            }
        }
        finally {
            is.close();
        }
    }

    private static List<Integer> readSingleFiles( final int count ) throws IOException
    {
        final int maxLength = String.valueOf( count ).length();
        final List<Integer> res = new ArrayList<Integer>( count );
        for ( int i = 0; i < count; ++i )
        {
            final File file = indexToFile( i, maxLength );
            final DataInputStream dis = new DataInputStream( new BufferedInputStream( new FileInputStream( file ), SEGMENT_SIZE + 8 ) );
            try
            {
                res.add( i );
                final int id = dis.readInt();
                final int len = dis.readInt();
                final byte[] data = new byte[ len ];
                dis.readFully( data );
                res.add( id );
            }
            finally {
                dis.close();
            }
        }
        return res;
    }

    private static void makeCompositeFile( final int count ) throws IOException
    {
        if ( COMPOSITE_FILE.exists() )
            return;
        //this is segment data
        final byte[] data = getSegmentData();
        /*
        Composite file contains multiple segments. Each segment contains:
        id:int
        length:int
        data:byte[ length ]
        */
        final DataOutputStream os = new DataOutputStream( new FileOutputStream( COMPOSITE_FILE ) );
        try
        {
            for ( int i = 0; i < count; ++i )
            {
                os.writeInt( i ); //id
                os.writeInt( data.length ); //message length
                os.write( data ); //message content
            }
        }
        finally {
            os.close();
        }
    }

    private static void makeSingleFiles( final int count ) throws IOException
    {
        //this is segment data
        final byte[] data = getSegmentData();
        final int maxLength = String.valueOf( count ).length();
        //populate all files
        for ( int i = 0; i < count; ++i )
        {
            final File file = indexToFile( i, maxLength );
            if ( !file.getParentFile().exists() )
                file.getParentFile().mkdirs();
            if ( file.exists())
                continue;
            final DataOutputStream os = new DataOutputStream( new FileOutputStream( file ) );
            try
            {
                os.writeInt( i ); //some sort of id
                os.writeInt( data.length ); //message length
                os.write( data ); //message contents
            }
            finally {
                os.close();
            }
        }
    }

    private static byte[] getSegmentData() {
        final byte[] data = new byte[ SEGMENT_SIZE ];
        for ( int i = 0; i < data.length; ++i )
            data[ i ] = ( byte ) i;
        return data;
    }

    private static File indexToFile( final int index, final int maxLength )
    {
        String val = String.valueOf( index );
        while ( val.length() < maxLength )
            val = "0" + val;
        File curFile = ROOT;
        while ( val.length() > 2 )
        {
            curFile = new File( curFile, val.substring( 0, 2 ) );
            val = val.substring( 2 );
        }
        return new File( curFile, val );
    }
}

    

One thought on “Single file vs multi file storage

  1. Pingback: Single file vs multi file storage  - ...

Comments are closed.