Saturday, June 26, 2010

Java IO - Busting Buffered Streams Myth

Java IO is at the core of Java API and every java developer should be aware of how the Input/Output streams work. Streams support many different kinds of data, including simple bytes, primitive data types, localized characters, and objects. Some streams simply pass on data; others manipulate and transform the data in useful ways.

At the core of Java I/O are InputStream and OutputStream which are abstract classes and very basic implementations are FileInputStream and FileOutputStream that are used to read and write files respectively. But Java community recommends using Buffered versions of these streams such as BufferedInputStream and BufferedOutputStream due to performance benefits. This is due to the reason that internally the buffered streams create an default buffer of 8192 bytes and do not flush the stream unless this number is reached. But off late while writing a small java program to copy some movies on the file system I faced serious performance problems using Buffered streams that made me look into the source code of these classes and surprisingly the results were unexpected. Like other developers I have been just using the Buffered Streams without question but found that plain Input and Output streams were performing better in comparison.

But why is that so....the reason is synchronization. Yes, astonishingly the Java Doc of the BufferedInputStream and BufferedOutputStream does not mention that the read and write methods are synchronized. Hence the cost of acquiring and releasing the locks is such that plain streams are performing at least 3 times better than Buffered Streams.

Here are the two programs to test the performance of the streams, you can try them out and look at the difference.

Buffered: Average - 170000 ms

package work.filemaker;

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.Calendar;

public class BufferedReaderTest {

/**
* @param args
* @throws IOException
*/
public static void main(String[] args) throws IOException {

long startMilliSec = Calendar.getInstance().getTimeInMillis();

InputStream inputStream = new BufferedInputStream(new FileInputStream(new File("D:/3idiots.avi")));

BufferedOutputStream os = new BufferedOutputStream(new FileOutputStream("D:/copy3idiots.avi"));
int c;
while(( c = inputStream.read()) != -1) {
byte b = (byte) c;
os.write(b);
}

os.close();

long endMilliSec = Calendar.getInstance().getTimeInMillis();

System.out.println(endMilliSec - startMilliSec);

}

}

Plain Streams: Average - 52000 ms

package work.filemaker;

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.Calendar;

public class FileReaderTest {

/**
* @param args
* @throws IOException
*/
public static void main(String[] args) throws IOException {

long startMilliSec = Calendar.getInstance().getTimeInMillis();

InputStream inputStream = new FileInputStream(new File("D:/3idiots.avi"));

OutputStream os = new FileOutputStream("D:/copy3idiots.avi");
byte[] b = new byte[8192];
int c;
while(( c = inputStream.read(b)) != -1) {

os.write(b);
}

os.close();

long endMilliSec = Calendar.getInstance().getTimeInMillis();

System.out.println(endMilliSec - startMilliSec);

}

}

Hence think twice before using Buffered streams as these doesn't seem to improve the performance. One thing that is questionable is why are read and write methods in Buffered Streams synchronized? Conclusively,I would prefere to create my own buffers as specified in example above and use the Plain IO streams unless somebody can convince me to do otherwise.

User comments are most welcome....

12 comments:

  1. This cannot be due to synchronization as you are using it single-threaded. Most likely it's because you are copying the same bytes all over (native -> inputstream -> bufferedinputstream -> bufferedoutpustream -> outpustream -> native) but I'm 100% sure it's not because those methods are synchronized.

    Why does not the synchronized bother here? Without contention using even the heaviest (real) locking is close to immeasurable. With the latest JRE's there's biased locking (no locking if you are the only thread), optimizing out locking (no locking if jvm sees you are the only thread who can see this object), etc.

    You should use FileChannel#transferTo when copying files. Note that it might not work if you try to transfer the whole file in one command, you might need to split it in chunks.

    ReplyDelete
  2. I never tried but how about increasing the buffer size?

    ReplyDelete
  3. Also, shouldn't you be writing to the outputStream as many bytes as you have just read from in your while condition?

    It'd look like it's very unlikely that the files will be identical with your second method, only files where (sz % 8192 == 0) or (sz < 8192) will be good.

    os.write(b); should be changed to:
    os.write(b,0,c);

    Other notes; you should use System.currentTimeInMillis() as going through calendar instantiation requires synchronizing with a cache hashtable.

    ReplyDelete
  4. Please notice the 2 copy actions are not same. FileStream test copied every 8192 bytes a time, while BufferedStream test copied 1 byte a time.
    And the int to byte convert may decrease performance also.

    I agree with joonas that synchronized is not the key. Buffered stream should be faster if you copy 8192 bytes a time for both test.

    ReplyDelete
  5. I agree with other commenters. You copy one byte at a time in your buffered version instead of using a buffer like in the unbuffered version.

    The overhead is not synchronization. The overhead is that your are making 8192 calls in one case, and 1 call in the other case to copy the same thing. You should try the same thing but copying chunck of 8K or maybe even 64KB would improve performance.

    Otherwise you shouldn't expect buffered stream to provide better performance than your own buffer implementation excepted if you code it bad. This is here just for conveniance.

    ReplyDelete
  6. In my opinion 8K buffer size is best , also you can use Java NIO and try to use ByteBuffer class, good for sever applications.

    Javin
    10 practical Java logging tips

    ReplyDelete
  7. Your two versions behave vastly different. In the first case you copy every single byte individually, in the second case you buffer 8KB.

    Just for fun, I wrote my own test case that copies a 1GB file with and without buffered streams and in the buffered case I consistently get 5,6s of runtime, without buffering it's almost as fast, I get 5,6-5,7s.

    ReplyDelete
  8. I respect all your comments but what you can also try is to prove this is take the Buffered API source code from JAVA IO package and remove synchronized keywords and see the difference.

    Also I think Buffered Streams internally creates a buffer hence there is no need to create the buffer for the first program.

    ReplyDelete
  9. http://www.precisejava.com/javaperf/j2se/IO.htm

    Visit this link, i'm trying to prove the 3 key points mentioned at the end of this link

    ReplyDelete
  10. Reading and writing one byte at a time will never be performant...period. Try this: repeat your second example with buffers larger than 8k. Try 16k, 64k, 128k, and 256k. Put those results in a spreadsheet. Now rerun each of those test but with the input and output streams wrapped in their buffered versions. I would be very interested to see the results. I did something similar and found that on the windows desktop I was using, 256k with buffered streams was by far the fastest. The test dataset was an entire directory of mp3s. Now that I think about it, you might want to make several copies of your avi, one for each test. That would eliminate any kind of operating system or disk subsystem caching from skewing the results... you know like if they noticed you kept accessing the same file.

    ReplyDelete
  11. It's just not the same, with the extra overhead of reading byte by byte you loose the buffer advantage. You need to compare with a "extra buffer" and read in it in both cases.

    With the second example I got that, varing the buffer, the results are linear between 64 bytes and 1024 bytes of buffer size. Bigger buffer doesn't change performance.

    With the first example (modified to use an "extra buffer") I got the same performance with any buffer of 8 bytes or more. And guess what, those results are just the same that the second example with a buffer of 1Kb or more, always ~20s. Not slightly differnt, just the same. Setting the size ef the buffer in the BufferedInputStream shows performance changes as changing the size of the buffer you use in the second example...

    ReplyDelete
  12. Just try and take the source code of BufferedOutputStream and remove all the synchronized keywords from the class. Run the program and see the difference

    ReplyDelete