Loading...

Why is Stream.Copy faster than Stream.Write to FileStream?


I have a question and I can't find a reason for it. I'm creating a custom archive file. I'm using MemoryStream to store data and finally I use a FileStream to write the data to disk.

My hard disk is an SSD, but the speed was too slow. When I tried to write only 95 MB to a file, it took 12 seconds to write!

I tried Filestream.Write and File.WriteAllBytes but it's the same.

At the end I got an idea to do it with copying and it was 100x faster!

I need to know why this is happening and what's wrong with the write functions.

Here's my code:

//// First of all I create an example 150MB file
Random randomgen = new Random();
byte[] new_byte_array = new byte[150000000];
randomgen.NextBytes(new_byte_array);

//// I turned the byte array into a MemoryStream
MemoryStream file1 = new MemoryStream(new_byte_array);
//// HERE I DO SOME THINGS WITH THE MEMORYSTREAM


/// Method 1 : File.WriteAllBytes | 13,944 ms
byte[] output = file1.ToArray();
File.WriteAllBytes("output.test", output);

// Method 2 : FileStream | 8,471 ms
byte[] output = file1.ToArray();
FileStream outfile = new FileStream("outputfile",FileMode.Create,FileAccess.ReadWrite);
outfile.Write(output,0, output.Length);

// Method 3 | FileStream | 147 ms !!!! :|
FileStream outfile = new FileStream("outputfile",FileMode.Create,FileAccess.ReadWrite);
file1.CopyTo(outfile);

Also, file1.ToArray() only takes 90 ms to convert the MemoryStream to bytes.

Why is this happening and what is the reason and logic behind it?

- - Source

Answers

answered 1 week ago Johnny #1

Update

Dmytro Mukalov has right. The performances you gain by extending FileStream internal buffer will be taken away when you do actual Flush. I dig a bit deeper and did some benchmark and it seems that the difference between Stream.CopyTo and FileStream.Write is that Stream.CopyTo use I/O buffer smarter and boost performances by copying chunk by chunk. At the end CopyTo use Write under the hood. The optimum buffer size has been discussed here.

Optimum buffer size is related to a number of things: file system block size, CPU cache size, and cache latency. Most file systems are configured to use block sizes of 4096 or 8192. In theory, if you configure your buffer size so you are reading a few bytes more than the disk block, the operations with the file system can be extremely inefficient (i.e. if you configured your buffer to read 4100 bytes at a time, each read would require 2 block reads by the file system). If the blocks are already in cache, then you wind up paying the price of RAM -> L3/L2 cache latency. If you are unlucky and the blocks are not in cache yet, you pay the price of the disk->RAM latency as well.

So to answer your question, in your case you are using unoptimized buffer sizes when using Write and optimized when you are using CopyTo or better to say Stream itself will optimize that for you.

Generally, you could force also unoptimized CopyTo by extending FileStream internal buffer, in that case, the results should be comparaably slow as unoptimized Write.

FileStream outfile = new FileStream("outputfile",
    FileMode.Create, 
    FileAccess.ReadWrite,
    FileShare.Read,
    150000000); //internal buffer will lead to inefficient disk write
file1.CopyTo(outfile);
outfile.Flush(); //don't forget to flush data to disk

Original (wrong assumption)

I did the analysis of the Write methods of the FileStream and MemoryStream and the point there is that MemoryStream always use an internal buffer to copy data, and it is extremely fast. The FileStream itself has a switch if the requested count >= bufferSize, which is true in your case as you are using default FileStream buffer, the default buffer size is 4096. In that case FileStream doesn't use buffer at all but native Win32Native.WriteFile.

The trick is to force FileStream to use the buffer by overriding the default buffer size. Try this:

// Method 2 : FileStream | 8,471 ms
byte[] output = file1.ToArray();
FileStream outfile = new FileStream("outputfile",
    FileMode.Create,
    FileAccess.ReadWrite, 
    FileShare.Read,
    output.Length + 1); // important, the size of the buffer
outfile.Write(output, 0, output.Length);

n.b. I do not say it is optimal buffer size just an explanation what is going on. To examine the best buffer size using FileStream refer to, link.

comments powered by Disqus