ZipArchive.Options.CompressorCount property

0 votes
asked Aug 12, 2015 by Chris6 (280 points)

Does the new ZIP property automatically determine the number of cores for optimal performance or do I need to manually assign it?

I have done some tests (my computer has 4 cores) but the default value is 3?

I have also tried changing it to zero, one, two and the amount does not vary significantly?

=== Test code ===

        StringBuilder tmp = new StringBuilder();

        for (int i = 0; i < 5; i++)
        {
            Stopwatch stopWatch = new Stopwatch();
            stopWatch.Start();

            string zipfile = @"C:\Users\Chris\Documents\Visual Studio 2012\Projects\WindowsFormsApplication1\WindowsFormsApplication1\archive.zip";

            if (File.Exists(zipfile))
                File.Delete(zipfile);

            using (ZipArchive zip = new ZipArchive(zipfile))
            {
                zip.Options.CompressorCount = i;
                zip.Add(@"C:\Users\Chris\Dropbox\*", "/test");
            }

            if (File.Exists(zipfile))
                File.Delete(zipfile);

            stopWatch.Stop();

            // Get the elapsed time as a TimeSpan value.
            TimeSpan ts = stopWatch.Elapsed;

            // Format and display the TimeSpan value. 
            string elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",
                ts.Hours, ts.Minutes, ts.Seconds,
                ts.Milliseconds / 10);
            tmp.AppendLine("RunTime " + elapsedTime + "(" + i + ")");
        }

        MessageBox.Show(tmp.ToString());


=== Output ===


RunTime 00:00:07.48(0)
RunTime 00:00:07.38(1)
RunTime 00:00:07.62(2)
RunTime 00:00:07.60(3)
RunTime 00:00:07.60(4)
Applies to: Rebex ZIP
commented Aug 12, 2015 by Chris6 (280 points)
I ran it on another computer with 6 cores (different data) and the results are much the same:

<quote>
RunTime 00:00:52.74(0)
RunTime 00:00:09.19(1)
RunTime 00:00:09.71(2)
RunTime 00:00:09.86(3)
RunTime 00:00:09.68(4)
RunTime 00:00:09.74(5)
RunTime 00:00:09.70(6)
RunTime 00:00:09.68(7)
RunTime 00:00:09.72(8)
RunTime 00:00:09.70(9)
</quote>

Is it recommended this property is left at the default value?.
commented Aug 12, 2015 by Lukas Pokorny (120,490 points)
How do your test data look like? Multi-core compression does not provide any speedup at all if the files you are compressing are not long enough. By default, the file is split into 512KB-long blocks of data and each is processed independently, which that there is no gain compressing a 512KB file. The chunk size can be cusstomized using ZipArchive.Options.CompressorChunkSize property, but setting it to a small value is not recommended - it would lead to more overhead and worse compression ratio.
commented Aug 12, 2015 by Chris6 (280 points)
Hi Lukas,

The test data is photos all 2-3 MB (679MB in total).

The CPU utilisation does not go above 10%, but when I use WinRAR (3 seconds) or 7ZIP (1 second) the CPU gets maxed out.

Its just an observation the current compression speed is good enough for us.

Chris

1 Answer

0 votes
answered Aug 12, 2015 by Lukas Pokorny (120,490 points)
selected Sep 1, 2015 by Chris6
 
Best answer

Yes, the default value is determined automatically, and it will always be slightly less than the number of CPU cores - there is always the main thread that manages the worker threads, and we have found that trying to utilize all the CPU cores is not worth the increase in overall CPU consumption.

The default value is derived from the number of virtual CPU cores:

  • less than 4 virtual cores => 1 compressor
  • 4 virtual cores => 2 compressors
  • 8 virtual cores => 3 compressors
  • 16 virtual cores => 6 compressors
  • more than 23 virtual cores => 8 compressors

This said, our multi-core compression works with 512 KB long chunks (by default - can be changed using ZipArchive.Options.CompressorChunkSize, but see my comment below), which means it does not provide any speed increase for short files.

We plan to do extensive benchmarking and publish a blog post soon outlining what speed increase you can expect in various scenarios. For now, I measured the duration of compressing 2 log files with an overall size of 410 MB:

Compression level 6 (default):
1 core … 45 seconds
2 cores … 25 seconds
3 cores … 17 seconds

Compression level 9 (highest):
1 core … 121 seconds
2 cores … 67 seconds
3 cores … 44 seconds

The time measurement indicates the total duration of the two ZipArchive.AddFile method calls.

These measurements were done on a Dell laptop with Intel Core i7-3740QM CPU (4 CPU cores, 8 virtual cores) at 2.7GHz, 16 GB RAM and an SSD disk drive.

Using 4 cores still enhanced the speed noticeably, but for 5, 6 or more cores the additional speed increase was not worth the increased overall CPU usage.

...