Ssh shell script execution performance suffers greatly in Parallel Task Library

0 votes
asked Aug 2, 2013 by hardywang (180 points)
edited Aug 12, 2013

I have following pieces of codes

    private static void ProcessOneRequestRebex(int threadId, long requestId)
    {
        Console.WriteLine("Thread {0}, {2}, request Id {1} received.", threadId, requestId, DateTime.Now.ToString("HH:mm:ss:ffff"));

        // prepare a tar file to upload
        string tgzFile = PrepareTarGzFile(requestId.ToString("D10"));
        Console.WriteLine("Thread {2}, {0}, File {1} TARed.", DateTime.Now.ToString("HH:mm:ss:ffff"), requestId, threadId);

        // establish the shared SSH connection
        using (var session = new Rebex.Net.SshSession())
        {
            session.Connect("my-unix-server");
            Console.WriteLine("Thread {2}, {0}, {1} session connected.", DateTime.Now.ToString("HH:mm:ss:ffff"), requestId, threadId);

            // authenticate
            session.Authenticate("my-userid", "my-password");
            Console.WriteLine("Thread {2}, {0}, {1} session authenticated.", DateTime.Now.ToString("HH:mm:ss:ffff"), requestId, threadId);

            // create client and connect
            using (Sftp sftpClient = new Sftp())
            {
                sftpClient.Bind(session);
                Console.WriteLine("Thread {2}, {0}, {1} sftp session bound.", DateTime.Now.ToString("HH:mm:ss:ffff"), requestId, threadId);

                sftpClient.PutFile(tgzFile, "/opt/some/some_in" + new FileInfo(tgzFile).Name);
                Console.WriteLine("Thread {2}, {0}, File {1} uploaded.", DateTime.Now.ToString("HH:mm:ss:ffff"), requestId, threadId);

                using (Ssh sshClient = new Ssh())
                {
                    sshClient.Bind(session);
                    Console.WriteLine("Thread {2}, {0}, {1} ssh client session bound.", DateTime.Now.ToString("HH:mm:ss:ffff"), requestId, threadId);

                    string result = sshClient.RunCommand("/opt/some/and_some/test.sh ../some_in/" + new FileInfo(tgzFile).Name);
                }
                Console.WriteLine("Thread {2}, {0}, Publish {1} command executed.", DateTime.Now.ToString("HH:mm:ss:ffff"), requestId, threadId);

                sftpClient.GetFile("/opt/some/some_out/output." + requestId.ToString("D10") + ".pdf", @"C:\temp\output." + requestId.ToString("D10") + ".pdf");
                Console.WriteLine("Thread {2}, {0}, File {1} downloaded.", DateTime.Now.ToString("HH:mm:ss:ffff"), requestId, threadId);
            }
        }
        Console.WriteLine("Thread {1},, {0} Successful.", requestId, threadId);
    }

And it is wrapped inside a Parallel.For

        Random rnd = new Random();
        Parallel.For(0, X, i =>
        {
            long requestId = rnd.NextLong(1000000000, 9999999999);
            ProcessOneRequestRebex(i, requestId);
        });

When I set the parallelism to 1 (only one thread) I pretty much can get end to end in about 2 seconds, see the Console output from code below:

Thread 0, 10:59:02:9018, request Id 3961681714 received.

Thread 0, 10:59:03:1258, File 3961681714 TARed.

Thread 0, 10:59:04:0228, 3961681714 session connected.

Thread 0, 10:59:04:2688, 3961681714 session authenticated.

Thread 0, 10:59:04:4738, 3961681714 sftp session bound.

Thread 0, 10:59:04:6228, File 3961681714 uploaded.

Thread 0, 10:59:04:6278, 3961681714 ssh client session bound.

Thread 0, 10:59:05:4438, Publish 3961681714 command executed.

Thread 0, 10:59:06:4098, File 3961681714 downloaded.

Thread 0,, 3961681714 Successful.

Please notice, the time between file uploaded and command executed is only about 1 second or so.

However, if I set the parallelism to 10 (10 concurrent thread, I have I7 CPU, 4 cores, Windows 7), I got something like following

Thread 0, 11:03:09:6928, request Id 3921126483 received.

Thread 6, 11:03:09:6968, request Id 1444956043 received.

Thread 2, 11:03:09:6978, request Id 5699697281 received.

Thread 8, 11:03:09:6968, request Id 1908922957 received.

Thread 4, 11:03:09:6978, request Id 2977101433 received.

Thread 6, 11:03:09:9088, File 1444956043 TARed.

Thread 8, 11:03:09:9088, File 1908922957 TARed.

Thread 2, 11:03:09:9098, File 5699697281 TARed.

Thread 4, 11:03:09:9178, File 2977101433 TARed.

Thread 0, 11:03:09:9178, File 3921126483 TARed.

Thread 1, 11:03:10:4948, request Id 3077438486 received.

Thread 1, 11:03:10:5168, File 3077438486 TARed.

Thread 3, 11:03:11:0478, request Id 8839964099 received.

Thread 0, 11:03:11:0568, 3921126483 session connected.

Thread 6, 11:03:11:0568, 1444956043 session connected.

Thread 8, 11:03:11:0568, 1908922957 session connected.

Thread 1, 11:03:11:0598, 3077438486 session connected.

Thread 2, 11:03:11:0618, 5699697281 session connected.

Thread 4, 11:03:11:0638, 2977101433 session connected.

Thread 3, 11:03:11:0758, File 8839964099 TARed.

Thread 1, 11:03:11:1938, 3077438486 session authenticated.

Thread 8, 11:03:11:1948, 1908922957 session authenticated.

Thread 2, 11:03:11:2168, 5699697281 session authenticated.

Thread 0, 11:03:11:2178, 3921126483 session authenticated.

Thread 4, 11:03:11:2568, 2977101433 session authenticated.

Thread 6, 11:03:11:2768, 1444956043 session authenticated.

Thread 3, 11:03:11:3668, 8839964099 session connected.

Thread 8, 11:03:11:3998, 1908922957 sftp session bound.

Thread 6, 11:03:11:3998, 1444956043 sftp session bound.

Thread 1, 11:03:11:3998, 3077438486 sftp session bound.

Thread 4, 11:03:11:4038, 2977101433 sftp session bound.

Thread 0, 11:03:11:3998, 3921126483 sftp session bound.

Thread 2, 11:03:11:3998, 5699697281 sftp session bound.

Thread 3, 11:03:11:4588, 8839964099 session authenticated.

Thread 6, 11:03:11:5228, File 1444956043 uploaded.

Thread 1, 11:03:11:5238, File 3077438486 uploaded.

Thread 4, 11:03:11:5298, File 2977101433 uploaded.

Thread 6, 11:03:11:5328, 1444956043 ssh client session bound.

Thread 4, 11:03:11:5338, 2977101433 ssh client session bound.

Thread 1, 11:03:11:5328, 3077438486 ssh client session bound.

Thread 0, 11:03:11:5308, File 3921126483 uploaded.

Thread 0, 11:03:11:5398, 3921126483 ssh client session bound.

Thread 8, 11:03:11:5238, File 1908922957 uploaded.

Thread 8, 11:03:11:5418, 1908922957 ssh client session bound.

Thread 2, 11:03:11:5508, File 5699697281 uploaded.

Thread 2, 11:03:11:5518, 5699697281 ssh client session bound.

Thread 3, 11:03:11:5788, 8839964099 sftp session bound.

Thread 3, 11:03:11:7238, File 8839964099 uploaded.

Thread 3, 11:03:11:7258, 8839964099 ssh client session bound.

Thread 5, 11:03:12:0008, request Id 1707773597 received.

Thread 5, 11:03:12:0448, File 1707773597 TARed.

Thread 5, 11:03:12:4028, 1707773597 session connected.

Thread 5, 11:03:12:6458, 1707773597 session authenticated.

Thread 5, 11:03:12:7648, 1707773597 sftp session bound.

Thread 5, 11:03:12:8938, File 1707773597 uploaded.

Thread 5, 11:03:12:8958, 1707773597 ssh client session bound.

Thread 7, 11:03:12:9978, request Id 7507612201 received.

Thread 7, 11:03:13:0348, File 7507612201 TARed.

Thread 7, 11:03:13:3238, 7507612201 session connected.

Thread 7, 11:03:13:4318, 7507612201 session authenticated.

Thread 7, 11:03:13:5528, 7507612201 sftp session bound.

Thread 7, 11:03:13:6458, File 7507612201 uploaded.

Thread 7, 11:03:13:6468, 7507612201 ssh client session bound.

Thread 9, 11:03:13:9978, request Id 3423911186 received.

Thread 9, 11:03:14:0368, File 3423911186 TARed.

Thread 9, 11:03:14:2968, 3423911186 session connected.

Thread 9, 11:03:14:4118

commented Aug 6, 2013 by Lukas Pokorny (88,550 points)
edited Aug 6, 2013

Would it be possible to create a communication log of this using SshSession object's LogWriter property (as described at http://www.rebex.net/kb/logging.aspx), possibly even in LogLevel.Verbose mode? That way, we will see what is actually going on during those 3 to 6 seconds. It's possible to make all the objects log into a single file. Also, when using Verbose mode, the log will contain your password, so either delete it or assign LogWriter after Authenticate has already been called.

commented Aug 6, 2013 by Lukas Pokorny (88,550 points)
edited Aug 6, 2013

FYI, when you create multiple SshSession objects, they should be completely independent from each other, which makes me wonder what is actually causing the issue. Hopefully, the Verbose log will tell.

commented Aug 6, 2013 by hardywang (180 points)
edited Aug 6, 2013

Hi Lucas, the log file is about 36mb, after 7z it is 2.3mb, how can I send it to you?

commented Aug 7, 2013 by Lukas Pokorny (88,550 points)
edited Aug 7, 2013

Please upload it via FTP - I just sent you the hostname, username and password to your e-mail. Thanks!

commented Aug 8, 2013 by hardywang (180 points)
edited Aug 8, 2013

Please check FTP, you can see 2 files. Actually they are the same. Please let me know what you can find.

1 Answer

0 votes
answered Aug 9, 2013 by Lukas Pokorny (88,550 points)
edited Aug 12, 2013

The log indeed shows lags of about 3 seconds - a command is executed and starts sending data, but this often starts lagging when another command or SFTP session (on a separate SshSession) is started on another channel. However, from the log, it looks like the lag occured at the server side and no data actually arrived during those 3 seconds.

Unfortunately, there was no indication of any kind of issue that might be causing this. A log produced by a network protocol analyzer such as Wireshark might be used to either confirm or disprove that the lag really occurs at the server side. We checked our SSH code as well, trying to find anything that might be causing the problem, but found nothing - multiple instances of SshSession really appear to be totally independent (as they should be).

Interestingly, testing with one of our SSH servers showed results similar to yours - performance really suffers greatly with multiple threads. In order to try ruling out the possibility of a bug in our library, we then tried a different approach - instead of parallelizing using Parallel.For, we instead spawned several instances of the same application to do some, which ensures that instances of SshSession really are independent of each other. And the results were actually quite similar to Parallel.For... If you give this a try as well, please let us know about your results.

To sum this up, we are currently inclined to believe this to be caused at the server side, not the client side.

By the way, SSH negotiation is very computationally intensive because it involves asymmetric cryptography (Diffie-Hellman and RSA or DSA). Have you considered or tried using a single instance SshSession for all the Sftp and Ssh objects? Or - even better - reusing those Sftp and Ssh objects as well? Both support multiple simultaneous operations. It might also be useful to limit the maximum degree of parallelism using ParallelOptions.MaxDegreeOfParallelism. I'm not sure what the default value is, but using a value higher that the number of CPU cores at the SSH server won't improve performance at all (and even a lower number might be too much if the bottleneck is not the CPU - in our test scenario, I believe it was a disk drive rather than CPU).

In any case, please let us know what you find. It might be useful both for other customers and for us as well. Thanks!

commented Aug 9, 2013 by hardywang (180 points)
edited Aug 9, 2013

Please see my update post. 2 issues - performance and an intermit exception Cannot open channel; administratively prohibited. Open failed. from at Rebex.Net.Sftp.ZZ(SshSession A, PEB B) at Rebex.Net.Sftp.Bind(SshSession session)

commented Aug 12, 2013 by Lukas Pokorny (88,550 points)
edited Aug 12, 2013

This exception was raised as a result of the SSH server refusing to open an SFTP channel - it sent a SSH_MSG_CHANNEL_OPEN_FAILURE packet with a reason of SSH_OPEN_ADMINISTRATIVELY_PROHIBITED (see RFC 4254). Perhaps the server has some limitations on maximum channels per sessions?

commented Aug 12, 2013 by hardywang (180 points)
edited Aug 12, 2013

Could be. So I may have to stick with one session per thread approach, thus I am not able to share the sessions/sftp/ssh connections.

...