Problem with sftp PutFile using offsets

Question

Problem with sftp PutFile using offsets

asked Apr 8, 2021 by j355y (130 points)

Hello,

I don't know if I'm doing something wrong, and there's probably something that I don't understand...

With Rebex 5.0.7733 in a .Net Core 3.1 app.

I'm getting a file, byte by byte, from an Azure Blob Storage. Of course, my end goal is not byte by byte, but for testing purposes, I'm doing that.

I then do a PutFile, for each byte, incrementing the offset. I never have any problem with the first byte. The file is written on the SFTP, and contains my first char. As soon as I increment the offset in PutFile, I get an error. If I let the offset always to 0, my file will be overwritten, correctly, by each byte. So, the stream is ok, the length is ok, the problem is with appending to the file I guess.

the code :

 long offset = 0;
                long remainingBytes = 0;
                long length = 1;
                do
                {
                    MemoryStream stream = new MemoryStream();
                    azUtilities.GetFileStreamBlock(_AzureFileName, offset*length, ref stream, length, out remainingBytes);
                    stream.Position = 0;
                    client.PutFile(stream, _SftpFileName, offset*length, stream.Length);                        
                    offset++;
                    stream.Dispose();
                } while (remainingBytes > 0);

The error :

2021-04-08 18:42:49.494 DEBUG Sftp(1)[1] SSH: Authentication successful.
2021-04-08 18:42:49.502 DEBUG Sftp(1)[1] SSH: Opening channel 'session' (initial window size: 131072, max packet size: 129024).
2021-04-08 18:42:49.529 DEBUG Sftp(1)[1] SSH: Requesting subsystem 'sftp'.
2021-04-08 18:42:49.555 DEBUG Sftp(1)[6] SSH: Adjusted remote receive window size: 0 -> 1048576.
2021-04-08 18:42:49.569 INFO Sftp(1)[1] Command: SSH_FXP_INIT (4)
2021-04-08 18:42:49.588 INFO Sftp(1)[1] Response: SSH_FXP_VERSION (4, 1 extension)
2021-04-08 18:42:49.590 INFO Sftp(1)[1] Info: Using SFTP v4 on a Windows-like platform.
2021-04-08 18:42:49.594 INFO Sftp(1)[1] Command: SSH_FXP_REALPATH (1, '.')
2021-04-08 18:42:49.651 INFO Sftp(1)[1] Response: SSH_FXP_NAME (1, '/')
2021-04-08 18:42:49.652 INFO Sftp(1)[1] Info: Home directory is '/'.
2021-04-08 18:42:52.571 INFO Sftp(1)[1] Command: SSH_FXP_OPEN (2, '/c_DRV3546_Plateforme_Integration_Biztalk_A/UNIT/l2c/new-file.txt', 26)
2021-04-08 18:42:52.654 INFO Sftp(1)[1] Response: SSH_FXP_HANDLE (2, 0x255775CC9D22E8A6FC31)
2021-04-08 18:42:52.658 DEBUG Sftp(1)[1] Command: SSH_FXP_WRITE (3, 0x255775CC9D22E8A6FC31, 0, 1 byte)
2021-04-08 18:42:52.672 DEBUG Sftp(1)[1] Response: SSH_FXP_STATUS (3, 0, 'The write completed successfully')
2021-04-08 18:42:52.674 INFO Sftp(1)[1] Command: SSH_FXP_CLOSE (4, 0x255775CC9D22E8A6FC31)
2021-04-08 18:42:52.847 INFO Sftp(1)[1] Response: SSH_FXP_STATUS (4, 0, 'The operation completed')
2021-04-08 18:42:54.353 INFO Sftp(1)[1] Command: SSH_FXP_OPEN (5, '/c_DRV3546_Plateforme_Integration_Biztalk_A/UNIT/l2c/new-file.txt', 2)
2021-04-08 18:42:54.409 INFO Sftp(1)[1] Response: SSH_FXP_HANDLE (5, 0xC1CB976A22B3693F8D22)
2021-04-08 18:42:54.410 DEBUG Sftp(1)[1] Command: SSH_FXP_WRITE (6, 0xC1CB976A22B3693F8D22, 1, 1 byte)
2021-04-08 18:42:54.427 INFO Sftp(1)[1] Response: SSH_FXP_STATUS (6, 4, 'Cannot write to a file not opened for writing!')
2021-04-08 18:42:54.478 ERROR Sftp(1)[1] Info: Rebex.Net.SftpException: Failure; Cannot write to a file not opened for writing!.
   at hljlg.hzynn.jfphx(jbrvv p0, Type p1)
   at Rebex.Net.Sftp.aquhm(tnzlq p0, wwrjk p1, String p2, Stream p3, Int64 p4, Int64 p5, owqet p6)

Do I have to do something to keep the file open for writing??

Or something else?

Thank you,
Jessy

Applies to: Rebex SFTP

1 Answer

Lukas Pokorny · Answer 1 · 2021-04-09T11:23:29+0000

answered Apr 9, 2021 by Lukas Pokorny (150k points)

Hello,

This seems to be caused by a server-side bug (or limitation of its file storage - I'll get back to this later). The following snippet of the log shows that the file has been opened for writing (the number of 2 indicates file open mode of SSHFXFWRITE). Despite this, the server rejects the write request, claiming otherwise:

Command: SSH_FXP_OPEN (5, '/c_DRV3546_Plateforme_Integration_Biztalk_A/UNIT/l2c/new-file.txt', 2)
Response: SSH_FXP_HANDLE (5, 0xC1CB976A22B3693F8D22)
Command: SSH_FXP_WRITE (6, 0xC1CB976A22B3693F8D22, 1, 1 byte)
Response: SSH_FXP_STATUS (6, 4, 'Cannot write to a file not opened for writing!')

Please contact the server operators or vendor to find out more about this.

My guess is that the server simply doesn't support random access writes and expects the whole file to be uploaded sequentially with no seeking (and reports an odd error if the client attempts this).

You might try experimenting a bit with Sftp object's Stream-based API (GetStream(string remotePath, FileMode mode, FileAccess access) method) to find out which operations are actually supported.

However, you might be wondering why the file mode is different (26 vs 2) when a zero remoteOffset is specified in the PutFile call. The reason for this is that PutFile behaves slightly differently in that case - instead of opening the file with SSHFXFWRITE mode only, it opens it in SSHFXFWRITE + SSHFXFCREAT + SSHFXFTRUNC mode, which also causes the file to be truncated if it exists and created if it does not exist. In other words, it behaves like the PutFile(Stream sourceStream, string remotePath) overload.

We understand this might be undesirable in some scenarios, so there is an option that changes this behavior, causing PutFile with zero remoteOffset to not SSHFXFTRUNC mode with zero offset:

client.Settings.DisablePutFileZeroOffsetTruncate = true;

But this does not seem relevant in your scenario.

commented Apr 9, 2021 by j355y (130 points)

Thank you for your answer.

The goal is to be able to transfer files from Azure to an SFTP, with the code running in a
Kubernetes Container with limited resources. So what I wanted to do is get / put
the file MB by MB.

At the company where I work at, we are dealing with tons of external SFTP vendors to do
transactions with them. So I HAVE to be compliant with all the others.

Any suggestions on how I could achieve this? Transfer or stream the file,
chunk by chunk? Avoid having to downaload the complete file in memory,
and more importantly, avoiding at all cost to have to temporarize the file on disk
because they may contain personnal/financial/medical data. And if the file becomes
stored on disk, I will have to comply to a lot of security measures that I'm trying
avoid having to do... :)

Thank you,
Jessy

commented Apr 9, 2021 by Lukas Pokorny (150k points)

commented Apr 9, 2021 by j355y (130 points)

Thanks a lot,
using the GetUploadStream, I was able to upload my file, writing in the Stream.

I don't know if you have the answer to this.... but while doing so, what will happen to memory consumption?

It maybe a newbie question... It's been a while since I had to care about memory management that much. But while writing to the stream, where goes my chunk,
once I cleared the original stream? Let's say I have a file of 500mb, that I get by chunks of 100mb.... once I write to the stream, and I dispose my "azure stream" (of course, depending on the GC), will my memory consumption still increase?

using (var sftpStream = client.GetUploadStream(_SftpFileName))
{
    do
    {
        MemoryStream azureStream = new MemoryStream();
        azUtilities.GetFileStreamBlock(_AzureFileName, offset * length, ref azureStream, length, out remainingBytes);
        sftpStream.Write(azureStream.ToArray(), 0, System.Convert.ToInt32(azureStream.Length));
        sftpStream.Flush();
        offset++;
        azureStream.Dispose();
    } while (remainingBytes > 0);
}

Thank you

commented Apr 9, 2021 by Lukas Pokorny (150k points)

sftpStream.Write method constructs a series of small SFTP packets based on the supplied data and sends them to the server. But each packet is only constructed once the previous has been sent to the server, so the additional memory usage stays very low. The packets are very small (they don't contain a copy of the data, just a reference to your MemoryStream's buffer). When the method returns, the data has already been written to the remote file, and GC is easily be able to take care of both our small packets and your MemoryStream.

To make the whole process even more memory-efficient, you can actually safely reuse the same MemoryStream for all chunks as well, and use GetBuffer() instead of ToArray() to prevent creating a copy of the data:

using (var sftpStream = client.GetUploadStream(_SftpFileName))
{
    MemoryStream azureStream = new MemoryStream();
    do
    {
        azUtilities.GetFileStreamBlock(_AzureFileName, offset * length, ref azureStream, length, out remainingBytes);
        sftpStream.Write(azureStream.GetBuffer(), 0, System.Convert.ToInt32(azureStream.Length));
        sftpStream.Flush();
        offset++;
        azureStream.SetLength(0);
    } while (remainingBytes > 0);
    azureStream.Dispose();
}

commented Apr 9, 2021 by j355y (130 points)

commented Apr 9, 2021 by Lukas Pokorny (150k points)

commented Apr 9, 2021 by j355y (130 points)

Problem with sftp PutFile using offsets

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Categories

Problem with sftp PutFile using offsets

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Categories