NASA Insignia
Site Title
The Unix "split" command can be very much worthwhile for transferring extremely large files and/or sending them across slow links.

This way, if sftp (or other file transfer) dies in the middle, one only has to retransmit one (fairly) small file, not start over again on something huge. This greatly reduces the frustration factor.

Quick Tutorial on How To Use Split:

The 'split' command can break up a file into 1000 line segments or any number of lines you choose. It works as well on binary files as ascii ones. Often 1000 lines is too few (i.e., it would create too many subfiles), so it is advisable to decide on an optimum number.

Here's one way to use it:

  1. wc -l filename (To see how many lines the file is.)
  2. Determine how many sub files you would like to create, and divide the result of the word count (wc) by this number, then round to something sane. (e.g., 13600 / 12 = 1133, so make the files 1100 lines long, and there will be a 13th file of 400 lines.)
  3. Now, split the file, using the number you choose in (2). If you do not specify an output filename, they will be named with a base of 'x' and suffix of 'aa', 'ab', etc (e.g., xaa, xab, xac, etc)
    split -1100 filename output_filename_base

    You also have the option to specify file sizes instead, with a -b (for 'bytes') flag. So, if you want 100MB split files, use
    split -b 100m filename output_filename_base

  4. Now do a checksum of the original and all the pieces, saving into a file for crosschecking, after your sftp upload:
    openssl sha1 output_filename_base* > filenames.sha1
  5. Place the output_filename_base* and filenames.sha1 files in your sharing directory.
  6. User will retrieve all of these files, using a binary transfer (for binary files).
  7. User double checks that all checksums are correct, either visually or by repeating (4) on his/her end and then doing a diff:
    openssl sha1 output_filename_base* > filenames.sha1.local
    diff filenames.sha1 filenames.sha1.local
  8. User puts all the pieces back together:
    cat output_filename_base* > filename
  9. User now has recreated the file in its original form. It may now be viewed, uncompressed, untarred, whatever is appropriate for that particular file.
I know it seems like a lot of steps, but it really works and is sometimes the only way to transfer something!
David Friedlander
28 Aug 2001 (original), small updates in April 2020, May 2021.