Skip to main content
Welcome guest. | Register | Login | Post

Why FTP sucks

First of all I want to make clear that I always liked FTP. And I still prefer it over uploading my files through a web-interface, but I now have found quite a big reason why FTP really really sucks.

And here it is: FTP has clearly not been designed to upload 7000 files.
Right now I am uploading a Joomla-website for a client. This package consists of around 7000 small files. It's been uploading for around 4 hours so far.

The problem is not really that FTP uses 2 separate connection, which in itself already causes some problems (although these are quite manageable), but how the second connection, the data-connection, is used.
The way it is is that for every single file the data-connection will be opened, and once the file is done it'll be closed again.

Now that doesn't really matter if you transfer only a few files, or big files. But it does matter when you transfer lots of tiny files, because then opening and closing the connection may take just as long as, or even longer than, the actual transfer.

Looking at my transfer I see a lot of this:

Quote:

templates/beez/html/com_poll/index.html: 44.00 B 141.87 B/s
templates/beez/html/com_poll/poll/index.html: 44.00 B 149.99 B/s
templates/beez/html/mod_newsflash/index.html: 44.00 B 143.82 B/s
templates/beez/html/mod_search/index.html: 44.00 B 151.15 B/s
templates/beez/html/com_user/index.html: 44.00 B 143.46 B/s
templates/beez/html/com_user/remind/index.html: 44.00 B 156.59 B/s
templates/beez/html/com_user/login/index.html: 44.00 B 139.53 B/s
templates/beez/html/com_user/register/index.html: 44.00 B 155.67 B/s
templates/beez/html/com_user/user/index.html: 44.00 B 130.48 B/s
templates/beez/html/com_user/reset/index.html: 44.00 B 140.22 B/s
templates/beez/html/com_newsfeeds/index.html: 44.00 B 137.64 B/s

Lots of files that are 44 Byte small! Considering that the MTU is a lot higher, this means that the whole file easily fits into one packet. And that is quite an understatement, considering that PPPoE has a MTU of roughly 1500 (I think it's actually 1492 or so).

Now think about it:
Connecting to the server: 3 packets (3-way-handshake)
Transmission: 1 packet
Closing the connection: 2 packets

Instead of just 1 packet 6 need to be sent! That's an overhead of 500%!

Now how would it be possible to avoid this overhead? Well, easy, stream all the files in one go and let the server (or client, when downloading) handle the splitting.
This could be done in two steps, transfer file-info first (like filenames, filesize and whatever other info you may want to send), and then send all the files.
That way the connection doesn't need to be opened and closed thousands of times and thus it will speed up the upload for transfers of many files a lot.
It won't probably have much of an impact on the upload of big files, or just a few files, but seeing how opensource-packages like Joomla, Zen Cart, ... are getting more and more popular and seeing how they consist of thousands of files, this may actually help people quite a bit.

FTP is old, and an adequate technology for the today's file-transfers needs to be found.
It may be SCP, I would need to check if SCP uses only one connection for all the files, but I would say it does. Another obvious advantage of SCP would of course be encryption.
But maybe we just need something completely new.

Either way, the biggest problem would probably be having hosters offer it. Seeing how old SSH, and thus SCP, is and seeing how few hosters offer it it's unlikely that any time soon we'll be able to get rid of FTP, no matter what's it being replaced with.

Comments

Re: Why FTP sucks

 

FTP sure does have its limitations. If you check out FileCatalyst's webpage (http://www.filecatalyst.com - not a free solution on the server side, so Nuxified readers probably won't find any downloads of interest) you'll see an option in the sidebars for "FTP vs. FileCatalyst" which pulls up a comparison chart. In the "FTP" rows, the chart illustrates overhead of FTP, including what you've already discovered plus other factors that make FTP slow. Naturally, the higher the latency and packet loss (packet loss currently a static value for the comparison tool as per the note below the chart), the more drastic the difference is. Higher link speeds really compound the problem (counter-intuitively... you'd think investing in a faster link would get you arithmatically-faster speeds).

In terms of FTP's drawbacks, it's not even the number of packets that's a problem (though in your case, for the number of small files you're delivering it might actually end up being the biggest slowdown for you) but rather the TCP window. In addition to the problem you described, if you have any sort of latency/packet loss on the line, the TCP window starts to close up... and you need some pretty consistently good periods of transfer before it opens up again!

In short, we totally agree with you about FTP.

For your particular problem, you could do a monolithic zip (with or without compression) if you had a way to unpack it on the destination side. Should be able to do it with some sort of PHP or other server-side script even if you don't have shell access. At least there wouldn't be setup/teardown for each of those 7000 files!

Greg

Re: Why FTP sucks

I wonder if the same problems apply to SFTP (SSH FTP)?

Btw Greg, Nuxified is more about "free" in the sense of free to share, modify etc.. Open Source. But yeah, it looks like FileCatalyst isn't, albeit I don't oppose anyone using it if they choose.

Btw, I put this on a Networking Reddit.

Re: Why FTP sucks

 

Tarball it first!

Re: Why FTP sucks

 

Same as above

Re: Why FTP sucks

 

I was going to say it...but it has already been said: tar the files first!

Re: Why FTP sucks

 

rsync -axzSP -e ssh /pwd/from user@host:"/pwd/to/"

The sparseoption -S does magic with the small jizzfiles.

Re: Why FTP sucks

 

The better question is why didnt you leave it compressed? 3 others so far said it, and I agree with it, that it should be put together in at least some form of compression. Any half decent host has at the least a web client that will extract the compressed format for you and many have ssh available on top of it. The other benefit to keeping them compressed is that your doing 1 reasonably sized file as opposed to hundreds or thousands of smaller files that equal up to a much larger amount of data to be transferred. I think Joomla is something like 4 or 5 mb in a tar.gz format, where as the last time I remember having it extracted on my local system, it was closer to 14mb or so. Why upload 3x the amount of data in 7000x the file count? Though not disagreeing by any means, ftp can suck. In general though, you can do without uploading 7000 small files, and can instead upload one and spend a minute extra to extract it.

Re: Why FTP sucks

 

Many hosting companies do not allow shell access, and keeping it tar'ed and zip'ed is not an option. PHP as a means of untar'ing and unzip'ing is a pain. Thus the only option is to transfer via ftp unzipped.

Re: Why FTP sucks

 

If the host happens to have python installed and configured for your use, unzipping or untarring would be like 3 lines of code.

Re: Why FTP sucks

 

Do something along the line:

ls * | sed 's/^/; put /' | ftp .....
My be add cd templates/beez/html/ before the put commands
Or use mput

Figuring it out takes less time then wait for the transfer to complete.

ps: Did anybody use xargs and ftp Confused

Re: Why FTP sucks

 

As said, not all hosters support shell-access. Also sometimes memory- or time-limits may get in the way of unpacking a file using a PHP-script.

@The last anonymous user: The problem isn't to upload a bunch of files, NcFTP takes care of this nicely by allowing recursive upload.

Also, if you guys don't have anything else to say than "see above", better don't waste everybody's bandwidth by posting just that. ;-)

Re: Why FTP sucks

 

No, many hosts don't offer shell access, but just about ALL of them offer a web-based file manager in their control panels, and just about every one that I've ever seen/used includes the ability to unzip a file from said file manager. At least one host I've used actually recommends zipping large uploads like that specifically as a better solution to ftp'ing 8000 files.

So it's not all about shell access. Or using PHP scripts. Or python scripts. Or anything other than the tools your host already provided for you in your control panel. The difference being, they're not always obvious solutions compared to the standard methods we're all used to.

Re: Why FTP sucks

 

ssh/scp supports persistent connections/tunnels to servers, which certainly gets around any open/close issues.

Re: Why FTP sucks

Aye, FTP is old, and slow, and it assumes that there is no NAT (okay, now that's not actually true, not any more). The only reason you'd use FTP is that it's universally supported. Every decent file manager speaks FTP. Every web host speaks FTP. SFTP is obviously a better choice, which I do use if possible, but FTP is still the norm. If you have full SSH access, you do, of course, have an enourmous amount of options open to you. You can use SFTP/SCP. You can rsync. You can do something in the style of tar c(...) | ssh (...) tar x(...). Or, if you don't like the encryption overhead, ssh in on a seperate terminal and use tar/netcat.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.