Saturday, 23 February 2013

Setting up an FTP server on AWS

Recently for testing some code, I had to host an FTP server. I tried doing it on my local first. It was easy. I just had to follow the Arch wiki for vsftpd. File transfers in both directions were working so I thought I can try it on an Amazon instance too.

My local machine ran Arch linux on while the Amazon instance ran Fedora 8. After looking up the details of the package manager for Fedora and some help from a friend, I installed vsftpd on it, applied the same config and started the FTP service. When we started testing it, we could operate successfully from command line but not from the code. From the command line, we were using active mode of operation while the code was using the passive mode, so we looked into the config to check settings related to passive mode of operation. It turned out that the passive mode is enabled by default. However, going through the various options we found an option called pasv_address. From prior experience I know that AWS machines have a private LAN IP and a separate public IP. Now, the OS on the cloud instance is not aware of what public IP it is serving. So, we suspected that in its response it must be asking the client to connect on the private LAN IP which would obviously fail. So we just set the pasv_address option to the public IP of the instance and passive mode started working fine. We could successfully connect to it and get file transfers done. So, we decided to use it for testing our code. However, when we tested it, we saw that our application was trying to post files but it was failing every time. The error we were getting each time said '500: Invalid Port command'.

The FTP protocol really goes funky with ports. It uses separate ports for control and data. The behaviour of data ports is dependent upon the mode of operation. In active mode, the client initiates the data connection and therefore the port selection is done by the client, while in passive mode, the server initiates the data connection and therefore the port selection is done by the server. We were using the passive mode of operation and the server was hitting the client at a port that later turned out to be blocked. To debug the situation, we tried connecting to the FTP server from the command line utility 'ftp' using the following command.

ftp ip address of FTP server

To turn on the passive mode and debug mode, we can use the commands 'passive' and 'debug' respectively. However, they only set the options on the client without actually sending any control data to the server. To test the FTP service, try some command that sends some control data. We went with an 'ls'. The following FTP commands were executed in sequence.

PASV
LIST


The PASV commands [1] outputs a line, like the following, indicating the port the data transfer will happen.

Entering Passive Mode (1,2,3,4,224,186)

The port has to be calculated from the last two numbers using the following formula.

n1 x 256 + n2

In the above instance, it is 224 x 256 + 186 = 57530. Once we knew that the issue was the port that the FTP server was trying to communicate to the application machine on was blocked, we decided to configure the FTP server to connect on some port within the open port range. This can be done setting the pasv_min_port and pasv_max_port options correctly in vsftpd.conf. Once we got the server connect to the client on proper ports, the transfers worked fine.

[1] A reference of FTP commands.

No comments: