Anybody that has worked with AWS must deal with the arbitrary hostnames that AWS generates for each instance. These names are difficult to remember and not particularly useful to understand what tasks any particular node is assigned to.
The classic solution to this problem is to use a DNS server, but unfortunately setting up and maintaining a DNS system can be a pain and, when you are a quickly moving startup, it’s a lot easier to spend your time maintaining only the bare necessities. Of course eventually you’ll need a proper setup, but until then, there are plenty of work arounds that do their job until you’re big enough to have more time to refine your infrastructure.
Setting up a Private DNS is not a simple task. Using Amazon Route53 could be an option, but it doesn’t manage private ip addresses any differently so, unless you want to leak your private addresses to the outside world, you’ll have to set up your own DNS. In addition to all of this, although there are many options for cheap private DNS service, nothing is cheaper than free.
Usually you live through this by setting up entries in the /etc/hosts file but a great advantage of the cloud is that instances come up and go down and scale up and down. Manually changing all of the /etc/hosts or pre-configuring them is hard and time consuming, and worst of all it’s hard to change in case of failures.
We’ve tried to work around the immediate necessity for a DNS server while still obtaining all of its benefits by using tags in AWS. The system works in two parts: on one side you set tags like dns: HOSTNAME on every machine that you want to reach with a simple hostname, on the other side you need a simple script that can build a /etc/hosts file by using the information obtained from the EC2 Tags service.
This simple script will create and upload to S3 multiple versions of your /etc/hosts file to accomodate for all the different regions that your infrastructure could be running on. In each regional hosts file there will be private addresses for local instances and public addresses for remote ones, but in all cases always accessible through the same simple name.
At this point all of your instances can just download the script from S3 and execute it to obtain a proper and up-to-date version of the /etc/hosts file.
In addition, to simplify development, there is also a fully public version of the hosts file. By using this version of the hosts and loading it on your computer you will now be able to access any machine of your cluster without needing to lookup their AWS hostname.
Of course there are a number of alternatives to solve this issue. By using this this trick at AdRoll, we manage our current installation with more than one hundred instances without many headaches.
Soon we’ll need to look into a proper DNS server but until then…