Last Updated: 2006-08-19 13:18:26 UTC
by Swa Frantzen (Version: 1)
So let's assume we have a website with fairly static content, some feedback forms where people can inquire the status, a search option and a table in a database that needs to be published somewhat real-time on the website to spice things up a bit. We know from the past that our web traffic is only less than a 1 Mbps.
ConnectivityLet's start with the connectivity.
If we build this we'd rather set it in a place where we can will the contest should it come to a DDoS, so we'll preferably not set it in the HQ in a DMZ as we're likely to have much less bandwidth there. One of the solutions would be to outsource the hosting of our servers to a tier-1 ISP and have it at their location.
Make a contract with them that they need to help you during DoS attacks and filter the traffic away from your connection. Over-engineer the physical connection far beyond what you need for your visitors. But do not let the connection become so bug that it can overwhelm your servers. I'd suggest a 100Mbps full duplex link for modern solutions if you have traffic levels in the lower Mbits or less. This allows you to keep it simple.
At such hosting facilities they are likely to connect you on a set of redundant switches with either a IP address in a VLAN with a set of routers doing a failover protocol such as HSRP and a few other customers in the VLAN. Try to negotiate to be the only customer in that VLAN. Negotiating to be the only customer on the link and having an air-gapped switch (not a VLAN) will not work for most of us as ports in routers are really limited in numbers.
NetworkFor our switches we standardize on a single model of not so big switches from a single vendor. It must have private VLANs, ports that we can shutdown, limits on what MACs can be learned, etc. Traffic reporting needs to be available but we'll not use SNMP v.1/2. We'll manage the switches as much as possible out of band over the consoles. See also the Tip of the Day on switches.
Server hardwareFor server hardware we're going to standardize on a single model of hardware. We'd like it not to have an Intel CPU as the hackers have way too many exploits ready for it for comfort. moreover the bad guys seem to know hat CPU's architecture much better than the defenders so we'll skip on that if possible. Unfortunately that means we're limited in choices so we might need to concede on this point a bit. See also the Tip of the Day on diversity.
We want machines that are fully remote manageable. With a console we can get to easily form far away. Easy to swap hard-drives are a requirement. See further.
We want hardware based raid solutions such as mirrorring (raid 1), that's fully supported by our OS of choice.
Server OSWe want a well tested OS on the security side, developed by a small set of people who really get security and put security above usability, speed or anything else. We'd like the source code and the implementations to be vetted regularly. So we'll go for OpenBSD. There really is nothing else in the same league.
This further limits the hardware choices above as current versions of OpenBSD don't like "binary" blobs to be inserted in the OS by vendors of hardware, so we'll need to mix and match a bit to get out platform together.
On our productions servers we'll install the bare minimum of the OS, e.g. no X11, no compilers. So we'll need a machine back in the office that does have at least that compiler and we'd like a test-bed to test new versions and be able to enhance our contraption while the previous version is out there.
Web serverWell once we chose OpenBSD we're left with Apache that even comes in a chroot-ed jail on OpenBSD. But we're going for extremes here, it's our job and reputation that's on the line so we're building 2 machines:
- www.example.com will do static content only.
- cgi.example.com will do the form feedback only.
So we'll recompile apache from source and we'll remove all that we do not need in the source and create two binaries out of it:
- The one for www.example.com needs only to be able to display static content. It needs not to be able to display directory content, have a cgi interface or anything lile it. We do want to increase the number of possible processes that can be forked as we'd like to win a DoS attempt or two.
- The one for cgi.example.com needs to have a bit more abilities like doing cgi.
FilteringWe will use pf (packet filter) of OpenBSD, it's extremely powerful in what it can filter and write filters that allow the bare minimum our servers need to do. Future Tips of the Day might expand a bit on the ideas needed to get this working very well. Stay tuned.
And the database?Ah, yes the database link containing items to be displayed and update in near real time. We really do not want to expose our database. Nor do we want to -should something happen- on the webserver to allow them any connectivity to our database as that's a welcome mat for intruders.
So how do we solve it? We put up one of our machines where it can reach the database server, let it run the queries and generate html out of it in a static fashion and then keep the initiative and send the data over a management connection to the static webserver. Repeating this process every so often as desired and we have our content on the static website where it's best protected without exposing the database server in any way.
Should in a future update (yes they happen!) there be a need to have some form of feedback towards the database, we can use that same machine, let the cgo.example.com machine collect the feedback, fetch it over the management connection, scrutinize it again, and then insert it in the database. Keeping the initiative on the safe side is the critical part in making it much harder to attack it. Scrutinizing any and every bit of data and treating it as tainted till proven otherwise is the second critical part. And the final part is to create software like this in a right form the first time try. It's like building bridges, not in the typical trial and error fashion.
Management connectionsWe need a way to connect back to the managment of the server that are hosted out there.
We'll have a small set of trusted machines in our organization that are allowed to get to the machines and use ssh to get there. It's important to make sure the ssh ports aren't exposed (while at it, please do not run them on port 22 or 2222 or something predicatble like that) and to make sure the endpoints are well protected. The encryption only protects data while in transit! See also the Tip of the Day: using ssh keys.
We will add at the remote location a management network to connect to the servers out of band. We can also use this network for backup proposes. And we add a terminal server to it that connects to all the serial consoles of all network equipments and servers we have there.
Emergencies prepared.In an emergency we'd like to be able to put up a "sorry we're closed, will be back soon" website and be able to pull the original one off-line for further incident handling. One of the low-cost ways is to have a hard-disk ready and swap it in the server, another is to have spare server sitting ready to take over (this is better as you can update the server with patches). The "website" might be made not using apache. One of the reasons you failed might be that apache got a security problem. Alternate ways to hand out html are possible, so let's be unpredictable and e.g. use netcat (nc) to hand out content.
Logbooks have been discussed in a previous Tip of the Day, we're going to be religious about using them.
Fast recovery in case of hardware failures or other incidents is something we need.
Disaster Recovery is something we need to prepare and perhaps have contracts for.
Backups is something we need to prepare.
RedundancyAdding redundancy adds a lot of complexity to this kind of solution. We can do it but there are risks. OpenBSD has some features to do it, and you could buy off the shelf solution for it. The problem remains the complexity it introduces.
If it's acceptable to have a manual failover I'd strongly suggest to keep offline machine and swap them manually if something does goes wrong. It's much more KISS, and that's just one of those plain good engineering principles.
Having only one type of server and only one type of switch etc. allows us to minimize the support contracts, while allowing for a spare device ready to take over any of it's failed cousins in minutes.
Swa Frantzen -- Section 66