If you have a Disaster Recovery site (cloudy or otherwise), and your DR plan involves changing public addresses when you "declare", you might want to consider automating your DNS changes.
Why would you do this?
Long story short, the last step of most DR plans is "update the external DNS records". Assuming your firewall rules are up to date at the DR site - is that list of DNS changes also up to date?
Automating these DNS changes can take errors off the table (again, assuming that the list of changes is up to date).
"Can I even automate that?" you ask? - - yup, most of the larger DNS providers give you an API, you can script changes with powershell, python or even curl in a shell script.
Since their API examples are so easy to implement in curl, let's go with that. I could (of course) write this in python or powershell and make it a "whole thing", but the object of this example is to show how simple this can be, and to give you a decent example to build on (or just re-use) in your environment.
This script changes a set of A record (in an input file) from the prod IP's to DR IP's (or back)
The dnsupd-dr.in input file (this moves me from prod to dr addresses). Note that it's just the CN followed by the IP address (the domain is buried in the script):
Let's run the script, moving from prod to dr:
Gotcha's? Like any DNS Migration process, the key thing to do is set your TTLs appropriately BEFORE your migration. DNS is all about expiry times - the phrase "DNS propagation" is malarky, even though it's still uttered by every DNS provider on the planet. What the TTL does is say "after being cached for xxxx seconds, I will expire that entry" - the instruction is for the DNS Server making the request. If your zone TTL is 7200 (2 hours), and the remote client is querying their DNS server, that entry will be cached for 7200 seconds after the last query. So if the last query was 7219 seconds ago, it'll expire in 1 second, and if the client just made a query, it's stuck there for them for the next 2 hours. So if you have a business process that relies on a DNS change (like your DR process), you're going to want to keep this in mind. 2 hours is likely too long, but 5 minutes is likely too short - you don't want to be that "bad citizen" on the internet that forces everyone else to burn excessive resources on your behalf. 15 minutes (900 seconds) is a happy medium that lots of folks find reasonable - it's short enough to management that it's reasonable, but it's not so short that you're "that company"
So the right time to change your TTL was yesterday, or for a DR process, many years ago. The important thing is that it should be "short enough" when you pull the trigger (not after). If it's set for 86400 (1 day) or something silly, the best time to think about it is today - like planting a tree :-)
This script of course will evolve over time, and I'll likely update it for other DNS providers (as one client or another needs that) - check my github for changes if you're interested - https://github.com/robvandenbrink. As always, TEST it for your organization and your situation and MODIFY IT AS NEEDED. This script is NOT meant to be a one-size-fits-all script that'll just work 100% for everyone without testing. For instance, you might choose to use CNAMEs instead of A records, or you might choose to have both sites active during PROD windows to spread load, and just delete the PROD addresses if you are in a DR situation. Or you might choose to use a GSLB (Global Server Load Balancer) with health checks instead of DNS to swing PROD traffic over to DR. Or if you have a different DNS provider the API calls will of course be different.
If you find this useful, or if you have suggestions or updates to the script, by all means use our comment section - let's talk!!
Dec 17th 2021
|Thread locked Subscribe||
Dec 17th 2021
6 months ago