THE SITUATION
Linux systems have a sneaking tendency to become more useful over time. That is, a machine initially set up as a firewall in one's basement will have services added as time goes by. My firewall machine, gerd, is no exception. What was originally a dedicated mailserver had first a webserver (apache) added, then a mail server, then many other services.
Much as the shoemaker's wife tends to have no shoes, the systems that network administrators run at home can be disorganized and poorly documented. While I rigidly document everything at work, and keep that documentation up to date, I am not so rigorous at home. It is easy to make "just one little change," and never document it anywhere.
In the space of eighteen months, my firewall had become a critical server--and one without a coherent list of services or versions on it. Worse, the version of Mandrake which it ran was deprecated. Thus, I could expect no security patches from Mandrake--I'd have to keep track of all the software versions myself.
The servers which did run on the system were not up to date. System libraries were aging rapidly, preventing me from installing various pieces of software, such as the latest gnucash.
There was only one thing to do: upgrade my server.
A more foolish^h^h^h^h^h^h^h brave individual may have simply thrown the OS CDs in, rebooted and have at it. But this machine was now my central fileserver, webserver and mailserver. Any email outages would not be tolerated by my most important user: my wife. I needed assurance that everything would go smoothly. I needed a plan.
THE RATIONALE
I'll be honest. I've been reading the Practice of System and Network Administration, by Thomas Limoncelli and Christine Hogan. I cannot recommend this book enough. It has chapters on how to actually be a good network/sysadmin. I've learned a lot of things the hard way, but this book taught me volumes. It's the best book I've ever read on network and system administrations. Go buy it and read it now.
THE PLAN
Any good operation begins with good information. I needed a complete list of all services on my machine. I knew most of them off the top of my head, and the rest were divined by reviewing my firewall scripts. Why review those scripts? To make sure I knew all the ports that were open. Once I had a list of open ports, I could verify to see what software was running on each one. From the list of running software, I examined configuration files to determine the exact status of each server. This was revealing: I hadn't realized I was actually receiving email for 7 domains. I knew I had at least four, but the other 3 had been thrown in as quick hacks to assist friends or to make sure that all forwarded email would actually be received properly.
Careful examination of my webserver configuration reminded me that I needed to preserve the mysql installation. I run geeklog for the Linux Users Group at the school where I teach. My short list of services looked like the first section of Appendix A
From that list of services, I took a careful look at what files really needed to be preserved. If you look at appendix A, you'll see the FILES WHICH MUST BE SAVED section. This was my best estimation at all the files I needed to preserve in order to keep things working. You'll notice that I actually recorded some hardware bindings in the firewall section, to ensure that I wouldn't need to change my firewall script. Note that many of the services require common files to be saved--those are recorded in the [0] section of the upgrade plan.
The next section is really freeform notes. In it, I list off some of the files which do not need to be saved, which directories will be backed up wholesale to another machine, and other caveats. I also threw in my filesystem map, just in case I would need it later.
Then, the plan section begins, listing off all the steps I would follow.Note that I have hardware changes to be incorporated as well. I decided to yank the LS120 drive and replace it with a 10gb maxtor drive which I'd scavenged from a defunct machine. I'd also decided to add in an 8x8x32 CDRW, to work alongside the original 4x2x8 Mitsumi. Toward the end of the plan I have my verification procedure. Note that I have no backout plan. This is an all-or-nothing upgrade. I don't have a tape drive, so this had to work the first time.
I set aside 10/15/2003 (the wednesday before MEA weekend) as my upgrade date. Angela was scheduled to drop off the baby at her parents while she went off to teach some lessons, so I'd have almost four hours of uninterrupted time. I'd start as soon as I got home: at 5:30 pm.
I "completed" the upgrade at 8:30 PM. That was the point at which I had found virtually everything wrong with the plan and fixed it. All but PHP and grepmail were up and running--and I didn't realize I'd missed those.
REALITY BITES
Any plan is less than complete. It's the nature of reality. One cannot conceive of all possibilities, not to mention the fact that humans make mistakes. My plan went quite well. I'll give myself a B. Here's what went wrong:
- I missed the POP3 services, because I don't use it. I know why I missed it: It wasn't listed in the firewall scripts, because only internal users (my wife) use it. I don't block internal users from any ports on my machine.
- Mandrake should NEVER be installed with anything more secure than 'standard' level of security. Anything more than that, and it's absolutely unuseable--at least for me. I'm not as familiar with Mandrake as I should be, apparently.
- I violated the plan. Halfway in, I decided to try to switch to postfix. Not a good idea. Stay with known quantities. Yes, sendmail is difficult to use and overly complex. But I have a working config for it, and the middle of a maintenance window is no time to be trying to learn new software. I wasted 30 minutes on this.
- Mandrake 9.1 ships with a secure kernel which does not allow setuid programs to run. This broke majordomo. I had to revert to the non-secure kernel.
- I neglected to list awstats on my critical software list. In doing so, I lost a year of webserver logs.
- I didn't back up my SSL certificates. They're only self-signed, so this isn't the end of the world.
- I didn't back up spamassassin. Big mistake.
- Majordomo failed because I didn't back up the link from /etc/smrsh to majordomo/wrapper. Whoops.
- the Mandrake iptables service completely prevented my homebrew script from running. I haven't looked into it, but it was fixed with a simple: chkconfig iptables off
- I missed PHP. This wasn't noticed until a week after the upgrade, when geeklog started misbehaving in interesting ways. Certain pages didn't work. Fixing this required quite a bit of compilation.
- I missed /etc/sysconfig on the initial plan, but did remember to back it up.
- I missed grepmail, which is an insanely handy little utility. This wasn't noticed until 12/08/2003 -- almost two months after the upgrade.
WHAT WENT RIGHT
Virtually everything. Last time I did this, I dumped a bunch of my wife's email. I lost important files of my own. It was a week before I was correctly receiving all the email I should have been. I don't do any cluster computing, but that upgrade was certainly a cluster SOMETHING.
In fact, that scared me away from upgrades for a while. This is why that machine sat at Mandrake 8.2 right up until a month after I got the email saying "Mandrake 8.2 is no longer supported." Yikes!
None of that happened this time. IIRC, I even had Angela's email client working before she realized it had ever been broken!
ENHANCEMENTS
Well, what fun would an upgrade be if you couldn't sneak some improvements in at the same time. I rebuilt my file system to have more space everywhere. I added a 10GB drive as a sandbox for myself. I switched from POP3 to POP3S -- POP3 over SSL. I upgraded to a MUCH better version of spamassassin, complete with bayesian filtering. And I upgraded the heck out of my plan.
The new plan can be seen at upgradeplan.html.
CONCLUSION
Every server should be fully documented. An upgrade plan is a good start to what actually runs on a server, and should be continually upgraded. I hope you enjoy this, and welcome comments and suggestions to the email address on the main page.
Oh, and don't EVER upgrade anything without having a plan!