Wed, 25 Apr 2007

natura upgraded to etch

Last week I started the final round of Debian upgrades for the servers I maintain here and there, which is mostly complete today. I haven't been so lucky with upgrades this time, for a long list of different reasons. In the end, the smoothest upgrades were those boxes I upgraded when etch froze or so.

natura.oskuro.net is the box serving these pages. It's an old, extremely noisy Pentium 150 which I've been intending to replace for a while now. I started the upgrade early on Thursday, knowing it'd take a while (natura takes its time only to read the Dpkg database), and it had apparently finished when I was ready to leave the office.

Three issues:

The very same night of last Thursday, I decided to dist-upgrade the box which serves the Spanish Debian website mirror. That's the only purpose on the box, so you can imagine the upgrade should have been pretty straight-forward. And so it seemed, until, in the middle of unpacking, dpkg died with a horrible I/O error, and I dropped into an unusable remote terminal with no working commands. Fortunately, apache2 was still up and running, and the web service has been working without interruption since the hard drive crash, albeit with no syncs from www-master.

Today, Sergio visited the campus and had a look. It was a XFS crash, which got cleanly repaired using an install CD. We have an empty partition in the box, and will probably move the system to it temporarily, and back to the RAID, but on ext3. When the box was back online, I just had to resume the upgrade process, make mdadm happy and update lilo.conf before rebooting into the new kernel.

This box uses LILO for some obscure reason I can't remember too clearly anymore. The box has just one partition on a md array, on two SCSI disks on a aic7xxx-based controller. Can anyone hint me why GRUB would have failed on us back in sarge, and if any fixes in the etch version would work any better? Using LILO here is error prone, and basically feels like a step back. Anyway, www.es.debian.org is now back up and running with updated content.

Sindominio.net had its bi-annual upgrading party last monday, but unfortunately I wasn't able to help much as when I tried to log into the server, I must have caught the system in the middle of some key lib upgrade or something, and again I was locked in a unusable shell which would only segfault. Given my previous experience, I assumed that something had gone wrong and the box would need to be fixed at the console, and after 20 minutes I gave up helping on that front. Until I noticed, quite a long while later that I was still getting mail from the server. I managed to log in to discover the upgrade was done, with just a few bits remaining to be done. The major issues were encountered with our pam and ldap setup, plus nscd kept dying causing quite a lot of mayhem all over the place. Great work from Seajob, Syvic, nogates, apardo and the rest of the people who handled it! With etch, we can finally move back to an official Debian kernel, something we've been longing to do for a long time. The only pending upgrade issue is that we need to move from our old jabber server to either the traditional jabberd 1.x or ejabberd; our current implementation is no longer supported in Debian.

The last of the etch upgrades stories involves Sofcatalà's servers. The box was running on a CentOS 4.4, which was moved away into a subdir just after booting Debian-Installer, and then lobotomised so it would run as a Linux-VServer under a new Debian etch install. I'll probably write more details about it soon though, as it could be a maybe less scary alternative to Guillem's debtakeover.

Yay for etch!