natura.oskuro.net
, the home server which still serves this
blog, has been suffering hardware problems for some weeks. Apparently the
hard drive is failing intermittently, so every now the kernel starts spewing
out noisy errors about its main disk dying. If I notice this quickly, it can
be rebooted and that normally fixes it for a few more days. But if I don't,
it'll end up giving nasty bus errors
which will make remote
logins a challenge. Most processes still work, but the filesystem appears
to be gone. It's easy to know what's going on if you visit the blog's url
and get some 404
, and in that case I can only phone my father
and tell him to press the reset button (I've tried sysrqd, but I need to
open the port in the router and haven't had chance to do that yet).
So it was time to do something about it, and the other day I installed a dirty 40GB drive on the second IDE controller, in case I could find the time to do somethng about it. Being with an endless pharyngitis that doesn't seem to get cured entirely, I've had some time today to look at it. This evening, I was about to transfer all the system to the new disk (it's half the size as the broken one, and probably slower, but it hopefully has no bad sectors), but I decided to upgrade the system first.
natura
was first installed in late 1997 or at the beginning
of 1998, using the Debian bo install media on a Pentium 150MHz, and
has gone through seven dist-upgrades which, as far as I can remember, have
always worked out without major problems.
The upgrade to lenny hasn't been an exception. The server has gradually lost many of the services it once hosted, so there aren't too many services to take care of anymore. All the mail services I setup for my father ended up being deprecated as they started to get used to Hotmail, GMail and so on, and the frequent hardware crashes made me switch them to the Linksys based DHCP server. In the end, the problems I saw after the upgrade were very similar to what I faced when I upgraded to etch:
apache2
restored the000-default
symlink in thesites-enabled
dir, which resulted in my website showing the classic “It works” message for a while.- Apache's suexec support had moved to its own package, and was very well documented in the NEWS file, but I somehow failed to notice for some time and kept wondering what was going on.
- The Python upgrade again affected my Pyblosxom install. Fortunately it was
a minor problem; I just had to add a
coding=utf-8
line to the beginning of myconfig.py
to get it working. dhcp3-server
apparently restored its init scripts and got started again. This time, it got removed.
Such an ancient install will clearly have old, obsolete packages. I
installed apt-show-versions
to find out what didn't match my
package sources. I found I had every single version of cpp, gcc and g++ from
2.95 to 4.3, and a myriad of obsolete libs. But there were also real gems:
defrag 0.73pjm1-7 installed: No available version in archive figlet 2.2.1-3 installed: No available version in archive ipmasqadm 0.4.2-2 installed: No available version in archive isapnptools 1.26-5 installed: No available version in archive ms-sys 2.1.0-1 installed: No available version in archive queso 0.980922b-3 installed: No available version in archive update 2.11-4 installed: No available version in archive
Spaniards will remember “queso” because it was written by Jordi Murgó and became a classic tool to find out what OS was running on a remote host. “update” was apparently needed to flush your filesystems prior to Linux 2.2.8, and “defrag” is obvious, although leaves me wondering why it was needed at the time.
With the upgrade done successfully, next step is trying to get the system transfered to the spare hard drive. For this, I first partitioned it creating a primary partition using up more or less half of the available space, and setup a LVM volume, leaving some free PE's in the volume group just in case I want to do snapshots in the future, and formatted it using ext3. I then transfered the system to the new disk and now face the boot challenge.
I haven't created a boot partition and that should be a double problem:
the BIOS is buggy and will only boot from the first 1024 cylinders, and my
root is on LVM and GRUB legacy might not like it (but I'm not sure). However,
I've become a big fan of GRUB2, and know I will be able to boot no
matter what my BIOS thinks of my disks and regardless the complex root
partition setup I throw at it. The plan is to install GRUB onto the new
drive's MBR, and set it up using the ata
module, which should
allow to ignore what the BIOS says, and read beyond cylinder 1024 or even boot
from CD-ROM. However, this isn't a setup I haven't tried before, and a single
failure will result in me taking a train to fix it on-site.
So, GRUB experts out there, any suggestions? Of course, for now I guess I can install GRUB in the current drive's MBR and make it boot the old kernel using the new system as root, but that's dirty and would just postpone the problem.