Sun, 20 Nov 2005

Hard drive failure

On Thursday, I went up to my father's house to pick up my desktop, now that I finally have Internet access at home so I can have permanent access to it from a more convenient place. While I was there, I decided it'd be a good idea to do an upgrade on my dad's box.

While I was doing this, the nfs share of the Debian mirror in the home server that serves this blog stopped working, and I couldn't ssh in, although the console and ICMP appeared to work. Having no monitor attached to that box, I decided to reboot it and resolve the problem in a lame way. Had I known my primary hard drive had decided enough is enough, I would have looked for a monitor instead.

After reboot, the server started doing a very loud and scary noise, which was not new to me. When the BIOS tried to probe the disk, it would not start up and would instead whine like that. When this happened in the past, switching the box for a few minutes and trying again was enough. Not this time, though, it looked like it was the end.

Horrified by the fact that I had no really useful backup of /etc and /home, I tried over and over to get it working. I plugged the HD into another old AT box (which once served as my GNU Hurd playground), but I still got those scary noises. At least I knew it had nothing to do with the mother board.

As it was getting late, I decided to take the HD with me, and another Maxtor of the same size (but not exact model) with me, to have a look in office the next day. I thought I'd have to restort to try to transplant the logic board from the good one to the faulty disk, but before that, I realised I could try mounting it in a USB cage. For some reason or another, this worked, and I quickly saved thetwo partitions it contained with no read errors.

After this, I realised I had not done the copy as root, so I had lost all my ownerships. I plugged the disk again, and re-rsynced. It still worked. Alleviated, I went back home, copied the data to another old disk, and I don't remember why, I tried mounting the faulty disk again. I got scary noises even using the USB case, and no other tries have been successful, so I think it's completely dead now, just before my final mount which saved the data.

The box is now back online, using a Seagate disk I had stored in a drawer and with no notes on it about it having any kind of problem. I suspect I will need to do real backups now, because the drive isn't as safe as it looked...

hda: dma_timer_expiry: dma status == 0x20
hda: DMA timeout retry
hda: timeout waiting for DMA
hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }

hda: drive not ready for command
hda: status timeout: status=0xd0 { Busy }

hda: drive not ready for command
ide0: reset: success

My little P150 needs a 6GB drive. I'll have to find one somewhere.

this DMA problemm is well known on Debian, just reconfigure your kernel.

Posted by SchAmane at Sun Nov 20 14:17:24 2005