From Lyceum
Jump to: navigation, search


But then Pete changed it!

Saving Drives

It seems like the last few days there have been a lot of customers with failed hard drives. One customer in particular has no backups and the drive is having pretty major seek errors and will not even show in an fdisk.

dd_rescue software is often able to recover most data to a clean drive. We regularly use it and have great success with it.

We used to charge $50 to image a drive, but as it's a hardware failure, and as it actually saves us time to clone it compared to all the work involved in reinstalling, migrating, importing and troubleshooting restore, we now offer it free of charge in the case of a dead drive. Migrating to a larger drive using other tools is still a $50 charge.

NOTE: Don't be lead astray! There are three similar commands: dd, ddrescue and dd_rescue - just use dd_rescue, it is the best.

You can use dd_rescue by booting ubuntu, enable the universe and multiverse repositories in /etc/apt/sources.list. apt-get update, apt-get install ddrescue.

You can run ddrescue like so:

dd_rescue /dev/sdb /dev/sda (where sdb is the bad drive)

You can also specify the -r option to read in reverse (so you can see how much is left).

If using dd you can use conv=noerror to skip errors if the occur and also specify block size with bs:

dd if=/dev/sda of=/dev/sdb -r bs=4096 conv=notrunc,noerror 

Specifying the block size can make it run faster, default is 512.

For more options:

Please note:

When cloning one drive to another you must be alert and ensure the destination drive is truly equal to or larger than the source drive. That sounds obvious I know, but seriously - not all drives of the same capacity have the same geometry and exact same size.

For example, the dozens of RMA replacement Western Digital drives we have recently gotten in are not the same size as the original 160GB drives, they are slightly smaller.

If you use a smaller drive, dd will complete, but then you will start seeing drive errors such as:

Attempt to access beyond end of device


Lost page write due to IO error on on sda

or other IO or fsck inconsistencies.

How to Match Geometry

Well, if it's a square hole don't use a round peg. Look at the drive geometry using hdparm, like this:

[email protected]:/$ hdparm -I /dev/sda


ATA device, with non-removable media
        Model Number:       WDC WD1200BB-00DAA1
        Serial Number:      WD-WMACM1072106
        Firmware Revision:  02.13B02
        Supported: 6 5 4
        Likely used: 6
        Logical         max     current
        cylinders       16383   16383   {----
        heads           16      16          | These are the CHS values
        sectors/track   63      63      {----
        CHS current addressable sectors:   16514064    {----
        LBA    user addressable sectors:  234441648        | You have to ensure these match too!
        LBA48  user addressable sectors:  234441648     {---
        device size with M = 1024*1024:      114473 MBytes
        device size with M = 1000*1000:      120034 MBytes (120 GB)

Notice the CHS, LBA and 48 bit LBA values above. The destination drive must have LBA values exactly equal to, or greater than the source.

Two 160GB drives may have identical CHS values, but different LBA values and will therefor be different sizes.

But hdparm will tell you what you need to know.

And knowing is half the battle.

Other Options

One limitation of dd_rescue is you can't dynamically expand the partitions if going to a larger drive. If you are taking the time to clone a drive, it might be nice to go from a 160GB to a 500GB, etc. dd just does a block wise copy though, so the partition tables, etc. are identical.

If the drive is mostly healthy you can use Acronis Drive Imaging, gparted or Clonezilla which have options for dynamically expanding the partitions. Acronis actually works quite well for this.

All Things dd_rescue and ddrescue

dd_rescue addition to printing out status messages will automatically adjust the block size based on errors encountered. dd will just use whatever static bs= argument (that's funny) you provide. Also, dd_rescue defaults to force reading through bad sectors, dd does not. Thus, dd_rescue is a bit better at cloning failing drives.

With dd a better syntax is:

#dd if=/dev/sda of=/dev/sdb -r bs=4096 conv=notrunc,noerror 

However, the above is basically the default used in dd_rescue.

Note that dd_resceue is slightly different and does not use if= and of=, but rather referenced block devices directly:

#dd_rescue /dev/sda /dev/sdb 

There is no confirmation - think before you pull the trigger.

(Note: There are two different programs with similar names: ddrescue and dd_rescue. In cases where one fails, the other may succeed. Info pages are wonderful things - they have examples and all kinds of good stuff.)

BOTH versions are available in RIP. Note you can make multiple recover passes to an IMAGE file and then dd the image file to a drive, etc. Fancy.

Also, this is a great time to mention again that when cloning drive it is ESSENTIAL to examine not only the CHS drive values, but also the LBA values. Two 120GB or 500GB drives are NOT necessarily the same size, even if from the same manufacturer - they change in different production lines.

The way to determine this for sure is to follow the steps from:

Oh, and DON"T TRY TO FIX A FAILING DISK WITH FSCK. The best description I ever read of fsck was it a barbarian - it destroys what it does not understand. If you see bad sectors in smartctl -a, never run an fsck. Clone, first and repair the FS on the clone.

Moral: Always use dd_rescue and always verify geometry when cloning. If you need, read the man and info pages on ddrescue and dd_rescue for more options - they are capable of much more than block device to device operations.

ddrescue Log Files

Always use a log file when running ddrescue. Reason: If the operation is interrupted you have to start all over, which often may greatly extend the time to recover a drive a drive.

Using log files with ddrescue (not dd_rescue) allows you to resume where you left off. Yep - nifty.

To do this in RIP / RescueCD simply use sshfs to mount a directory from another server and direct the log to be saved there.

Mounting remote folder with sshfs:

[email protected]:~$sshfs -p 12273 [email protected]:ddrescuelogs /mnt
[email protected]'s password:s4g0n0ct3ch
[email protected]:~$ ls mnt

Now, run ddrescue directing the log to /mnt

ddrescue /dev/sda /dev/sdb /mnt/ {--- Filename of log

To specify number of retries use -r (-1=infinite) and to resume where you left off, use -C for example:

ddrescue -r 3 -C /dev/sda /dev/sdb /mnt/

This will pick up where the last attempted was interrupted.

Note you can run multiple passes and use the same log file - it will only attempt to get the bad sections on subsequent passes.