Procedure for making a bootable recovery/rescue CD-ROM for Linux servers with non-standard vendor-supported hardware

Jim Ockers
Pason Systems USA Corp., 2001

Theory:

Many servers have non-standard kernels or hardware with vendor- provided device drivers. In some cases it is not possible to boot from the vendor-provided boot media (floppy, CD-ROM) and read data from the tape drive while simultaneously writing the data to the hard drives, due to driver incompatibilty or other device support problems.

In cases where the vendor (Dell, IBM, etc.) provide a fully operational Linux system but no boot rescue media that fully support the hardware, it would be useful to clone the system from the hard drive to a bootable CD-ROM, to ensure that you can recover your system when something goes wrong and you desperately need to restore data from the tape.

Until the vendors start providing rescue media such as this you will need to make your own.

First consider the implications of having the "rescue" mode come up on the network at the same IP address as the system being rescued. If you have more than one system that this rescue CD will be used for, you may want to disable the auto- matic network-starting so that you don't step on the IP address of an important server while rescuing another server or system.

Procedure:

This procedure was developed on Red Hat Linux 6.x. but the general principles apply to any Linux distribution.
  1. Clone the system to an empty directory, such as /usr2 .

    THE REST OF THE INSTRUCTIONS PERTAIN TO THE CLONED SYSTEM WHICH SHOULD LIVE IN /usr2 IF YOU ARE GOING EXACTLY BY MY EXAMPLE.

  2. There is invariably too much data to fit on one CD, so we need to delete some stuff. Delete the following directories, which you hopefully will not need on the CD (assuming your tape backup restore software works in text mode and does not need X11):

    If the system is old or has been in use for a while there might be a bunch of LARGE stuff in the /usr2/home/ directory tree. Consider deleting all home directories since the home directories will not be needed in rescue mode anyway. that root's home dir- ectory is not in the /usr2/home tree.

    Hopefully after doing the above deletions your disk usage is less than 650MB, the size of most CD-R media. ( cd /usr2 ; du -sc * )

    Also delete most of the startup (S*) stuff in /etc/rc.d/rc3.d (assuming your initdefault runlevel is 3), except leave the following:

    You Also need to create some directories in the new filesystem in case you need them in the future (the proc directory is needed by the runtime environment):

    Delete everything in /usr2/var/lock/subsys/* . Also delete /usr2/var/run/*.pid :

  3. Edit /usr2/etc/issue* to reflect rescue purpose of this CD. Say something like:
                   Pason Systems USA Corp.
                   Dell PowerEdge 4400 Linux Rescue CD
                   (c) 2001 Pason Systems USA Corp.
    
  4. Edit the /usr2/etc/inittab file to change the console getties. It turns out that mingetty REALLY hates having a read-only root filesystem and just refuses to work. Red Hat's default distribution comes with a /sbin/getty that works with read-only root fs. The new getty command in inittab for EACH tty needs to be:
    1:2345:respawn:/sbin/getty tty1 38400 linux
    2:2345:respawn:/sbin/getty tty2 38400 linux
    3:2345:respawn:/sbin/getty tty3 38400 linux
    4:2345:respawn:/sbin/getty tty4 38400 linux
    5:2345:respawn:/sbin/getty tty5 38400 linux
    6:2345:respawn:/sbin/getty tty6 38400 linux
    

    If you leave the inittab using mingetty you will have a perfectly functional system booted from the CD but you will be unable to log in.

  5. Edit /usr2/etc/fstab and get rid of all lines that are NOT:

    Also, change the root filesystem device to be /dev/scd0 (or /dev/hdc, or whatever your server's CD-ROM device is) . For example:

    The "0 0" there is for fsck's benefit, but as I describe below we need to disable fsck anyway since the filesystem is guaranteed to be read-only.

  6. The /etc/mtab file will not be writable on the CD; here is a work-around:

  7. There is a fair amount of editing of the /usr2/etc/rc.d/rc.sysinit file required now. Unfortunately this will vary depending on the linux distribution you are using, and possibly even the version of Red Hat. You want to get rid of everything that involves an automatic fsck, everything that could trigger an automatic reboot based on some fsck or other error, and everything that writes to a file such as lines that say nothing but "> /etc/mtab" or which involve sending the output of a command to something other than >/dev/null .

    You also don't need the depmod stuff since the modules.dep is on a read-only filesystem.

    There is a point in the rc.sysinit where it changes the mounting of the root filesystem to read-write. Obviously you don't want to do that since the CD-ROM is read-only.

    Make sure that all the stuff that could cause an "automatic reboot" or reboot of any sort in the rc.sysinit is removed.

    I can provide you with the rc.sysinit script that I used for my system, just e-mail me at ockers@us.pason.com if you need it. I have also added it to this page: Here are all the .ISO images, a tarball, and the rc.sysinit scripts I used.

  8. Make a boot disk for your running system. Red Hat provides an mkbootdisk utility; on my Dell I might type: mkbootdisk 2.2.14-6.1.1smp The important thing about the boot disk is that it should use LILO to load the kernel (so you can pass command-line parameters if needed) and it should contain any needed modules that your system normally loads from the initrd, such as SCSI or RAID modules and ethernet modules. Obviously it should also contain the kernel. The Red Hat mkbootdisk command will do all of that stuff for you.

  9. Mount the diskette so you can edit it:

  10. Make a device block-special file on the floppy disk for the CD-ROM device you are going to use as your root filesystem:

    You may need to look up the major & minor numbers of the device using "ls -al /dev/scd0" for your device (scd0 is 11,0)

  11. Edit the /mnt/floppy/boot/message to reflect the rescue nature of this boot CD. Say something like:
    Pason Systems USA Corp. == Dell PowerEdge 4400 Rescue CD.
    
    Press  (or wait 10 seconds) to boot your Red Hat Linux system from
    the bootable CD-ROM.  You may override the default linux kernel parameters
    by typing "linux ", followed by  if you like.
    
  12. Edit the /mnt/floppy/etc/lilo.conf to change the root device that the kernel will try to load by default. Obviously you will want to use the CD-ROM device. Find the kernel section of the lilo.conf where it says "root = /dev/sda3" or something like that and change it to /dev/scd0 (or whatever your CD-ROM device is).

  13. Re-write the boot sector of the floppy disk using the message and lilo.conf you just edited, then unmount the diskette:

  14. Get the floppy disk image using dd: dd if=/dev/fd0 of=/usr2/boot.img

  15. Before we finish things up you may want to delete the root password from /usr2/etc/shadow (everything between the :colons: that is the encrypted password) just to make life easier for the rescuer. Use a text editor of course.

  16. Make the ISO filesystem disk image of the /usr2 system. I have a /usr2/mkisofs.sh script that does this for me, and writes the CD image to the /home filesystem:
    #!/bin/sh
    cd /usr2
    rm -f /home/bootcd.img
    mkisofs -b boot.img -c boot.catalog -d -l -L -o /home/bootcd.img \
             -R -r -U .
    
    [NOTE: SOME VERSIONS OF MKISOFS BARF ON THE -U OPTION THERE.  IF YOURS
    DOES, THEN GET RID OF -U AND USE -N INSTEAD.]
    
  17. Burn the bootcd.img image to a CD and try it out! Also, keep the bootable diskette around that you finished creating in step 14.

  18. It would be a good idea to print out your /etc/fstab from the running system and keep it with your rescue CD, in case you have lots of partitions and a bad memory. It helps to know the filesystem layout of your system if you are doing a disaster recovery, so you can manually mount the filesystems in the appropriate places, do the fscks properly, and so forth.

    I also do "fdisk -l /dev/sda" and so forth for each disk so I know what each partition type, size, and other characteristics are; naturally this should be printed out and kept with the fstab printed output from the previous paragraph.

  19. I have an IBM NetFinity 5500 server in which the BIOS incorrectly maps the CD-ROM for bootable CD's to the floppy disk drive. So, LILO is trying to load from /dev/fd0 and on the NetFinity the boot floppy disk MUST be in the floppy disk drive, otherwise it just prints L 0x18 0x18 0x18 0x18 0x18 ad nauseum. On most BIOSes the BIOS remaps the A: drive (/dev/fd0) to be the bootable area of the CD; on the NetFinity the remapping is either NOT done or is done INCORRECTLY. Thus, the boot diskette is necessary to use in conjunction with the boot CD. This is why I suggest, in step 18, that you keep the bootable diskette around.

If you have any feedback or suggestions for me I would love to hear them. Send them to
ockers@us.pason.com
.