Executive summary: it is fairly easy to recover from certain situations where your machine won’t boot the way you want (or at all).
Various scenarios that could cause this situation are discussed in section 2, but for now let’s concentrate on getting out of the situation.
Here is a terse summary. Every step in this section is covered in more detail in section 4.
If you happen to know, based on experience, where the target system lives, define $partition and $drive accordingly, and skip to step 8. If you need to figure it out, proceed as follows:
With any luck, you will see something like this:
drwxr-xr-x 2 root root 200 Mar 26 07:59 . drwxr-xr-x 6 root root 120 Mar 26 08:00 .. lrwxrwxrwx 1 root root 10 Mar 26 08:00 emu -> ../../sda9 lrwxrwxrwx 1 root root 10 Mar 26 08:00 linux-root -> ../../sda5 lrwxrwxrwx 1 root root 11 Mar 26 08:00 lnx -> ../../sda11 lrwxrwxrwx 1 root root 11 Mar 26 08:00 more -> ../../sda10 lrwxrwxrwx 1 root root 10 Mar 26 08:00 SERVICEV003 -> ../../sda1 lrwxrwxrwx 1 root root 10 Mar 26 08:00 SW_Preload -> ../../sda2 lrwxrwxrwx 1 root root 10 Mar 26 08:00 usrsrc -> ../../sda7
mount /dev/sda11 /x ls -al /x/boot/vm* ls: cannot access /x/boot/vm*: No such file or directory
That tells us that sda11 is definitely not the right answer. So let’s try again:
umount /x # unmount previous hypothesis mount /dev/sda5 /x ls -al /x/boot/vm* -rw-r--r-- 1 root root 4275712 May 30 2013 /boot/vmlinuz-2.6.39.4 -rw-r--r-- 1 root root 4678720 Aug 25 2013 /boot/vmlinuz-3.10.9 -rw-r--r-- 1 root root 4599824 Mar 23 18:45 /boot/vmlinuz-3.11.3+ -rw-r--r-- 1 root root 4599824 Mar 23 18:43 /boot/vmlinuz-3.11.3+.old
Beware: You want to install grub on the drive (e.g. /dev/sda). If you install it on the partition (e.g. /dev/sda6), the grub-install program won’t complain, but the results won’t be what you wanted.
That’s probably enough to get you going. Shut down the Live Demo system, eject the DVD if any, and reboot in the normal way from your favorite drive (/dev/sda in the example).
Let’s assume you have already carried out the steps in section 1.1. Various possible next steps include:
Note that we are assuming that grub version 2 is being used. It has been several years since anybody used grub version 1, so I’m not going to worry about it.
Fire up the text editor: vi grub-hackconfig.
Find the line that says GRUB_DEVICE=... and change the last thing on the line, so that it probes /x rather than /.
Similarly, find the line that says GRUB_DEVICE_BOOT=... and change the last thing on the line, so that it probes /x/boot rather than /boot. Here is the diff:
--- grub-mkconfig 2014-03-26 13:51:30.649215458 -0700 +++ grub-hackconfig 2014-03-26 13:47:31.895892874 -0700 @@ -136,11 +136,11 @@ fi # Device containing our userland. Typically used for root= parameter. -GRUB_DEVICE="`${grub_probe} --target=device /`" +GRUB_DEVICE="`${grub_probe} --target=device /x`" GRUB_DEVICE_UUID="`${grub_probe} --device ${GRUB_DEVICE} --target=fs_uuid 2> /dev/null`" || true # Device containing our /boot partition. Usually the same as GRUB_DEVICE. -GRUB_DEVICE_BOOT="`${grub_probe} --target=device /boot`" +GRUB_DEVICE_BOOT="`${grub_probe} --target=device /x/boot`" GRUB_DEVICE_BOOT_UUID="`${grub_probe} --device ${GRUB_DEVICE_BOOT} --target=fs_uuid 2> /dev/null`" || true # Filesystem for the device containing our userland. Used for stuff like
Now you really should be ready to shut own the Live Demo system, remove the DVD if any, and reboot in the normal way.
The procedures in section 1.1 were meant to get the system functioning again as quickly as possible. Now that the system is up and running, so that the time pressure is off, we can do some housekeeping:
In ideal situations, the work described in this section doesn’t accomplish much, because it duplicates the work done in section 1.1 and section 1.3. However, consider the situation where the Live Demo system you used to restore the MBR is using a different version of grub. Maybe one system is out of date, or maybe just exercised the option to use a different version. This is your chance to install the grub version that your system thinks should be installed. If you don’t do this, you risk having some ugly problems later.
There are several scenarios that can lead to an MBR being overwritten or otherwise rendered unsatisfactory. Examples include:
Suppose you have a dual boot system, i.e. one that sometimes boots Linux and sometimes boots Windows. Every time you install (or reinstall) Windows, it installs its own boot loader into the MBR. This is a problem, because the MS boot loader will not load anything except the MS operating system ... in contrast to grub, which will happily allow you to boot almost anything: Linux, memtest86, various MS products, et cetera.
Some folks recommend installing MS before installing Linux, so that the Linux installation process will set up the MBR for you. This is fine as far as it goes, but it is not always possible. For instance, sometimes it is necessary to reinstall or upgrade the MS stuff, days or months or years after Linux was installed.
The grub-reinstall procedure described in this document takes only a few minutes, so feel free to install MS after Linux if you find it necessary or convenient to do so. MS will trash the MBR, but you can restore it using the techniques described here.
It never hurts to make backups of sector 0.
dd if=/dev/sda of=host1-sda.mbr count=1
Keep in mind that sector zero contains both the stage-0 boot code and the primary partition table. Therefore, before restoring the boot sector, you have to make a decision:
dd if=host1-sda.mbr of=/dev/sda count=1
Keep a copy, just to be safe: dd if=/dev/sda of=damaged.mbr count=1
Grab the good boot code from backup: dd if=host1-sda.mbr bs=1 count=444 > new.mbr
Tack on the current partition table: dd if=/dev/sda bs=1 skip=444 count=68 >> new.mbr
Write to disk: dd if=new.mbr of=/dev/sda count=1
Some discussion of the MBR and the basic boot process can be found in reference 1.
We now discuss the step sudo su
For good reasons, when you fire up a typical live CD, you are logged in as an ordinary user, not the superuser.
You can exert superuser privileges on a command-by-command basis by prefixing each command with "sudo" ... but since every command we are about to do requires superuser privileges, it is easier to just become superuser once and for all by saying sudo su
We now discuss the step mkdir /x
This creates a new empty directory named x. The name is arbitrary, made up just for this purpose. You could use any other name if you wanted, so long as you used the name consistently in all steps in the grub-reinstall procedure ... but x is as good as any. It’s just some empty directory. It serves the following purpose: In a moment we are going to want to mount a filesystem. Linux mounts things by mounting them onto a directory. The newly mounted filesystem has to attach to the rest of the filesystem somewhere, and Linux uses a pre-existing directory as a point of attachment.
We now discuss the step mount /dev/sda6 /x
Not much to say, really. If you want the operating system to treat your partition as a collection of files and directories (as opposed to a bucket of bits) you need to mount it.
We now discuss the step grub-install --root-directory=/x /dev/sda
The --root-directory=/x option tells grub where to look for the grub directory during the installation process. The grub directory is /x/boot/grub on typical distributions such as Ubuntu and Debian, but may be /x/grub on some *bsd setups.
The grub-install program uses the grub directory in several ways during the installation process. Among other things, it goes there to read the device.map file. It also goes there to write the core.img file. A new core.img file gets written each time you run grub-install.
Keep in mind that the Unix file system is essentially a graph (in the sense of graph theory) with edges and nodes. The edges are the paths, i.e. directory names and file names. The nodes do not have names. The nodes are where the data is stored. So: the inode of interest will be reached by the path "/x" during the installation process. Grub assumes this inode will be reached by the simple path "/" later, when the system on /dev/sda6 is actually booting and running.
The idea that the same inode could be reached by one path now and a different path later makes perfect sense if you think about it the right way. The grub-install program understands the distinction between the two, which is what makes it possible to reinstall grub using the easy procedure described in this document.
This distinction is, alas, not well documented. You could read the grub manpage all day and not learn anything about this distinction. The grub-install --help message says
--root-directory=DIR install GRUB images under the directory DIR instead of the root directory
which seems somewhere between incomprehensible and self-contradictory. Is DIR the root directory (as suggested by the equation root-directory=DIR) ... or is DIR used "instead of the root directory" (as stated in the explanatory message)? Gaaack.
I hope you never need to know this. Usually the procedures described in section 1.1 make this unnecessary.
Imagine a scenario where grub is installed in the MBR correctly, but the grub configuration files are messed up, so all you get is the grub> prompt (rather than a menu of kernels that can be booted). Further imagine that you can’t fix it using the methods described in section 1.1.
You may be able to recover using the following procedure:
This will give you a listing of all the partitions on the hd0 drive, along with their UUID, filesystem type, and modification date.
If hd0 turns out to be not the drive you want, try hd1 and so on.
This will give you a listing of all the filenames in the boot directory that start with “vml”. (If your kernel isn’t named vmlinuz-something, adapt these instructions accordingly.)
Note that you generally have to add the root=... option to the linux command line.
Beware that the way grub numbers disk drives {hd0, hd1, hd2, etc.} may be different from the way linux does it {sda, sdb, sdc, etc.} ... and the difference is not systematic. I have one system where hd0 corresponds to /dev/hde/. This is commonly an annoyance on systems that have a mixture of SATA and PATA drives.
The numbering of partitions is also different, but the difference is systematic: grub numbers them starting from 0, while linux numbers them starting from 1, so grub partition (...,2) coresponds to linux partition /dev/...3 and so on.
This will give you a listing of all the initrd files. Pick the one that corresponds to your kernel, and issue the complete command: initrd /boot/initrd.img-2.6.35.10 or whatever.
The only way to handle this case is to refer to the disk by its UUID, using a construction of the form root=UUID=4240ce68-802b-4a41-8345-543fad0ec20f
That is an obnoxious amount of typing, but with any luck you only have to do it once.
Grub will tell you the UUID; see the first item in this list.