I have spent the last few weeks creating an automated build process for my ESX 3.01 servers.
Half of my servers are in an unmanned remote site - so I wanted to minimise the need to actually visit the site.
I started to look at implementing a PXE boot solution after seeing Shridhar Deuskar's talk at TSX in April.
http://www.vmware-tsx.com/download.php?asset_id=38
Shortly after this Carl Thijssen released UDA 1.4 - a VM appliance that allows you to PXE build various Operating Systems
Mike Laverick worked with Carl to provide an ESX specific version.
http://www.ultimatedeployment.org/uda/index.html
So I then switched to using UDA 1.4 and have been very impressed with it.
My ESX servers are all HP hardware and hence use specific references to cciss disks in the kickstart file - as opposed to generic sda, sdb etc disks:
e.g. part /boot --fstype ext3 --size 250 --ondisk=cciss/c0d0
I suspect that this means that there is little likelihood of me overwriting a SAN disk at build time - but not everyone was convinced...
I certainly didn't want to have to pull the SAN cables each time I needed to rebuild a server - but I also wanted to minimise the possibility of accidentally formating a SAN LUN.
I found a couple of suggestions in the vmtn forums about disabling the HBA drivers in the initial PXE boot file - initrd.img - but no specific instructions.
While I was on holiday last week "smikeyp" got this to work - he realised that it was also necessary to remove the hba drivers from the netstg2.img file.
http://www.vmware.com/community/thread.jspa?messageID=673351 (then scroll down a little...)
I also used info from GavinJ et al to rebuild my ESX 3.01 ISO
http://www.xtravirt.com/index.php?option=com_content&task=view&id=46&Itemid=67
The notes provided by smikep were quite brief, at least for non-linux experts (like myself) - and I encountered quite a few problems trying to do the same thing myself - so I thought I'd provide a more detailed description of the process.
Hopefully it will be useful for some other ESX admins.
Perhaps VMware might even release future versions of ESX with these image files pre-doctored - this would presumably be worthwhile for them?
It would certainly make life easier for their customers.
I'm happy to provide copies of my two amended .img files if anyone wants them - especially if anyone with a website wants to host them?
They would of course be provided with the usual caveats:
Provided "as is" with no guarantees. Test them out and use them at your own risk...etc....
My notes are specific to editing the files on the UDA box - (which I believe is Fedora 5) - but the same concepts will apply no matter where the initrd.img and/or netstg2.img are run from - even for standard CD based installations.
UDA specifically refers to the initrd in /var/public/tftproot/initrd.esx301 - this is just a copy of the standard initrd.img released by VMware on the ESX 3.01 ISO
The referral to the initrd.esx301 is made in the file:
/var/public/tfpproot/pxelinux.cfg/default
(This is one of the various places that tftp trawls through, looking for an appropriate boot file to use)
(The boot options in this file are made up by the UDA app concatanating the list of templates created elsewhere in the UDA app)
(Note: the esx301 iso is mounted at boot to /var/public/www/esx/esx301 - but the pxe initrd.img in here is not explicitly used by the UDA app)
However the /var/public/www/esx/esx301/VMware/base/netstg2.img is then explicitly used by Anaconda for the second step in the installation process)
This is the actual process to follow to update the initrd.esx301 and the netstg2.img so that neither of them load the HBA drivers:
First backup the /var/public/tftproot/initrd.esx301 - just copy it to /backup
Then:
create a new directory to work in
\# mkdir /initrd
\# cd /initrd
copy initrd.esx301 file in - renaming it with .gz extension as we need to uncompress it shortly....
\# cp /var/public/tftproot/initrd.esx301 /initrd/initrd.esx301.gz
Uncompress it
\# gunzip initrd.esx301.gz
make an /initrd/extracted dir (we'll mount the initrd on here shortly)
\# mkdir extracted
mount the initrd on /initrd/extracted
\# mount initrd.esx301 extracted -o loop
cd to extracted and check that the mount has worked
\# cd extracted
\# ls
should now see various subdirectories - including a modules dir - where all the drivers are situated.
cd to modules directory and should see five files.... modules.cgz etc....
\# cd modules
\# ls
Four of them are just standard text files and can be edited as is - but modules.cgz is compressed - and needs to be uncompressed before we can amend it
First backup all the initial modules
copy the modules directory to /backup
Now create a new directory to work on modules.cgz in
\# mkdir /initrd/modules
\# cd /initrd/modules
Now uncompress modules.cgz directly to this new directory
\# zcat /initrd/extracted/modules/modules.cgz | cpio -idvm
This will create a directory structure beginning with "2.4.21-37.0.2.ELBOOT/i386" then containing the various driver files
The Emulex ones start with "lpf" and the qlogic ones start with "qla"
You could remove just the drivers for the type of HBAS you have - or to be totally future proof - remove all the HBA drivers....
\# cd 2.4.21-37.0.2.ELBOOT/i386
\# ls
You now see all the driver files (suffixed with .o) - there are 37 files in total
Delete all the Emulex drivers
\# rm -f lpf*
Delete all the Qlogic drivers
\# rm -f qla*
\# ls
To check that the correct drivers have been removed. There should now be only 29 files.
Now need to repack the files into a new modules.cgz....
\# cd /initrd/modules
\# find . -type f | cpio -o -H crc | gzip -n9 > /initrd/modules.cgz
compare the size of the initial and the new modules.cgz - hopefully the new one will be quite a lot smaller...
Overwrite the initial file with the new file...
\# cp /initrd/modules.cgz /initrd/extracted/modules/modules.cgz
"y" to overwrite when prompted
now need to edit the other four files in the /initrd/extracted/modules directory
\# cd /initrd/extracted/modules
edit modules.info to remove all the references to the emulex and/or qlogic drivers (I used vi - don't use a windows editor such as notepad....)
edit modules.dep in a similar fashion
edit modules.pcimap in a similar fashion
edit pcitable in a similar fashion
copy the five new module files to /backup/newmodules just in case we need to amend them in future?
now need to repack the initrd:
create a brand new initrd.esx301 - put it in /tmp for the moment...
\# dd if=/dev/zero of=/tmp/initrd.esx301 bs=1k count=4096
get following output:
4096+0 records in
4096+0 records out
4194304 bytes (4.2 MB) copied, 0.064656 seconds, 64.9 MB/s
\# mke2fs -i 1024 -b 1024 -m 5 -F -v /tmp/initrd.esx301
get following output:
mke2fs 1.38 (30-Jun-2005)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
4096 inodes, 4096 blocks
204 blocks (4.98%) reserved for the super user
First data block=1
Maximum filesystem blocks=4194304
1 block group
8192 blocks per group, 8192 fragments per group
4096 inodes per group
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 20 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
create a new directory to mount this initrd.esx301 image file on
\# mkdir /initrd/extractednew
mount this new (empty) filesystem on the new directory we just created:
\# mount /tmp/initrd.esx301 /initrd/extractednew -t ext2 -o loop
copy in the files.....
\# cp -av /initrd/extracted/* /initrd/extractednew
Found out that doing a cp -av /???/* does not copy any files of the form .filename
Unfortunately this meant that the .buildstamp file has not been copied
This means that Anaconda will complain that the directory trees does not match when the netstg2 is finally loaded
To get round it manually copy the .buildstamp too
(could alternatively use tar -C to ensure that all similar files copied too....)
\# cp -av /initrd/extracted/.buildstamp /initrd/extractednew/
unmount it to write all cached files... (if you are in the dir currently you need to cd .. first so it's not busy)
\# umount /initrd/extractednew
\# gzip --best /tmp/initrd.esx301
\# cd /tmp
get rid of the .gz suffix
\# mv initrd.esx301.gz initrd.esx301
copy the new one to /backup/initrd.esx301.newnosan - just for a backup...
\# cp /tmp/initrd.esx301 /backup/initrd.esx301.newnosan
Also copy it back to the place that it needs to be for UDA to actually use it - /var/public/tftproot
\# cp /tmp/initrd.esx301 /var/public/tftproot/
y to overwrite
Well we're half way there now... we just need to do a similar thing to the netstg2.img now....
The esx301 iso is automatically mounted by the UDA app on /var/public/www/esx/esx301
The particular file that we are interested in is the one that Anaconda loads directly after the initrd.esx301
/var/public/www/esx/esx301/VMware/base/netstg2.img
Copy the initial one to /backup/ - just in case we ever need it again...
create a new directory to work in
\# mkdir /netstg2
\# cd /netstg2
copy the netstg2.img file in ....
\# cp /var/public/www/esx/esx301/VMware/base/netstg2.img /netstg2/netstg2.img
make an /netstg2/extracted dir (we'll mount the netstg2.img on here shortly)
\# mkdir extracted
mount the netstg2.img on /netstg2/extracted
\# mount netstg2.img extracted -o loop
cd to extracted and check the mount has worked
\# cd extracted
\# ls
you should now see four subdirectories - including /modules - where all the drivers are situated.
I was hoping that all five module files would be identical to the ones in the initrd - but they appear to be slightly different
First backup all the initial modules...
copy the contents of the modules directory to /backup/netmodules
Now create a new directory to work on modules.cgz in
\# mkdir /netstg2/modules
\# cd /netstg2/modules
Now uncompress modules.cgz directly to this new directory
\# zcat /netstg2/extracted/modules/modules.cgz | cpio -idvm
This will create a directory structure beginning with "2.4.21-37.0.2.ELBOOT/i386" then containing the various driver files
The Emulex ones start with "lpf" and the qlogic ones start with "qla"
You could remove just the ones you have - or to be totally future proof - remove all the HBA drivers....
\# cd 2.4.21-37.0.2.ELBOOT/i386
\# ls
You now see all the driver files (suffixed with .o) - there are 38 files in total
Delete all the Emulex drivers
\# rm -f lpf*
Delete all the Qlogic drivers
\# rm -f qla*
\# ls
To check that the correct drivers have been removed. There should now be only 30 files.
Now need to repack the files into a new modules.cgz....
\# cd /netstg2/modules
\# find . -type f | cpio -o -H crc | gzip -n9 > /netstg2/modules.cgz
compare the size of the initial and the new modules.cgz - hopefully the new one will be quite a lot smaller...
I tried to overwrite the initial file with the new file...
via cp /netstg2/modules.cgz /netstg2/extracted/modules/modules.cgz
Unfortunately this fails as the filesystem seems to be mounted readonly - do the following workaround
\# cd /netstg2
\# mkdir extracted_new
use the following command to copy all the mounted data to a new directory
(It's better than copy as it explicitly does all types of files)
\# tar -C /netstg2/extracted/ -cmf - . | tar -C /netstg2/extracted_new/ -xmf -
can now copy the newly edited modules.cgz into this new directory structure:
\# cp /netstg2/modules.cgz /netstg2/extracted_new/modules/modules.cgz
now need to edit the other four files in the /netstg2/extracted_new/modules directory
\# cd /netstg2/extracted_new/modules
edit modules.info to remove all the references to the emulex and/or qlogic drivers (I used vi - don't use a windows editor such as notepad....)
edit modules.dep in a similar fashion
edit modules.pcimap in a similar fashion
edit pcitable in a similar fashion
copy the five new module files to /backup/newnetmodules just in case we need to amend them in future?
Now need to repack the netstg2.img...
\# cd /netstg2
Following command works on an ESX box - but will not work on Fedora (which is what the UDA box is)
mkcramfs /netstg2/extracted_new/ /netstg2/netstg2.img.new
Hence we need to use a slightly different syntax:
\# mkfs.cramfs /netstg2/extracted_new/ /netstg2/netstg2.img.nosan
compare the sizes of the original and the new one - the new one should be slightly smaller...
copy the new one to /backup/netstg2.img.nosan - just for a backup...
Unfortunately we cannot just copy this file back to the place that it needs to be for UDA to actually use it - /var/public/www/esx/esx301/VMware/base/
As this is a readonly mounted file system (mounted at UDA boot time directly from the esx 301 iso).
So we need to actually add it to the ESX 3.01 iso - which can then be remounted in the correct place....
copy the directory structure from under /var/public/www/esx/esx301/ to somewhere with enough free space.
In my case to /var/public/smbmount/DISK2/esx301nosan/ (I've added a second disk to my UDA box with plenty of space on it)
Then copy netstg2.img.nosan over the existing /var/public/smbmount/DISK2/esx301nosan/VMware/base/netstg2.img
Then need to use mkisofs to recreate the ISO.
Unfortunately the mkisofs utility is not installed on the UDA box by default...
Downloaded it from:
http://rpmfind.net/linux/rpm2html/search.php?query=%2Fusr%2Fbin%2Fmkisofs
Selected the Fedora Core 5 for i386 option
Copied it to the UDA server - (via veeam fast scp or winscp etc....)
Installed it on the UDA box via:
cd to directory where it is then:
\# rpm -i mkisofs-2.01.01.0.a03-3.i386.rpm
(got a warning - but seems to work OK)
Can now create the ISO
cd to the appropriate directory as the mkisofs parameters are relative...
\# cd /var/public/smbmount/DISK2/esx301nosan/
\# mkisofs -l -J -R -r -T -o /var/public/smbmount/DISK2/esx-3.0.1-32039new.iso -b isolinux/isolinux.bin -c isolinux/boot.cat -no-emul-boot -boot-load-size 4 -boot-info-table /var/public/smbmount/DISK2/esx301nosan
Chugs along for a whilecreating the new ISO.....
Now unmount the existing old esx 301 ISO from /var/public/www/esx/esx301
\# umount /var/public/www/esx/esx301
now rename the old ISO
\# mv esx-3.0.1-32039.iso esx-3.0.1-32039old.iso
rename the new ISO
\# mv esx-3.0.1-32039new.iso esx-3.0.1-32039.iso
All that is left to do now is to remount the new ISO
Can do that in at least three ways - via the UDA app, via a mount command, or by rebooting the UDA server
I used the UDA app
check it is mounted with
\# mount
Now we're ready to test out the changes....
As I hoped the UDA PXE boot menu appeared - I selected an appropriate kickstart option for my environment
The build process then started, the lpfcdd_732 emulex hba driver did not load and my build completed as normal....
Hurray - it finally works...
Hopefully the notes above are correct - but it's quite possible that I've made the odd typo, missed out a directory or omitted a "cd" to a particular directory somewhere etc....
I wrote the instructions as I did it, but I haven't gone through the process a second time to check them...
If you find any problems - please let me know.
Hope it's useful.
Dinny