Tag Archives: Vmware
Cleaning up snapshots (inc. failed VCBs) on VC hosted on 2008 R2

Cleaning up snapshots (inc. failed VCBs) on VC hosted on 2008 R2

Hi,

I recently upgraded my VirtualCenter server to 4.0.1 running on 2008 R2. I’m very happy with it, except the old way of running scheduled powershell tasks in 2003, no longer works. This is my script to delete all snapshots older than 7 days:

Add-PSSnapin VMware.VimAutomation.Core
Connect-VIServer VCservername -User UserName -Password Password

Get-VM | Get-Snapshot | Where { $_.Created -lt (Get-Date).AddDays(-7)} | remove-snapshot -confirm:$false

The way to run this script (assuming it’s called Delete-Snapshots.ps1 and sits in C:\Windows\Scripts) is to go into the Actions tab of Task Scheduler and change the program settings to:

  • Program/script: %SystemRoot%\system32\WindowsPowerShell\v1.0\powershell.exe
  • Add arguments (optional): -noninteractive -nologo c:\windows\scripts\delete-snapshots.ps1

Alternatively, here is the xml file dump you can import into Task Scheduler (delete-snapshots.xml) – obviously you’ll need to change usernames to run the service as.

Ta,

Leo

Comments ( 0 )
vCenter 4.1 upgrade causing you SSL certificate headaches?

vCenter 4.1 upgrade causing you SSL certificate headaches?

Recently, a customer of mine decided that it’s time to upgrade to vCenter 4.1. So he took a backup of the database and blew away the 32-bit Windows server. He then installed Windows 2008 R2, SQL Server Express 2008 R2 with Advanced Services and imported the DB back in – created a 64-bit DSN, and started installing vCenter. You then choose to upgrade the database to 4.1…

… only to be told that vCenter won’t install without your old certificates in the All Users profile which can be found in

  • Windows 2000/2003/XP: c:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\SSL\
  • Windows 2008/2008/Vista/7: c:\programdata\VMware\VMware VirtualCenter\SSL

VMware’s vCenter installer says you can’t do it unless you have a certificate back up or have already imported the certificates into the new server at the default locations (above)

That’s a lie. You can do it. You just have to accept that it’s going to be a little bit more manual and you’ll have to reconnect all your ESX/i hosts afterwards.

Here’s how to do it:

  • Blow away your restored database (but keep the backup)
  • Create a new database with a SQL db_owner user identical to the username of the SQL db_owner user on the old server.
  • Create a new SQL 64-bit DSN
  • Install vCenter as a brand new install.
  • Stop the VMware VirtualCenter service
  • Stop the VMware WebServices service
  • Open SQL Management Studio and restore the backed up DB over the top of the new DB
  • If you are using SQL authentication open a new query window and execute the following statement:
    sp_change_users_login “auto_fix”, ‘vmware’
    This marries the orphaned SQL db_owner in the database to the newly created login in point 2 above
  • Open the command prompt and in the installation directory of VirtualCenter, run the following:
    vpxd -p
    You will be asked to input a password – use the same password as that of the SQL db_owner user for your database. This re-initializes/re-encrypts the database with the newly installed certificates and removes references to the old certificates which got blown away during the OS reinstall.

At this point you can fire up your vCenter and reconnect all your hosts – this is a manual task but once finished, everything will be fine.

What I haven’t been tested is the effect of doing this on plugins such as Update Manager that share column space in the vCenter DB (yes you can separate the DBs but not all people do). Thankfully, a quick google revealed this document in VMware’s KB which might help – it has to do with the password defaults for certificates, not the password defaults for the DB, so I might be leading you up the wrong alley.

Nonetheless, this neatly avoids the issue of restores, or having to re-implement all the customisations in your VC (including dvSwitches).
Cheers,

Leo

Comments ( 0 )
Performing SAN fabric maintenance on ESX

Performing SAN fabric maintenance on ESX

Hi,

We have close to 50 LUNs presented to 40 hosts at a client site. We also need to perform maintenance which will cause downtime on our core FC switch. This means there will be a massive failover and potential path thrashing due to lack of path balance on all VM hosts at once.

Here’s what we have:

ESX Server with 2 HBAs -> 4 Paths -> Fixed path policy -> Active/Active Storage Array

Each HBA has 2 paths to the Active/Active array. This means that I should be able to migrate all paths to either of these two paths on one of the HBAs.

So here’s how we do it:

#!/bin/bash
COUNTER=1
for LUN in $(esxcfg-mpath -l | grep "has 4 paths" | awk '{print $2}')
   do
      esxcfg-mpath --lun=${LUN} --path=$(esxcfg-mpath -q --lun=${LUN} | grep FC | awk '{print $4}' | awk '{print NR "S\t " $0}' | grep ${COUNTER}S | awk '{print $2}') --preferred
      COUNT=`expr ${COUNTER} + 1`
      COUNTER=${COUNT}
      if [[ ${COUNTER} -gt 2 ]]
      then
         COUNTER="1"
      fi
   done
for HBA in `esxcfg-info -w | grep vmhba | awk '{print $3}' | grep -e 'vmhba\+[1-9]' -o`
   do
      esxcfg-rescan $HBA
   done
/usr/bin/vmware-vim-cmd hostsvc/storage/refresh

In the above case what you’re seeing is a loadbalance across paths 1 and 2 of the lowest HBA number seen for each path by esxcfg-mpath (esxcfg-mpath sorts HBA-path configs from lowest HBA number to highest HBA number)

Then there is an esxcfg-rescan operation on all HBAs of the host and a storage refresh. At this point, all your paths are on the two paths of the first HBA.

If you want to take down the first HBA and move all paths to the second HBA, it’s simply a slight script modification to increment the COUNTER variable:

#!/bin/bash
COUNTER=3
for LUN in $(esxcfg-mpath -l | grep "has 4 paths" | awk '{print $2}')
   do
      esxcfg-mpath --lun=${LUN} --path=$(esxcfg-mpath -q --lun=${LUN} | grep FC | awk '{print $4}' | awk '{print NR "S\t " $0}' | grep ${COUNTER}S | awk '{print $2}') --preferred
      COUNT=`expr ${COUNTER} + 1`
      COUNTER=${COUNT}
      if [[ ${COUNTER} -gt 4 ]]
      then
         COUNTER="3"
      fi
   done
for HBA in `esxcfg-info -w | grep vmhba | awk '{print $3}' | grep -e 'vmhba\+[1-9]' -o`
   do
      esxcfg-rescan $HBA
   done
/usr/bin/vmware-vim-cmd hostsvc/storage/refresh

In this case, the paths will vary between paths 3 and 4 which represent the two paths of the second HBA listed by esxcfg-mpath.

Thanks to Duncan Epping of yellow-bricks.com for some of the code

Cheers,

Leo

Comments ( 0 )
VCDX progress update

VCDX progress update

Just to let you know, my VCDX application has been accepted – I’m defending my mammoth design in Melbourne on the 6th of July…

Wish me luck…

Comments ( 0 )
A new VMFS driver for Linux systems for recovery purposes.

A new VMFS driver for Linux systems for recovery purposes.

As an update to this post, the vmfs-tools project has been operating for a while and, unlike the previous driver – these guys have made it possible to read VMFS extents.

Cheers,

Leo

Comments ( 0 )
Quick heads up re. Dell R610 and Dell R710 servers for vSphere

Quick heads up re. Dell R610 and Dell R710 servers for vSphere

Hi,

Just a quick piece of advice regarding Dell R610s and Dell R910s in vSphere – when setting up the BIOS of the server, make sure to leave Node Interleaving disabled.

This messes with ESX’s own NUMA settings (Configuration -> Advanced Settings -> Numa) and significantly reduces the speed of the ESX server.

Reference

Cheers,

Leo

Comments ( 0 )
Update: Linux VM template best practices

Update: Linux VM template best practices

I had meant to post this sooner than now but unfortunately work is keeping me occupied 28×7 (that’s not a typo!)

This is a partial update to this post

Attila seems to be a scripting whiz and he’s made a change to my dd script to not utilize bc and added better tunable parameters – ddsleep, ddbs and ddsize. Then he emailed me to share with you! He has tested the default values on some systems and they seemed good: the progress was acceptable and it didn’t generate a high load, the system remained very responsive (in fact it was nearly not noticeable at all):

#!/bin/bash

# by default we skip /
doroot=0

# we want to keep 10 percent free
freepercent=5

# we want to keep at least 100 megabytes free
freespace=100m

# low priority
ddnice=19

# the file to create
file=.vtools-zerovmdk.bin

# the amount of time to sleep between two dd
ddsleep=0.4

# the block size to use with dd
ddbs=64k

# the size by which to increment the file in each turn
ddsize=10m

# if the file is already there, we continue to grow it
resume=0

fss=

while [ $# -gt 0 ]; do

	case $1 in
		--do-root)
			doroot=1
			shift 1
			;;

		--resume)
			resume=1
			shift 1
			;;

		--free-percent)
			freepercent=$2
			shift 2
			;;

		--free-space)
			freespace=$2
			shift 2
			;;

		--dd-nice)
			ddnice=$2
			shift 2
			;;

		--file)
			file=$2
			shift 2
			;;

		--dd-sleep)
			ddsleep=$2
			shift 2
			;;

		--dd-bs)
			ddbs=$2
			shift 2
			;;

		--dd-size)
			ddsize=$2
			shift 2
			;;

		--fss)
			fss=$2
			shift 2
			;;

		*)
			echo "Invalid parameter: $1"
			exit 1
			;;

	esac

done

function resolve_size
{
	local size=$1
	local m=0

	if ! (echo "$size" | grep -q "[0-9]*"); then
		echo "Invalid size: $size" >&2
		exit 1
	fi

	case $size in
		*g)
			m=3
			;;
		*m)
			m=2
			;;
		*k)
			m=1
			;;
	esac

	if [ $m -gt 0 ]; then
		local l=${#size}
		l=$(( $l - 1 ))
		size=${size:0:$l}
		local i=0
		while [ $i -lt $m ]; do
			size=$(( $size * 1024 ))
			i=$(( $i + 1 ))
		done
	fi
	echo $size
}

function calc_percent
{
	local v=$(($1 * $2))
	local l=${#v}
	l=$(($l - 2))
	echo ${v:0:$l}
}

function calc_fsfree
{
	df -B 1 $fs | tail -n 1 | awk -F" " '{print $4}'
}

function calc_fsmax
{
	df -B 1 $fs | tail -n 1 | awk -F" " '{print $2}'
}

freespace=`resolve_size $freespace`
ddbs=`resolve_size $ddbs`
ddsize=`resolve_size $ddsize`

test -z "$ddnice" && ddnice=`nice`

if [ ! -z "$fss" ]; then
	doroot=1
else
	fss=`mount | egrep "type (ext3|ext2|xfs)" | awk -F" " '{ print $3 }'`
fi

for fs in $fss; do

	if [ ! -d "$fs" ]; then
		echo "No such directory: $fs" >&2
		exit 1
	fi

	if [ "$fs" = "/" ]; then
		test "$doroot" = "0" && continue
	fi

	if [ $resume -eq 0 -a -f "$fs/$file" ]; then
		echo "File exists: $fs/$file" >&2
		exit 1
	fi

	fsmax=`calc_fsmax $fs`
	fskeep=`calc_percent $fsmax $freepercent`
	if [ $freespace -gt $fskeep ]; then
		fskeep=$freespace
	fi

	fsfree=`calc_fsfree $fs`
	fszero=$(($fsfree - $fskeep))

	#
	# If the last chunk will not reach
	# the space amount we need to free then
	# then we may enter an endless loop
	last=0

	while [ $fszero -ge 0 ]; do

		test $last -eq 1 && break

		ddcnt=$ddsize

		if [ $fszero -lt $ddsize ]; then
			ddcnt=$fszero
			last=1
		fi

		# this is the step where we may loose precision (see 'last' variable)
		ddcnt=$(( $ddcnt / $ddbs ))

		nice -n $ddnice dd count=$ddcnt bs=$ddbs if=/dev/zero of="$fs/$file" oflag=append \
conv=notrunc status=noxfer 2>&1 | sed '/^[0-9]\++[0-9]\+.*\(records\|rekord\)/d' >&2

		test ! -z "$ddsleep" && sleep $ddsleep

		fsfree=`calc_fsfree $fs`
		fszero=$(($fsfree - $fskeep))
	done

	nice -n $ddnice rm -f "$fs/$file"
done

The script is also downloadable from here.

Attilla, many thanks.

Cheers,

Leo

Comments ( 0 )
Blowing my own horn

Blowing my own horn

Just to let committed blog readers know, I’ve just passed the first stage of the VCDX exam.

I’ve also passed my VCP on vSphere.

  • Do not go into the VCDX unprepared – your arse will be handed back to you on a platter. It is not an easy exam
  • Anyone who is a VCP on VI3.5 only needs to read up the vSphere Configuration Maximums and you’ll be able to pass with your hands tied behind your back – there are some questions on VMDirectPath I/O and vShield/VMSafe but not too many.

Cheers,

Leo

Comments ( 0 )
Some heads up for vCenter 4

Some heads up for vCenter 4

First of all, apologies for not posting for a very long time. I’ve been busy setting up my Cloud Computing service. I’ll write about that some other time.

For now, I have some recent experiences to share with vCenter:

  • vCenter 4.0 works on Windows 2008 R2 but the vSphere client does not unless a workaround is used
  • If you plan on using the Cisco Nexus 1000V as part of your vSphere Enterprise Plus package, vCenter must be able to run on port 80 – this means, no IIS on the vCenter box.
  • If upgrading from VirtualCenter 2.5.x and moving servers, make sure to keep a backup of the SSL certs in C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\SSL – if they are not copied over the ones newly installed by the vCenter installer, the vCenter Hardware Status and the vCenter Service Status will fail to enable.
  • If installing on an x64 Windows box, you still need a 32-bit connection via ODBC to your SQL DB. Start, Run, type odbcad32.exe and enter your details

Cheers,

Leo

Comments ( 0 )
Upgrading Dell Firmware as part of your ks.cfg

Upgrading Dell Firmware as part of your ks.cfg

If you own Dell Servers and run your ESX farms on them, you’ll know what a pain in the arse it is to update the firmware.

What if you could do it as part of your install script, or even deploy it in the future as a script to be run on all boxes?

I’ve got the solution for that, because frankly, the Dell TechCenter one didn’t suit my purposes (mounting/lwp-downloading a 1.9GB ISO to each server to upgrade 30MB worth of updates is a serious waste of time and bandwidth across sites)

The only requirement is that Dell’s OMSA is installed and running on the ESX server.

So here’s my script:

#!/bin/sh

# Allowing outbound access through the ESX firewall
esxcfg-firewall --allowOutgoing

# Downloading Dell firmware/BIOS updates
lwp-download http://149.171.186.10/Dell/Toolkit/Systems/PE6950/DRAC.BIN /tmp/DRAC.BIN
lwp-download http://149.171.186.10/Dell/Toolkit/Systems/PE6950/BIOS.BIN /tmp/BIOS.BIN
lwp-download http://149.171.186.10/Dell/Toolkit/Systems/PE6950/RAID.BIN /tmp/RAID.BIN

# Disabling outbound access through the ESX firewall
esxcfg-firewall --blockOutgoing

# Make sure firmware/BIOS scripts are executable
chmod a+x /tmp/*.BIN

# Attach virtual media for DRAC update
racadm config -g cfgRacVirtual -o cfgVirMediaAttached 1
export rawdevice=`dmesg | tail -n30 | grep "32768 512-byte hdwr" | awk -F" " '{print $3}' | sed "s/://g"`
raw /dev/raw/raw1 /dev/$rawdevice

# Run the updates
for i in /tmp/*.BIN ; do $i -q ; done

# Delete the updates
rm -f /tmp/*.BIN

# Detach the virtual media for DRAC update
racadm config -g cfgRacVirtual -o cfgVirMediaAttached 0

The file is likewise attached.

Obviously, in the lwp-download section, put in your own URL for the .BIN files that comprise the Dell updates.

Cheers,
Leo

Comments ( 0 )