This must be read in conjunction with this post.
The fast I/O fencing needs to be enabled in DB2 pureScale. This is a feature of GPFS in conjunction with your storage firmware or its driver.
In a nutshell, the following procedure must be followed:
The file prcapdevices should be generated by the tsprinquiry command.
prcapdevices file should be copied on all nodes in /var/mmfs/etc directory.
mmchconfig usePersistentReserve=yes should be executed and this requires DB2 pureScale cluster (RSCT and GPFS) to be stopped.
The following script is an attempt to automate the process of enabling usePersistentReserve feature of GPFS, which of course depends upon the underlying storage.
Once fast I/O fencing is enabled, it can be verified by the command mmlsconfig | grep Persistent and it should should show usePersistentReserve set to either yes or no.
Through this script, you can either set or unset this parameter. This script requires two inputs. The first being the name of the instance GPFS name. You can determine by various methods.
1. Look at /etc/fstab and look for the instance GPFS name. The instance GPFS mount point will hold sqllib_shared directory. Normally the default name is db2fs1 but this could be different.
2. db2cluster -cfs -list -filesystem
The script could have figured this out but I thought to ask for it.
The second argument is either yes or no which will be used to set or unset the usePersistentReserve parameter.
Example:
# ./enablescsipr <db2_gpfs_name> <yes|no>
# ./enablescsipr db2fs1 yes
Download Script enablescpr
#!/bin/bash # Vikram Khatri (vikram.khatri@us.ibm.com) # enable SCSI-3 PR at GPFS level # # ./enablescsipr# # ./enablescsipr db2fs1 yes MMBIN=/usr/lpp/mmfs/bin RSCTBIN=/usr/sbin/rsct/bin if [ "$#" != "2" ]; then echo "Usage: $0 yes|no" 1>&2 echo "Usage: $0 db2fs1 yes" 1>&2 echo " Instance GPFS name should be specified" 1>&2 echo " yes - mmchconfig usePersistentReserve=yes will be run" 1>&2 echo " no - mmchconfig usePersistentReserve= will be run" 1>&2 exit 1 fi INSTGPFS=$1 SCSIPR=$2 if [ "$SCSIPR" != "yes" -a "$SCSIPR" != "no" ] ; then echo "Valid value for second argument is yes or no" 1>&2 exit fi ## Find db2ls and if not found then exit which db2ls > /dev/null 2>&1 if [ $? -eq 1 ] ; then echo "db2ls not found. Exiting ...." 1>&2 exit 1 fi DB2INSTALLDIR=`db2ls -c | awk 'BEGIN {FS=":"} END {print $1}'` echo "DB2 install dir = $DB2INSTALLDIR" INSTANCE=`$DB2INSTALLDIR/bin/db2ilist` echo "DB2 Instance Name = $INSTANCE" INSTHOME=`cat /etc/passwd | grep $INSTANCE | awk 'BEGIN{FS=":"} {print $6}'` echo "DB2 Instance Home = $INSTHOME" HOSTLIST=`cat $INSTHOME/sqllib/db2nodes.cfg | awk '{print $2}' | sort | uniq` if [ "$HOSTLIST" == "" ] ; then echo "Unable to read db2nodes.cfg file. Looks like GPFS is down" 1>&2 exit 1 fi GPFSDISK=`$DB2INSTALLDIR/bin/db2cluster -cfs -list -filesystem $INSTGPFS -disk | grep "(*)" | sed -e 's/.*\///'` if [ "$GPFSDISK" == "" ] ; then echo "Unable to determine GPFS disk name" 1>&2 echo "Command used to determine the disk name was" 1>&2 echo "db2cluster -cfs -list -filesystem $INSTGPFS -disk" 1>&2 exit 1 fi echo "GPFS Disk Name = $GPFSDISK" TSPRINQ=`$MMBIN/tsprinquiry $GPFSDISK | tr -d ' '` echo "Writing $TSPRINQ to prcapdevices" echo $TSPRINQ > prcapdevices for host in $HOSTLIST do echo "Copying prcapdevices to $host" 1>&2 scp prcapdevices $host:/var/mmfs/etc/ done echo "Stop db2 instance" 1>&2 NODELIST=`cat $INSTHOME/sqllib/db2nodes.cfg | awk '{print $1}'` for node in $NODELIST do echo "Executing db2stop $node force" 1>&2 su -l $INSTANCE -c "source ~/.bashrc;db2stop $node force" done echo "Sleeping for 10 seconds" sleep 10 for host in $HOSTLIST do echo "Stop db2 instance on $host" 1>&2 su -l $INSTANCE -c "source ~/.bashrc;db2stop instance on $host" done echo "Sleeping for 5 seconds" sleep 5 export CT_MANAGEMENT_SCOPE=2 echo "Enter CM into maintenance mode" 1>&2 $DB2INSTALLDIR/bin/db2cluster -cm -enter -maintenance -all echo "Enter CFS into maintenance mode" 1>&2 $DB2INSTALLDIR/bin/db2cluster -cfs -enter -maintenance -all echo "Sleeping for 5 seconds" sleep 5 echo "Running mmchconfig usePersistentReserve=yes" 1>&2 if [ "$SCSIPR" == "yes" ] ; then $MMBIN/mmchconfig usePersistentReserve=yes else $MMBIN/mmchconfig usePersistentReserve= fi echo "Running mmlsnsd -X" 1>&2 $MMBIN/mmlsnsd -X echo "Exit CM out of maintenance mode" 1>&2 $DB2INSTALLDIR/bin/db2cluster -cm -exit -maintenance echo "lsrpdomain" 1>&2 $RSCTBIN/lsrpdomain echo "lsrpnode" 1>&2 $RSCTBIN/lsrpnode -B -Q -P echo "Sleeping for 10 seconds" sleep 10 echo "Exit CFS out of maintenance mode" 1>&2 $DB2INSTALLDIR/bin/db2cluster -cfs -exit -maintenance -all echo "Check Status of CFS hosts" 1>&2 $DB2INSTALLDIR/bin/db2cluster -cfs -list -host -state DB2MOUNT=`cat /etc/fstab | grep $INSTGPFS | awk '{print $2}'` for i in {1..200} do echo "$i Waiting for sqllib_shared to become available" >&2 if [ -a $DB2MOUNT/$INSTANCE/sqllib_shared/db2profile ] ; then echo "sqllib_shared has become available now." for host in $HOSTLIST do echo "Start db2 instance on $host" 1>&2 su -l $INSTANCE -c "source ~/.bashrc;db2start instance on $host" done echo "db2start 129" su -l $INSTANCE -c "source ~/.bashrc;db2start 129" echo "db2start 128" su -l $INSTANCE -c "source ~/.bashrc;db2start 128" echo "db2start" su -l $INSTANCE -c "source ~/.bashrc;db2start" break; fi sleep 10 done
Note: If GPFS mount points do not come up after setting usePersistentReserve=yes, unset the entry and bring back your cluster. This is an indication that GPFS was not able to use this with your storage. Your option is to open a PMR and let IBM support handle this for you.
If GPFS mount points do not come up. Run these commands to unset it.
# db2cluster -cm -enter -maintenance -all # db2cluster -cfs -enter -maintenance -all # mmchconfig usepersistentReserve= # db2cluster -cm -exit -maintenance # db2cluster -cfs -exit -maintenance -all $ db2start instance on <hostname> $ db2start
Disclaimer: Run this script on your own risk and test it first on a test cluster. This is a one time job and you might as well do it step-by-step rather than just by running this script. By going through this script, you know the required steps.