DB2 pureScale Install Problem Determination
RSCT License Issue
$ db2start 128 SQL1677N DB2START or DB2STOP processing failed due to a DB2 cluster services error DATA #9 : SQLHA Remote Command Output, PD_TYPE_SQLHA_COMMAND_RESPONSE, 3508 bytes commandResponse->callRC: 0x00000000 commandResponse->output: Error: Product license is invalid and needs to be upgraded. 2016-06-30-09.40.31.819767-240 I6015E554 LEVEL: Error PID : 18164 TID : 140258826409760 PROC : db2start INSTANCE: db2psc NODE : 000 HOSTNAME: purescale.zinox.com FUNCTION: DB2 UDB, high avail services, sqlhaVerifyHostLicenses, probe:18163 MESSAGE : The cluster manager license for the host is not ok: DATA #1 : String, 42 bytes purescale.zinox.com DATA #2 : SQLHA_LICENSE_STATUS, PD_TYPE_SQLHA_LICENSE_STATUS, 4 bytes SQLHA_LICENSE_STATUS_EVALUATION_PERIOD_EXPIRED
You applied RSCT license using samlicm -i <sam32.lic or sam41.lic> but you still see above message in db2diag.log. Even though samlicm -i <license file> did not report any error. The license may still be invalid. This can occur due to variety of reasons not known to me. However, it is always a good idea to check if license applied is valid or not.
# samlicm -t # echo $?
The first command tests the license if it is OK or not. The second output should show a value of 0, which means that the license is valid. If the return code is ‘1’, the license is invalid. Download the license file again from IBM Passport Advantage site and try it again.
For example:
# samlicm -t # echo $? 1 # samlicm -i sam32.lic # samlicm -t # echo $? 0
Reload License
Applying a license does not mean that the running processes know about it. Either reboot the machine for license to pickup or kill IBM.ConfigRMd process without (-9) so that it will restart again. This may work or may not work as the critical resource protection method may get invoked and RSCT may reboot the server.
# ps -ef | grep -i config root 1704 6398 0 09:52 pts/0 00:00:00 grep -i config root 2106 992 0 09:36 ? 00:00:00 /usr/sbin/rsct/bin/IBM.ConfigRMd # kill 2106
netmon.cf
If you specified entries in netmon.cf for the layer 2 network having an outside IP address, you must make sure that you are able to ping the IP address using the interface. For example:
# cd /var/ct/cfg # cat netmon.cf !IBQPORTONLY !ALL !REQD eth0 10.10.120.11 !REQD eth1 192.168.120.11
Make sure that you are able to ping the IP addresses using the interface. If you do not get output from ping, you either have wrong interface name or IP address or something has changed since last good config like a NIC card was replaced and the interface name changed but IP address was same.
$ ping -I eth0 10.10.121.11 $ ping -I eth1 192.168.120.11
Even after correcting above problem, if you still see the message “Error: A reachable IP address could not be automatically determined” and the error message is asking to fix netmon.cf, chances are that you have duplicate adapter name assigned. For example, in /etc/sysconfig/network-scripts, you may have redundant ifcfg file which is not mapped to any adapter but has the same name. The simple fix is to remove the unwanted ifcfg file.
SSH Key has changed
When a machine gets rebuilt and backup restored, the SSH key may get changed and you will have that node not working or you will see the following messages in your db2diag.log file. Fix your SSH keys on all hosts and make sure that you are able to do ssh using localhost, IP address, FQDN and short name.
2016-06-30-08.33.53.397828-240 E2122E2289 LEVEL: Severe PID : 32035 TID : 140342334629664 PROC : db2cluster INSTANCE: db2psc NODE : 000 HOSTNAME: purescale.zinox.com FUNCTION: DB2 UDB, high avail services, sqlhaExecuteCommandLocal, probe:1264 DATA #1 : String, 25 bytes /var/db2/db2ssh/db2locssh DATA #2 : String, 21 bytes root@vpdb202 hostname DATA #3 : signed integer, 8 bytes 6 DATA #4 : unsigned integer, 4 bytes 32047 DATA #5 : Boolean, 1 bytes true DATA #6 : unsigned integer, 8 bytes 853 DATA #7 : SQLHA Remote Command Output, PD_TYPE_SQLHA_COMMAND_RESPONSE, 3508 bytes commandResponse->callRC: 0x00000000 commandResponse->output: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that the RSA host key has just been changed. The fingerprint for the RSA key sent by the remote host is c9:96:96:d1:3e:f5:e1:96:0f:b9:9b:64:43:89:0e:63. Please contact your system administrator. Add correct host key in /home/db2psc/.ssh/known_hosts to get rid of this message. Offending key in /home/db2psc/.ssh/known_hosts:8 RSA host key for purescale.zinox.com has changed and you have requested strict checking. Host key verification failed. failure - examine the system log on the remotehost for additional information