While creating DB2 pureScale instance, it appears that the node becomes unresponsive under RHEL 7.2. If you reboot the node and look at the
/var/log/messages, you may notice these several messages:
kernel:BUG: soft lockup - CPU#1 stuck for 23s! [mmfsd:3280]
mmfsd is GPFS (aka IBM Spectrum Scale) file system daemon and somehow it looks that this is the cause of the this CPU soft look.
The other symptom of soft CPU lockup is the high queue seen in the
vmstat output. Please look at the first column ‘r’ under procs. The value of ‘r’ would be very high in this case.
It looks that Supervisor Mode Access Prevention (SMAP) feature of Intel Xeon V4 processor (Broadwell) and Linux kernel 3.7 or later causes mmfsd to not have access to some memory space. The SMAP feature in Intel Broadwell family of CPU (including Intel Core i7 6820 HQ) has the protection enabled which disallows access from kernel-space memory to user-space memory, a feature aimed at making it harder to exploit software bugs. Now, GPFS is a kernel level access and this feature is disallowing GPFS access of kernel-space memory with a result that soft lockup of CPU occurs and that leads to system appearing hung-up.
This causes the node to appear to hang but it is actually soft CPU lockup issue as seen with the above command. The soft CPU lockup also causes the high queue – with a result that the system becomes non-responsive.
The RHEL 7.2 kernel has the support for SMAP feature by default. If your cpu has this feature or not, you can check the output from
cat /proc/cpuinfo | grep smap and if you see
smap in the flags section, you have this Supervisor Mode Access Prevention (SMAP) feature enabled.
GPFS has fixed this issue in v22.214.171.124 but the version that comes with DB2 11.1 FP 1 is v126.96.36.199. If you are using later version of DB2, you can find out the version of GPFS that will be installed by looking at file
spec in the folder
You can disable
smap feature in Linux kernel by adding kernel parameter
nosamp as shown below (for RHEL 7.2).
grub.cfg–> It can be in different places depending upon legacy or EFI boot. In my case, this was in
grub.cfgand find line associated with the system image such as line containing
vmlinuzand add “
nosmap” parameter at the end.
Alternatively, you can use edit
/etc/default/grub and add
nosmap parameter at the end of line containing
GRUB_CMDLINE_LINUX and then run
grub2-mkconfig -o /boot/grub2/grub.cfg and reboot the system.
You can also run the grubby command to add this kernel parameter.
# grubby --update-kernel=ALL --args=nosmap