About me

As always, Thanks for visiting.
Vikram at Louis Lake, Alberta, Canada

This Is My Story:

I happen to travel in North America from my work. I took time off from the work and visited Lake Louise which is around 200 km from Calgary, Alberta, Canada. It is a beautiful lake. The water is so crystal clear that I wanted to take bath but it was freezing cold. The glacier at the end of Lake is so beautiful that picture does not tell its serenity. One has to be there to actually feel it.

  • Life spent so far 70%
  • Retirement destination 22%
  • Satisfaction in Life 90%

Super Computer of Karma

I have not yet understood the Laws of Karma. Trying to gain understanding of it. “Karmo Ki Gati Ko Koi Nahi Jaan Sakta”.

Karmas that bind us

We all know about bad Karmas. But, seldom we think that good Karmas are also binding. It is as bad as Bad Karmas. When I heard about this, I was not able to grasp it.

Naiskarma

Naiskarma Siddhi delivers one from this world. I am trying to learn how to practice Naiskarma – It is extremely difficult.

When we get praise

When someone praises me, I feel happy. It is a clear indication that Naiskarma Siddhi is far away from me. Did you hear PTSD? The soldiers will not suffer PTSD, if the Karams they did were for the Nation and not for themselves.

Recent Blog Posts

Learn – What is happening, tips and techniques in DB2

DB2 11.1 BLU with DPF Monitoring scripts

I just wrapped a very large PoC for BLU with DPF with very high ingestion rate (110K messages per second) and as well as very high volume point queries (1500 queries per second) and I used the following queries to measure the various parameters and optimum tuning.

The STMM is off by default for DPF with BLU but we should take advantage of the some automation.

Rule – 1 : If you are using more than one MLN (Multiple Logical Node) per physical host (or LPAR), divide total memory of the host by the number of MLN and then do the calculation. Say for example: If each host has 256 GB RAM and decide to use 2 MLN per host then the available memory per MLN is 128 GB. (This is also misnomer – the actual memory in Linux is usually less than what is the installed memory and we should go with what free command says as available memory and not the one that is installed –> For example: The installed memory is 256 GB but available that Linux reports is only 251 GB so I would divide 251 GB with MLN say 2.)

Rule – 2: Keep INSTANCE_MEMORY to 90-95% of the available memory to the MLN. Keep INSTANCE_MEMORY to NON-automatic.

Rule – 3: Keep DATABASE_MEMORY to 90% of the INSTANCE_MEMORY. Keep DATABASE_MEMORY to AUTOMATIC. {Please note this.}

Rule – 4: Keep 40% of the DATABASE_MEMORY to SHEAPTHRES_SHR and SORTHEAP to 1/20 of the SHEAPTHRES_SHR. Now, for my workload, I had to the keep SHEAPTHRES_SHR to 60% and the ratio to 1/5 as my system was not having sufficient memory to process a very large data set.

Rule – 5: If continuous LOAD is performed, keep UTIL_HEAP_SZ to 5000 pages and AUTOMATIC and let db2 adjust the memory. Their is no need to bump this up if UTIL_HEAP_SZ and DATABASE_MEMORY are AUTOMATIC (even though STMM is turned off)

Ruke – 6: Keep buffer pool 40% of the DATABASE_MEMORY and keep it NON-AUTOMATIC. Again adjust this size based upon query performance requirement.

Monitoring queries for BLU with DPF.

 $ cat mondpf
#!/bin/bash

SECONDS=25200

db2 connect to <database>

endTime=$(($(date +%s) + SECONDS))
while [ $(date +%s) -lt $endTime ]
do
 echo Skew report;
 db2 -tf skew.sql
 echo Sort info;
 db2 -tf sort.sql
 echo logfull info;
 db2 -tf logfull.sql
 echo BP hit ratio;
 db2 -tf bp.sql
 echo Members info;
 db2 -tf members.sql
 sleep 60
 #db2 "call monreport.dbsummary(60)"
done

This skew script is very important in a multi-member (either pureScale or DPF) to determine if any of the member is slow in writing or reading the information from this disk. No matter how much monitoring you do for a FC card performance, you can’t make out anything just by seeing the values in isolation if a particular is slow in writing. This is one of most difficult problem to solve. It took me 2 weeks to figure this out and this SQL was the saver through which we found out the members that were slow in writing and reading –> This led us to change the FC cards and when this was done, the performance gain was 4 times. Through this script – by comparing the  direct_write_time of every member, we determined which member FC cards were slow in writing. It was a big help.

 $ cat skew.sql
select member,
 bp_cur_buffsz,
 direct_writes,
 pool_data_writes,
 direct_write_time,
 pool_write_time,
 pool_col_writes,
 pool_async_col_writes
from TABLE(MON_GET_BUFFERPOOL('',-2))
where bp_name in ('MINBP','MAXBP')
order by member
;

In BLU with DPF, the maximum performance gain occurs by having a properly sized SHEAPTHRES_SHR and the ratio between SHEAPTHRES_SHR and SORTHEAP. A proper tuning and adjusting the values will be helpful but know through this script of sort overflows are occurring and minimizing that by increasing SHEAPTHRES_SHR.

 $ cat sort.sql
echo Monitor Sort Memory Usage - Obtain current and maximum sort usage for the database;

SELECT MEMBER, SORT_SHRHEAP_ALLOCATED,
 SORT_SHRHEAP_TOP
FROM TABLE(MON_GET_DATABASE(-2))
ORDER BY MEMBER
;

echo Extract percentage of sort operations that have spilled and high watermark sort usage;

WITH ops AS (
SELECT
 member, (total_sorts + total_hash_joins + total_hash_grpbys)as sort_ops,
 (sort_overflows + hash_join_overflows + hash_grpby_overflows) as overflows,
 sort_shrheap_top as sort_heap_top
FROM TABLE (mon_get_database(-2))
) SELECT member, sort_ops,
 overflows,
 (overflows * 100) / nullif(sort_ops,0) as pctoverflow,
 sort_heap_top
from ops
order by member;

The log full script is the one each DBA must keep in their pocket all the time. I have seen 100s of DBAs who will just increase the LOGPRIMARY and LOGSECOND whenever they run into LOG FULL condition without realizing as what is causing this. Find out the offending application that is holding a LOG file hostage and thus leading to LOG FULL condition. I have covered this in detail somewhere also.

 $ cat logfull.sql
SELECT MEMBER,
 TOTAL_LOG_AVAILABLE / 1048576 AS LOG_AVAILABLE_MB,
 TOTAL_LOG_USED / 1048576 AS LOG_USED_MB,
 CAST (((CASE WHEN (TOTAL_LOG_AVAILABLE + TOTAL_LOG_USED) = 0
 OR (TOTAL_LOG_AVAILABLE + TOTAL_LOG_USED) IS NULL
 OR TOTAL_LOG_AVAILABLE = -1 THEN NULL
 ELSE ((CAST ((TOTAL_LOG_USED) AS DOUBLE) / CAST (
 (TOTAL_LOG_AVAILABLE + TOTAL_LOG_USED) AS DOUBLE))) * 100
 END)) AS DECIMAL (5,2)) AS USED_PCT,
 APPLID_HOLDING_OLDEST_XACT
FROM TABLE (MON_GET_TRANSACTION_LOG(-2))
ORDER BY member, USED_PCT DESC;

The buffer pool hit ratio is meaningless to me. It would be good 99.9% of the time and then why do we measure it? The key is to see if we have enough pool size or not. I am still searching for the silver bullet. This is least to monitor in my perspective but still including the script.

 $ cat bphit.sql
WITH BPMETRICS AS (
 SELECT bp_name,
 pool_data_l_reads + pool_temp_data_l_reads +
 pool_index_l_reads + pool_temp_index_l_reads +
 pool_xda_l_reads + pool_temp_xda_l_reads as logical_reads,
 pool_data_p_reads + pool_temp_data_p_reads +
 pool_index_p_reads + pool_temp_index_p_reads +
 pool_xda_p_reads + pool_temp_xda_p_reads as physical_reads,
 member
 FROM TABLE(MON_GET_BUFFERPOOL('',-2)) AS METRICS)
 SELECT
 VARCHAR(bp_name,20) AS bp_name,
 logical_reads,
 physical_reads,
 CASE WHEN logical_reads > 0
 THEN DEC((1 - (FLOAT(physical_reads) / FLOAT(logical_reads))) * 100,5,2)
 ELSE NULL
 END AS HIT_RATIO,
 member
 FROM BPMETRICS;

This script requires that you create a workload for the purpose of the monitoring. This is the best way to find out performance of a client application at a member level. I have covered elsewhere how to do this in detail.

 $ cat members.sql
select substr(host_name,1,20) host, 
       t.member, act_completed_total + act_aborted_total AS TOTAL_TRANS,
       cast(total_act_time/1000.0 as decimal(16,2)) ACT_TIME_SECS 
from table(mon_get_workload('Your Work Load Name',-2)) as t,
     table (env_get_system_resources()) m 
where m.member = t.member 
order by host, t.member;

I like to run as few scripts as possible for the purpose of monitoring and let the database do the job rather than to monitor every aspect of it – which is not a good idea once you have tuned the system. The monitoring has a drag in the same fashion as 20 managers keep asking a high performing employee every hour the status of the work. I have seen this in all major corporations that I have visited in last 14 years. We have too many status seeking managers than high performing employees now-a-days. Does it sound right?


 

db2haicu disable / enable and scripting

It is recommended that you always update RSCT / TSA from db2 software media and not from IBM Fix Central. The reason – DB2 is tested and certified with the version that ships with the DB2 software.

If you go to server_t/db2/linuxamd64/tsamp and run db2cktsa command using -v install (what is installed) or -v media (what is in media), you can choose to update the software.

[root@node03 tsamp]# pwd
/root/download/server_t/db2/linuxamd64/tsamp
[root@node03 tsamp]# ./db2cktsa -v install
4.1.0.3

[root@node03 tsamp]# ./db2cktsa -v media
4.1.0.3

How to find RSCT version installed?

# /usr/sbin/rsct/install/bin/ctversion -b

How to find TSA version installed

# /usr/sbin/rsct/bin/samversion

How to use db2haicu -enable or db2haicu -disable for scripting?

A. For disable, create a file called db2haicu.disable as shown:

$ cat << EOF > db2haicu.disable
> 1
> EOF

The above command will just add one line having 1 in it. You can also create same using vi or any other editor.

Then, you can run db2haicu through script to disable as:

$ db2haicu -disable < db2haicu.disable

B. For enable, create a file called db2haicu.enable for scripting as:

$ cat << EOF > db2haicu.enable
> 1
> 1
> EOF

Then, you can run db2haicu through script to enable as:

$ db2haicu < db2haicu.enable

DB2 pureScale – No Product License found

In db2diag, you may see this message – No product license found.

Sample output:

 2017-05-19-13.00.58.511626-240 E2575E1007 LEVEL: Severe
PID : 35363 TID : 70366589481392 PROC : db2start
INSTANCE: purf01 NODE : 000
HOSTNAME: va33dlvudb001
FUNCTION: DB2 UDB, high avail services, sqlhaExecuteCommandSQO, probe:1062
DATA #1 : String, 13 bytes
va33dlvudb003
DATA #2 : String, 26 bytes
/usr/sbin/rsct/bin/samlicm
DATA #3 : String, 2 bytes
-s
DATA #4 : String, 4 bytes
root
DATA #5 : String, 6 bytes
purf01
DATA #6 : String, 6 bytes
purf01
DATA #7 : unsigned integer, 4 bytes
1055
DATA #8 : Boolean, 1 bytes
false
DATA #9 : SQLHA Remote Command Output, PD_TYPE_SQLHA_COMMAND_RESPONSE, 3508 bytes
commandResponse->callRC: 0x00000000
commandResponse->output: No product license found.
 
 
DATA #10: Hex integer, 4 bytes
0x00000000
DATA #11: Hex integer, 4 bytes
0x00000001
CALLSTCK: (Static functions may not be resolved correctly, 
as they are resolved to the nearest symbol)
 [0] 0x0000000000000000 ?unknown + 0x0
 [1] 0x0000000000000000 ?unknown + 0x0
 
2017-05-19-13.00.58.513621-240 I3583E741 LEVEL: Error
PID : 35363 TID : 70366735337216 PROC : db2start
INSTANCE: purf01 NODE : 000
HOSTNAME: va33dlvudb001
FUNCTION: DB2 UDB, high avail services, 
sqlhaGetLicenseStatusFromEachHost, probe:24843
MESSAGE : Problem running command on the host.
DATA #1 : SQLHA Remote Command Set, PD_TYPE_SQLHA_COMMAND_SET, 292120 bytes
commandSet->numCommands: 4
commandSet->options: NONE
commandSet->previousDb2RshCmd:
DATA #2 : SQLHA Remote Command Output, PD_TYPE_SQLHA_COMMAND_RESPONSE, 8 bytes
commandResponse->callRC: 0x0D9C36A0
commandResponse->output: NOT_POPULATED
DATA #3 : unsigned integer, 8 bytes
3
DATA #4 : String, 27 bytes
node03.zinox.com

From the above output, it looks that node03.zinox.com has the product license issue.

Fix Problem

Go to /var/opt/sam/lic

# cd /var/opt/sam/lic
# ls -l
-rw-r----- 1 root root 86 May  8 14:02 nodelock

If permission shows 640, change it 644

# chown 644 nodelock

Somehow, in the process of locking or unlocking this file, RSCT forgets to change the permission back. Most probably, this issue could be result of time difference between hosts but I am not sure. I will ask RSCT development on this.

DB2 pureScale instance creation hangs after GPFS

Hang Issue due to SMAP

While creating DB2 pureScale instance, it appears that the node becomes unresponsive under RHEL 7.2. If you reboot the node and look at the /var/log/messages, you may notice these several messages:

kernel:BUG: soft lockup - CPU#1 stuck for 23s! [mmfsd:3280]

The mmfsd is GPFS (aka IBM Spectrum Scale) file system daemon and somehow it looks that this is the cause of the this CPU soft look.

The other symptom of soft CPU lockup is the high queue seen in the vmstat output. Please look at the first column ‘r’ under procs. The value of ‘r’ would be very high in this case.

It looks that Supervisor Mode Access Prevention (SMAP) feature of Intel Xeon V4 processor (Broadwell) and Linux kernel 3.7 or later causes mmfsd to not have access to some memory space. The SMAP feature in Intel Broadwell family of CPU (including Intel Core i7 6820 HQ) has the protection enabled which disallows access from kernel-space memory to user-space memory, a feature aimed at making it harder to exploit software bugs. Now, GPFS is a kernel level access and this feature is disallowing GPFS access of kernel-space memory with a result that soft lockup of CPU occurs and that leads to system appearing hung-up.

This causes the node to appear to hang but it is actually soft CPU lockup issue as seen with the above command. The soft CPU lockup also causes the high queue – with a result that the system becomes non-responsive.

The RHEL 7.2 kernel has the support for SMAP feature by default. If your cpu has this feature or not, you can check the output from cat /proc/cpuinfo | grep smap and if you see smap in the flags section, you have this Supervisor Mode Access Prevention (SMAP) feature enabled.

GPFS has fixed this issue in v4.2.1.1 but the version that comes with DB2 11.1 FP 1 is v4.1.1.9. If you are using later version of DB2, you can find out the version of GPFS that will be installed by looking at file spec in the folder <db2softwaredir>/server_t/db2 directory.

You can disable smap feature in Linux kernel by adding kernel parameter nosamp as shown below (for RHEL 7.2).

  1. Locate grub.cfg  –> It can be in different places depending upon legacy or EFI boot. In my case, this was in /boot/grub2/grub.cfg
  2. Edit grub.cfg and find line associated with the system image such as line containing vmlinuz and add “nosmap” parameter at the end.

Alternatively, you can use edit /etc/default/grub and add nosmap parameter at the end of line containing GRUB_CMDLINE_LINUX and then run grub2-mkconfig -o /boot/grub2/grub.cfg and reboot the system.

You can also run the grubby command to add this kernel parameter.

# grubby --update-kernel=ALL --args=nosmap

Hang Issue due to blk_cloned_rq_check_limits

Their is another issue related with GPFS and the technote is : http://www-01.ibm.com/support/docview.wss?uid=ssg1S1009622

One of our colleague Kajen ran into this: During instance creation on RHEL 7.3 (db2icrt), I ran into problem where the disk specified for instance shared dev paths fails and the db2icrt hangs for ever. You will see the messages in /var/log/messages file regarding the failure and the message on the technote: kernel: blk_cloned_rq_check_limits: over max size limit. you can workaround the problem by adding udev rules as per the link given above.

The workaround to the above problem as per IBM technote is to write a udev rule.

If the GPFS file system block size is 4MB, and the storage is from IBM, you need to create a new rule in /etc/udev/rules.d/54-custom.rules with the following content:

ACTION=="add|change", SUBSYSTEM=="block", ATTRS{vendor}=="IBM*", RUN+="/bin/sh -c '/bin/echo 4096 > /sys/block/%k/queue/max_sectors_kb'"

You need to change the block size in the ‘echo’ command to match the file system block size. You also need to provide the correct storage vendor in the “ATTRS{vendor}” statement.

 

 

 

Don't Be Shy. Get In Touch.

Any thing related to DB2 or Karmas, send me a note.

Contact Me