The Work Load Management (WLM) feature of DB2 is mostly viewed as one of the best but least understood feature. The WLM has two main themes.
1. Use system defined resources such as CPU or I/O usage to do a task faster or slower
2. Use DB2 defined control such as thresholds or concurrency controls to build a most effective WLM.
The first approach is the one used by almost all vendors in some capacity or another. The second approach is unique to DB2 and at present no database vendors beat this functionality. This WLM feature has its root in mainframe DB2 and if you do this right, no workload can bring down your DB2 database.
However, I am going to discuss only the first approach of WLM. Sometime in a system, there is a query running and takling lots of resources and having an impact on the system. A DBA wants to demote this query so that the system resources are freed up.
In other databases, a DBA has to do the baby-sitting to identify and demote such queries but in DB2, the concept is "Set it and forget it."
My mentor used to say "Doing the baby sitting may be ok at times but when you need to change the diapers, then you have to think – Am I doing too much?" This is what you end up doing on other databases – you also need to change the diapers.
The DBAs have to be smart and there is nothing wrong in being lazy if you know how to achieve, "Set it and forget it". This is how it is done in DB2.
I am showing Optim Performance Manager (OPM) tool to do the WLM configuration. The use of OPM takes away the drudgery of the WLM learning curve.
1. A Workload P_WL is mapped to P_HIGH service class. The workload P_WL is associated with a client user id as 'priorityage'.
2. The Service class P_HIGH has CPU limits and is remapped to P_MED service class when those thresholds are violated.
3. The Service class P_MED has CPU limits and is remapped to P_LOW service class when those thresholds are violated..
The Workload Management works when such remapping is done.
If this is actually working or not working can be demonstrated by the output from a script, which works in this fashion.
1. We create a WLM event monitor as per the following definition.
CREATE EVENT MONITOR DB2THRESHOLDVIOLATIONS FOR THRESHOLD VIOLATIONS WRITE TO TABLE THRESHOLDVIOLATIONS (TABLE THRESHOLDVIOLATIONS_DB2THRESHOLDVIOLATIONS IN USERSPACE1 PCTDEACTIVATE 100), CONTROL (TABLE CONTROL_DB2THRESHOLDVIOLATIONS IN USERSPACE1 PCTDEACTIVATE 100) AUTOSTART;
2. Turn this event monitor on by using SET EVENT MONITOR DB2THRESHOLDVIOLATIONS STATE 1.
3. Run a workload – example given below:
bcuaix@bddn28:~/vikram> cat priorityage.sql call WLM_SET_CLIENT_INFO('priorityage', '', '', NULL, NULL); set current schema tpcds; select i_item_id, ca_country, ca_state, ca_county, avg( cast(cs_quantity as numeric(12,2))) agg1, avg( cast(cs_list_price as numeric(12,2))) agg2, avg( cast(cs_coupon_amt as numeric(12,2))) agg3, avg( cast(cs_sales_price as numeric(12,2))) agg4, avg( cast(cs_net_profit as numeric(12,2))) agg5, avg( cast(c_birth_year as numeric(12,2))) agg6, avg( cast(cd1.cd_dep_count as numeric(12,2))) agg7 from catalog_sales, customer_demographics cd1, customer_demographics cd2, customer, customer_address, date_dim, item where cs_sold_date_sk = d_date_sk and cs_item_sk = i_item_sk and cs_bill_cdemo_sk = cd1.cd_demo_sk and cs_bill_customer_sk = c_customer_sk and cd1.cd_gender = 'F' and cd1.cd_education_status = 'Secondary' and c_current_cdemo_sk = cd2.cd_demo_sk and c_current_addr_sk = ca_address_sk and c_birth_month in (9,5,6,6,9,6) and d_year = 2001 and ca_state in ('NM','IN','VA' ,'GA','IL','NC','MI') group by rollup (i_item_id, ca_country, ca_state, ca_county) order by ca_country, ca_state, ca_county, i_item_id fetch first 100 rows only;
4. When the workload is finished, run the following SQL scripts to capture thresholds violations and how remapping activity was done. Here is the SQL script.
SELECT PARTITION_NUMBER, SUBSTR(T2.THRESHOLDNAME,1,35) AS THRESH_NAME, SUBSTR(S1.SERVICECLASSNAME,1,30) AS SOURCE_SERVICE_CLASS, SUBSTR(S2.SERVICECLASSNAME,1,30) AS TARGET_SERVICE_CLASS, T1.THRESHOLD_ACTION AS THRESH_ACTION, SUBSTR(T1.THRESHOLD_PREDICATE,1,20) AS PREDICATE, T1.THRESHOLD_MAXVALUE, T1.TIME_OF_VIOLATION FROM (THRESHOLDVIOLATIONS_DB2THRESHOLDVIOLATIONS T1 LEFT JOIN SYSCAT.THRESHOLDS T2 ON T1.THRESHOLDID = T2.THRESHOLDID) LEFT JOIN SYSCAT.SERVICECLASSES S1 ON (T1.SOURCE_SERVICE_CLASS_ID = S1.SERVICECLASSID) LEFT JOIN SYSCAT.SERVICECLASSES S2 ON (T1.DESTINATION_SERVICE_CLASS_ID = S2.SERVICECLASSID) WHERE THRESHOLD_ACTION = 'Remap';
The output of this SQL as shown demonstrates that the priority aging actually worked with each thresholds when it was violated and how it was remapped to other one.
1 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:04:41.0 1 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:20:26.0 1 DS_QUERYSC01_PMED_1349461764601_CPU P_MED P_LOW Remap CPUTimeInSC 1 2012-10-08 14:20:28.0 2 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:04:41.0 2 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:20:26.0 2 DS_QUERYSC01_PMED_1349461764601_CPU P_MED P_LOW Remap CPUTimeInSC 1 2012-10-08 14:20:28.0 11 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:04:44.0 11 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:20:30.0 11 DS_QUERYSC01_PMED_1349461764601_CPU P_MED P_LOW Remap CPUTimeInSC 1 2012-10-08 14:20:31.0 38 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:04:45.0 38 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:20:31.0 38 DS_QUERYSC01_PMED_1349461764601_CPU P_MED P_LOW Remap CPUTimeInSC 1 2012-10-08 14:20:33.0 33 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:04:45.0 33 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:20:31.0 33 DS_QUERYSC01_PMED_1349461764601_CPU P_MED P_LOW Remap CPUTimeInSC 1 2012-10-08 14:20:32.0 16 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:04:44.0 16 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH P_MED Remap CPUTimeInSC 1 2012-10-08 14:20:29.0 16 DS_QUERYSC01_PMED_1349461764601_CPU P_MED P_LOW Remap CPUTimeInSC 1 2012-10-08 14:20:31.0