The Work Load Management (WLM) feature of DB2 is mostly viewed as one of the best but least understood feature. The WLM has two main themes.

1. Use system defined resources such as CPU or I/O usage to do a task faster or slower

2. Use DB2 defined control such as thresholds or concurrency controls to build a most effective WLM.

The first approach is the one used by almost all vendors in some capacity or another. The second approach is unique to DB2 and at present no database vendors beat this functionality. This WLM feature has its root in mainframe DB2 and if you do this right, no workload can bring down your DB2 database.

However, I am going to discuss only the first approach of WLM. Sometime in a system, there is a query running and takling lots of resources and having an impact on the system. A DBA wants to demote this query so that the system resources are freed up.

In other databases, a DBA has to do the baby-sitting to identify and demote such queries but in DB2, the concept is "Set it and forget it."

My mentor used to say "Doing the baby sitting may be ok at times but when you need to change the diapers, then you have to think – Am I doing too much?" This is what you end up doing on other databases – you also need to change the diapers.

The DBAs have to be smart and there is nothing wrong in being lazy if you know how to achieve, "Set it and forget it". This is how it is done in DB2.

I am showing Optim Performance Manager (OPM) tool to do the WLM configuration. The use of OPM takes away the drudgery of the WLM learning curve.

1. A Workload P_WL is mapped to P_HIGH service class. The workload P_WL is associated with a client user id as 'priorityage'.

2. The Service class P_HIGH has CPU limits and is remapped to P_MED service class when those thresholds are violated.

3. The Service class P_MED has CPU limits and is remapped to P_LOW service class when those thresholds are violated..

 

The Workload Management works when such remapping is done.

If this is actually working or not working can be demonstrated by the output from a script, which works in this fashion.

1. We create a WLM event monitor as per the following definition.

CREATE EVENT MONITOR DB2THRESHOLDVIOLATIONS
    FOR THRESHOLD VIOLATIONS
    WRITE TO TABLE
    THRESHOLDVIOLATIONS (TABLE THRESHOLDVIOLATIONS_DB2THRESHOLDVIOLATIONS
                         IN USERSPACE1
                         PCTDEACTIVATE 100),
    CONTROL (TABLE CONTROL_DB2THRESHOLDVIOLATIONS
             IN USERSPACE1
             PCTDEACTIVATE 100)
    AUTOSTART;

2. Turn this event monitor on by using SET EVENT MONITOR DB2THRESHOLDVIOLATIONS STATE 1.

3. Run a workload – example given below:

bcuaix@bddn28:~/vikram> cat priorityage.sql
call WLM_SET_CLIENT_INFO('priorityage', '', '', NULL, NULL);
set current schema tpcds;
select  i_item_id,
        ca_country,
        ca_state,
        ca_county,
        avg( cast(cs_quantity as numeric(12,2))) agg1,
        avg( cast(cs_list_price as numeric(12,2))) agg2,
        avg( cast(cs_coupon_amt as numeric(12,2))) agg3,
        avg( cast(cs_sales_price as numeric(12,2))) agg4,
        avg( cast(cs_net_profit as numeric(12,2))) agg5,
        avg( cast(c_birth_year as numeric(12,2))) agg6,
        avg( cast(cd1.cd_dep_count as numeric(12,2))) agg7
 from catalog_sales, customer_demographics cd1,
      customer_demographics cd2, customer, customer_address, date_dim, item
 where cs_sold_date_sk = d_date_sk and
       cs_item_sk = i_item_sk and
       cs_bill_cdemo_sk = cd1.cd_demo_sk and
       cs_bill_customer_sk = c_customer_sk and
       cd1.cd_gender = 'F' and
       cd1.cd_education_status = 'Secondary' and
       c_current_cdemo_sk = cd2.cd_demo_sk and
       c_current_addr_sk = ca_address_sk and
       c_birth_month in (9,5,6,6,9,6) and
       d_year = 2001 and
       ca_state in ('NM','IN','VA'
                   ,'GA','IL','NC','MI')
 group by rollup (i_item_id, ca_country, ca_state, ca_county)
 order by ca_country,
        ca_state,
        ca_county,
        i_item_id
  fetch first 100 rows only;

4. When the workload is finished, run the following SQL scripts to capture thresholds violations and how remapping activity was done. Here is the SQL script.

SELECT 	PARTITION_NUMBER, SUBSTR(T2.THRESHOLDNAME,1,35) AS THRESH_NAME, 
	SUBSTR(S1.SERVICECLASSNAME,1,30) AS SOURCE_SERVICE_CLASS, 
	SUBSTR(S2.SERVICECLASSNAME,1,30) AS TARGET_SERVICE_CLASS, 
	T1.THRESHOLD_ACTION AS THRESH_ACTION, 
	SUBSTR(T1.THRESHOLD_PREDICATE,1,20) AS PREDICATE, 
	T1.THRESHOLD_MAXVALUE,
	T1.TIME_OF_VIOLATION
FROM 	(THRESHOLDVIOLATIONS_DB2THRESHOLDVIOLATIONS T1
	LEFT JOIN SYSCAT.THRESHOLDS T2
	     ON T1.THRESHOLDID = T2.THRESHOLDID)
	LEFT JOIN SYSCAT.SERVICECLASSES S1
	     ON (T1.SOURCE_SERVICE_CLASS_ID = S1.SERVICECLASSID)
	LEFT JOIN SYSCAT.SERVICECLASSES S2
	     ON (T1.DESTINATION_SERVICE_CLASS_ID = S2.SERVICECLASSID)
WHERE THRESHOLD_ACTION = 'Remap';

The output of this SQL as shown demonstrates that the priority aging actually worked with each thresholds when it was violated and how it was remapped to other one.

        1 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:04:41.0
        1 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:20:26.0
        1 DS_QUERYSC01_PMED_1349461764601_CPU P_MED          P_LOW     Remap         CPUTimeInSC       1 2012-10-08 14:20:28.0
        2 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:04:41.0
        2 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:20:26.0
        2 DS_QUERYSC01_PMED_1349461764601_CPU P_MED          P_LOW     Remap         CPUTimeInSC       1 2012-10-08 14:20:28.0
       11 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:04:44.0
       11 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:20:30.0
       11 DS_QUERYSC01_PMED_1349461764601_CPU P_MED          P_LOW     Remap         CPUTimeInSC       1 2012-10-08 14:20:31.0
       38 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:04:45.0
       38 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:20:31.0
       38 DS_QUERYSC01_PMED_1349461764601_CPU P_MED          P_LOW     Remap         CPUTimeInSC       1 2012-10-08 14:20:33.0
       33 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:04:45.0
       33 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:20:31.0
       33 DS_QUERYSC01_PMED_1349461764601_CPU P_MED          P_LOW     Remap         CPUTimeInSC       1 2012-10-08 14:20:32.0
       16 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:04:44.0
       16 DS_QUERYSC01_PHIGH_1349461764600_CP P_HIGH         P_MED     Remap         CPUTimeInSC       1 2012-10-08 14:20:29.0
       16 DS_QUERYSC01_PMED_1349461764601_CPU P_MED          P_LOW     Remap         CPUTimeInSC       1 2012-10-08 14:20:31.0