How to concatenate rows together?

Say, you want to concatenate all rows of a column and show it as a single row in the same fashion aggregate function works.

$ db2 "create table t1(line# int not null, word varchar(20))"
$ db2 "insert into t1 values(1,'The'),(1,'horse'),(1,'is'),(1,'white')"
$ db2 "insert into t1 values(2,'The'),(2,'quick'),(2,'brown'),
       (2,'fox'),(2,'jumps'),(2,'over'),(2,'the'),(2,'dog')"
$ db2 "select * from t1"
LINE#       WORD                
----------- --------------------
          1 The                 
          1 horse               
          1 is                  
          1 white               
          2 The                 
          2 quick               
          2 brown               
          2 fox                 
          2 jumps               
          2 over                
          2 the                 
          2 dog

But, you want this to be in a single row. For example:

1 The horse is white
2 The quick brown fox jumps over the dog

SQL Server or Sybase allows you to concatenate string together. For example, you could use the following construct to get string concatenation.

declare @phrase varchar(1024)
set @phrase=''
select @phrase= word + @phrase from t1

People can debate this forever that why are you doing this processing at the server level against the wishes of RDBMS engine. Normally, the front end should have logic to do this processing. It is like destroying 1st normal form with a concatenated list.

Again, we do always compare things without rationalization. Read the book Predictably Irrational if you want to know that how humans beings are irrational by nature.

Yes, we can do same in DB2 in several different fashions. For example:

Use XMLAGG Function
Use a C UDF
Use recursive query
Use a SQL UDF by opening a cursor

Method – 1 : use XMLAGG function

SELECT line#, replace(replace(xml2clob(xmlagg(xmlelement(NAME a, word))),'<A>',''),'</A>',' ') 
FROM t1 
GROUP BY line#

Here, we use XML capability of DB2 to turn a data type to XML using XMLELEMENT. Use XMLAGG to aggregate the values and use XML2CLOB to view the data. We use REPLACE to trim XML tag. The output is as follows:

1 The horse is white
2 The quick brown fox jumps over the dog

Method – 2 : C UDF to process all rows (joinstr.c)

#include <stdio.h>
#include <string.h>
#include <stdarg.h>
#include <sqludf.h> 

#define STR_LEN 1024

struct SCRATCHDATA
{
    int changed;
    char str[STR_LEN];
};


void SQL_API_FN joinstr(SQLUDF_VARCHAR *inStr, SQLUDF_INTEGER *changed, SQLUDF_VARCHAR *outStr,
SQLUDF_SMALLINT *inStrIND, SQLUDF_INTEGER *changedIND, SQLUDF_SMALLINT *outStrIND, SQLUDF_TRAIL_ARGS_ALL)
{
  struct SCRATCHDATA *sp;
  sp = (struct SCRATCHDATA *) SQLUDF_SCRAT->data;

  switch (SQLUDF_CALLT)
  {
    case SQLUDF_FIRST_CALL:
      memset(sp,0,sizeof(struct SCRATCHDATA));
      if(*inStrIND >= 0)
      {
         strcpy(sp->str, inStr);
         if (*changedIND >= 0)
           sp->changed = *changed;
      }
      break;
    case SQLUDF_NORMAL_CALL:
      if(*inStrIND >= 0)
      {
         if (*changedIND >= 0)
         {
             if (*changed != sp->changed)
                memset(sp,0,sizeof(struct SCRATCHDATA));
             else
                strcat(sp->str, " ");
             strcat(sp->str, inStr);
             sp->changed = *changed;
         }
      }
      break;
    case SQLUDF_FINAL_CALL:
      break;
  }
  strcpy(outStr, sp->str);
  *outStrIND = 0;
}

joinstr.exp

joinstr

makefile

DB2PATH=${HOME}/sqllib

BLDRTN=./bldrtn

ALIAS=sample
UID=
PWD=

COPY=cp
ERASE=rm -f

all : \
        srv

srv : \
        rtn

rtn : \
        joinstr

clean : 
        $(ERASE) *.o

#****************************************************************************
#                  2h - make cleanall
#****************************************************************************

cleanall : \
        clean
        $(ERASE) joinstr
        $(ERASE) $(DB2PATH)/function/joinstr*

joinstr : joinstr.c
        $(BLDRTN) joinstr $(ALIAS)
        ./udfcat $(ALIAS) > udfcat.log

udfcat – Catalog the UDF

#! /bin/ksh
export ${FENCED:="NOT FENCED THREADSAFE"}
export ${SHLIBNAME:="joinstr"}
TOK=$(date +"%y%m%d%H%M%S")
export SHLIBNAME=joinstr$TOK
rm -f ~/sqllib/function/joinstr*
cp -f joinstr ~/sqllib/function/$SHLIBNAME

db2 -tv <<!EOF

CONNECT TO $1;

DROP FUNCTION JOIN_STR; 
CREATE FUNCTION JOIN_STR
(
  IN_STR       VARCHAR(1024),
  LINE_NO      INTEGER
)
RETURNS VARCHAR (1024)
EXTERNAL NAME '${SHLIBNAME}!joinstr'
SPECIFIC JOINSTR
LANGUAGE C
PARAMETER STYLE DB2SQL
${FENCED}
RETURNS NULL ON NULL INPUT
NOT DETERMINISTIC
NO SQL
NO EXTERNAL ACTION
SCRATCHPAD 2000
FINAL CALL
DISALLOW PARALLEL;

How to compile?

Unzip joinstr.tar.gz all files into a directory. Copy bldrtn from ~/sqllib/samples/c to your current directory and run make.

How to run?

db2 "select line#, max(join_str(word,line#))  from t1 group by line#"

We use MAX function to do the aggregation but the actual work is done by the C UDF to join words together.

Method – 3: SQL UDF to join words for a given line

CREATE FUNCTION get_str(v_line# INTEGER)
LANGUAGE SQL
RETURNS VARCHAR(1024)
BEGIN ATOMIC
  DECLARE str VARCHAR(1024);
  SET str = '';
  loop1: FOR row AS (SELECT word FROM t1 where line# = v_line#)
  DO
    IF row.word IS NOT NULL THEN
       SET str = str || row.word || ' ';
    END IF;
  END FOR loop1;
  RETURN str;
END
@

db2 "SELECT line#, get_str(line#) FROM t1 GROUP BY line#"

Method – 4 : Recursive SQL to join words for a line

WITH temp1(line#, word, all_words, cnt) as
(SELECT min(line#), min(word), VARCHAR(min(word),50),SMALLINT(1) FROM t1 a
group by a.line#
UNION ALL
SELECT a.line#, a.word, b.all_words||' '||a.word,SMALLINT(b.cnt+1)
FROM t1 a, temp1 b
WHERE a.line# = b.line#
AND a.word > b.word
AND a.word = (select min(c.word) from t1 c where c.line# = b.line# and c.word > b.word) 
)
SELECT d.line#, all_words FROM temp1 d
where cnt = (SELECT max(cnt) FROM temp1 e where e.line# = d.line#);

1 The horse is white                                
2 The brown dog fox jumps over quick the

The question comes – which one of the above should I use?

If you are averse to C and hate to take burden of compiling C UDF as shown, use Method – 1 XMLAGG function. C UDF is light weight and uses scratchpad as you are going through the rows. It will give you the best performance. The recursive SQL is a good example but it could be slow.

How to concatenate rows together?

Method – 1 : use XMLAGG function

Method – 2 : C UDF to process all rows (joinstr.c)

joinstr.exp

makefile

udfcat – Catalog the UDF

How to compile?

How to run?

Method – 3: SQL UDF to join words for a given line

Method – 4 : Recursive SQL to join words for a line

Follow Me on Linked In

Archives

Categories

How to concatenate rows together?

Method – 1 : use XMLAGG function

Method – 2 : C UDF to process all rows (joinstr.c)

joinstr.exp

makefile

udfcat – Catalog the UDF

How to compile?

How to run?

Method – 3: SQL UDF to join words for a given line

Method – 4 : Recursive SQL to join words for a line

Follow Me on Linked In

What did you read most?

Archives

Categories