Decode 함수1
[출처] SQL튜닝의 시작(박성호,오수영 지음)
※ Decode 함수는 SQL내에서 if ~ then ~ else if ~ end 사용할 수 있도록 Oracle에서 제공하는 함수이다.
※ 본 posting에서는 Decode를 이용해서 Row를 Column으로 바꾸어 표현하고자 경우에 대해서만 Posting 된 글이다.
※ Oracle 11g 버전부터는 pivot 기능이 동일한 기능을 한다. 아래 링크 참조
2018/04/01 - [개발/DataBase] - [Oracle] Pivot 함수
1. Test를 위한 Table 생성 및 샘플 데이터 입력.
CREATE TABLE SALES_TEST (
SALE_DT CHAR(8), /* 판매일자 */
EMPNO NUMBER, /* 사원번호 */
DEPTNO NUMBER, /* 부서번호 */
TARGET NUMBER NOT NULL, /* 목표 판매량 */
SALECNT NUMBER NOT NULL, /* 실제 판매량 */
SALE_DESC VARCHAR2(200) /* 목표와 실제 판매량에 대한 상세내용 */
);
INSERT INTO SALES_TEST VALUE
SELECT '20180301'
, LEVEL
, MOD(LEVEL, 10)
, MOD(LEVEL, 710)
, MOD(LEVEL, 400)
, 'SALE '||LEVEL||' : '||MOD(LEVEL,710)||' ==> ' || MOD(LEVEL, 400)
FROM DUAL
CONNECT BY LEVEL <= 100000
;
INSERT INTO SALES_TEST VALUE
SELECT '20180302'
, LEVEL
, MOD(LEVEL, 10)
, MOD(LEVEL, 300)
, MOD(LEVEL, 100)
, 'SALE '||LEVEL||' : '||MOD(LEVEL,310)||' ==> ' || MOD(LEVEL, 100)
FROM DUAL
CONNECT BY LEVEL+1128 <= 100000
;
INSERT INTO SALES_TEST VALUE
SELECT '20180303'
, LEVEL
, MOD(LEVEL, 10)
, MOD(LEVEL, 128)
, MOD(LEVEL, 98)
, 'SALE '||LEVEL||' : '||MOD(LEVEL,128)||' ==> ' || MOD(LEVEL, 98)
FROM DUAL
CONNECT BY LEVEL+528 <= 100000
;
INSERT INTO SALES_TEST VALUE
SELECT '20180304'
, LEVEL
, MOD(LEVEL, 10)
, MOD(LEVEL, 238)
, MOD(LEVEL, 900)
, 'SALE '||LEVEL||' : '||MOD(LEVEL,238)||' ==> ' || MOD(LEVEL, 900)
FROM DUAL
CONNECT BY LEVEL+278 <= 100000
;
INSERT INTO SALES_TEST VALUE
SELECT '20180305'
, LEVEL
, MOD(LEVEL, 10)
, MOD(LEVEL, 897)
, MOD(LEVEL, 1258)
, 'SALE '||LEVEL||' : '||MOD(LEVEL,897)||' ==> ' || MOD(LEVEL, 1258)
FROM DUAL
CONNECT BY LEVEL+278 <= 100000
;
INSERT INTO SALES_TEST VALUE
SELECT '20180306'
, LEVEL
, MOD(LEVEL, 10)
, MOD(LEVEL, 70)
, MOD(LEVEL, 40)
, 'SALE '||LEVEL||' : '||MOD(LEVEL,70)||' ==> ' || MOD(LEVEL, 40)
FROM DUAL
CONNECT BY LEVEL+134 <= 100000
;
INSERT INTO SALES_TEST VALUE
SELECT '20180307'
, LEVEL
, MOD(LEVEL, 10)
, MOD(LEVEL, 8548)
, MOD(LEVEL, 500)
, 'SALE '||LEVEL||' : '||MOD(LEVEL,8548)||' ==> ' || MOD(LEVEL, 500)
FROM DUAL
CONNECT BY LEVEL+38119 <= 100000
;
COMMIT;
CREATE INDEX IDX_SALES_TEST_01 ON SALES_TEST(SALE_DT);
CREATE INDEX IDX_SALES_TEST_02 ON SALES_TEST(EMPNO);
EXEC dbms_stats.gather_table_stats('SCOTT', 'SALES_TEST');
2. 각 일자별 판매목표와 판매량 데이터를 colunm형태로 조회하는데 스칼라서브 쿼리를 이용해서 조회하는 경우.
SELECT '2018/03/01' AS SALEDT_0301
, (SELECT SUM(TARGET) FROM SALES_TEST WHERE SALE_DT = '20180301') AS TARGET_0301
, (SELECT SUM(SALECNT) FROM SALES_TEST WHERE SALE_DT = '20180301') AS SALE_0301
, (SELECT SUM(TARGET) FROM SALES_TEST WHERE SALE_DT = '20180302') AS TARGET_0302
, (SELECT SUM(SALECNT) FROM SALES_TEST WHERE SALE_DT = '20180302') AS SALE_0302
, (SELECT SUM(TARGET) FROM SALES_TEST WHERE SALE_DT = '20180303') AS TARGET_0303
, (SELECT SUM(SALECNT) FROM SALES_TEST WHERE SALE_DT = '20180303') AS SALE_0303
, (SELECT SUM(TARGET) FROM SALES_TEST WHERE SALE_DT = '20180304') AS TARGET_0304
, (SELECT SUM(SALECNT) FROM SALES_TEST WHERE SALE_DT = '20180304') AS SALE_0304
, (SELECT SUM(TARGET) FROM SALES_TEST WHERE SALE_DT = '20180305') AS TARGET_0305
, (SELECT SUM(SALECNT) FROM SALES_TEST WHERE SALE_DT = '20180305') AS SALE_0305
, (SELECT SUM(TARGET) FROM SALES_TEST WHERE SALE_DT = '20180306') AS TARGET_0306
, (SELECT SUM(SALECNT) FROM SALES_TEST WHERE SALE_DT = '20180306') AS SALE_0306
, (SELECT SUM(TARGET) FROM SALES_TEST WHERE SALE_DT = '20180307') AS TARGET_0307
, (SELECT SUM(SALECNT) FROM SALES_TEST WHERE SALE_DT = '20180307') AS SALE_0307
FROM DUAL
※ 아래는 위의 sql문 실행시 trace파일의 일부이다.
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.40 0.44 0 13334 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.42 0.46 0 13334 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 84
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=1025 pr=0 pw=0 time=0 us)
100000 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=1025 pr=0 pw=0 time=114193 us cost=954 size=1224847 card=94219)
100000 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=281 pr=0 pw=0 time=33887 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=1025 pr=0 pw=0 time=0 us)
100000 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=1025 pr=0 pw=0 time=58695 us cost=954 size=1224847 card=94219)
100000 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=281 pr=0 pw=0 time=17263 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=988 pr=0 pw=0 time=0 us)
98872 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=988 pr=0 pw=0 time=57429 us cost=954 size=1224847 card=94219)
98872 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=278 pr=0 pw=0 time=14069 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=988 pr=0 pw=0 time=0 us)
98872 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=988 pr=0 pw=0 time=54871 us cost=954 size=1224847 card=94219)
98872 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=278 pr=0 pw=0 time=13685 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=981 pr=0 pw=0 time=0 us)
99472 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=981 pr=0 pw=0 time=55488 us cost=954 size=1224847 card=94219)
99472 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=281 pr=0 pw=0 time=14959 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=981 pr=0 pw=0 time=0 us)
99472 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=981 pr=0 pw=0 time=64438 us cost=954 size=1224847 card=94219)
99472 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=281 pr=0 pw=0 time=19945 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=1018 pr=0 pw=0 time=0 us)
99722 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=1018 pr=0 pw=0 time=75046 us cost=954 size=1224847 card=94219)
99722 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=280 pr=0 pw=0 time=15725 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=1018 pr=0 pw=0 time=0 us)
99722 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=1018 pr=0 pw=0 time=55357 us cost=954 size=1224847 card=94219)
99722 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=280 pr=0 pw=0 time=12273 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=1032 pr=0 pw=0 time=0 us)
99722 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=1032 pr=0 pw=0 time=60983 us cost=954 size=1224847 card=94219)
99722 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=281 pr=0 pw=0 time=14702 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=1032 pr=0 pw=0 time=0 us)
99722 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=1032 pr=0 pw=0 time=55869 us cost=954 size=1224847 card=94219)
99722 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=281 pr=0 pw=0 time=13807 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=976 pr=0 pw=0 time=0 us)
99866 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=976 pr=0 pw=0 time=58307 us cost=954 size=1224847 card=94219)
99866 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=281 pr=0 pw=0 time=13554 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=976 pr=0 pw=0 time=0 us)
99866 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=976 pr=0 pw=0 time=55878 us cost=954 size=1224847 card=94219)
99866 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=281 pr=0 pw=0 time=14193 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=647 pr=0 pw=0 time=0 us)
61881 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=647 pr=0 pw=0 time=34903 us cost=954 size=1224847 card=94219)
61881 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=176 pr=0 pw=0 time=8693 us cost=266 size=0 card=94219)(object id 74707)
1 SORT AGGREGATE (cr=647 pr=0 pw=0 time=0 us)
61881 TABLE ACCESS BY INDEX ROWID SALES_TEST (cr=647 pr=0 pw=0 time=34008 us cost=954 size=1224847 card=94219)
61881 INDEX RANGE SCAN IDX_SALES_TEST_01 (cr=176 pr=0 pw=0 time=9716 us cost=266 size=0 card=94219)(object id 74707)
1 FAST DUAL (cr=0 pr=0 pw=0 time=0 us cost=2 size=0 card=1)
※ Trace 결과를 보면, SALES_TEST Table을 스칼라서버쿼리의 수 만큼 반복해서 Access한다.
위 예제에서 다른 통계정보까지 구해야 한다면 SALE_TEST Table Access는 더 많이 발생하게 되어 반복탐색에 대한 비효율이 증가한다.
3. 각 일자별 판매목표와 판매량 데이터를 colunm형태로 조회하는데 Decode함수를 이용해서 조회하는 경우.
SELECT '2018/03' AS SALEDT_03
, SUM(DECODE(SALE_DT, '20180301', TARGET, 0)) AS TARGER_0301
, SUM(DECODE(SALE_DT, '20180301', SALECNT, 0)) AS SALE_0301
, SUM(DECODE(SALE_DT, '20180302', TARGET, 0)) AS TARGER_0302
, SUM(DECODE(SALE_DT, '20180302', SALECNT, 0)) AS SALE_0302
, SUM(DECODE(SALE_DT, '20180303', TARGET, 0)) AS TARGER_0303
, SUM(DECODE(SALE_DT, '20180303', SALECNT, 0)) AS SALE_0303
, SUM(DECODE(SALE_DT, '20180304', TARGET, 0)) AS TARGER_0304
, SUM(DECODE(SALE_DT, '20180304', SALECNT, 0)) AS SALE_0304
, SUM(DECODE(SALE_DT, '20180305', TARGET, 0)) AS TARGER_0305
, SUM(DECODE(SALE_DT, '20180305', SALECNT, 0)) AS SALE_0305
, SUM(DECODE(SALE_DT, '20180306', TARGET, 0)) AS TARGER_0306
, SUM(DECODE(SALE_DT, '20180306', SALECNT, 0)) AS SALE_0306
, SUM(DECODE(SALE_DT, '20180307', TARGET, 0)) AS TARGER_0307
, SUM(DECODE(SALE_DT, '20180307', SALECNT, 0)) AS SALE_0307
FROM SALES_TEST
※ 아래는 위의 sql문 실행시 trace파일의 일부이다.
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.48 0.52 0 4856 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.49 0.53 0 4856 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 84
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=4856 pr=0 pw=0 time=0 us)
659535 TABLE ACCESS FULL SALES_TEST (cr=4856 pr=0 pw=0 time=125558 us cost=1346 size=11212095 card=659535)
※ Decode 함수를 사용한 경우에는 SALES_TEST Table을 한번만 Access한다.
반복탐색에 의한 비효율을 제거.
- Trace에서 확인해 보면 스칼라서브쿼리를 사용한 경우와 비교해봤을때 Table을 한번만 읽음으로써 I/O는 많이 개선되었으나 Elapsed Time 및 Cpu Time은 증가하였는데 결과만 보면 Decode함수를 사용하는것이 성능상 불리해 보이지만 이 경우 Table에 적재된 데이터의 순서와 인덱스의 정렬된 순서가 거의 일치함으로써 Index Clustering Factor가 좋아서 나타나는 결과이다.
만일 데이터입력을 Target 순서대로 입력을 하였다면 Decode가 좋은 성능을 보일것이다.
4. 각 일자별 판매목표와 판매량 데이터를 colunm형태로 조회하는데 Decode함수를 이용하는데 불필요한 Default 제거.
SELECT '2018/03' AS SALEDT_03
, SUM(DECODE(SALE_DT, '20180301', TARGET)) AS TARGER_0301
, SUM(DECODE(SALE_DT, '20180301', SALECNT)) AS SALE_0301
, SUM(DECODE(SALE_DT, '20180302', TARGET)) AS TARGER_0302
, SUM(DECODE(SALE_DT, '20180302', SALECNT)) AS SALE_0302
, SUM(DECODE(SALE_DT, '20180303', TARGET)) AS TARGER_0303
, SUM(DECODE(SALE_DT, '20180303', SALECNT)) AS SALE_0303
, SUM(DECODE(SALE_DT, '20180304', TARGET)) AS TARGER_0304
, SUM(DECODE(SALE_DT, '20180304', SALECNT)) AS SALE_0304
, SUM(DECODE(SALE_DT, '20180305', TARGET)) AS TARGER_0305
, SUM(DECODE(SALE_DT, '20180305', SALECNT)) AS SALE_0305
, SUM(DECODE(SALE_DT, '20180306', TARGET)) AS TARGER_0306
, SUM(DECODE(SALE_DT, '20180306', SALECNT)) AS SALE_0306
, SUM(DECODE(SALE_DT, '20180307', TARGET)) AS TARGER_0307
, SUM(DECODE(SALE_DT, '20180307', SALECNT)) AS SALE_0307
FROM SALES_TEST
※ 아래는 위의 sql문 실행시 trace파일의 일부이다.
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.36 0.39 0 4856 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.37 0.40 0 4856 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 84
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=4856 pr=0 pw=0 time=0 us)
659535 TABLE ACCESS FULL SALES_TEST (cr=4856 pr=0 pw=0 time=102904 us cost=1336 size=11212095 card=659535)
※ Decode함수에서 Default Value를 지정하고 Sum함수를 수행하는데
Sum함수의 특성상 Row를 Sum할때는 Null을 제외하고 Sum을 하므로 Decode함수에서 Default Value값을 제거함으로써 성능의 향상을 기대할 수 있다.
5. Decode함수의 Default 값 제거에 따른 오류데이터 발생 가능성을 NVL함수로 제거.
SELECT '2018/03' AS SALEDT_03
, NVL(SUM(DECODE(SALE_DT, '20180301', TARGET)),0) AS TARGER_0301
, NVL(SUM(DECODE(SALE_DT, '20180301', SALECNT)),0) AS SALE_0301
, NVL(SUM(DECODE(SALE_DT, '20180302', TARGET)),0) AS TARGER_0302
, NVL(SUM(DECODE(SALE_DT, '20180302', SALECNT)),0) AS SALE_0302
, NVL(SUM(DECODE(SALE_DT, '20180303', TARGET)),0) AS TARGER_0303
, NVL(SUM(DECODE(SALE_DT, '20180303', SALECNT)),0) AS SALE_0303
, NVL(SUM(DECODE(SALE_DT, '20180304', TARGET)),0) AS TARGER_0304
, NVL(SUM(DECODE(SALE_DT, '20180304', SALECNT)),0) AS SALE_0304
, NVL(SUM(DECODE(SALE_DT, '20180305', TARGET)),0) AS TARGER_0305
, NVL(SUM(DECODE(SALE_DT, '20180305', SALECNT)),0) AS SALE_0305
, NVL(SUM(DECODE(SALE_DT, '20180306', TARGET)),0) AS TARGER_0306
, NVL(SUM(DECODE(SALE_DT, '20180306', SALECNT)),0) AS SALE_0306
, NVL(SUM(DECODE(SALE_DT, '20180307', TARGET)),0) AS TARGER_0307
, NVL(SUM(DECODE(SALE_DT, '20180307', SALECNT)),0) AS SALE_0307
FROM SALES_TEST
※ 아래는 위의 sql문 실행시 trace파일의 일부이다.
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 2 0.37 0.41 0 4856 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5 0.38 0.41 0 4856 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 84
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=4856 pr=0 pw=0 time=0 us)
659535 TABLE ACCESS FULL SALES_TEST (cr=4856 pr=0 pw=0 time=99064 us cost=1336 size=11212095 card=659535)
※ 해당 날짜에 데이터가 없는 경우에는 Null이 반환되므로 위의 쿼리에서 NVL함수로 데이터를 보정해야 한다.
또한 NVL함수를 추가할때는 함수의 수행횟수가 가장 적은 위치에 추가해야 한다.
- NVL(SUM(DECODE(SALES_DT,'20180301',TARGET)),0) ⇒ 효율이 좋다.
- SUM(DECODE(SALES_DT,'20180301',NVL(TARGET,0))) ⇒ 효율이 좋지 않다.