Oracle table() function disabling optimizations, part 2
2025-12-10
Earlier
I wrote about how Oracle Database disables some
optimizations like join predicate pushdown whenever a
table() function is present anywhere in the parsed
query structure. There are ways to rewrite queries to not use
table() function, but sometimes it gets difficult.
However, I had already noticed some performance issues when
parsing queries having large UNION ALL blocks.
After upgrading from Oracle 19.24 to 19.26 our system more or
less collapsed. Some of our queries with large
UNION ALL blocks became extremely slow, and some
weren't even able to finish parsing in several minutes!
Convincing Oracle to fix a performance regression would probably
be a long road, so we had to find a workaround.
It turns out there's another way to express the same trick
but without a manually written large UNION ALL.
Where were we
In the previous post, we created some structures and test
data. We finished with a query that performs well since we used
a trick to avoid needing the table() function:
SELECT /*+ GATHER_PLAN_STATISTICS USE_NL(d e) */ (
SELECT count(*)
FROM (SELECT a, regexp_count(a, '[^_]+') b FROM (SELECT /*+ NO_MERGE */ repeat_value_str(1, LEAST(emps, 5)) a FROM DUAL)) foo,
LATERAL (SELECT regexp_substr(a, '[^_]+', 1, 1) c FROM DUAL UNION ALL
SELECT regexp_substr(a, '[^_]+', 1, 2) c FROM DUAL UNION ALL
SELECT regexp_substr(a, '[^_]+', 1, 3) c FROM DUAL UNION ALL
SELECT regexp_substr(a, '[^_]+', 1, 4) c FROM DUAL UNION ALL
SELECT regexp_substr(a, '[^_]+', 1, 5) c FROM DUAL
WHERE 1 = CASE WHEN b <= 5 THEN 1 ELSE to_number('a') END)
WHERE c IS NOT NULL
)
FROM (SELECT depid, depname FROM department GROUP BY depid, depname) d
JOIN (SELECT depid, count(*) emps FROM employee GROUP BY depid) e ON e.depid = d.depid
WHERE depname = 'department-1';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(format => 'allstats last'));which produced the following plan:
-----------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 7 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 25 | | | |
| 2 | NESTED LOOPS | | 1 | 5 | 5 |00:00:00.01 | 25 | | | |
| 3 | VIEW | | 1 | 1 | 1 |00:00:00.01 | 25 | | | |
| 4 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
| 5 | VIEW | VW_LAT_33CB5887 | 1 | 5 | 5 |00:00:00.01 | 0 | | | |
| 6 | UNION-ALL | | 1 | | 5 |00:00:00.01 | 0 | | | |
|* 7 | FILTER | | 1 | | 1 |00:00:00.01 | 0 | | | |
| 8 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
|* 9 | FILTER | | 1 | | 1 |00:00:00.01 | 0 | | | |
| 10 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
|* 11 | FILTER | | 1 | | 1 |00:00:00.01 | 0 | | | |
| 12 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
|* 13 | FILTER | | 1 | | 1 |00:00:00.01 | 0 | | | |
| 14 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
|* 15 | FILTER | | 1 | | 1 |00:00:00.01 | 0 | | | |
| 16 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
| 17 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 7 | | | |
| 18 | VIEW | | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
| 19 | HASH GROUP BY | | 1 | 1 | 1 |00:00:00.01 | 3 | 1422K| 1422K| 492K (0)|
| 20 | TABLE ACCESS BY INDEX ROWID BATCHED| DEPARTMENT | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
|* 21 | INDEX RANGE SCAN | DEPNAME_IDX | 1 | 1 | 1 |00:00:00.01 | 2 | | | |
| 22 | VIEW PUSHED PREDICATE | | 1 | 1 | 1 |00:00:00.01 | 4 | | | |
|* 23 | FILTER | | 1 | | 1 |00:00:00.01 | 4 | | | |
| 24 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 4 | | | |
|* 25 | INDEX RANGE SCAN | DEP_IDX | 1 | 1000 | 1000 |00:00:00.01 | 4 | | | |
-----------------------------------------------------------------------------------------------------------------------------------------------Unfortunately, if we increase the number of
UNION ALLs to several hundred, the parsing time
explodes when the query is part of a significantly larger query,
especially when there is more than one large
UNION ALL block.
Alternative trick
Instead of producing multiple rows, we can express the same
result as a single row that we then UNPIVOT into
multiple rows:
SELECT /*+ GATHER_PLAN_STATISTICS USE_NL(d e) */ count(*)
FROM (SELECT depid, depname FROM department GROUP BY depid, depname) d
JOIN (SELECT depid, count(*) emps FROM employee GROUP BY depid) e ON e.depid = d.depid,
LATERAL (SELECT repeat_value_str(1, LEAST(emps, 5)) a FROM DUAL),
LATERAL (SELECT column_value
FROM (SELECT regexp_substr(a,'[^_]+',1,1) c1,
regexp_substr(a,'[^_]+',1,2) c2,
regexp_substr(a,'[^_]+',1,3) c3,
regexp_substr(a,'[^_]+',1,4) c4,
regexp_substr(a,'[^_]+',1,5) c5
FROM DUAL
WHERE 1 = CASE WHEN regexp_count(a, '[^_]+') <= 5 THEN 1 ELSE to_number('a') END)
UNPIVOT (column_value FOR foo IN (c1,c2,c3,c4,c5)))
WHERE depname = 'department-1';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(format => 'allstats last'));Here the first LATERAL produces the concatenated
single-row string, and the next LATERAL provides
the magic that splits it apart. The WHERE clause is
again used to alert us if there are more results than our
UNPIVOT can handle.
Checking the execution plan, we can see that it still performs well:
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 7 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 7 | | | |
| 2 | MERGE JOIN CARTESIAN | | 1 | 5 | 5 |00:00:00.01 | 7 | | | |
| 3 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 7 | | | |
| 4 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 7 | | | |
| 5 | VIEW | | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
| 6 | HASH GROUP BY | | 1 | 1 | 1 |00:00:00.01 | 3 | 1422K| 1422K| 494K (0)|
| 7 | TABLE ACCESS BY INDEX ROWID BATCHED| DEPARTMENT | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
|* 8 | INDEX RANGE SCAN | DEPNAME_IDX | 1 | 1 | 1 |00:00:00.01 | 2 | | | |
| 9 | VIEW PUSHED PREDICATE | | 1 | 1 | 1 |00:00:00.01 | 4 | | | |
|* 10 | FILTER | | 1 | | 1 |00:00:00.01 | 4 | | | |
| 11 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 4 | | | |
|* 12 | INDEX RANGE SCAN | DEP_IDX | 1 | 1000 | 1000 |00:00:00.07 | 4 | | | |
| 13 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
| 14 | BUFFER SORT | | 1 | 5 | 5 |00:00:00.01 | 0 | 2048 | 2048 | 2048 (0)|
|* 15 | VIEW | | 1 | 5 | 5 |00:00:00.01 | 0 | | | |
| 16 | UNPIVOT | | 1 | | 5 |00:00:00.01 | 0 | | | |
| 17 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
----------------------------------------------------------------------------------------------------------------------------------------------Interestingly, if we use Oracle's
DBMS_UTILITY.expand_sql_text to see how it
internally represents the query:
DECLARE
l_clob CLOB;
BEGIN
DBMS_UTILITY.expand_sql_text(input_sql_text => q'[
SELECT /*+ GATHER_PLAN_STATISTICS USE_NL(d e) */ count(*)
FROM (SELECT depid, depname FROM department GROUP BY depid, depname) d
JOIN (SELECT depid, count(*) emps FROM employee GROUP BY depid) e ON e.depid = d.depid,
LATERAL (SELECT repeat_value_str(1, LEAST(emps, 5)) a FROM DUAL),
LATERAL (SELECT column_value
FROM (SELECT regexp_substr(a,'[^_]+',1,1) c1,
regexp_substr(a,'[^_]+',1,2) c2,
regexp_substr(a,'[^_]+',1,3) c3,
regexp_substr(a,'[^_]+',1,4) c4,
regexp_substr(a,'[^_]+',1,5) c5
FROM DUAL
WHERE 1 = CASE WHEN regexp_count(a, '[^_]+') <= 5 THEN 1 ELSE to_number('a') END)
UNPIVOT (column_value FOR foo IN (c1,c2,c3,c4,c5)))
WHERE depname = 'department-1'
]', output_sql_text => l_clob);
DBMS_OUTPUT.put_line(l_clob);
END;
/We can see that Oracle has internally rewritten our
UNPIVOT into a large UNION ALL
structure:
SELECT COUNT(*) "COUNT(*)"
FROM (SELECT "A10"."DEPID" "QCSJ_C000000000500000",
"A10"."DEPNAME" "DEPNAME_1",
"A9"."DEPID" "QCSJ_C000000000500001",
"A9"."EMPS" "EMPS_3"
FROM (SELECT "A11"."DEPID" "DEPID",
"A11"."DEPNAME" "DEPNAME"
FROM "INFRA"."DEPARTMENT" "A11"
GROUP BY "A11"."DEPID", "A11"."DEPNAME"
) "A10",
(SELECT "A12"."DEPID" "DEPID", COUNT(*) "EMPS"
FROM "INFRA"."EMPLOYEE" "A12"
GROUP BY "A12"."DEPID"
) "A9"
WHERE "A9"."DEPID" = "A10"."DEPID"
) "A3",
LATERAL (
(SELECT "INFRA"."REPEAT_VALUE_STR"(1, least("A3"."EMPS_3", 5)) "A_0"
FROM "SYS"."DUAL" "A8"
)
) "A2",
LATERAL (
(SELECT "A4"."COLUMN_VALUE" "COLUMN_VALUE"
FROM (SELECT "A5"."FOO" "FOO",
"A5"."COLUMN_VALUE" "COLUMN_VALUE"
FROM (SELECT 'C1' "FOO",
"A13"."C1" "COLUMN_VALUE"
FROM (SELECT regexp_substr("A2"."A_0", '[^_]+', 1, 1) "C1",
regexp_substr("A2"."A_0", '[^_]+', 1, 2) "C2",
regexp_substr("A2"."A_0", '[^_]+', 1, 3) "C3",
regexp_substr("A2"."A_0", '[^_]+', 1, 4) "C4",
regexp_substr("A2"."A_0", '[^_]+', 1, 5) "C5"
FROM "SYS"."DUAL" "A7"
WHERE 1 = CASE WHEN ( regexp_count("A2"."A_0", '[^_]+') <= 5 ) THEN 1 ELSE TO_NUMBER('a') END
) "A13"
UNION ALL
SELECT 'C2' "FOO",
"A13"."C2" "COLUMN_VALUE"
FROM (SELECT regexp_substr("A2"."A_0", '[^_]+', 1, 1) "C1",
regexp_substr("A2"."A_0", '[^_]+', 1, 2) "C2",
regexp_substr("A2"."A_0", '[^_]+', 1, 3) "C3",
regexp_substr("A2"."A_0", '[^_]+', 1, 4) "C4",
regexp_substr("A2"."A_0", '[^_]+', 1, 5) "C5"
FROM "SYS"."DUAL" "A7"
WHERE 1 = CASE WHEN ( regexp_count("A2"."A_0", '[^_]+') <= 5 ) THEN 1 ELSE TO_NUMBER('a') END
) "A13"
UNION ALL
SELECT 'C3' "FOO",
"A13"."C3" "COLUMN_VALUE"
FROM (SELECT regexp_substr("A2"."A_0", '[^_]+', 1, 1) "C1",
regexp_substr("A2"."A_0", '[^_]+', 1, 2) "C2",
regexp_substr("A2"."A_0", '[^_]+', 1, 3) "C3",
regexp_substr("A2"."A_0", '[^_]+', 1, 4) "C4",
regexp_substr("A2"."A_0", '[^_]+', 1, 5) "C5"
FROM "SYS"."DUAL" "A7"
WHERE 1 = CASE WHEN ( regexp_count("A2"."A_0", '[^_]+') <= 5 ) THEN 1 ELSE TO_NUMBER('a') END
) "A13"
UNION ALL
SELECT 'C4' "FOO",
"A13"."C4" "COLUMN_VALUE"
FROM (SELECT regexp_substr("A2"."A_0", '[^_]+', 1, 1) "C1",
regexp_substr("A2"."A_0", '[^_]+', 1, 2) "C2",
regexp_substr("A2"."A_0", '[^_]+', 1, 3) "C3",
regexp_substr("A2"."A_0", '[^_]+', 1, 4) "C4",
regexp_substr("A2"."A_0", '[^_]+', 1, 5) "C5"
FROM "SYS"."DUAL" "A7"
WHERE 1 = CASE WHEN ( regexp_count("A2"."A_0", '[^_]+') <= 5 ) THEN 1 ELSE TO_NUMBER('a') END
) "A13"
UNION ALL
SELECT 'C5' "FOO",
"A13"."C5" "COLUMN_VALUE"
FROM (SELECT regexp_substr("A2"."A_0", '[^_]+', 1, 1) "C1",
regexp_substr("A2"."A_0", '[^_]+', 1, 2) "C2",
regexp_substr("A2"."A_0", '[^_]+', 1, 3) "C3",
regexp_substr("A2"."A_0", '[^_]+', 1, 4) "C4",
regexp_substr("A2"."A_0", '[^_]+', 1, 5) "C5"
FROM "SYS"."DUAL" "A7"
WHERE 1 = CASE WHEN ( regexp_count("A2"."A_0", '[^_]+') <= 5 ) THEN 1 ELSE TO_NUMBER('a') END
) "A13"
) "A5"
WHERE "A5"."COLUMN_VALUE" IS NOT NULL
) "A4"
)
) "A1"
WHERE "A3"."DEPNAME_1" = 'department-1'However, with this format we haven't observed any parsing
performance issues. I can only speculate that perhaps
UNPIVOT acts as a barrier for the optimizer,
preventing it from delving inside to try various kinds of
merging and other optimizations, which would otherwise blow up
the parsing time.
Where to go from here
Of course, no one in their right mind would want to write
hundreds of columns for an UNPIVOT clause.
Fortunately, Oracle nowadays has support for macros!
CREATE OR REPLACE PACKAGE table_util AUTHID DEFINER AS
FUNCTION assert_matches(p_max_values NUMBER, p_matches NUMBER, p_msg VARCHAR2) RETURN NUMBER;
FUNCTION assert_nonnull(p_value VARCHAR2, p_msg VARCHAR2) RETURN VARCHAR2;
-- Splits '_'-separated _values parameter string to rows without table-function.
-- Can produce at most p_max_values rows. Throws an error if there are more.
FUNCTION split(p_max_values NUMBER, p_values DBMS_TF.COLUMNS_T, p_msg VARCHAR2 DEFAULT null) RETURN CLOB SQL_MACRO;
END;
/
CREATE OR REPLACE PACKAGE BODY table_util AS
FUNCTION assert_matches(p_max_values NUMBER, p_matches NUMBER, p_msg VARCHAR2) RETURN NUMBER IS
PRAGMA UDF;
BEGIN
IF p_matches > p_max_values THEN
raise_application_error(-20610, 'Got more rows than the maximum expected: '||p_matches||' > '||p_max_values||'. Increase in "'||p_msg||'" the first argument of table_util.split!');
END IF;
RETURN 1;
END;
FUNCTION assert_nonnull(p_value VARCHAR2, p_msg VARCHAR2) RETURN VARCHAR2 IS
PRAGMA UDF;
BEGIN
IF p_value IS NULL THEN
raise_application_error(-20611, 'Macro got a null value in "'||p_msg||'". Check that the value is always nonnull, otherwise this is an Oracle bug!');
END IF;
RETURN p_value;
END;
FUNCTION split(p_max_values NUMBER, p_values DBMS_TF.COLUMNS_T, p_msg VARCHAR2) RETURN CLOB IS
substrs CLOB;
cols VARCHAR2(32000) := '';
args VARCHAR2(1000) := '';
BEGIN
FOR i IN 1..p_values.COUNT LOOP
args := args||p_values(i)||CASE WHEN i < p_values.COUNT THEN ',' ELSE '' END;
END LOOP;
substrs := TO_CLOB('regexp_substr(table_util.assert_nonnull('||args||',p_msg),''[^_]+'',1) c1');
FOR i IN 2..p_max_values LOOP
substrs := substrs||',regexp_substr('||args||',''[^_]+'',NULLIF(INSTR('||args||',''_'',1,'||(i-1)||'),0)+1) c'||to_clob(i);
END LOOP;
FOR i IN 1..p_max_values LOOP
cols := cols||'c'||i||CASE WHEN i < p_max_values THEN ',' ELSE '' END;
END LOOP;
RETURN TO_CLOB('SELECT column_value
FROM (SELECT '||substrs||'
FROM DUAL
WHERE 1 = table_util.assert_matches(p_max_values, regexp_count('||args||', ''[^_]+''), p_msg))
UNPIVOT (column_value FOR foo IN ('||cols||'))');
END;
END;
/Using this macro, we can write the actual query in a clean and even intuitive way:
SELECT /*+ GATHER_PLAN_STATISTICS USE_NL(d e) */ count(*)
FROM (SELECT depid, depname FROM department GROUP BY depid, depname) d
JOIN (SELECT depid, count(*) emps FROM employee GROUP BY depid) e ON e.depid = d.depid,
LATERAL (SELECT (SELECT /*+ NO_MERGE */ repeat_value_str(1, LEAST(emps, 5)) FROM DUAL) a FROM DUAL),
LATERAL (SELECT * FROM table_util.split(5, COLUMNS(a), 'my test query for the blog'))
WHERE depname = 'department-1';We can provide descriptive text as the third parameter so that when there are too many values, the error message tells us which query has the problem. The solution is simply to increase the first parameter.
Unfortunately, Oracle is still buggy, and using
LATERAL views sometimes still produces
null values here and there unless the
NO_MERGE hints shown here are used.
The execution plan is good, and we can see
VIEW PUSHED PREDICATE has been applied:
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(format => 'allstats last'));
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 7 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 7 | | | |
| 2 | NESTED LOOPS | | 1 | 5 | 5 |00:00:00.01 | 7 | | | |
| 3 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
| 4 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 7 | | | |
| 5 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 7 | | | |
| 6 | VIEW | | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
| 7 | HASH GROUP BY | | 1 | 1 | 1 |00:00:00.01 | 3 | 1422K| 1422K| 491K (0)|
| 8 | TABLE ACCESS BY INDEX ROWID BATCHED| DEPARTMENT | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
|* 9 | INDEX RANGE SCAN | DEPNAME_IDX | 1 | 1 | 1 |00:00:00.01 | 2 | | | |
| 10 | VIEW PUSHED PREDICATE | | 1 | 1 | 1 |00:00:00.01 | 4 | | | |
|* 11 | FILTER | | 1 | | 1 |00:00:00.01 | 4 | | | |
| 12 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 4 | | | |
|* 13 | INDEX RANGE SCAN | DEP_IDX | 1 | 1000 | 1000 |00:00:00.01 | 4 | | | |
| 14 | VIEW | VW_LAT_2D0B8FC8 | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
| 15 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
| 16 | VIEW | VW_LAT_2D0B8FC8 | 1 | 5 | 5 |00:00:00.01 | 0 | | | |
|* 17 | VIEW | | 1 | 5 | 5 |00:00:00.01 | 0 | | | |
| 18 | UNPIVOT | | 1 | | 5 |00:00:00.01 | 0 | | | |
|* 19 | FILTER | | 1 | | 1 |00:00:00.01 | 0 | | | |
| 20 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | | | |
--------------------------------------------------------------------------------------------------------------------------------------------------Conclusion
If you want to return multiple values from a function without
sacrificing optimizations, avoid using table()
function by rewriting them using UNPIVOT instead.
You can create a macro to make the syntax cleaner and easier to
use. This approach can help avoid parsing performance problems
while still achieving the desired functionality.