Interesting things about index cardinality and cost calculation
For the following statement, 9i, 10g and 11g output the different cardinality and cost.
create table t1 nologging as select trunc(dbms_random.value(0,25)) n1, rpad('x',40) ind_pad, trunc(dbms_random.value(0,20)) n2, lpad(rownum,10,'0') small_vc, rpad('x',200) padding from all_objects where rownum <= 10000 ; update t1 set n1 = n2 where n2 in (1,3,5,7,11,13,15,17,19); create index t1_i1 on t1(n1, ind_pad, n2) nologging pctfree 91 ; -- gather statistics TABLE_NAME BLOCKS NUM_ROWS -------------------- ---------- ---------- T1 371 10000 1 row selected. NUM_ROWS DISTINCT_KEYS BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR AVG_LEAF_BLOCKS_PER_KEY AVG_DATA_BLOCKS_PER_KEY ---------- ------------- ---------- ----------- ----------------- ----------------------- ----------------------- 10000 284 2 1111 7830 3 27 1 row selected. COLUMN_NAME NUM_NULLS NUM_DISTINCT DENSITY LOW_VALUE HIGH_VALUE -------------------- ---------- ------------ ---------- ------------------------ ------------------------ IND_PAD 0 1 1 782020202020202020202020 782020202020202020202020 202020202020202020202020 202020202020202020202020 2020202020202020 2020202020202020 N1 0 25 .04 80 C119 N2 0 20 .05 80 C114 set autot trace explain select small_vc from t1 where n1 = 2 and ind_pad = rpad('x',40) and n2 = 4 ;
9i:
Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=21 Card=20 Bytes=1160) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=21 Card=20 Bytes=1160) 2 1 INDEX (RANGE SCAN) OF 'T1_I1' (NON-UNIQUE) (Cost=5 Card=20)
10gR2:
--------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | --------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20 | 1160 | 34 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 20 | 1160 | 34 | |* 2 | INDEX RANGE SCAN | T1_I1 | 35 | | 6 | --------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("N1"=2 AND "IND_PAD"='x ' AND "N2"=4)
11g:
--------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | --------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 35 | 2030 | 34 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 35 | 2030 | 34 | |* 2 | INDEX RANGE SCAN | T1_I1 | 35 | | 6 | ---------------------------------------------------------------------
For 9i:
cost: 5 = blevel + LEAF_BLOCKS* (1/20)*(1/25)
21 - 5 = clustering_factor * (1/20)*(1/25)
Cardinality: 20 = 10000 * (1/20)*(1/25)
For 10gR2:
cost: 6 = blevel + LEAF_BLOCKS * (1/DISTINCT_KEYS)
34 - 6 = clustering_factor *(1/DISTINCT_KEYS)
Cardinality: 35 = 10000 * (1/DISTINCT_KEYS)
20 = 10000 * (1/20)*(1/25)
For 11g:
cost: 6 = blevel + LEAF_BLOCKS * (1/DISTINCT_KEYS)
34 - 6 = clustering_factor *(1/DISTINCT_KEYS)
Cardinality: 35 = 10000 * (1/DISTINCT_KEYS)
So, in 9i, the distinct_keys of user_indexes view is not used when calculate the cardinality or cost for index access. In 10gR2, the cost and cardinality for index line is calculate using distinct_keys, so is the cost for table access cost. In 11G, the cardinality for table access is also calculated with distinct_keys.
Actually, the cost and cardinality are becoming more and more accurate, when the index columns have some dependence with each other.
