数据库高效优化:架构、规范与SQL技巧
上QQ阅读APP看书,第一时间看更新

7.1 表

“表”是大家最为熟知的一个对象。它也是保存数据的实体。Oracle数据库支持多种表的类型。大家最为常见的就是堆表,它也是适用范围最广的一种表。此外,还支持索引组织表、簇表等。除了按照表的结构分类外,还可以根据表的组织形式、用途等进行分类,比如常见的分区表、临时表等。

下面我们将从最常见的堆表开始介绍。

1.堆表

堆表是Oracle默认的表类型,也是最常用的表类型。除非有特殊原因要使用其他表类型,否则都使用堆表类型。对于堆表来说,最常见的影响性能的因素就是表的规模。这一点很容易理解,规模越大,扫描的块数越多,当然成本也就越高。前面两章提到,如果对表进行全表扫描,会扫描高水位线以下的所有块。这也解释了为什么删除数据后,扫描表仍然很慢。下面通过一个示例说明。


SQL> create table t1 as select * from dba_objects;
//表已创建

SQL> insert into t1 select * from t1;
//已创建 18865 行

SQL> insert into t1 select * from t1;
//已创建 37730 行

SQL> insert into t1 select * from t1;
//已创建 75460 行

SQL> insert into t1 select * from t1;
//已创建 150920 行

SQL> commit;
//提交完成
//这里我们构造一张大表,并插入了几十万条记录

SQL> set serveroutput on
SQL> exec show_space('t1','auto');
Total Blocks............................4096
Total Bytes.............................33554432
Unused Blocks...........................0
Unused Bytes............................0
Last Used Ext FileId....................4
Last Used Ext BlockId...................6144
Last Used Block.........................128
PL/SQL 过程已成功完成。
/*
这里我们调用了一个存储过程(附录中会详细说明)。通过这个存储过程,我们可以观察到表的高水位线信息。对于上面这个示例,高水位线的位置在Total Blocks – Unused Blocks = 4096
*/

SQL> set autotracetraceonly
SQL> select count(*) from t1;
-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |  1115   (2)| 00:00:14 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T1   |  4559K|  1115   (2)| 00:00:14 |
-------------------------------------------------------------------

统计信息
----------------------------------------------------------
5903  consistent gets
4059  physical reads
//通过上面执行的SQL语句可见,这个查询语句大约要执行5000多次逻辑读操作

SQL> delete from t1;
//已删除301840行

SQL> commit;
//提交完成

SQL> exec show_space('t1','auto');
Total Blocks............................4096
Total Bytes.............................33554432
Unused Blocks...........................0
Unused Bytes............................0
Last Used Ext FileId....................4
Last Used Ext BlockId...................6144
Last Used Block.........................128
//PL/SQL 过程已成功完成
//删除操作后,我们通过观察发现高水位线没有变化

SQL> select count(*) from t1;
-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |  1096   (1)| 00:00:14 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T1   |     1 |  1096   (1)| 00:00:14 |
-------------------------------------------------------------------
统计信息
----------------------------------------------------------
0  recursive calls
0  db block gets
5166  consistent gets
4026  physical reads
//删除之后执行查询,仍然需要5000多次的逻辑读操作

SQL> truncate table t1;
//表被截断

SQL> set autotrace off
SQL> exec show_space('t1','auto');
Total Blocks............................8
Total Bytes.............................65536
Unused Blocks...........................5
Unused Bytes............................40960
Last Used Ext FileId....................4
Last Used Ext BlockId...................2096
Last Used Block.........................3
//PL/SQL 过程已成功完成
//截断表后,高水位线明显降低了

SQL> select count(*) from t1;
-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T1   |     1 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------
统计信息
----------------------------------------------------------
1  recursive calls
0  db block gets
4  consistent gets
1  physical reads
/*
高水位线降低后,再次执行查询语句,可见其逻辑读非常小。这也说明了降低高水位线对全表扫描的影响
*/

2.索引组织表

索引组织表,顾名思义,就是存储在一个索引结构中的表,也就是以B+树结构存储。换句话说,在索引组织表中,索引就是数据,数据就是索引,两者合二为一。索引组织表的好处并不在于解决磁盘空间的占用,而是可以减少I/O,进而减少访问缓冲区缓存。

下面我们通过一个示例说明普通堆表与索引组织表的访问对比。


SQL> create table t_normal( aint,bint,c varchar2(100));
//表已创建

SQL> create index idx_normal_a on t_normal(a);
//索引已创建

SQL> insert into t_normal select rownum,object_id,object_name from dba_objects;
//已创建 18867 行

SQL> commit;
//提交完成

SQL> create table t_iot(a int,bint,c varchar2(100),primary key(a)) organization index;
//表已创建

SQL> insert into t_iot select rownum,object_id,object_name from dba_objects;
//已创建 18869 行

SQL> commit;
//提交完成

//上面分别创建了普通表和索引组织表,并插入了相同数据
SQL> set autotracetraceonly
SQL> select * from t_normal where a=1000;
--------------------------------------------------------------------------------
| Id  | Operation             | Name         | Rows  | Bytes | Cost (%CPU)| Time
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |              |     1 |    78 |     1   (0)| 00:00
|   1 |  TABLE ACCESS BY INDEX ROWID
                              | T_NORMAL     |     1 |    78 |     1   (0)| 00:00
|*  2 |   INDEX RANGE SCAN    | IDX_NORMAL_A |    74 |       |     1   (0)| 00:00
--------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
0  recursive calls
0  db block gets
4  consistent gets
0  physical reads
//从上面输出可见,在堆表中访问这条记录需要4个逻辑读操作。从执行计划可见,需要一个回表查询

SQL> select * from t_iot where a=1000;
--------------------------------------------------------------------------------
| Id  | Operation   | Name              | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT
                    |                   |    1 |    78 |     1   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN
                    | SYS_IOT_TOP_21676 |    1 |    78 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          2  consistent gets
0  physical reads
/*
从上面输出可见,在索引组织表中访问这条记录需要2个逻辑读,相较堆表访问大大减少了。从执行计划可见,可以直接访问,不需要回表;或者说,就是通过索引直接访问数据
*/

3.分区表

分区表是Oracle数据库中应对大规模数据量的一种很好的解决方案。其基本原理很简单,就是将大的对象分解为若干个小对象。当访问表时,根据分区策略,可以精确地定位到小对象(单个分区),进而提高访问效率。需要说明的是,分区表中的每个分区也是一个独立的堆表,只是逻辑上看起来是一张完整的表。

下面通过一个示例说明普通表与分区表的访问区别。


[hf@testdb] SQL> create table t_part
2  (
  3      owner varchar2(30),
  4      object_namevarchar2(128),
  5      object_id number,
  6      created date
7  )
8  partition by range (created)
9  (
10  partition part_201305 values less than(to_date('2013-06-01','yyyy-mm-dd')),
11  partition part_201306 values less than(to_date('2013-07-01','yyyy-mm-dd')),
12  partition part_201307 values less than(to_date('2013-08-01','yyyy-mm-dd')),
13  partition part_201308 values less than(to_date('2013-09-01','yyyy-mm-dd')),
14  partition part_201309 values less than(to_date('2013-10-01','yyyy-mm-dd')),
15  partition part_201310 values less than(to_date('2013-11-01','yyyy-mm-dd')),
16  partition part_201311 values less than(to_date('2013-12-01','yyyy-mm-dd')),
17  partition part_201312 values less than(to_date('2014-01-01','yyyy-mm-dd')),
18  partition part_201401 values less than(to_date('2014-02-01','yyyy-mm-dd')),
19  partition part_201402 values less than(to_date('2014-03-01','yyyy-mm-dd')),
20  partition part_201403 values less than(to_date('2014-04-01','yyyy-mm-dd')),
21  partition part_201404 values less than(to_date('2014-05-01','yyyy-mm-dd')),
22  partition part_201405 values less than(to_date('2014-06-01','yyyy-mm-dd')),
23  partitionpart_max values less than(maxvalue)
24  );
Table created.

[hf@testdb] SQL> create table t_normal
2  (
  3      owner varchar2(30),
  4      object_namevarchar2(128),
  5      object_id number,
  6      created date
7  );
Table created.

[hf@testdb] SQL> insert into t_normal select owner,object_name,object_id,created from sys.dba_objects;
86299 rows created.

[hf@testdb] SQL> commit;
Commit complete.

[hf@testdb] SQL> insert into t_part select owner,object_name,object_id,created from sys.dba_objects;
86299 rows created.

[hf@testdb] SQL> commit;
Commit complete.
//上面创建了两张表,并插入了数万条记录

[hf@testdb] SQL> exec dbms_stats.gather_table_stats('hf', 't_normal');
PL/SQL procedure successfully completed.

[hf@testdb] SQL> exec dbms_stats.gather_table_stats('hf', 't_part');
PL/SQL procedure successfully completed.

[hf@testdb] SQL> select * from t_part where created between to_date('2013-11-01', 'yyyy-mm-dd') and to_date('2013-11-30','yyyy-mm-dd');
--------------------------------------------------------------------------------
| Id  | Operation             | Name  |Rows|Bytes| Cost(%CPU)| Time    |Pstart|Pstop
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |  1 | 105 |     2 (0)| 00:00:01|     |       
|   1 |  PARTITION RANGE SINGLE |      |  1 | 105 |     2 (0)| 00:00:01|    7| 7
|*  2 |   TABLE ACCESS FULL   |T_PART|  1 | 105 |     2 (0)| 00:00:01|    7| 7  
--------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
0  consistent gets
0  physical reads
//从分区表的访问可见,没有消耗逻辑读。原因是该分区内没有数据,所以无须读取
[hf@testdb] SQL> select * from t_normal where created between to_date('2013-11-01', 'yyyy-mm-dd') and to_date('2013-11-30','yyyy-mm-dd');
-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |    75 |  3300 |   171   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL | T_NORMAL |    75 |  3300 |   171   (1)| 00:00:03 |
-------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
12  recursive calls
0  db block gets
643  consistent gets
0  physical reads
/*
如果直接对表进行访问,则需要600多次逻辑读。由此可见,通过分区访问可以更精确地定位数据,减少访问规模
*/