ZHSGBK16到AL32UTF8的一点测试

关于字符集的一点测试,由于某客户的一套rac需要更改字符集,所以我这里再次进行测试。
#### 原库

SQL> select userenv('language') from dual;

USERENV('LANGUAGE')
----------------------------------------------------
SIMPLIFIED CHINESE_CHINA.ZHS16GBK

SQL> show parameter nls

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
nls_calendar                         string
nls_comp                             string
nls_currency                         string
nls_date_format                      string
nls_date_language                    string
nls_dual_currency                    string
nls_iso_currency                     string
nls_language                         string      AMERICAN
nls_length_semantics                 string      BYTE
nls_nchar_conv_excp                  string      FALSE
nls_numeric_characters               string
nls_sort                             string
nls_territory                        string      AMERICA
nls_time_format                      string
nls_time_tz_format                   string
nls_timestamp_format                 string
nls_timestamp_tz_format              string

SQL> conn roger/roger
已连接。

SQL> create table zifuji(name varchar2(6));

表已创建。

SQL> insert into zifuji values('海天');

已创建 1 行。

SQL> /

已创建 1 行。

SQL> /

已创建 1 行。

SQL> /

已创建 1 行。

SQL> commit;

提交完成。

SQL> select dump('海天') from dual;

DUMP('海天')
-----------------------------
Typ=96 Len=4: 186,163,204,236

++++ 我们可以看到在GBK下,一个汉字占2个字节 ++++
++++ 下面进行数据的导出 ++++

OWNER      DIRECTORY_NAME                 DIRECTORY_PATH
---------- ------------------------------ --------------------------------------------------
SYS        WORK_DIR                       C:\ADE\aime_080314\oracle/work
SYS        ADMIN_DIR                      C:\ADE\aime_080314\oracle/md/admin
SYS        ORACLE_OCM_CONFIG_DIR          G:\oracle\product.2.0\db_3\ccr\state
SYS        DATA_PUMP_DIR                  G:\oracle\product.2.0\admin\alex\dpdump\
SYS        BDUMPDIR                       G:\ORACLE\PRODUCT.2.0\ADMIN\ALEX\BDUMP

C:\Documents and Settings\Administrator> set nls_lang=AMERICAN_AMERICA.ZHS16GBK

C:\Documents and Settings\Administrator> expdp system/oracle directory=data_pump_dir dumpfile=zifuji.dmp tables=roger.zifuji

Export: Release 10.2.0.4.0 - Production on Wednesday, 21 September, 2011 9:42:34

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SYSTEM"."SYS_EXPORT_TABLE_01":  system/******** directory=data_pump_dir dumpfile=zifuji.dmp tables=roger.zifuji
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 64 KB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "ROGER"."ZIFUJI"                            4.945 KB       4 rows
Master table "SYSTEM"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYSTEM.SYS_EXPORT_TABLE_01 is:
  G:\ORACLE\PRODUCT.2.0\ADMIN\ALEX\DPDUMP\ZIFUJI.DMP
Job "SYSTEM"."SYS_EXPORT_TABLE_01" successfully completed at 09:43:00
#### 目标数据库

SQL> select instance_name from v$Instance;

INSTANCE_NAME
--------------------------------
test

SQL> select userenv('language') from dual;

USERENV('LANGUAGE')
--------------------------------------------------------------------------------
SIMPLIFIED CHINESE_CHINA.AL32UTF8

SQL> create user roger identified by roger;

用户已创建。

SQL> grant connect,resource to roger;

授权成功。

SQL> alter user  system identified by oracle;

用户已更改。

SQL> col directory_path for a50
SQL> set lines 140
SQL> select * from dba_directories;

OWNER      DIRECTORY_NAME                                               DIRECTORY_PATH
---------- ------------------------------------------------------------ --------------------------------------------------
SYS        DATA_PUMP_DIR                                                G:\oracle\product.2.0\admin\test\dpdump\
SYS        ORACLE_OCM_CONFIG_DIR                                        G:\oracle\product.2.0\db_3\ccr\state
SYS        ADMIN_DIR                                                    C:\ADE\aime_080314\oracle/md/admin
SYS        WORK_DIR                                                     C:\ADE\aime_080314\oracle/work
C:\Documents and Settings\Administrator> set nls_lang=AMERICAN_AMERICA.AL32UTF8    -- 注意这里

C:\Documents and Settings\Administrator> impdp system/oracle directory=data_pump_dir dumpfile=zifuji.dmp remap_schema=roger:roger remap_tablespace=roger:users

Import: Release 10.2.0.4.0 - Production on Wednesday, 21 September, 2011 9:50:49

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_FULL_01":  system/******** directory=data_pump_dir dumpfile=zifuji.dmp remap_schema=roger:roger remap_tablespa
ce=roger:users
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "ROGER"."ZIFUJI"                            4.945 KB       4 rows
Job "SYSTEM"."SYS_IMPORT_FULL_01" successfully completed at 09:50:55
C:\Documents and Settings\Administrator> sqlplus "/as sysdba"

SQL*Plus: Release 10.2.0.4.0 - Production on Wed Sep 21 09:51:20 2011

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select count(*) from  roger.zifuji;

  COUNT(*)
----------
         4

SQL> select * from roger.zifuji;

NAME
------
娴峰ぉ
娴峰ぉ
娴峰ぉ
娴峰ぉ
发现想 impdp 客户端字符集设置为跟目标库一致后,导入会是乱码,是因为字符集编码的差异。
我们知道在10g以前,也就是使用exp,imp的方式,是可以直接手工修改dmp文件的,具体是修改
dump文件的第2,3个字节,但是如果使用expdp,那么情况就不是这样了。

exp dmp文件:

ZHS16GBK dmp 文件 第2,3字节为 0354
AL32UTF8 dmp 文件 第2,3字节为 0369

如下图所示:

exp情况下的dmp:


expdp情况下的dmp:


下面我们试试直接修改 expdp dmp 文件:


使用UE修改为如下情况:


经过测试发现对于expdp来说,使用传统修改dmp文件的方式不行了,如下:
C:\Documents and Settings\Administrator> impdp system/oracle directory=data_pump_dir dumpfile=zifuji.dmp remap_schema=roger:roger remap_tablespace=roger:users

Import: Release 10.2.0.4.0 - Production on Wednesday, 21 September, 2011 10:09:31

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORA-39001: invalid argument value
ORA-39000: bad dump file specification
ORA-31619: invalid dump file "G:\oracle\product.2.0\admin\test\dpdump\zifuji.dmp"
对于 expdp 的dmp文件,直接修改dmp文件会导致文件损坏。

其实这里我需要将 impdp 客户端字符集设置为跟原库一样即可,因为UTF8包含了所有的GBK字符,不过只是编码不一样而已。

这样在导入的过程中,oracle会自己去做字符集编码的转换。
C:\Documents and Settings\Administrator> set nls_lang=AMERICAN_AMERICA.ZHS16GBK

C:\Documents and Settings\Administrator> impdp system/oracle directory=data_pump_dir dumpfile=zifuji.dmp remap_schema=roger:roger remap_tablespace=roger:users

Import: Release 10.2.0.4.0 - Production on Wednesday, 21 September, 2011 10:12:03

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_FULL_01":  system/******** directory=data_pump_dir dumpfile=zifuji.dmp remap_schema=roger:roger remap_tablespa
ce=roger:users
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "ROGER"."ZIFUJI"                            4.945 KB       4 rows
Job "SYSTEM"."SYS_IMPORT_FULL_01" successfully completed at 10:12:08
C:\Documents and Settings\Administrator>sqlplus "/as sysdba"

SQL*Plus: Release 10.2.0.4.0 - Production on Wed Sep 21 10:12:47 2011

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select instance_name from v$Instance;

INSTANCE_NAME
--------------------------------
test

SQL> select * from roger.zifuji;

NAME
------------
海天
海天
海天
海天

SQL> select dump('海天') from dual;

DUMP('海天')
-----------------------------------------------
Typ=96 Len=6: 230,181,183,229,164,169

SQL> conn roger/roger
Connected.

SQL> drop table zifuji;

Table dropped.

我们看到,在UTF8下面这2个汉字共占据了6个字节,即每个汉字3个字节。
由于原表结构为 zifuji(name varchar2(6)); 将其修改为4然后再试试。

#### 原库

SQL> conn roger/roger
Connected.

SQL> desc zifuji

 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 NAME                                               VARCHAR2(6)

SQL> alter table zifuji modify(name varchar2(4));

Table altered.

SQL> desc zifuji

 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 NAME                                               VARCHAR2(4)

SQL> select * from zifuji;

NAME
----
海天
海天
海天
海天
#### 导出该表

C:\Documents and Settings\Administrator> set nls_lang=AMERICAN_AMERICA.ZHS16GBK

C:\Documents and Settings\Administrator> expdp system/oracle directory=data_pump_dir dumpfile=roger.dmp tables=roger.zifuji

Export: Release 10.2.0.4.0 - Production on Wednesday, 21 September, 2011 10:21:59

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SYSTEM"."SYS_EXPORT_TABLE_01":  system/******** directory=data_pump_dir dumpfile=roger.dmp tables=roger.zifuji
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 64 KB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "ROGER"."ZIFUJI"                            4.945 KB       4 rows
Master table "SYSTEM"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYSTEM.SYS_EXPORT_TABLE_01 is:
  G:\ORACLE\PRODUCT.2.0\ADMIN\ALEX\DPDUMP\ROGER.DMP
Job "SYSTEM"."SYS_EXPORT_TABLE_01" successfully completed at 10:22:21
#### 导入到目标库

C:\Documents and Settings\Administrator> set nls_lang=AMERICAN_AMERICA.ZHS16GBK

C:\Documents and Settings\Administrator> impdp system/oracle directory=data_pump_dir dumpfile=roger.dmp remap_schema=roger:roger remap_tablespace=roger:users

Import: Release 10.2.0.4.0 - Production on Wednesday, 21 September, 2011 10:24:35

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_FULL_01":  system/******** directory=data_pump_dir dumpfile=roger.dmp remap_schema=roger:roger remap_tab
e=roger:users
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
ORA-02374: conversion error loading table "ROGER"."ZIFUJI"
ORA-12899: value too large for column NAME (actual: 6, maximum: 4)

ORA-02372: data for row: NAME : 0X'BAA3CCEC'

ORA-02374: conversion error loading table "ROGER"."ZIFUJI"
ORA-12899: value too large for column NAME (actual: 6, maximum: 4)

ORA-02372: data for row: NAME : 0X'BAA3CCEC'

ORA-02374: conversion error loading table "ROGER"."ZIFUJI"
ORA-12899: value too large for column NAME (actual: 6, maximum: 4)

ORA-02372: data for row: NAME : 0X'BAA3CCEC'

ORA-02374: conversion error loading table "ROGER"."ZIFUJI"
ORA-12899: value too large for column NAME (actual: 6, maximum: 4)

ORA-02372: data for row: NAME : 0X'BAA3CCEC'

. . imported "ROGER"."ZIFUJI"                            4.945 KB       0 out of 4 rows
Job "SYSTEM"."SYS_IMPORT_FULL_01" successfully completed at 10:24:38
报错是因为在UTF8下,一个汉字是3个字节了,而原来的表结构 zifuji(name varchar2(6)); 字段长度为4。

最后需要说明一点的是,AL32UTF8包含了所有的ZHS16GBK汉字,只是其编码方式不同而已。

评论

  1. 学习!

  2. 请问做这种类型的字符集修改时有没有什么简单的方式?

    我是先导出,导入的时候先导表结构,然后把所有varchar2 char都改大,然后再导入数据。

    不知有没有更好的做法。

  3. 还是你行 。翻遍了网,只有你说清楚了。

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注