How to change both charset (NLS_CHARACTERSET) and National Charset (NLS_NCHAR_CHARACTERSET) to UTF8 on a RDS Oracle database? How to change both charset (NLS_CHARACTERSET) and National Charset (NLS_NCHAR_CHARACTERSET) to UTF8 on a RDS Oracle database? oracle oracle

How to change both charset (NLS_CHARACTERSET) and National Charset (NLS_NCHAR_CHARACTERSET) to UTF8 on a RDS Oracle database?


The parameter can be set by specifying --character-set-name during instance creation using AWS CLI. So far I have not found anyway to change that for an existing instance.

In my testing, I set it using --character-set-name KO16MSWIN949, which will support Korean based on AWS doc:

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.OracleCharacterSets.html


AWS RDS has no way to set the Oracle Database National Charset (NLS_NCHAR_CHARACTERSET) to UTF8. NLS_NCHAR_CHARACTERSET will always be AL16UTF16. Data types NVARCHAR2, NCHAR, and NCLOB are affected. For the sake of discussion, I will refer to these data types as NCHAR.

Use of AL16UTF16 has space consequences in a migration. As the name implies, all characters are stored as 16 bits (2 bytes). For example, the Western letter 'A' will be stored zero-padded as '\0','A'.

Because of this, the space requirement at the migration target could be higher than at the source. How much higher depends on the prevalence of NCHAR columns. 25% higher is an actual example from experience. An 8 TB schema on conventional hardware required 10 TB on AWS RDS.

If your NLS_CHARACTERSET is AL32UTF8, then one way to prevent migration to the space-wasting AL16UTF16 character set is to migrate your NCHAR columns to CHAR. Example:

from:

CREATE TABLE ...( "BUSINESS_UNIT" NVARCHAR2(5) NOT NULL ENABLE, 

to:

alter session set NLS_LENGTH_SEMANTICS = 'CHAR';CREATE TABLE ...( "BUSINESS_UNIT" VARCHAR2(5) NOT NULL ENABLE,etc.


Using Cloudformation you can set it too in the CharacterSetName attribute of AWS::RDS::DBInstance.