Migrating from sql server to Oracle varchar length issues
What is your database character set
SELECT parameter, value FROM v$nls_parameters WHERE parameter LIKE '%CHARACTERSET'
Assuming that your database character set is AL32UTF8, each character could require up to 4 bytes of storage (though almost every useful character can be represented with at most 3 bytes of storage). So you could declare your column as VARCHAR2(1020)
to ensure that you have enough space.
You could also simply use character length semantics. If you declare your column VARCHAR2(255 CHAR)
, you'll allocate space for 255 characters regardless of the amount of space that requires. If you change the NLS_LENGTH_SEMANTICS
initialization parameter from the default BYTE to CHAR, you'll change the default so that VARCHAR2(255)
is interpreted as VARCHAR2(255 CHAR)
rather than VARCHAR2(255 BYTE)
. Note that the 4000-byte limit on a VARCHAR2
remains even if you are using character length semantics.
If your data contains line breaks, do you need the TRAILING NULLCOLS
parameter? That implies that sometimes columns may be omitted from the end of a logical row. If you combine columns that may be omitted with columns that contain line breaks and data that is not enclosed by at least an optional enclosure character, it's not obvious to me how you would begin to identify where a logical row ended and where it began. If you don't actually need the TRAILING NULLCOLS
parameter, you should be able to use the CONTINUEIF parameter to combine multiple physical rows into a single logical row. If you can change the data file format, I'd strongly suggest adding an optional enclosure character.
The bytes used by an NVARCHAR
field is equal to two times the number of characters plus two (see http://msdn.microsoft.com/en-us/library/ms186939.aspx), so if you make your VARCHAR
field 512 you may be OK. There's also some indication that some character sets use 4 bytes per character, but I've found no indication that Hebrew is one of these character sets.