Home Home    Forum    Blog    Feed your aggregator (RSS 2.0)

The Johnnynine Weblog - Reading and Writing Unicode to/from a Unicode Oracle Database
A weblog by Johnny Hughes
 
 Thursday, November 20, 2008

Reading and Writing unicode strings to and from a Unicode Oracle database can be seem tricky and there isn't much information available on the subject.  The solution is actually fairly simple, just make sure you are using the proper connection string.

ADO.NET: This is simple, using the OracleClient ensure that the connection string has: Unicode=true

Example: Data Source=ds;User ID=username;Password=pass1;Unicode=true;

ADO (VB6): In short the only 100% compatible solution is to use the ADO OraOleDB.Oracle data provider.

Example: Provider=ORAOLEDB.ORACLE;Data Source=ds;User Id=username;Password=password;

It also seems that the East Asian language files must be installed when writing those characters (at least with ADO).  This is done from the Regional and Language Option control panel, Languages tab.

When using parameters (at least with ADO) if not using a DSN connection (which you can't really if using unicode), the appropriate parameter properties must be set manually (such as size), otherwise the following error occurs: Run-time error '-2147217839 (80040e51)  Provider cannot derive parameter information and SetParameterInfo has not been called.

 

The grid below shows my test results from a VB6 application using ADO.

Actual 2 Japanese chars written: "とら" which has unicode values of 12392 and 12425.

Oracle 10g AL32UTF8 database.

write method  vb shows in      read     read char 1    readchar 2        actual value stored
              debug window     length   unicode value  unicode value     in the database

MSDAORA: Provider=msdaora;Data Source=ds;User Id=username;Password=password;
FROM SQL:     ‚Ƃ砠           2        12392          12425             ‚Æ‚ç  (4 characters)
FROM PARARMS: ‚Ƃ砠           2        12392          12425             ‚Æ‚ç  (4 characters)
PRESAVED:     ¿¿               2        -129           -129              とら
Conclusion: Reads and writes each encoded byte as separate characters, so the database ends up storing 4 characters in this case instead of 2.  Reading is consistent with writing so the resulting read value is actually correct in memory, but the stored database value is incorrectly stored.  Reading a properly stored unicode string from the database results in a corrupt read value.

DSN (Force SQL_WCHAR Support checked): dsn=dsnname;uid=username;pwd=password
FROM SQL:     ??               2        191            191               ¿¿
FROM PARARMS: ‚Ƃ砠           2        12392          12425             とら
PRESAVED:     ‚Ƃ砠           2        12392          12425             とら
Conclusion: When writing unicode strings using normal sql the stored value is corrupt, however using parameters when writing results in a properly stored value. Reading a properly stored unicode string from the database is successfull.

DSN (no workarounds checked): dsn=dsnname;uid=username;pwd=password
FROM SQL:     ¿¿               2       -129            -129              ¿¿
FROM PARARMS: ¿¿               2       -129            -129              とら
PRESAVED:     ¿¿               2       -129            -129              とら
Conclusion: When writing unicode strings using normal sql the stored value is corrupt, however using parameters when writing results in a properly stored value. Reading a properly stored unicode string from the database is NOT successful.

ORAOLEDB.ORACLE: Provider=ORAOLEDB.ORACLE;Data Source=ds;User Id=username;Password=password;
FROM SQL:     ‚Ƃ砠           2        12392          12425             とら
FROM PARARMS: ‚Ƃ砠           2        12392          12425             とら
PRESAVED:     ‚Ƃ砠           2        12392          12425             とら
Conclusion: This works in all situations.

The PRESAVED lines above indicate a record was previously saved to the database with a valid unicode value and was then read using the indicated data provider.

Thursday, November 20, 2008 3:49:10 PM (US Mountain Standard Time, UTC-07:00)  #    Comments [1]    | 
Copyright © 2010 Johnny Hughes. All rights reserved.
DasBlog 'Portal' theme by Johnny Hughes.