In order to run multilanguage JCR on an Oracle backend Unicode
encoding for characters set should be applied to the database. Other
Oracle globalization parameters don't make any impact. The only property
to modify is NLS_CHARACTERSET.
We have tested NLS_CHARACTERSET =
AL32UTF8 and it works well for many European and
Asian languages.
Example of database configuration (used for JCR testing):
NLS_LANGUAGE AMERICAN NLS_TERRITORY AMERICA NLS_CURRENCY $ NLS_ISO_CURRENCY AMERICA NLS_NUMERIC_CHARACTERS ., NLS_CHARACTERSET AL32UTF8 NLS_CALENDAR GREGORIAN NLS_DATE_FORMAT DD-MON-RR NLS_DATE_LANGUAGE AMERICAN NLS_SORT BINARY NLS_TIME_FORMAT HH.MI.SSXFF AM NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR NLS_DUAL_CURRENCY $ NLS_COMP BINARY NLS_LENGTH_SEMANTICS BYTE NLS_NCHAR_CONV_EXCP FALSE NLS_NCHAR_CHARACTERSET AL16UTF16
JCR 1.12.x doesn't use NVARCHAR columns, so that the value of the parameter NLS_NCHAR_CHARACTERSET does not matter for JCR.
Create database with Unicode encoding and use Oracle dialect for the Workspace Container:
<workspace name="collaboration">
<container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
<properties>
<property name="source-name" value="jdbcjcr" />
<property name="dialect" value="oracle" />
<property name="multi-db" value="false" />
<property name="max-buffer-size" value="200k" />
<property name="swap-directory" value="target/temp/swap/ws" />
</properties>
.....