Adding new text encoding to OpenOffice.org
Adding a new text encoding involves a number of modifications, mostly in the sal/textenc source directory (see below). The main benefits of adding a text encoding typically are that you can add it to the drop-down list boxes where you choose a character encoding (e.g., for the "Text Encoded" filter for importing/exporting plain text files, or in the "Tools - Options... - Load/Save - HTML Compatibility" dialog).
The necessary steps include:
-
Add a new
RTL_TEXTENCODING_
to sal/inc/rtl/textenc.h, and updatertl_isOctetEncoding
in sal/textenc/tencinfo.c. -
In sal/textenc,
create a
static ImplTextEncodingData
instance for the new encoding. If it is a single-byte encoding based on ASCII, you can easily reuse theImplTextConverter
framework that is already in place. You also have to specify corresponding Windows/Unix/MIME character sets, if such exist. If your text encoding is single-byte, yourImplTextEncodingData
instance would probably best fit into one of the already existing tcvtXXX1.tab source files (e.g., if your encoding is for Tibetan, it would fit into tcvteas1.tab). Do not forget to add a pointer to yourImplTextEncodingData
structure to theaData
array in textenc.cxxImpl_getTextEncodingData()
. Note that for older versions (OOo1.x SRX645 branch and up to SRC680_m44) this was textenc.c - You should add tests of your new encoding to sal/test/texttestenc.cxx.
-
To make the new encoding appear in the drop-down list boxes mentioned above, you have:
- open an issue for
Liz@openoffice.org
, write the path of where the string is going to go (for example, "Tools - Options... - Load/Save - HTML Compatibility" dialog) and suggest the new string you want added (in English) and wait for approval for the string as well as the German equivalent from Liz. (If you can suggest the string in German as well that would be helpful. - add both the English and German strings to the resources in svx/source/dialog/txenctab.src file manually; that means directly into the source code.
- open an issue for
That said, adding any new text encodings also adds complexity to OpenOffice.org (mostly source and code size in this case), so it should be done only if really necessary (e.g., if there is demand to import/export plain text files encoded in a special encoding; adding a new text encoding would generally not be necessary to just make sure that OpenOffice.org can be localized to some language, or that texts in that language can be written in OpenOffice.org).
If you have any further questions, just ask on our list.