Encodings in GeneXus

Official Content
This documentation is valid for:

The encodings used in the standard functions that require it are unified for all generators. To this end, a domain called Encoding and of type Character(255) is distributed with GeneXus.
This makes the code more portable and helps you to choose valid values.

For example, you can write: &xmlreader.SetDocEncoding(Encoding.UTF-8)

In addition to this method of XML data types, there are these other functions and methods that also receive as parameter a string with the encoding name.

The functions and methods that use encoding are as follows:

  • ByteCount(CharacterExpression:Att|Var|Cons, Encoding:Att|Var|Cons): Numeric
  • DFWOpen(FileName: Character, FieldDelimiter: Character, StringDelimiter: Character, Append: Numeric, Encoding:Character) : Numeric
  • DFROpen(FileName: Character, RegLength: Numeric, FieldDelimiter: Character, StringDelimiter:Character, Encoding:Character): Numeric
  • &xmlreader.SetDocEncoding(Encoding:Character)
  • &xmlreader.SetNodeEncoding(Encoding:Character)
  • &xmlwriter.WriteStartDocument(Encoding:Character, Standalone:Boolean)

Encoding Domain (Unified encoding list)

  .NET Java
ASCII Yes Yes
Big5 Yes Yes
Big5-HKSCS No Yes
EUC-JP Yes Yes
EUC-KR Yes Yes
GB18030 Yes Yes
GB2312 Yes Yes
GBK Yes Yes
IBM850 Yes Yes
ISO-2022-JP Yes Yes
ISO-8859-1 Yes Yes
ISO-8859-10 No No
ISO-8859-13 Yes Yes
ISO-8859-15 Yes Yes
ISO-8859-16 No No
ISO-8859-2 Yes Yes
ISO-8859-3 Yes Yes
ISO-8859-4 Yes Yes
ISO-8859-5 Yes Yes
ISO-8859-6 Yes Yes
ISO-8859-7 Yes Yes
ISO-8859-8 Yes Yes
ISO-8859-9 Yes Yes
KOI8-R Yes Yes
KOI8-U Yes No
KSC_5601 Yes Yes
Shift_JIS Yes Yes
TIS-620 Yes Yes
US-ASCII Yes Yes
UTF-16BE BOM Yes Yes
UTF-16LE BOM Yes Yes
UTF-32 Yes No
UTF-32 BOM Yes No
UTF-32BE BOM Yes No
UTF-32LE BOM Yes No
UTF-8 Yes Yes
UTF-8 BOM Yes No
Windows-1250 Yes Yes
Windows-1251 Yes Yes
Windows-1252 Yes Yes
Windows-1253 Yes Yes
Windows-1254 Yes Yes
Windows-1255 Yes Yes
Windows-1256 Yes Yes
Windows-1257 Yes Yes
Windows-1258 Yes Yes
Windows-31J No Yes
Windows-874 Yes Yes

BOM (Byte Order Mark)

UTF* BOM encodings are used to indicate that a byte order mark is to be used at the beginning of the file or stream to specify the Unicode type in which the text in the file or stream is encoded.

For UTF-8, the BOM is represented by the sequence 0xEF, 0xBB, 0xBF.

For UTF-16BE, the sequence is 0xFE, 0xFF

For UTF-16LE, the sequence is 0xFF, 0xFE

For UTF-32BE, the sequence is 0x00, 0x00, 0xFE, 0xFF

For UTF-32LE, the sequence is 0xFF, 0xFE, 0x00, 0x00

See Also

Encoding Management