Code pages are tables of values that describe the character set for a particular language. The following table lists the code pages supported by International Components for Unicode (ICU).
|
Code page |
Description |
|---|---|
|
ASCII |
7-bit ASCII |
|
LATIN1 |
ISO 8859-1 Western European |
|
ISO8859_2 |
ISO 8859-2 Eastern European |
|
ISO8859_3 |
ISO 8859-3 Southeast European |
|
ISO8859_4 |
ISO 8859-4 Baltic |
|
ISO8859_5 |
ISO 8859-5 Cyrillic |
|
ISO8859_6 |
ISO 8859-6 Arabic |
|
ISO8859_7 |
ISO 8859-7 Greek |
|
ISO8859_8 |
ISO 8859-8 Hebrew |
|
ISO8859_9 |
ISO 8859-9 Latin 5 (Turkish) |
|
ISO8859_10 |
ISO 8859-10 Latin 6 (Nordic) |
|
ISO8859_11 |
ISO 8859-11 Thai |
|
ISO8859_13 |
ISO 8859-13 Latin 7 (Baltic Rim) |
|
ISO8859_14 |
ISO 8859-14 Latin 8 (Celtic) |
|
ISO8859_15 |
ISO 8859-15 Latin 9 (Western Europe) |
|
UTF_8 |
UTF-8 encoding of Unicode |
|
EUC_CN |
Simplified Chinese Combined (367 + 1382) |
|
EUC_KR |
Korean EUC Combined (367 + 971) |
|
EUC_JP |
Japanese Combined (895 + 952 + 896 + 953) |
|
EUC_TW |
Taiwan Extended UNIX® Code (CNS 11643-1986), Combined (367 + 960 + 961) |
|
UCS2 |
UCS-2 (Really UTF-16 BE) |
|
CP037 |
IBM® EBCDIC US English |
|
CP037_S390 |
IBM EBCDIC US English LF & NL reversed |
|
CP256 |
IBM EBCDIC Netherlands |
|
CP259 |
IBM EBCDIC Symbols Set 7 |
|
CP273 |
IBM EBCDIC German |
|
CP274 |
IBM EBCDIC Belgium |
|
CP275 |
IBM EBCDIC Brazil |
|
CP276 |
IBM EBCDIC French-Canada |
|
CP277 |
IBM EBCDIC Danish |
|
CP278 |
IBM EBCDIC Swedish |
|
CP280 |
IBM EBCDIC Italian |
|
CP282 |
IBM EBCDIC Portugal |
|
CP284 |
IBM EBCDIC Latin American Spanish |
|
CP285 |
IBM EBCDIC UK English |
|
CP290 |
IBM EBCDIC Japanese Katakana |
|
CP297 |
IBM EBCDIC French |
|
CP420 |
IBM EBCDIC Arabic |
|
CP421 |
IBM EBCDIC Maghreb/French |
|
CP423 |
IBM EBCDIC Greek |
|
CP424 |
IBM EBCDIC Latin/Hebrew |
|
CP437 |
MS-DOS US English |
|
CP500 |
IBM EBCDIC 500V1 |
|
CP708 |
Arabic (ASMO 708) |
|
CP709 |
Arabic (ASMO 449+, BCON V4) |
|
CP710 |
Arabic (Transparent Arabic) |
|
CP720 |
Arabic (Transparent ASMO) |
|
CP737 |
reek (formerly 437G) |
|
CP770 |
Lithuanian Standard RST 1095-89 |
|
CP771 |
KBL (Lithuanian and Russian characters) |
|
CP772 |
Lithuanian Standard LST 1284:1993 |
|
CP773 |
Lithuanian (Mix of 771 and 775) |
|
CP774 |
Lithuanian Standard 1283:1993 |
|
CP775 |
Baltic |
|
CP776 |
Lithuanian 770 extended |
|
CP777 |
Lithuanian 771 extended |
|
CP778 |
Lithuanian 775 extended |
|
CP790 |
Mazovia (Polish + codepage 437 extended characters |
|
CP803 |
IBM EBCDIC Hebrew (old) |
|
CP813 |
ISO 8859-7 Greek/Latin |
|
CP819 |
ISO 8859-1 Latin Alphabet No. 1 |
|
CP833 |
IBM EBCDIC Korean SBCS |
|
CP834 |
IBM EBCDIC Korean DBCS |
|
CP835 |
IBM EBCDIC Traditional Chinese DBCS |
|
CP837 |
IBM EBCDIC Simplified Chinese DBCS |
|
CP838 |
IBM EBCDIC Thai |
|
CP850 |
MS-DOS Latin 1 |
|
CP851 |
MS-DOS Greek |
|
CP852 |
MS-DOS Slavic (Latin 1) |
|
CP853 |
MS-DOS Turkey Latin 3 (replaced by Latin 5) |
|
CP855 |
IBM Cyrillic (primarily Russian) |
|
CP856 |
PC Hebrew |
|
CP857 |
IBM Turkish (Latin 5) |
|
CP860 |
MS-DOS Portuguese |
|
CP861 |
MS-DOS Icelandic |
|
CP862 |
Hebrew (Migration) |
|
CP863 |
MS-DOS Canadian-French |
|
CP864 |
PC Arabic |
|
CP865 |
MS-DOS Nordic |
|
CP866 |
MS-DOS Russian |
|
CP868 |
MS-DOS Urdu |
|
CP869 |
IBM Modern Greek |
|
CP870 |
IBM EBCDIC Multilingual Latin 2 |
|
CP871 |
IBM EBCDIC Icelandic |
|
CP872 |
PC Cyrillic with Euro update |
|
CP874 |
MS-DOS Thai, superset of TIS 620 |
|
CP875 |
IBM EBCDIC Greek |
|
CP878 |
KOI-R (Cyrillic) |
|
CP880 |
Cyrillic Multilingual |
|
CP899 |
PC Symbols |
|
CP905 |
IBM EBCDIC Turkey Latin 3 (replaced by Latin 5) |
|
CP912 |
ISO 8859-2; ROECE Latin-2 Mulitlingual |
|
CP913 |
ISO 8859-3 Southeast European |
|
CP914 |
ISO 8859-4 Baltic |
|
CP915 |
ISO 8859-5; Cyrillic; 8-bit ISO |
|
CP916 |
ISO 8859-8; Hebrew |
|
CP918 |
IBM EBCDIC Urdu |
|
CP920 |
ISO 8859-9; Latin 5 |
|
CP921 |
ISO Baltic (8-bit) |
|
CP922 |
ISO Estonia (8-bit) |
|
CP929 |
Thai PC double byte |
|
CP930 |
IBM EBCDIC Japanese Katakana Extended, Combined (290 + 300) |
|
CP931 |
IBM EBCDIC Japanese Latin-Kanji, Combined (037 + 300) |
|
CP932 |
MS Windows® Japanese, superset of Shift-JIS, Combined (897 + 301) |
|
CP933 |
IBM EBCDIC Korean Combined (833 + 834) |
|
CP934 |
Korean PC Combined (891 + 926) |
|
CP935 |
IBM EBCDIC Simplified Chinese, Combined (836 + 837) |
|
CP936 |
MS Windows Simplified Chinese, Combined (903 + 928) |
|
CP937 |
IBM EBCDIC Traditional Chinese, Combined (037 + 835) |
|
CP938 |
Traditional Chinese Combined (904 + 927) |
|
CP939 |
IBM EBCDIC Japanese Latin Extended, Combined (1027 + 300) |
|
CP942 |
MS-DOS Japanese Kana Combined (1041 + 301) |
|
CP943 |
MS-DOS Japanese Combined (1041 + 941) |
|
CP944 |
Korean PC Combined (1040 + 926) |
|
CP946 |
Simplified Chinese PC Combined (1042 + 928) |
|
CP948 |
MS-DOS Traditional Chinese, Combined (1043 + 927) |
|
CP949 |
MS Windows Korean, superset of KS C 5601-1992, Combined (1088 + 951) |
|
CP950 |
MS Windows Traditional Chinese, superset of Big 5, Combined (1114 + 947) |
|
CP1004 |
PC-data Latin-1 extended desktop publishing |
|
CP1006 |
Urdu, 8-bit |
|
CP1008 |
Arabic, 8-bit ISO/ASCII |
|
CP1025 |
IBM EBCDIC Cyrillic |
|
CP1026 |
IBM EBCDIC Turkish |
|
CP1027 |
IBM EBCDIC Japanese Extended Single Byte |
|
CP1040 |
Korean PC extended Single Byte |
|
CP1041 |
Japanese PC extended Single Byte |
|
CP1043 |
Traditional Chinese extended Single Byte |
|
CP1046 |
Arabic |
|
CP1047 |
Latin 1 / Open Systems (US 3270) |
|
CP1047_S390 |
Latin 1 / Open Systems (US 3270) LF & NL reversed |
|
CP1051 |
HP-UX Latin1 |
|
CP1097 |
IBM EBCDIC Farsi |
|
CP1098 |
MS-DOS Farsi |
|
CP1112 |
IBM EBCDIC Baltic Multilingual |
|
CP1114 |
Traditional Chinese Single Byte (IBM Big 5) |
|
CP1115 |
Simplified Chinese Single Byte (IBM GB) |
|
CP1122 |
IBM EBCDIC Estonia |
|
CP1123 |
IBM EBCDIC Cyrillic Ukraine |
|
CP1124 |
Cyrillic Ukraine 8-bit |
|
CP1130 |
IBM EBCDIC Vietnamese |
|
CP1137 |
IBM EBCDIC India |
|
CP1140 |
IBM EBCDIC US (with Euro) |
|
CP1141 |
IBM EBCDIC Germany, Austria (with Euro) |
|
CP1142 |
IBM EBCDIC Denmark (with Euro) |
|
CP1143 |
IBM EBCDIC Sweden (with Euro) |
|
CP1144 |
IBM EBCDIC Italy (with Euro) |
|
CP1145 |
IBM EBCDIC Spain (with Euro) |
|
CP1146 |
IBM EBCDIC UK Ireland (with Euro) |
|
CP1147 |
IBM EBCDIC France (with Euro) |
|
CP1148 |
IBM EBCDIC International Latin1 (with Euro) |
|
CP1149 |
IBM EBCDIC Iceland (with Euro) |
|
CP1153 |
IBM EBCDIC Latin2 (with Euro) |
|
CP1154 |
IBM EBCDIC Cyrillic (with Euro) |
|
CP1155 |
IBM EBCDIC Turkish (with Euro) |
|
CP1156 |
IBM EBCDIC Baltic Multilingual (with Euro) |
|
CP1157 |
IBM EBCDIC Estonia (with Euro) |
|
CP1158 |
IBM EBCDIC Cyrillic Ukraine (with Euro) |
|
CP1159 |
SBCS Traditional Chinese Host (with Euro) |
|
CP1160 |
IBM EBCDIC Thailand (with Euro) |
|
CP1164 |
IBM EBCDIC Vietnamese (with Euro) |
|
CP1250 |
MS Windows Latin 2 (Central Europe) |
|
CP1251 |
MS Windows Cyrillic (Slavic) |
|
CP1252 |
MS Windows Latin 1 (ANSI), superset of Latin1 |
|
CP1253 |
MS Windows Greek |
|
CP1254 |
MS Windows Latin 5 (Turkish), superset of ISO 8859-9 |
|
CP1255 |
MS Windows Hebrew |
|
CP1256 |
MS Windows Arabic |
|
CP1257 |
MS Windows Baltic Rim |
|
CP1258 |
MS Windows Vietnamese |
|
CP1279 |
Hitachi Japanese Katakana Host |
|
CP1361 |
MS Windows Korean (Johab) |
|
CP1381 |
MS-DOS Simplified Chinese Combined (1115 + 1380) |
|
CP1383 |
China EUC |
|
CP1386 |
GBK Chinese |
|
CP1392 |
Simplified Chinese GB18030 |
|
CP5026 |
IBM EBCDIC Japan Katakana-Kanji Combined (290 + 300) |
|
CP5028 |
Japan Mixed Combined (897 + 301) |
|
CP5031 |
IBM EBCDIC Simplified Chinese Combined (836 + 837) |
|
CP5033 |
IBM EBCDIC Traditional Chinese Combined (037 + 835) |
|
CP5035 |
IBM EBCDIC Japan Latin Combined (1027 + 300) |
|
CP5038 |
Japan Mixed Combined (1041 + 301) |
|
CP5045 |
Korean PC Combined (1088 + 951) |
|
CP5050 |
Japanese EUC Combined (895 + 952 + 896 + 953) |
|
CP5488 |
Simplified Chinese GB18030 |
|
CP9125 |
IBM EBCDIC Korean Combined (833 + 834) |
|
EuroShift_JIS |
Test code page, Shift-JIS with European characters |
|
SBCS |
Single Byte Code Set |
|
DBCS |
Double Byte Code Set |
|
MBCS |
MultiByte Code Set |
|
UTF16BE |
utf-16 big endian |
|
UTF16LE |
utf-16 little endian |
|
UTF32BE |
utf-32 big endian |
|
UTF32LE |
utf-32 little endian |
|
HZ |
HZ code set |
|
SCSU |
SCSU code set |
|
ISCII |
iscii code set |
|
UTF7 |
utf-7 |
|
BOCU1 |
bocu1 |
|
UTF16 |
utf16 code set |
|
UTF32 |
utf32 code set |
|
CESU8 |
cesu8 code set |
|
GB18030 |
gb18030 code set |