There are many different standards of character encoding, both the Latin and Cyrillic. When I use the word encoding, imagine the set of characters in a font. What characters are in there and where are they mapped within that font determines encoding. For example, a standard used in the USA is called ASCII or American National Standard Code for Information Interchange; its most common form is 7-bit (128 characters) encoding which only contains characters of the Latin alphabet.
Besides KOI8, there are several more methods of encoding Cyrillic text, and while surfing the Internet, you might see the names Codepage 1251 (MS-Windows ANSI) and Codepage 866 (Alternative PC). Those encodings are more commonly used on Windows and DOS computers, respectively.
KOI8
KOI stands for "Kod Obmena Informatsii" or Code of Information Exchange.
It is an 8-bit encoding (hence the name KOI8) which includes both Latin
and Cyrillic alphabets and is used in Russia predominantly for
communication purposes, such as e-mail, USENET, Internet publishing via
WWW, Gopher, etc.
The difference between OV and AV and the difference between koi8-r (RFC 1489) and koi8 ukrainian:
koi8-r | koi8 ukrainian |
Updated 5/26/98 The encoding of the Ukrainian and Belarussian characters given above is not quite correct. Please compare with ISO-IR-111, KOI8-R, KOI8-uni, and KOI8-U. Submitted by Andreas Prilop
More information of the "Cyrillic alphabet soup" is available.
Apple Standard Cyrillic
Other proprietary encodings (fonts)
CP866 and CP1251
Code Page 866 |
Code Page 1251 |
ISO 8859
Character Sets
ISO 8859 is a standardized series of 8bit
character sets for writing in Western alphabetic languages. It was
designed by the European Computer Manufacturer's Association (ECMA).