Unicode

Unicode series
Unicode
UCS
UTF-7
UTF-8
UTF-16
UTF-32
SCSU
Punycode

Unicode it is the standard of codification of characters developed by Unicode Consortium. This comes being adopted for many great companies in the direction to standardize the codification of characters. This codification always was problematic due to existence of different standards ( pt, en, etc, EBCDIC, etc.) and of incompatibility between them, what it made with that the representation of text between different languages was confused due to the different interpretations, for example, of the characters special and accented (ç, Ç, ã, Ã, õ, Õ, ö, Ö, etc.).

Unicode it associates a number each to caracter, independent of the program, platform or language.

Unicode it encloses almost all currently the writings in use, beyond extinct the historical writings already and the symbols, in special the mathematicians and the musical comedies.

The first version (), based in the codification of 16 bit accumulated of stocks 65.536 characters. Already last (Unicode 4.1), arrives close to a million of characters.

The set of Unicode characters has some forms of representation as: UTF-8, UTF-16 e UTF-32.

To also see


 

  > Portuguese to English > pt.wikipedia.org (Machine translated into English)