![]() #Ascii codepoints codeUnfortunately, 128 additional characters aren’t enough for the entire world: code pages varied by country (Russian code page, Hebrew code page, etc.). To solve this, computer makers defined “code pages” that used the undefined space from 128-255 in ASCII, mapping it to various characters they needed. Now, ASCII encoding works great for English text (using Western characters), but the world is a big place. ASCII does not explicitly define what values 128-255 map to. Note that values 0-127 fit in the lower 7 bits in an 8-bit byte. They map the numeric values 0-127 to various Western characters and control codes (newline, tab, etc.). You’ve probably heard of the ASCII/ANSI characters sets. Weird, yes, but see how much clearer it is?Įmbrace the philosophy that a concept and the data that stores it are different. Now imagine they came up and said “The following number is an ASCII character: 65”. You’d have no idea what they were talking about. Imagine if someone came up to you and said “65”. ![]() If you see the number 65 in binary, what does it really mean? “A” in ASCII? Your age? Your IQ? Unless there is some context, you’d never know. When reading data, you must know the encoding used in order to interpret it properly. Encodings differ in efficiency and compatibility. ![]() The idea of “A” can be encoded many different ways. An encoding is just a method to transform an idea (like the letter “A”) into raw data (bits and bytes). The concept of “A” is something different than marks on paper, the sound “aaay” or the number 65 stored inside a computer. If you’re like me, you’ll get an itch to read about the details in the Unicode specs or in Wikipedia. Read them alone, or as a follow-up to Joel’s unicode article above. Reading about Unicode is a nice lesson in design tradeoffs and backwards compatibility. ![]() Unicode isn’t hard to understand, but it does cover some low-level CS concepts, like byte order. But like many newbies, I had an urge to learn once my interest was piqued by an introduction to Unicode. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |