New essay!
November 18th, 2003
New essay! In which I invoke the unholy gods of ASCII art in the name of programmer education:
1100 0011 | 1010 1001 <= this is how it looks in the
page content.
110x xxxx | 10xx xxxx <= this is the UTF-8 template for
"character between 0x80 and
0x7FF".
---0 0011 | --10 1001 To reconstruct the Unicode for
| || \_ || | | which character that is, take
\|| \_\ || | | all the x's and mush them together
\ \ || | | at the end of a 16-bit field.
0000 0000 | 1110 1001 <= Lo, it is Unicode 0x00E9, commonly
written "U00E9", which is "é"
The full essay is here: “Why can't Amar read (Unicode)?”