Seek Memory: Code points are an abstraction

Thursday, September 15, 2016

Code points are an abstraction

In The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), Joel Spolsky discusses Unicode code points and gives a few examples, such as U+0639 representing the Arabic letter Ain and U+0041 representing the English letter A.

Splosky doesn't come right out and use the word abstraction, but the concept of an abstraction is exactly what he's talking about when he writes:

OK, so say we have a string:

Hello

which, in Unicode, corresponds to these five code points:

U+0048 U+0065 U+006C U+006C U+006F.
Just a bunch of code points. Numbers, really. We haven't yet said anything about how to store this in memory or represent it in an email message.

The Wikipedia article Abstraction (software engineering) contains the following quote attributed to John V. Guttag: “The essence of abstractions is preserving information that is relevant in a given context and forgetting information that is irrelevant in that context.”

In the context of Unicode code points, the information that is relevant is some hexadecimal number, like 0048, and the character that number represents (H). Information we might want to forget (at least temporarily), which may be irrelevant in the context of a general discussion about Unicode, is the number of bytes and the specific bits used to represent hexadecimal numbers like 0048.

O, The Stuff You'll Encounter!

Courses

Elements ← are you where this is

Exercises

Guide

Jazz

Beware The Bard

“If the subject of a Poem is obscure, or not generally known, or not interesting, and if it abounds with allusions, and facts of this improper, and uninteresting character, the writer who chuses the subject, and introduces those improper, and unaffecting allusions, and facts, betrays a great want of poetical judgment, and taste.”

—Percival Stockdale
(Radcliffe, "The Bard. A Pindaric Ode.")

The Programmer

“The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures.... Yet the program construct, unlike the poet's words, is real in the sense that it moves and works, producing visible outputs separate from the construct itself.... The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.”

—Fred Brooks
(The Mythical Man-Month, p 7)

Pages

Thursday, September 15, 2016

Code points are an abstraction