Hungarian data for read-write apps

read-write web turns the web from a world wide "book" into a world
wide "software program". this means we can now interact with it not
only to consume information, but also to push buttons that record
(editing the web), as well as effectuate (APIs on the internet of
things), changes in the real world things described by the web's
documents.

i've been thinking about how to model this in a way that reduces the
number of potential bugs ('word jokes'). if you're subscribed to the
public-tag cg, you may have seen my thread there about distinguishing
various types of URLs in our software. this train of thought has lead
to this sunny Saturday morning coming up with an Apps Hungarian
notation for local variables in javascript applications that
manipulate the read-write web, as well as for attribute names in
machine-readable documents on the web. Even if you don't jump to adopt
it, it might be an interesting straw man argument about how we should
name things in linked data in general, and on the read-write web
especially.

The only thing i love more than computer science is metaphysics, so
let's start with some of that:
The way we will model the real world today is as a collection of
objects (our variables; not necessarily physical objects, but any
'thing') that have certain relations to each other (the relations are
what makes for our actual data). We can describe the real world with
an array of statements, each of which says that thing X has relation R
to thing Y. Since objects are only variables, to represent them we
need to talk about how they relate to other objects. That's why in
computer programs, objects are represented as lists of relations to
other objects. so if X is the 'parent' of Y (without specifying
further what that means), then we can do:

var Y;
var X = {
  parentOf: Y
};

That's the basis of our metaphysics here. Now we can make it a bit
richer, by saying that an object can be zero or one of the following:
document (string), statement (boolean), scalar (number), array.

If an object is a document, then it has contents (which may be the
empty string). If a thing is a statement, then it has a truth value
(which may be undecided or unknown in the current world view)). If a
thing is a scalar then it has a numeric value. If a thing is an array,
then it has zero or more element objects in it. Simple, right?

Let's add one more useful data structure that computer scientists like
to use when describing the world: the tree. A tree has a root, nodes,
and leaves. The root points to nodes or leaves, nodes point to nodes
or leaves, and the leaves are objects. Actually an array is a special
kind of tree, but let's treat them separately because of how arrays
are used in software.

To talk more about representation, we already saw that an object can
be represented by a variable, but it can also be represented by a
document, using an encoding. Example encodings are JSON, XML, and
human-readable language. Normally a document is either read-only or
read-writable for the current user. This affects what we can do with
this document in a 'read-write app' context. We also allow documents
to be append-only, like printer queues. In that case we call them
virtual documents.

Now we can introduce our Hungarian notation to deal with these
documents, statements, scalars and arrays, as well as with plain
objects.

attributes hash of the object:
_    If a variable holds a hash of attributes of object 'thing', then
we give this variable the name 'thing' (camel case).

document describing another object:
j    If a variable holds the contents of a document that describes
object 'Thing' using JSON, then we prepend a 'j', and use camel-case,
so: jThing
x    same for xml
h    same for human-readable language

virtual documents forming a way to contact another object:
c    for instance, a mailbox or a printer queue on the internet of
things is an appendable document (this is a bit abstract, but it helps
simplify our model of the web and the APIs that live there if we think
of message queues as just documents). if you have a variable in a
program where you accumulate messages for Person, but that does not,
as a variable, represent that person, then you would call this
variable cPerson.

boolean describing the truth value of a statement:
b    if a variable holds a boolean describing whether 'statement' is
true in the current world view, then we call this variable bStatement

number describing the value of a scalar:
n    if a variable holds a numeric value representing the value of
'scalar' in the current world view, then we call this variable nScalar

data structure containing objects:
a    if a variable holds an array of objects, then we can name that
variable aObject. for example a variable holding an array of 'author'
objects would be called aAuthor.
t    if a variable holds a tree (arrays, nested recursively), then we
can name it after its leaves, so if the leaves are authors, then we
call the tree variable tAuthor.

web presence:
p    if a variable holds the URL of document 'document' then it would
be confusing to call that variable 'document' (especially because both
documents and URLs are strings). so then it's better to call that
variable pDocument.

So that's the Hungarian notation that i'm proposing for the read-write web:
- no prefix for attribut hashes,
- j,x,h for strings describing objects
- c for 'virtual documents' that are contact interfaces of objects
- b for truth value of a statement
- n for numeric value of a scalar combinations
- a,t for composite objects
- p for strings (addresses) describing where to obtain/edit documents
or where to interact with virtual documents

So we could make combinations obviously, to get more powerful naming schemes.

For instance, applying Hungarian notation to both variable names and
attribute names, how would you Hungarianize this:

var book = {
  authorMailboxUrl = 'mailto:user@host'
};

The variable name 'book' is OK because the variable is an attribute
hash of  a book. But obviously, the string 'mailto:user@host' is not
an attribute hash of the author. It's an address at which we can
interact with the mailbox of the author. That's why this attribute
name is so long -  'authorMailboxUrl' - a really cumbersome attribute
name. So what a lot of (IMO sloppy) programmers would do is just call
the attribute 'author' and then say 'well, programmers that come after
me will know what i mean'. But luckily, we no longer have to be so
sloppy, and can now abbreviate it, first to 'pAuthorMailbox' and then
even to 'pcAuthor'.

That's my proposal for using Hungarian notation in linked data. i
think it's absolutely necessary to introduce this. i'm going to start
using it myself. It will probably change as i start using it, this is
just a first draft. Let me know what you think!


Cheers,
Michiel

Received on Saturday, 12 May 2012 09:40:56 UTC