my laziness with literals

Just a comment and a bit of general advice:

I've noted that, when storing data, I'm either very lazy or averse to 
verboseness (the latter not on this list, of course ;) ). I seem to want 
to stick everything into a plain literal. I was converting some of my 
old data to a new format today and it wasn't working. Then I realized 
that my integers were stored as plain literals when I could have used 
xsd:integer. My booleans were stored as plain literals when I could have 
used xsd:boolean. My code was balking at a bunch of strings when my API 
wanted numbers and booleans.

And I'm not the only one. The way RDF has evolved from plain literals to 
typed literals, along with the verbose RDF/XML syntax for typed 
literals, has helped bring out the laziness in all of us. Want a 
language? Stick it in the plain literal "en-US". Want a URI? Stick it in 
a plain literal. Want a date? Stick it in a plain literal. Want an 
Internet media type? Stick it in a plain literal.

But if we're going to produce semantic rich data that can be 
machine-processed, we need to store things as they are, with appropriate 
indication of type.

So my plea to all data-architects:

* If you're going to store a number, use a typed literal with 
xsd:integer or similar.
* If you're going to store a boolean, use a typed literal with 
xsd:boolean or similar.
* If you're going to store a URI, use a typed literal with xsd:anyURI.
* If you're going to store a language, use something like info:lang/en/US.
* If you're going to store a Java class, use something like 
info:lang/com/example/package#Class.
* If you're going to store an Internet media type, use something like 
info:media/text/plain.

I know it's easier just to stick these things in plain literals, but 
when someone else tries to machine-process your data, it has to take 
what's there. I'm going to suppress my laziness and stop producing 
specifications and data the rely on plain literals as a crutch. I 
encourage everyone to do the same.

Best,

Garret

Received on Friday, 12 October 2007 13:05:31 UTC