- From: Dan Brickley <danbri@danbri.org>
- Date: Fri, 12 Oct 2007 15:41:49 +0200
- To: Garret Wilson <garret@globalmentor.com>
- CC: Semantic Web <semantic-web@w3.org>
Garret Wilson wrote: > > Just a comment and a bit of general advice: > > I've noted that, when storing data, I'm either very lazy or averse to > verboseness (the latter not on this list, of course ;) ). I seem to want > to stick everything into a plain literal. I was converting some of my > old data to a new format today and it wasn't working. Then I realized > that my integers were stored as plain literals when I could have used > xsd:integer. My booleans were stored as plain literals when I could have > used xsd:boolean. My code was balking at a bunch of strings when my API > wanted numbers and booleans. > > And I'm not the only one. The way RDF has evolved from plain literals to > typed literals, along with the verbose RDF/XML syntax for typed > literals, has helped bring out the laziness in all of us. Want a > language? Stick it in the plain literal "en-US". Want a URI? Stick it in > a plain literal. Want a date? Stick it in a plain literal. Want an > Internet media type? Stick it in a plain literal. > > But if we're going to produce semantic rich data that can be > machine-processed, we need to store things as they are, with appropriate > indication of type. > > So my plea to all data-architects: I'm not convinced of this. RDF/XML's syntax for datatyping is pretty heavyweight, and there are many RDF vocabularies that pre-date RDFCore (ie. created between 1997-2003). It would be good to have a notation in RDFS/OWL (maybe OWL1.1 could do it) to indicate that some plain-literal-valued property takes string values that can be cast to some specified datatype. > > * If you're going to store a number, use a typed literal with > xsd:integer or similar. > * If you're going to store a boolean, use a typed literal with > xsd:boolean or similar. > * If you're going to store a URI, use a typed literal with xsd:anyURI. RDF has special handling for URIs. Almost always people are interested in the thing the URI is identifying, not in the URI string itself. > * If you're going to store a language, use something like info:lang/en/US. > * If you're going to store a Java class, use something like > info:lang/com/example/package#Class. There is a java: URI scheme. This is used for example in ARQ for dynamic code loading. I don't see a case for using info: instead. > * If you're going to store an Internet media type, use something like > info:media/text/plain. Or dc:format? It's good to agree on ways of doing these things, but your choices seem a little arbitrary, and not yet widely used. > I know it's easier just to stick these things in plain literals, but > when someone else tries to machine-process your data, it has to take > what's there. I'm going to suppress my laziness and stop producing > specifications and data the rely on plain literals as a crutch. I > encourage everyone to do the same. Can we take "Be liberal in what you accept, and conservative in what you send." (see http://www.postel.org/postel.html ) as a shared goal here? Of course defining "conservative" here is the slippery part :) cheers, Dan > Best, > > Garret >
Received on Friday, 12 October 2007 13:42:27 UTC