Re: my laziness with literals


Dan Brickley wrote:
> Garret Wilson wrote:
>> But if we're going to produce semantic rich data that can be 
>> machine-processed, we need to store things as they are, with 
>> appropriate indication of type.
> I'm not convinced of this. RDF/XML's syntax for datatyping is pretty 
> heavyweight, and there are many RDF vocabularies that pre-date RDFCore 
> (ie. created between 1997-2003).

I was making a normative assertion---saying the way things *should* be 
going forward.

I agree completely with your comments regarding RDF/XML typed literal 
syntax---but that's a problem with RDF/XML. If RDF/XML made typed 
literals as easy to use as plain literals, would you agree with me when 
I say that we *should* use appropriate types in the future rather than 
making plain literals our first choice?

> It would be good to have a notation in RDFS/OWL (maybe OWL1.1 could do 
> it) to indicate that some plain-literal-valued property takes string 
> values that can be cast to some specified datatype.

OMG. Of course it would be useful, but it's ludicrous because of what it 
says about how hard it is to use RDF/XML datatypes. I'm wondering 
whether to laugh or to cry (not because what you say is laughable---but 
because of the conditions that make your suggestion useful).

> RDF has special handling for URIs. Almost always people are interested 
> in the thing the URI is identifying, not in the URI string itself.

I'm not sure what you're saying. Are you saying that any time a 
processor sees a plain literal starting with "", it 
should assume that the type is URI because people never want to identify 
the URI string itself? If we have to rely on the context, aren't we back 
in plain XML land?

If people want a string of "", they should use a 
string. If people want a URI of <>, they should use a 
typed literal of xsd:anyURI type. If they want a resource identified by 
the URI <>, they should use a resource with that URI. 
Isn't that the perfect world scenario? That's what I was pushing for---a 
perfect world. :)

>> * If you're going to store a language, use something like 
>> info:lang/en/US.
>> * If you're going to store a Java class, use something like 
>> info:lang/com/example/package#Class.
> There is a java: URI scheme. This is used for example in ARQ for 
> dynamic  code loading. I don't see a case for using info: instead.

There might have been *plans* for a Java URI scheme back when you 
suggested it over eight years ago 
(<>), but I 
don't think it was ever standardized, and the link you cited 
(<>) no longer references such 
a scheme. If such a scheme has been standardized, by all means let me 
know. Otherwise, I'm going with info:java/ .

>> * If you're going to store an Internet media type, use something like 
>> info:media/text/plain.
> Or dc:format?

dc:format is a property. I'm talking about resource types. The whole 
point of RDF is that we can tell the types of resources without knowing 
what predicate is being used.

> It's good to agree on ways of doing these things, but your choices 
> seem a little arbitrary,

There is no "java:" URI scheme, so there is no alternative to 
info:java/. dc.format is a property, not a resource type, so saying that 
is an alternative is comparing apples to oranges. So I don't know of any 
alternatives to my choices---if there were choices, I would have used 
them. By all means, I'm interested in knowing other choices.

> and not yet widely used.

...because it's easier to stick things in plain literals.

> Can we take "Be liberal in what you accept, and conservative in what 
> you send." (see ) as a shared goal 
> here?

In a semantic context?! No, no, no!

"Be liberal in what you accept, and conservative in what you send" is 
useful in certain circumstances when interpreting syntax and protocols. 
But in a semantic context, it's horrible---I don't want to send you a 
string "123" and have you use it as the integer 123 just because you 
noticed I used digits in the string! Similarly, I don't want to send you 
the string "www.something" and have you try to look up a web page just 
because you noticed there was a "www" in there somewhere. Will the 
strength of our semantic exchange rely on how good our heuristic 
algorithms are? The whole point of a semantic framework is that we 
identify the types of things we're using! Otherwise, we could just stick 
everything in XML and tell people to guess about types based upon context.

Anyway, just a statement from experience trying to encapsulate best 
practice---didn't realize this would be controversial.



Received on Friday, 12 October 2007 14:26:54 UTC