Re: my laziness with literals from Garret Wilson on 2007-10-12 (semantic-web@w3.org from October 2007)

From: Garret Wilson <garret@globalmentor.com>
Date: Fri, 12 Oct 2007 15:17:15 -0300
To: Frank Manola <fmanola@acm.org>
CC: Dan Brickley <danbri@danbri.org>, Semantic Web <semantic-web@w3.org>
Message-ID: <470FBA2B.5030605@globalmentor.com>
Frank Manola wrote:
>
> I wonder if you guys could clarify what you have in mind as the 
> problems with RDF/XML datatypes?  The reason I ask is that what many 
> people find troublesome is the requirement to explicitly specify a 
> type with every literal.  If that's what you're referring to, that's 
> not an artifact of RDF/XML, it's part of the way RDF itself defines 
> datatyped literals (and there's a daunting amount of email in the RDF 
> Core archives concerning the tradeoffs involved in that requirement).

I don't think the problem here is that RDF/XML uses datatypes for typed 
literals. I think the problem is that if you want to specify that 123 is 
an integer, you have to go out of your way---way out of your way. In N3 
(I believe) and in JSON, you can just write 123 and it gets an integer 
datatype. It only becomes a string if you put it in quotes: "123".

So like I said in my original post, this huge overhead of RDF/XML has 
made people get lazy and just use plain literals. As Henry pointed out, 
this isn't a problem in N3, and if we would have always had N3 then this 
wouldn't be a problem. But as it stands, RDF/XML is the high-profile 
serialization of RDF, and that's one reason why it's easier just to 
stick an integer or a URI or a date or a language or a media type or 
whatever in a plain literal.

I'm advocating using typed literals over plain literals, whatever syntax 
you use. If somebody wants to abandon RDF/XML to get this accomplished, 
I won't try to stop them. :)

> So if I'm publishing (integer) ages using literals obtained from a 
> Java program, and you are publishing (integer) ages using literals 
> obtained from an SQL database, we might want to use separate datatypes 
> (even if we're talking about values of the same RDF property), in 
> order to identify exactly the datatype the literal is associated with 
> at its source.

Fine! That's great---at least I can see what it is, so that I can know 
if I can convert it to/from Java/SQL. Just don't stick the integer in a 
plain literal, or I have nothing to go on!

> Anyway, in this context I think at least a form of "Be liberal in what 
> you accept, and conservative in what you send" continues to be good 
> advice, if we can understand "conservative" as meaning "provide the 
> receiver with as much relevant metadata (in this case, the datatype) 
> about what you're sending as you can".

As an interpretation, that's a stretch ;) , but I'll gladly accept the 
conclusion! Producers should provide as much semantics as possible, 
which is my sole point here. (Whether consumers should be liberal about 
conversion is a separate topic---one that doesn't interest me as much at 
this point in time.)


> Garret, what you seem to be suggesting is more of an agreed set of 
> types that everyone would agree to use, and where the mediation with 
> local types would already have taken place.

Actually, I wasn't even talking about local types at all. I'm just 
talking about types we all agree on, and asking everyone to use 
consistent non-plain-literal representations. We all agree that an 
xsd:integer is an xsd:integer, but so many ontologies and data sets just 
stick xsd:integers into plain literals---how was I supposed to know it 
was an integer? Similarly, we all agree what an Internet media type (or 
MIME type) is---we just don't have a way to represent them, which is why 
I proposed one URI representation. (But see my separate email I'll soon 
be sending.)

Converting different datatypes between systems (Java to/from SQL or 
whatever) is beyond the scope of my point.

Thanks for the response.

Garret
Received on Friday, 12 October 2007 18:18:11 UTC