W3C home > Mailing lists > Public > public-rdf-wg@w3.org > July 2012

Re: in...of syntax Re: Turtle Last Call: Request for Review

From: Steve Harris <steve.harris@garlik.com>
Date: Tue, 31 Jul 2012 18:06:43 +0100
Cc: Jan Wielemaker <j.wielemaker@vu.nl>, Sandro Hawke <sandro@w3.org>, public-rdf-wg@w3.org
Message-Id: <FBE6087A-CDC1-4BC7-A75F-A262442279A1@garlik.com>
To: nathan@webr3.org
On 2012-07-31, at 14:38, Nathan wrote:

> Steve Harris wrote:
>> On 2012-07-31, at 13:58, Jan Wielemaker wrote:
>>> On 07/31/2012 02:42 PM, Steve Harris wrote:
>>>> On 2012-07-31, at 13:28, Nathan wrote:
>>>> 
>>>>> Steve Harris wrote:
>>>>>> On 2012-07-31, at 02:36, Sandro Hawke wrote:
>>>>>>> On 07/30/2012 06:37 PM, Andy Seaborne wrote:
>>>>>>>>> BUt surely IF this is a good idea and worth having, which
>>>>>>>>> Im assuming it is, then the longer we wait, the more
>>>>>>>>> problems there will be with deployed systems out there
>>>>>>>>> which don't support it. Kicking the can down the road is
>>>>>>>>> not a good way to handle problems of legacy inertia.
>>>>>>>>> 
>>>>>>>> Your argument would apply to literals-as-subjects as well;
>>>>>>>> it's largely a syntax restriction.  If that's going to
>>>>>>>> happen, it isn't in this WG (by charter), so why not make the
>>>>>>>> changes in one step, not in multiple steps?
>>>>>>> If literals-as-subject were primarily a matter of syntax, or
>>>>>>> were seen as inevitable, I don't think they'd have been ruled
>>>>>>> out by the charter.    I understand the reasons were mostly
>>>>>>> about data structures and implementation techniques, but I
>>>>>>> wasn't paying close attention to the technical content, so
>>>>>>> perhaps I misunderstood.
>>>>>> I think that the reason users don't try it is because of the
>>>>>> syntax restriction, the reason engines don't (on the whole)
>>>>>> support it is more due to the legacy of getting on for 15 years
>>>>>> worth of software, research and publications. Knowing the that
>>>>>> subject can only be a URI or bNode is a useful optimisation for
>>>>>> many SPARQL engines.
>>>>> wild idea and probably way off course - but what if there was some
>>>>> kind of "EXTENDED MODE" keyword for sparql queries that let the
>>>>> engine know to expect literals as subjects and other such things -
>>>>> would an approach like that allow the engines to keep their
>>>>> optimizations most of the time, and skip them when demanded?
>>>> It would have to be at DB creation time (at least in 4/5store) as the
>>>> optimisation goes right down into the storage engine.
>>> I think that the more fundamental question is at the level of modelling.
>>> My fear would be that we get triples that use proper names of entities
>>> as subject instead of inventing a URI for this particular appearance of
>>> the string.  If this happens we introduce massive ambiguity (or conflicts, depending on your viewpoint).  Next is
>>> 
>>> "Beer" "sub type of" "Animal".
>>> 
>>> URI's have a role: they avoid ambiguity and they can be used to fetch
>>> data as Open Linked Data.  They are not without problems, but they do
>>> fix some problems associated with natural language words.
>>> 
>>> In any case, the order must be to make a decision at the datamodel first
>>> and next at the syntax level.  Please don't 
>> That's a very good point.
>> People coming from a relational database background will definitely use integer "keys", until they (hopefully) realise why that's a bad idea on the web. e.g.
>> 9876435 a :Product ;
>>        :name "5cm Trunion (Foo Inc.)" ;
>>        :cost 12.99 ;
> 
> people will often use integer "keys", and often do even in linked data, by using (inverse) functional properties, or by prefixing part of a URI before the key.

Right, but neither of those things necessarily causes such a big issue.

>> To a Linked Data person that's very clearly a bad idea, but it's an obvious, and rational translation from a database table.
> 
> I have to disagree (not to be argumentative or negative though!) - it seems like a rational idea to me, to use integer keys, even better to namespace them within a URI (http://example.org/products/9876435), and to have optimizations of integer only keys within a storage engine for a specific domain.

I would argue that's a perfectly reasonable thing to do. It's very different to using a bare integer though!

> That's not the point though, the point is that the above "looks like RDF", "is N3", but isn't "linked data" - as it's not linked. It's not a bad idea, it's just not linked data.

It is a bad idea in the presence of data from other people, who had the same idea.

E.g.

<http://foo.com/products.ttl>

   9876435 a :Product ;
          :name "5cm Trunion (Foo Inc.)" ;
          :cost 12.99 ;


<http://bar.com/products.ttl>

   9876435 a :Product ;
          :name "12cm Frob (Bar Inc.)" ;
          :cost 1.99 ;


That would be perfectly legal in a literals-as-subjects world, but now we've created some very strange assertions.

> Use URIs as names is the first principal of linked data, simple education gets that across to people.

I don't believe that's sufficient. Currently they have no choice if they want to use RDF, but I'm sure people would do things like the above if they could. Right now many people using RDF have drunk the SemWeb cool-aid, but if it ever takes off that will no longer be the case, it will mostly be developers who've been told by their boss to "make a linked data version of the site", or whatever.

> The above looks like a set of three useful triples to me, that I could use in multiple scenario. RDF doesn't support it it over the wire, and of course it isn't linked - doesn't make it unwanted, bad, or useless though.

I don't think I agree - or, at least, I'd take a lot of persuading that that example isn't destructive.

> I'm aware it's way too late to see literals as subjects in RDF 1.1, but will defend their usage and the idea of them, as it's certainly not useless, stupid, or bad; nor something to create fear, uncertainty, and doubt over. They have some value, some cost, some people would like them, some people would not.

There are certainly usecases for LAS that aren't bad, I was just pulling out one that was.

- Steve

-- 
Steve Harris, CTO
Garlik, a part of Experian
+44 7854 417 874  http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ
Received on Tuesday, 31 July 2012 17:07:14 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:49 GMT