RE: My RDF Manifesto from Grant Robertson on 2012-05-14 (www-rdf-comments@w3.org from April to June 2012)

From: Grant Robertson <grantsr@gmail.com>
Date: Sun, 13 May 2012 20:19:03 -0700
To: "'Danny Ayers'" <danny.ayers@gmail.com>
Cc: <www-rdf-comments@w3.org>
Message-ID: <34563AD196E642529869F1EDA05EFB68@grantdesk>
Danny, sorry to have taken so long to respond. Other life stuff sometimes
interferes... 

> -----Original Message-----
> From: Danny Ayers [mailto:danny.ayers@gmail.com] 
> 
> 1) There is no meta-metadata.
> 
> Named graphs appear to be the best solution here. Whatever 
> they ultimately wind up looking like, there is always the 
> HTTP model to fall back on: put an RDF-friendly format 
> representation of a resource online (set of triples), talk about that.

I disagree. I have been watching the discussion of the possible uses for
that last member of a Quad and it seems there is disagreement as to whether
that member should be A) An IRI pointing to additional metadata (as you seem
to suggest here), or B) merely a label to indicate which "layer" a triple
should be on or which "space" said triple should be considered to be in. The
latter limits each triple to being in only one layer. Or, if two quads exist
that are identical except for the last member, how is one to know that they
are truly intended to be the same node that spans two different layers or if
they are supposed to be independent? Quads add more data but also simply
move the ambiguity a step or two to the right while actually multiplying
said ambiguity. 

Relying on the "HTTP model" - wherein I suppose you mean the IRI of the
fourth member of the quad is actually a URL pointing to metadata about the
triple - utterly depends on either A) all those other web pages to be
available each and every time the data needs to be processed or B) the data
on those web pages to be cached within some extension of an RDF data store.
It also requires either a separate web page for each triple posted on the
internet or for a set of different fragments within one large web page
describing a set of RDF data. 

I have to say here: Seriously? Does the RDF community really think it is
simpler to: A) Require processors and data stores to handle a possible
additional element (that fourth element that makes a triple into a quad)? B)
Require the creation and subsequent repeated dereferencing of perhaps
thousands of additional separate web pages which hold that metadata? C)
Cause ambiguity as to whether that IRI should be dereferenced for metadata
or merely used as a label for a "layer" or "space"? ... All as opposed to
simply allowing use of the query string in existing IRIs? A change that will
be immensely easier for developers of processors to program around - if they
choose to ignore it - because all they have to do is ignore the query string
if they want to maintain the status quo (remember, no one uses the query
string because it is currently of no useful significance). A change that
will keep the meta-metadata with the triple itself. And a change that will
allow meta-metadata to be included with the triple while ALSO allowing
labels to be applied to the triple to indicate which "layer" they should be
on. Yes, it is possible to turn left by making three rights - and it may be
necessary every once in a while - but would you really want to get around
town that way all the time?


> 
> 2) RDF is entirely Boolean.
> 
> Not quite, "true" vs. "unknown" is a slightly different 
> species. But I had exactly the same kind of issue with RDF 
> when I first encountered it, and others have been there 
> before. I believe Aaron Swartz came up with a property 
> :kindaLike. But the fact that you can put numeric values in 
> literals mean that it is possible to connect to 
> number-crunching systems. I'd suggest that putting

Adding a numeric value in a literal only connects that value to a subject.
It does not associate it with a particular triple that contains said
subject. If I want to say that Jim foaf:knows(100%) Sally but that Jim
foaf:knows(50%) Bill, your solution provides no means of doing that other
than putting the metadata in an entirely separate file. Please explain to me
how that is simpler to process and easier for web authors to write into
their web pages or RDF files.



> 3) RDF is fragile and impermanent.
> 
> It's more or less as fragile and impermanent as Platonic 
> solids (though YMMV as far as specs and mindshare are 
> concerned :) But I agree very much that dependency on HTTP's 
> dependency on DNS is troublesome.

I fail to see how the stability of the internet can be compared to a
geometrical construct.


> 4) Blank Nodes are too ambiguous yet not fuzzy enough
> 
> Then don't use them :)

That is not a solution. Blank nodes are often absolutely required in order
to create the more complicated data structures that TBL promised we would be
able to create using the much simpler and easier to program around "triple."
Yet, once one has created a blank node - either implicitly through chaining
or explicitly through the use of _:name - it is impossible to tell for
certain whether a blank node in one data source (be it a web page or a
triple store) is really, REALLY the same as another blank node in another
data source... Especially after long periods of time. 


Please remember, I am not thinking in terms of simply adding a bit of
citation information to a quote on my web page. Nor am I interested in
telling the world who my friends are (let alone through the least user
friendly means possible). I am considering the implications of using RDF and
RDFa to store scientific and research data that can be used by as many
people as possible, over the course of potentially hundreds of years. I am
thinking in terms of what will meet the needs of scientists and real people.
Whether this system matches up with what someone's Discrete Math teacher
taught them in graduate is not even on my priority list. Nor do I think it
is on the priority list of the people who will really use RDF, if it ever
finally meets their needs. I don't think even scientists are ever going to
say, "Well this system doesn't really meet my needs and it is exceedingly
difficult to store and use my data in this format... but hell, it is a
mathematically based model, so I guess I should use it."  I know I am not a
gray-beard in this community, nor do I have all kinds of math or CS degrees,
but this whole thing really seems like a huge case of a whole group of
people having gone down a path that looked interesting at first, but now
refusing to backtrack regardless of how treacherous that path has become or
how far it has taken them from their original goal.
Received on Monday, 14 May 2012 03:19:34 UTC