Re: Ongoing objection to RDFa Profiles format (as XHTML+RDFa) from Nathan on 2010-10-08 (public-rdfa-wg@w3.org from October 2010)

From: Nathan <nathan@webr3.org>
Date: Fri, 08 Oct 2010 04:15:44 +0100
To: Manu Sporny <msporny@digitalbazaar.com>
CC: RDFa WG <public-rdfa-wg@w3.org>, Mark Birbeck <mark.birbeck@webbackplane.com>
Message-ID: <4CAE8CE0.4060009@webr3.org>
Manu, Mark, All,

I hate to say it but I also don't support RDFa Profiles (not just the 
format, I don't support RDFa Profiles at all), that said the approach I 
feel is the correct one (which I'll out line in a moment) I can't see 
being supported in tooling for quite some time sadly (although it is in 
some tools and I could support it myself easily enough).

My Reasons against:

- increased network load, several resources may be required to process a 
single resource, decreasing network efficiency, adding in additional 
interactions and reducing user-perceived performance (this is quite major..)

- constrains an RDFa document to be protocol bound, without profiles one 
could retrieve a RDFa document via FTP, P2P, SCP or any other means and 
still extract the graph serialized, with profiles it's entirely likely 
that an HTTP (or other) agent would also be required to process the 
document.

- exponentially increases the overhead involved in batch processing RDFa 
documents, if you consider recursively wget'ing a website and then batch 
extracting the RDF in to a triple store.

- an RDFa document does not contain the graph serialized within it when 
processed offline, or when one or more RDFa Profiles cannot be resolved 
/ successfully retrieved (partial messages can be a real problem, 
especially when the missing information is critical, signatures, trust 
metrics, assertions of falsehood, update streams etc also [1])

- temporal issues, if I serialize an RDF graph in an RDFa document 
today, I'd quite like to be able to get that graph back out in 10 years 
time without having to pray the profiles are still on-line.

[1] Manu, this one may resonate with you - let's move down the line a 
couple of years and say a large chunk of the ecommerce world is using 
good relations and the RDFa profile for it, consider what happens if 
http://www.heppnetz.de/grprofile/ goes down for a day.. or perhaps it's 
hacked and those prefix mappings are changed.. could be quite a big problem?

I do fully understand why RDFa Profiles were introduced, I just feel 
it's the wrong approach, as per the above, the approach I feel is the 
correct one, is as follows:


Alternative Solution:

Leverage OWL to create proxy ontologies using owl:equivalentClass and 
owl:equivalentProperty

If Martin Hepp's Good Relations ontology requires the use of a couple of 
geo: properties, some from dcterms and so forth, he could simply define 
aliases to them in the good relations ontology and assert their equivalence.

Likewise if I was to create a blogger type platform I could simply 
assert that x:title owl:equivalentProperty dcterms:title, rdfs:label . 
and so forth.

The point being that there's no reasons a full RDF(a) document could not 
use a single schema which aliases / proxies multiple different schemas. 
I'm sure you all follow without needing to go in to too much detail :)

Manu's comments on OWL leads me back to why I fully agreed with Jim 
Hendler's proposal on RDFS 3.0 [2] at the next steps workshop, and 
indeed why I'm surprised it wasn't considered more, there's a general 
sentiment of being ontology/schema shy around the linked data camps, 
preferring to use out of band knowledge about classes and properties 
rather than having a basic schema awareness within tooling, but this is 
an issue engrained deeper within the community, and which is a 
fundamental issue behind many of the more discussed RDFa WG issues at 
present - for instance when I mentioned pulling the range of properties 
to work out whether a string uri in an RDFa document is a resource or 
not - would be nice if it was under our remit to encourage and promote 
good form in this respect (arguable?!).

On this note I also feel there's an 80/20 rule for general usage of RDF 
on the web, but it's more like 95/5, where if you defined RDFS 3.0, then 
took the common properties used in say 10k personal profiles and made a 
proxy ontology, then did the same for 10k blog posts/micro 
blogs/comments/forum posts/articles, and 10k ecommerce sites then you'd 
have covered most of the common usage on the web in just 3-4 schemas, 
leaving the rest down to people who are more familiar with RDF and don't 
really make the mistakes we're trying to cater for (copy and paste, fear 
of the prefix etc).

[2] http://www.w3.org/2009/12/rdf-ws/papers/ws31

To summarise in general, even in the all too likely case where what I've 
suggested doesn't happen any time soon, I still think that adding RDFa 
Profiles is leading to internet scale problems which could last a 
generation, and far better to have those relatively short lived 
copy-paste and a little bit harder to write problems coupled with a 
"let's teach them" approach than to introduce something we later 
(perhaps much later) regret.

Of course I could be wildly wrong, regardless I'll pass on my regards 
and hope that this mail finds you all well :)

Best,

Nathan

Manu Sporny wrote:
> On 09/08/2010 02:30 PM, Mark Birbeck wrote:
>> On Wed, Sep 8, 2010 at 3:08 PM, Ivan Herman <ivan@w3.org> wrote:
>>> [snip]
>>> I am sorry but these things have already been discussed, and the WG has
>>> decided to go along the lines it has now. I do not see any new information
>>> here, ie, no argument that has not been discussed before. Reopening a closed
>>> issue is really not a good way forward.
>> As you rightly say the issue was resolved by the WG some months ago.
>> However, I never supported the original resolution:
>>
>>   <http://www.w3.org/2010/02/rdfa/meetings/2010-04-15#resolution_3>
>>
>> and I'm afraid I can't support it now. I'm not really sure what people
>> expect me to do, since I didn't say I could live with this -- I said I
>> oppose it.
>>
>> For me this is particularly compounded by the fact that I've yet to
>> see a decent argument in favour of using RDF to express the prefix
>> mappings (as opposed to name/value pairs as is done in N3, SPARQL,
>> RDF/XML, and so on); you say that "these things have already been
>> discussed", but I don't feel the discussion really nailed this.
> 
> Hi Mark,
> 
> Sorry for taking so long to respond. I had promised you a follow-up to
> this e-mail at some point in the past month. I've seen the "why are we
> using RDF to express prefix mappings?" question raised by you several
> times with no in-depth answer from the list, so here's my attempt at
> summarizing the conversation over the past several months with the
> various parties involved.
> 
> RDFa Vocabulary/Profile Orthogonality
> -------------------------------------
> 
> I think the short answer is that we're using RDF to express the
> information because we expect that many of the RDFa Profile documents
> will be most useful to people as human-readable documents that just
> happen to contain machine-readable RDFa. Take the FOAF vocabulary for
> example - it's human-readable:
> 
> http://xmlns.com/foaf/spec/
> 
> but it is also machine-readable via XHTML+RDFa, here are the triples:
> 
> http://check.rdfa.info/check?url=http://xmlns.com/foaf/spec/&version=1.0
> 
> We are expecting the RDFa Profile documents to be marked up in the same
> way, for example, here is the Good Relations RDFa Profile document:
> 
> http://www.heppnetz.de/grprofile/
> 
> and here's the machine-readable triples from the document:
> 
> http://check.rdfa.info/check?url=http://www.heppnetz.de/grprofile/&version=1.1
> 
> Note that the same document is used to provide both the human-readable
> (HTML) and machine readable (RDFa) information for the FOAF Vocabulary.
> Also note that the identical XHTML+RDFa mechanism was used to generate
> the Good Relations RDFa Profile.
> 
> This orthogonality is very important, and is one of the main driving
> reasons to mark up RDFa Profiles in RDFa.
>
> There is no difference between how one goes about creating an ideal
> vocabulary document and an ideal profile document for use with RDFa. So,
> if someone understands how to write XHTML+RDFa, the likelihood that they
> will be able to write an RDFa Vocabulary document and an RDFa Profile
> document is higher if we don't switch the underlying format on them.
> 
> Now, let's take a few of the suggestions that you have made - flat text
> files with key-value pairs, JSON, and using a @prefix-based mechanism.
> 
> Each one of these approaches requires someone that already knows
> XHTML+RDFa to understand that RDFa Profiles operate differently than the
> rest of XHTML+RDFa. That is, one must write XHTML+RDFa Vocabulary
> documents in one way, and RDFa Profile documents in another way.
> 
> Human-readability of RDFa Profiles
> ----------------------------------
> 
> In the case of flat text files with key-value pairs, we don't have any
> human-readable aspect to the files - just machine readable data. The
> same problem exists with JSON (which may or may not be understood by the
> person writing HTML+RDFa). To understand why this is such a bad idea,
> one can look at the OWL vocabulary:
> 
> http://www.w3.org/2002/07/owl
> 
> Trying to understand how to use OWL by just looking at the
> machine-readable vocabulary is painful and error prone. Given the choice
> between the way FOAF describes how their vocabulary can be used, and the
> way that OWL does the same thing - the choice is pretty clear from a
> human-readability point of view.
> 
> So, advocating a mechanism to express RDFa Profiles in a way that is not
> human-readable is a non-starter as far as I'm concerned.
> 
> You had also mentioned that we could perhaps re-use @prefix in an RDFa
> Profile document to express all of the prefixes and terms for the RDFa
> Profile. This would allow us to express the document in a human and
> machine readable way. However, the major drawback to this approach is
> that all of the prefix/term settings would be shoved into one attribute.
> 
> Future Proofing RDFa Profiles
> -----------------------------
> 
> If we wanted to modify RDFa Processor behavior by decorating prefix/term
> mappings via another mechanism, we'd be blocked in doing so due to the
> nature of the fairly simple @prefix syntax.
> 
> For example, if we wanted to define terms in the future that generated 4
> triples every time a term was found, we couldn't do so via the @prefix
> mechanism. That is, if we wanted to generate dc:title, foo:title,
> bar:title and zurg:title when the term "title" was used like so:
> 
> <span property="title">Zorgon The Emphatic</span>
> 
> we'd have to invent a new backwards-compatible syntax for @prefix as
> used in RDFa Profile documents.
> 
> Alternatively, the RDFa Profiles mechanism that used RDFa markup to
> express the profile terms and prefixes would just add another triple
> that states the other triples that must be generated when "title" is
> found in the markup. In other words, we're also using RDF to express the
> RDFa Profile documents because it is extensible.
> 
> Concerns
> --------
> 
> That is not to say that I don't agree with your notion that we're mixing
> the layers a bit here, but at the end of the day, authors rarely care
> about that. Language designers care about that kind of thing and I can't
> think of how it may bite us later down the line at the moment. What
> usability issues do you see with this approach? What technical issues do
> you see with this approach? I think I understand the design issues
> you're raising, but even I have to admit that they are a bit purist.
> 
> What is the worst-case here? If an SVG+RDFa processor must implement an
> XML-compatible+RDFa processor (which it has to do anyway) to read RDFa
> Profiles, what is the down-side to that? We have between 18-22 RDFa
> processors at present, do we think that we'll only have a handful of
> RDFa 1.1 processors due to this design decision?
> 
> To put it another way, your reaction to this seems to be fairly strong,
> Mark. Perhaps all of us that don't see it as a big issue are missing
> something, but I can't understand what that something must be.
> 
> I think the best way forward at this point is for you to submit a solid
> alternative proposal. You've mentioned several ways forward, could you
> pick the winner as far as you see it and we can discuss that in order to
> make the conversation a bit more concrete?
> 
> -- manu
>
Received on Friday, 8 October 2010 03:16:56 UTC