Re: Comments on PR-rdfa-syntax-20080904

Hi Danny,

> I too am now positive about the material (after quite strong initial
> doubts), I just think it could be presented better. Did you you get
> any comments from the HTML5 folks? Atom folks? Microsoft? (*grin*)

If you mean did we *solicit* responses from particular groups, then I
don't believe we did. Obviously anyone is free to comment, and groups
that have had a particular interest in RDFa have provided us with the
largest number of comments.


>> So do you think you could be more specific than you have been?
>
> Probably (soon!). I've not been following the list more than
> superficially, but have got the impression that most (if not all) of
> the technical issues have been dealt with. However I don't see this
> document as reflecting that work.

Ok...but hopefully you'll be able to be specific at some point,
otherwise this thread is going to be a little bit repetitive. :)


>>> IMHO it could use compressing, making more formal, and significant chunks
>>> moving to other docs - some of the informative bits to the primer...
>>
>> The primer is aimed mainly at authors, whilst this document is aimed
>> primarily at implementers. The general feeling is that the RDFa spec
>> should be self-contained, and provide everything that an implementer
>> needs to produce an XHTML+RDFa processor.
>
> Reasonable aim.

:)


>> (Also, removing all of the informative material would probably make
>> the document more difficult, not less, since it would be very terse.)
>
> Not sure there, but I may well be wrong.

I think it would definitely be the case.

If we could have two documents working together, and be sure that
people would first read the primer, and then the syntax document
(which is pretty much what you are getting at), then of course moving
informative text out of the normative document would have no effect.

But if we can't be sure that people will consult the primer, and if we
don't want to have two documents moving through the specification
process, then we can only be sure that people will read the syntax
document. And in that case, it seems that removing anything that helps
make things clearer (the informative text) can only lead to a more
terse document.


>>> ...the CURIE def to another spec.
>>
>> You are exactly right, and there is a separate CURIE spec. However,
>> the timing of the specifications means that we can't refer to it
>> normatively from the current RDFa spec.
>
> Ok, nice to hear. I dread the w3c process on that...

You are not alone...


>>> Pragmatically, as it stands I suspect most
>>> publishers/consumers & parser authors will simply get confused.
>>
>> I don't want to sound overly sensitive here, since of course any spec
>> can be improved upon. :)
>
> For sure. All I'm saying I think you've got all you need, it just
> needs a couple more cycles to make a workable spec.

And all I'm saying... ;)

I don't doubt that you might be right...I'm certainly not going to
argue with you about it. But without specific pointers it's difficult
to know where to go on this.


>> But the number of implementations that *already* exist [1], and the
>> positive comments we have received from reviewers and implementers do
>> make me think that the spec is not that confusing.
>
> c.f. http://cyber.law.harvard.edu/rss/rss.html

I'm not sure what you are getting at, here. Do we need more pictures
of flowers in the spec?

(Only teasing... ;))

But seriously, that document is a high-level quick introduction to
RSS, but it fails as a model document on so many levels I don't know
where to start.

Just to pick an example, we haven't even got out of the introduction
and we have:

  Subordinate to the <rss> element is a single <channel> element, which contains
  information about the channel (metadata) and its contents.

Are you really saying this is an example of simplicity? Putting aside
the awkward wording, we don't ever find out what a 'channel' is!


>> However, if you have very specific proposals, they would be most welcome.
>
> Hmm. Did my best this time.

With respect, you said you had the 'impression' that things 'seemed'
incomplete, or 'messy'. But with the best will in the world, it's
difficult to do anything with the impressions, unless they are
transformed into something more concrete.


>>> while Relax NG might not be as widely adopted as DTDs, for the purposes of a
>>> specification like this, such a description would be a lot more helpful than
>>> the DTDs
>>
>> This module has been designed specifically to work with XHTML
>> Modularisation, which currently uses DTDs. You are exactly right about
>> Relax NG though, and Shane has been working on a new version of M12N
>> that uses it. But that is a little way off, and for now DTDs are the
>> only formally supported mechanism (i.e., in the sense of the
>> specifications).
>
> Fair enough.

:)


>>> the distinction rendered data vs. structured data doesn't seem clear
>>
>> We aren't really concerned with the rendered data. What are you
>> thinking in particular that we should clarify?
>
> Hard to express without waving my arms around - how about microformats
> with @profile compared to microformats without @profile..?

What I meant was, since you could glean RDF from an XHTML document
that contains RDFa, independent of whether that document ends up in a
browser or not, the rendering side is irrelevant. You'll see that most
of the current implementations for example, are server-side, and so
obviously have no rendering capability.

Perhaps that clarifies why we don't mention rendering -- but of course
if there is some reason that we *should* then please raise it.


>>> how does a parser distinguish between intentional RDFa and HTML tag soup?
>>
>> It doesn't. In general the use of existing XHTML attributes like @rel
>> and @href will generate triples that one could reasonably infer from a
>> standard HTML document. And since @about, @datatype and other such
>> attributes don't exist already, then we don't think we're going to get
>> any 'false positives'.
>
> I'm happy to let this one pass, despite utopian (well, URI-based
> extensibility) aspirations :-)

:)


>> Can I just ask for clarification, whether you are asking this
>> question, merely because you think the spec should be clearer? Or
>> whether you are asking because you have a strong opinion on this? If
>> the latter, then it would be worth raising your doubts in a separate
>> thread, since there has been a great deal of discussion about this
>> issue in the past.
>
> As above.

Similarly utopian aspirations, you mean? :)

As it happens, I'm hoping that you'll find that issues like
extensibility have been covered pretty well.


>>> While I appreciate the intent, I believe this statement to be wrong -
>>> accurate communication (of data) requires both the producer and consumer to
>>> understand the language. Suggest rewording to something along the lines of:
>>> "...authors don't need complete understanding of RDF to use it"
>>
>> I don't think it is true that an author needs to understand RDFa.
>> There are many scenarios where they might need to do little more than
>> cut-and-paste. The full sentence that you have quoted from, says this:
>>
>>  Although RDFa is designed to be easy to author—and authors don't need to
>>  understand RDF to use it—anyone writing applications that consume RDFa
>>  will need to understand RDF.
>
> I stand by what I said. Cut & paste is a cheesy near-term UI thing.
> Communication from producer to consumer goes deeper. Could I trust my
> bank account on this?

Right...but what I was trying to emphasise was that the paragraph you
refer to says that authors don't *need* to understand RDF, but
implementers probably will.

I also don't think that cut-and-paste will be the normal mode of
operation for RDFa users. But I do believe that with the primer they
could make good use of RDFa without understanding RDF, and in fact I
believe the interest in, and uptake of, Microformats over the years,
has illustrated this pretty well.)


>>> * 3.7. Graphs
>>>
>>> appears unfinished...
>>
>> How so? A graph is a collection of triples.
>
> Perhaps just ref the relevant bit in the RDF specs.

Yes. That also relates to your more general comment about linking to
the definitions.


>>> * 4.1. Document Conformance *
>>>
>>> This seems a bit messy...
>>
>> You seem to use a lot of this kind of wording in your comments.
>
> Yup. Plain English.

Not really. :)

If you think it's messy, then you should say "It's messy", and point
to where it is messy. That would be "plain speaking". No-one would cry
about it, because as you imply, plain speaking is good. :)

But you didn't say "it's messy"; you said "it *seems* to be messy",
and you didn't point to anything in particular.

That's not plain speaking, but an 'impression', and whilst you're
perfectly entitled to voice impressions, I'm not sure how they help
move the spec forward.

I hope you don't think I'm being over-sensitive here. But hopefully
you can see my point, because after saying "it seems messy", the only
example you gave was that there was no reference to @version.

...and there was. :)


>>> ... both substantively & editorially. For starters, where
>>> is @version in the http://www.w3.org/1999/xhtml namespace?
>>
>> I don't know what you mean by that. In section A.3 [3] you'll find
>> references to it in the DTD. Is that what you mean, or do you mean
>> something else?
>
> It's another thing I now feel ok to let pass, but I can't deny a
> little discomfort at post hoc additions to an established vocabulary
> (the XHTML namespace).

That's an interesting comment. The whole point of this module is to
add new items to the vocabulary, using the extension mechanism of
XHTML Modularisation -- so post hoc additions are exactly what we're
doing.

So if you missed that point, we most definitely need to make it
clearer up front.


>>> * 4.3. RDFa Processor Conformance *
>>>
>>> "A conforming RDFa Processor MAY make available additional triples that have
>>> been generated using rules not described here, but these triples MUST NOT be
>>> made available in the [default graph]. (Whether these additional triples are
>>> made available in one or more additional [RDF graph]s is
>>> implementation-specific, and therefore not defined here.)"
>>>
>>> This seems over-constrained. If I have a doc which contains RDFa plus GRDDL
>>> plus [something not yet defined] RDF-in-HTML data, I would expect them to at
>>> least be able to be interpreted as a single graph. i.e. the graph scope
>>> should be the document, not the RDFa processor's interpretation. (That's
>>> assuming "default graph" is meant to mean what I think - it's not defined
>>> here as far as I can see). I don't see how Appendix C. Deployment Advice
>>> fits in here either.
>>
>> The problem with the approach you suggest is that it is then difficult
>> to establish what is and is not a conforming processor. If I were to
>> add a rule to my processor that added triples based on the value of
>> @class or @alt, would that be conforming or not? If I had a processor
>> that ignored certain triples, would that be conforming?
>
> Well not really, you just say the subgraph provided by RDFa, and leave
> the rest open.

But how do we find that subgraph when it comes to verifying implementations?


>> So the approach we took was to ensure that there would be *at least*
>> one graph that represented all triples in the XHTML document that
>> could be obtained with the core rules. This graph is required
>> *exactly*, i.e., it should not contain more or less triples than the
>> rules state should be obtained.
>
> That's fine - but I think "default graph" is the wrong terminology.

Again, that's interesting.

A few people have expressed dissatisfaction with this terminology, but
no-one has suggested an alternative.

It's a bit tricky to respond to this, especially when no-one has
pursued this argument very forcefully.

So...putting you on the spot here.... :) do you have a suggestion as
to how this should be changed? (Bearing in mind that putting all
triples into one graph means that you can't find a particular set of
triples again.)


>> Now, if some processor wants to then interpret Microformats like
>> @class="hcard", or to run a GRDDL transform, or whatever, they can do,
>> but those triples should not be part of the default graph. This allows
>> people to experiment with new features in RDFa without users having to
>> worry that the processor they are using is 'non-conformant'.
>
> I don't see any problem with X triples generated by the RDFa processor
> and Y triples generated by the GRDDL processor being analyzed
> separately for conformance purposes, yet both being part of the graph
> expressed by the document. If this doesn't convince you, I'll do my
> best to make the case better.

I understand what you are getting at. But the scenario we were more
concerned about was that someone comes along, and adds to their
processor support for "DC.creator", or something like that. In the
'put it all in together' model, there is no way to say that this
processor is, or is not conforming.

Now, let's say in the future that we want to enhance RDFa and/or
CURIEs in such a way that "DC.creator" is processed in the core rules.
There are many ways to address this; one might be to use "DC." as a
prefix, whilst another would be to simply map "DC.creator" to a full
URI (in the same way that we do for other CURIEs).

And of course, there may be other solutions.

The point is, no-one has done the work yet to decide which is better,
and we don't want to be forced to go down one route in the future,
because implementers have added a certain technique to their
processors.

So by dealing in separate graphs, we can say to implementers, by all
means process 'extra' data that you find useful in your application,
and use whatever rules you like to do that processing. But please
place the triples you extract in a separate graph, because in the
future we may come up with some new rules for processing that 'extra'
data, and those triples could go into the 'default graph'.


>>> * 5.2. Evaluation Context *
>>> This section seems loosely defined for normative material. I don't think
>>> it'd take much effort to tie it to the XML DOM, in a similar fashion to 5.5.
>>> Sequence (SAX).
>>
>> Well, that's two separate statements. I assume you are not saying
>> "it's loosely defined because it doesn't use DOM terminology"?
>>
>> I believe you are saying that (a) it's loosely defined, and (b) you
>> think it would be better defined in terms of the DOM.
>
> It's pretty much the same thing repeated, except the 5.2 section is
> quasi-DOM. (I very much mean the XML DOM, btw)

Yes, but as Toby pointed out, one of the core ideas of RDFa (in
general, as opposed to the specific language "XHTML+RDFa"), is that it
can be applied to hierarchical markup languages like HTML, as well as
XHTML.

Now, of course you are right that, since this document is very
specifically about *XHTML* we could use XML DOM terminology. But that
would then mean that when we create the HTML+RDFa document, we'd have
to change the terminology from being XML-specific to just being about
a DOM.


>> I have to disagree that it's loosely defined, but would welcome any
>> precise comments you have on where it could be improved. I think a
>> general statement that it could be improved by adopting DOM
>> terminology is not really good enough, since even after making the
>> change to please you, someone else could come along and say that they
>> think it is loosely defined because it doesn't use the post-schema
>> validation infoset terminoloyg...and so on.
>>
>> So if you think it's loosely defined, I think you need to say why, in
>> terms of the spec as it is now.
>
> Time permitting ;-)

:)


>>> Use of [] on the CURIE attributes seems inconsistent.
>>
>> Again, forgive me if I ask you for something more precise than "seems".
>>
>> Where exactly is it inconsistent? There are places where we need to
>> 'escape' CURIEs so that they don't get mixed up with URLs, but are you
>> saying that even after taking that into account there are
>> inconsistencies?
>
> Might just be my myopic reading.

Maybe. :) But the thing is that you are right that in some places we
use CURIEs, and in some place we have to use SafeCURIEs. So my
question is more about whether there are places where you think we
haven't explained that well enough.


>>> unnecessary repetion of HTML defs - a single example would do (i.e.
>>> rel="cite")
>>
>> I think the def's example is a reasonable one, since it's something
>> that people do a lot.
>
> The example is good, but listing the other values shouldn't be
> necessary (since they're derived from elsewhere).

Do you mean the other values for @rel?


>>> various places:
>>> mark-up => markup
>>
>> Interesting. :) I always use the former, but I notice that the W3C
>> 'house style' is to use the latter. Speaking for myself, I agree that
>> this should be changed.
>
> Given your forename, that's understandable ;-)

You'll notice that I've even started using it myself...like in this email. :)


> [snip]
>
>>> consider merging 3.1 Statements with 3.2 Triples
>>
>> I think the current structure distinguishes between the quasi-human
>> readable 'statement', and the machine-readable 'triple', which is
>> useful for those new to RDF.
>
> Ok, I can accept that.

Thanks.


>>> *** Nitpicking ***
>>>
>>> * Abstract *
>>>
>>> "The modern Web is made up of an enormous number of documents that have been
>>> created using HTML."
>>> =>
>>> "The current Web is primarily made up of an enormous number of documents
>>> that have been created using HTML."
>>
>> I like "primarily", although I don't mind either way on "modern" and
>> "current". :)
>
> Good-o. It seems a shame to discount all that RDF/XML, RSS, Atom, PDFs and p0rn.

:)


>>> "RDFa is a specification for attributes to express structured data in any
>>> markup language."
>>> Is RDFa specified for any language other than XHTML? Where?
>>
>> Not by us, but Yahoo! have incorporated RDFa into DataRSS [4] which
>> can in turn be carried in Atom and RSS.
>
> Ok, but *any* seems a bit broad. How do you do it in OpenID Attribute
> Exchange..? The email markup spec? (who's RFC numbers I should
> remember, but can't).

That will be for another spec to define. But essentially, as long as
there is hierarchy in your documents, you can add RDFa.

In fact, it's better than that. The DTD module for XHTML+RDFa that
Shane has done -- the one that you would prefer to be in RelaxNG ;) --
could be included into your language, and then you could validate
documents that contain your language plus RDFa.


>> The general point is that as we refined the spec, we always ensured
>> that nothing we did would jeopardise the consistent use of the same
>> rules in other markup languages.
>
> XML markup, even HTML5...but httpd log files?

Do you mean that we should be more specific that we are dealing with
SGML and XML? I guess the use of a DTD to extend your language implies
that, but perhaps it could be clearer.

(Although I take Toby's point that we could apply RDFa to JSON
objects. But that can be left to a discussion about RDFa 'in
general'.)

Having said all of this, at the end of the day, this spec is about
XHTML+RDFa, so any comments about the broader applicability of RDFa
are merely intended to be part of 'setting the scene', and perhaps to
give clues to anyone that wants to add RDFa to their own langauge.


>>> * Motivation *
>>> "RDF/XML [RDF-SYNTAX] provides sufficient flexibility to represent all of
>>> the abstract concepts in RDF [RDF-CONCEPTS]."
>>> - with certain limitations, e.g. properties which can't be expressed as
>>> qnames can't be serialized as RDF/XML, e.g. http://example.com/1234
>>
>> Good point.
>>
>> How about "most of"? :)
>
> Works for me.

Thanks.


>>> 'hard-wired' - not sure everyone will understand, anyone got a synonym?
>>
>> Really? Speaking only for myself, I don't mind if it's changed, but I
>> don't think it's an unusual expression.
>
> I'm just assuming not all modern programmers burnt their fingers on a
> soldering iron :-)

He he.

However, 'hard-wired' is also a common expression nowadays, in
relation to behaviour, psychology, etc. So whilst I'm happy to see it
changed, I still maintain that it is not an unusual expression.


>> Thanks for your comments, and once again, just to point out that this
>> isn't a formal reply, but more an attempt to get some clarification on
>> some of your issues.
>
> Thank you.

Thanks again.

Mark

-- 
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)

Received on Thursday, 11 September 2008 10:07:09 UTC