Re: ISSUE-147 (preserve markup by default): RDFa Processors should preserve markup by default [RDFa 1.1 in HTML5]

On Sat, Dec 29, 2012 at 8:25 AM, Ivan Herman <ivan@w3.org> wrote:
> I agree with you and Gregg. The issue on XML Literal has been discussed a
> lot. It wasn't an obvious issue, but the decision has been made.
>
> Procedurally, it is correct to say that this WG has the right to define the
> behaviour of HTML5+RDFa differently for XML Literals and/or for HTML
> literals. However, the discussions for RDFa Core, as well as for
> XHTML1+RDFa, obviously took into account the most important prospective
> deployment of RDFa, i.e., HTML5, too. I also do not see any new evidence in
> this thread that would justify essentially reopening this issue, and
> introducing a major incompatibility between XHTML1, SVG, etc., and HTML5. I
> am sorry, Andreas, but I am definitely not in favour of this change.
>
> Ivan
>
> ---
> Ivan Herman
> Tel:+31 641044153
> http://www.ivan-herman.net
>
> (Written on mobile, sorry for brevity and misspellings...)


Ivan,

 Thank you for your response. And thank you for clarifying that the WG
can adapt the RDFa 1.1 Core document to the specific needs of an
(X)HTML5 context, though I'm sure we all agree that that is inherent
in the process of describing how RDFa works in a host language.

 And again thank you for indicating that there is a requirement for
new evidence. I am not sure why that is the case since this is a new
issue opened on "RDFa 1.1 in HTML", but if that requirement does
exist, I believe this situation meets it. Of course, that keeps us on
procedural issues, which are less interesting than the basic point of
making sure "RDFa 1.1 in HTML" is a well-constructed and robust spec
that can meet the needs of many users. But since the objections to
considering ISSUE 147 seem to be procedural, I'll address those.

New Evidence:

* The existence of RDFa Lite
    When the decision to discard child elements when distilling RDF
triples from RDFa+XML was made (ISSUE 19) [1], RDFa Lite did not exist
[2]. This is relevant because it seems that a reason for discarding
child elements is how "semantic data is consumed in the marketplace"
[3].

 I believe I'm on fairly firm ground in saying that one inspiration
for RDFa Lite was to ease the process by which semantic data is
firstly created and secondly consumed in an SEO oriented
"marketplace". Although I do not use RDFa Lite, I fully recognize its
need and utility. But to the extent that one downstream use of
semantic data is determining default behaviors, I suggest we can now
encapsulate those behaviors in RDFa Lite. Meaning, that RDFa Lite can
retain the default production of Plain Literals when processing HTML5.
In fact, I think that is a perfect use of RDFa Lite, one that was not
possible when ISSUE 19 was decided. This should not lead to ISSUE 19
being re-opened, but rather the current ISSUE 147 being considered on
its merits.


* Use cases in which child elements in (X)HTML5 are not a mistake

I think that such uses can be considered new evidence has already been
recognized by Manu's inclusion of point 2 in his list of items that
the WG should re-examine [4]. So perhaps the following is also my
contribution to that re-examination.

 Here's my use case and some of its history by way of explaining why I
raised this issue now and in the context of "RDFa 1.1 in HTML5":

 As I said, I edit an online scholarly journal "ISAW Papers" [5]. The
end result of the publication process will be the deposition of XHTML
files in the New York University Faculty Digital Archive for permanent
preservation. It is my hope to also use RDFa to encode the semantic
data in these articles. I have begun to do so and an example of the
current state of work can be seen in the publicly accessible version
of ISAW Papers 2 as delivered by the NYU library [6]. If you look in
the source of that page, you will see many @property values which have
child elements, in particular, dcterms:bibliographicCitation. It is
not a mistake that those are there, they are important, and it is
important to me that they be preserved in workflows that process this
data. That is the source of my request that ("non-lite") RDFa
distillers preserve that markup.

 But in terms of procedure and the requirement for new evidence, here
is where I am. For some time now I have been developing ISAW Papers
with XHTML+RDFa 1.0, which did preserve child elements.  The RDFa Core
Rec came out in June. THe community's attention turned more firmly to
RDFa in HTML5, as did mine. In the fall I began converting my content
to HTML5 and RDFa 1.1. (If you poke around ISAW Papers you'll see
evidence of an ongoing process...) In making that conversion I saw
that child elements where discarded.  This is a concern to me because
I am hoping to encode my semantic data using RDFa 1.1, then use
standardized tools that extract that semantic data, and then process
that semantic data for both automatic-agent and human consumption. In
all these cases, preservation of markup is extremely important.

It did take me some time to confirm that the discarding of markup was
not caused by error on my part. Knowing now that it is a feature of
the REC, I have now reported my concern in the context of the "RDFa
1.1 in HTML5", and I have made this report while the spec is a Working
Draft. I don't think it is unusual that I have found this issue only
after taking considerable time to work with both specs and hope
realities of my timing aren't determinative in addressing ISSUE 147.
To put that another way, I think I have done the "right thing" in
pursuing the conversion now and in offering my feedback to the WG.

 I do recognize that the development process for RDFa 1.1 Core looked
forward to its deployment in HTML5. Ivan, you noted this above. But I
do think its important that RDFa Core describes RDF in the context of
generic XML documents and so did not bear the burden of fully
addressing the role of RDFa in HTML5. This seems clear from the
well-understood need for the "RDFa 1.1 in HTML5" product that we are
working on now.

* Accessibility
 I am not an expert in this topic so I raise it with some hesitancy.
But I would like my XHTML to be accessible as defined by the W3's Web
Accessibility Initiative [7]. I see there that it suggests the use of
HTML markup to achieve its goals, see the suggested use of the dfn
element [8]. It is likely that such elements will end up in RDFa
marked content such as dcterms:abstract. The current "RDFa 1.1 HTML5"
spec discards that accessibility markup. But let me clear, I defer to
Shane's greater expertise or any other official feedback from the WAI
on this issue.


 Ivan, on the basis of the above, I wonder if you would be willing to
reconsider your determination that insufficient new evidence has been
introduced in order to consider ISSUE 147 within its stated "RDFa 1.1
in HTML5" context.

 Broadening this discussion slightly, in earlier messages I have made
what I think are substantive points as to why either rdf:XMLLiteral or
rdf:HTML should be the default production when parsing elements that
have child elements. I won't repeat those here, other than to note
that I highlighted issues of language, which I think are especially
relevant as RDFa is deployed in HTML5. It looks like there was some
consideration of language in the teleconference that resolved ISSUE
19, with the suggestion that this is where people might want markup
preserved [9]. I agree and think "RDFa 1.1 in HTML5" is the right
place to pursue that topic.


To sum up with reference to previous messages:

 1) The consideration of ISSUE 147 with in the context of the Working
Draft of "RDFa 1.1 in HTML5" is timely.

 2) There is new evidence in the form of the existence of RDFa Lite,
the introduction of a use case in which child elements are not a
mistake, full consideration of multi-lingual issues as they appear in
HTML5 as used in the real world, and the possibility of WAI impact.

 3) There is a substantive case for the default production of
rdf:XMLLiteral and rdf:HTML in the context of HTML5 and its variants.
See "2)" immediately above.

 4) Some solutions have been offered in the form of: restricting the
default discarding of markup to RDFa Lite in HTML5, writing separate
specs for  RDFa inXHTML5 and HTML5, re-enforcing that elements
without child markup should produce a Plain Literal.

 I do hope the above discussion allows us to move beyond procedural
issues to full consideration of the merits of ISSUE 147.

 My final point is that I hope it's clear that this issue is of great
importance to me. I want to use XHMTL+RDFa but this default
behavior is a real impediment. One that I have only recently
discovered. So I've tried to be clear in my
language (though it's hard to be concise!). Please don't take
that as abruptness or rudeness.

 Thank you,

 Sebastian.

[1] http://www.w3.org/2010/02/rdfa/track/issues/19
[2] http://www.w3.org/standards/history/rdfa-lite
[3] http://lists.w3.org/Archives/Public/public-rdfa-wg/2012Dec/0077.html
[4] http://www.w3.org/2010/02/rdfa/track/issues/147
[5] http://isaw.nyu.edu/publications/isaw-papers
[6] http://dlib.nyu.edu/awdl/isaw/isaw-papers/2/
[7] http://www.w3.org/WAI/
[8] http://www.w3.org/TR/2012/NOTE-WCAG20-TECHS-20120103/H54
[9] http://www.w3.org/2010/02/rdfa/meetings/2010-05-13#ISSUE__2d_19__3a__Default_generation_of_XMLLiterals

Received on Saturday, 29 December 2012 19:48:06 UTC