ISSUE-67 triage from Toby Inkster on 2011-01-25 (public-rdfa-wg@w3.org from January 2011)

From: Toby Inkster <tai@g5n.co.uk>
Date: Tue, 25 Jan 2011 23:16:43 +0000
To: RDFa WG <public-rdfa-wg@w3.org>
Message-ID: <1295997403.4229.10.camel@ophelia2.g5n.co.uk>
Here are Henri Sivonen's comments (HS) and my replies (TI). In
paragraphs where I suggest an editorial change, I've included the marker
"[EDIT]". In paragraphs where I suggest opening a new issue, I've
included the marker "[ISSUE]".

I've divided Henri's comments into three broad categories - "Editorial
Fixes", "Won't Fix" and "Needs Addressing By WG" - though there's some
overlap. (For example an issue that we won't make any technical changes
to RDFa to fix, but could make some editorial changes to clarify if the
WG thinks this is needed.)

----

Editorial Fixes
===============

HS: It seems questionable that formsplayer.com (site of a product that
one of the Editors has a commercial interest in) is used in an example.

TI: Please replace with something like example.com. [EDIT]

HS: The Creative Commons license example in section 2.2 uses the
anti-pattern of saying "a Creative Commons license" (instead of saying
which one of the numerous licenses) in the human-readable prose.

TI: Please include the name of the license in this example - possibly
mark it up in RDFa as the license's dc:title or rdfs:label. [EDIT]

HS: The processing model for the case where the optional xmlns:prefix
feature is supported isn't specified.

TI: Step 4 in the processing sequence could be expanded upon. Need to
indicate that @xmlns:* is processed first, if permitted by host
language; then @prefix which can override the previous mappings from
@xmlns:*. [EDIT]

HS: It's weird that the prefix attribute requires a single space between
the colon following the prefix and the URI but allows multiple spaces
between the URI and the next prefix.

TI: This seems to be a mistake. We should allow one or more whitespace
characters in both cases. We need to make sure the regular expression or
algorithm for extracting mappings from @prefix can cope with that (see
next question). [EDIT]

HS: If the spec contains rules for how to extract a set of prefix to URI
mappings from the prefix attribute, the rules are hard to locate.

TI: We used to have a regular expression for parsing the prefix mapping.
(Perhaps on the rdfa.info wiki?) Could this be added to RDFa Core, or as
an alternative a detailed algorithm for parsing @prefix? [EDIT]

Won't Fix
=========

HS: To be Compatible with Existing Content, RDFa 1.1 doesn't need to be
backwards compatible in the sense of parsing the same triples out of any
valid RDFa 1.0 input as RDFa 1.0. Instead, it needs only to produce the
right triples for the content that's already out there. Thus,
Compatibility with Existing Content could be mostly achieved by
performing by hard-coding the meanings of the common prefixes used in
deployed content that purports to use RDFa.

TI: There exist at least three commonly-used prefixes that I can think
of that this would not work for: 'dc', 'v' and 'og'. 'dc' is variously
used to abbreviate <http://purl.org/dc/terms/> and
<http://purl.org/dc/elements/1.1/>; 'v',
<http://rdf.data-vocabulary.org/#> and
<http://www.w3.org/2006/vcard/ns#>; 'og',
<http://www.opengraphprotocol.org/schema/> and <http://ogp.me/ns#>.
While in two of these cases the different URIs merely indicate different
versions of the same or similar sets of terms, switching them - as
inevitably would happen if any of these prefixes were to be hard-coded
ignoring the document's own mappings - would be unacceptable for any
format that purports to be a serialisation of RDF.

TI (cont'd): Further, from a more theoretical standpoint, blessing a
finite set of vocabularies and forcing others to use a more long-winded
notation discourages truly decentralised development of vocabularies.
Decentralised development is an important use case which RDFa and RDF
more generally were developed to cater for.

HS: I reiterate my previous comment that prefix-based indirection
confuses authors and complicates implementation. Please use absolute
URLs only instead of CURIEs. I'm not going to elaborate on this point,
because I realize that the WG isn't going to change this.

TI: Prefix-based abbreviation of URIs is considered a feature of RDFa.
Dropping support for it would, in our opinion harm the usability of the
language. It would also severely break backwards compatibility, as RDFa
1.0 in many places *only* supports CURIEs, and disallows full URIs. This
working group is chartered to maintain backwards compatibility except in
one particular area.

TI (cont'd): That having been said, there are some editorial changes we
should consider, including shifting the examples where full URIs are
used earlier in the RDFa Core, before CURIEs are introduced. This would
make it clearer that CURIEs are an optional abbreviation for URIs,
rather than URIs being an unusual way of expressing CURIEs. [EDIT]

TI (cont'd): Further, in examples where we are not specifically trying
to illustrate CURIEs, we could take the default position of using
absolute URIs. This includes all the examples in section 7.3 of RDFa
Core. [EDIT]

HS: Loading prefix definitions from an external file seems to make RDFa
brittle in case the external file can't be loaded. Also, blocking RDFa
processing in order to do IO to fetch the prefix definitions complicates
implementation.

TI: The working group has attempted to, as much as possible, limit the
harm that is done when external profiles are unavailable.

TI (cont'd): In a hypothetical world where Youtube and Flickr both
implemented RDFa using profiles, then the brittleness of their links to
profile documents seems secondary to the brittleness of their links to
video and image files respectively. If the latter is not seen as
problematic, then it's not clear why the former should be.

TI (cont'd): Profiles are an optional feature of RDFa, and when an
author is more than averagely concerned with the longevity of their
data, they are perhaps a feature best avoided. We may wish to explicitly
state this somewhere in RDFa Core. [EDIT]

HS: (This is a general RDF problem but...) It seems author-hostile to
require authors to specify the datatype of e.g. date literals instead of
making the datatype of a property a characteristic of the property in
the vocabulary/ontology.

TI: This is something perhaps better addressed further up the RDF stack.
We note that an RDF working group has recently been established by the
W3C.

HS: It seems unfortunate to use XML Schema Datatype as an example
considering how much weird variability XML Schema Datatypes allow.

TI: With the exception of rdf:XMLLiteral, XML Schema datatypes are the
most commonly used datatypes in RDF data in the wild. It makes sense to
use these datatypes rather than perhaps theoretically "better" but less
realisitic examples.

HS: xmlns:prefix is marked as an optional feature. Please remove the
feature altogether, because xmlns:prefix parses differently in text/html
and application/xhtml+xml which are the media types most likely to be
used to transfer RDFa.

TI: We cannot remove this feature, as it would break compatibility with
RDFa 1.0. We are chartered to retain compatibility.

TI (cont'd): Further, we are not aware of any RDFa implementors who have
had problems caused by this difference. A number of RDFa implementors
have built processors that accept DOMs generated by both HTML5 and XML
parsers, with no reported difficulties.

HS: It's a bad idea to use xs:anyURI as part of a definition, because
the meaning on xs:anyURI has shifted over time and, IIRC, now any string
is an xs:anyURI.

TI: RDF's use of XML Schema datatypes differs from XML's use of them -
they are not merely syntactic checks, but impart meaning to literals.
Thus, in RDF terms, there is a difference between stating a property's
range is xsd:string versus xsd:anyURI, even if they allow syntactically
the same literal values.


Needs Addressing By WG
======================

HS: The concept of "processor graph" seems to be an open-ended loophole
of non-interoperability.

TI: We should open a new issue for this. [ISSUE]

HS: Under 4.1 the statement about whitespace seems to say that authors
should assume non-conforming processors.

TI: Would anyone object to removing the second and third sentances from
the last paragraph of section 4.1? [EDIT]


-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>
Received on Tuesday, 25 January 2011 23:17:25 UTC