- From: Eduard Pascual <herenvardo@gmail.com>
- Date: Fri, 22 May 2009 12:26:23 +0200
On Thu, May 21, 2009 at 5:19 PM, Toby Inkster <mail at tobyinkster.co.uk> wrote: > On Thu, 2009-05-21 at 13:26 +0200, Eduard Pascual wrote: > [... lots ...] I won't go point by point through your reply neither, but there are some points worth answering. > CSS was invented as a way to separate out content from styling. Or to > put it another way, to separate out data and presentation, which allows > the same data to be re-presented (or indeed represented) in many > different ways. The unobtrusive scripting "movement" (for want of a > better word) aims to separate out behaviour from data, which I think is > also a worthy ideal. But I consider the information which RDFa carries > to be very strongly part of the document's *data*, so not especially > suitable for separating out. The way you describe CSS really makes it look too different from CRDF and similar approaches. But I see it somewhat different: as much as CSS describes how content should be conveyed to humans, CRDF describes how should it be conveyed to machines. With this description, they suddenly look quite parallel; so I'll stay in neutral ground and take these as just different points of view. It's important to state that CRDF is *not* intended to take *all* the semantics *out* of the document. In the most extreme cases, it would be intended to take *some* *descriptions* of those semantics somewhere more centralized (a external file if it's to be shared by several documents, the document header if it's to be widely used across the document, etc). > (This consideration very much effected the design of RDF-EASE. You'll > note that the -rdf-about and -rdf-content properties which it defines do > not allow the author to hard code data into the RDF-EASE file -- they > only allow the author to specify an attribute from the (X)HTML file > where the data can be found.) This makes a lot of sense. Actually, RDF-EASE is meant to be always placed on an external file, so it's reasonable to disallow stuff that just shouldn't go on an external file. CRDF, on the other hand, is designed to work either as an external file, an embedded piece of code (a.k.a. a <script>, using HTMLish terms), or inline within the document; and, most prominently, combining these forms as appropriate for each case. It also tries to have a syntax and content model that is consistent across all three usages. This leads for features that are mostly intended for inline usage to be also allowed when CRDF is used as an external file; but this doesn't meant that such usage is neither intended nor advisable. To put a clearer example, should CSS forbid constructs like this: "h1:not(h1)"? (hint: they are allowed). Some things just make no sense, but are allowed because explicitly forbidding them would add unneeded complexity to the format. My plan was to follow CSS's good example, adding informative notes on stuff that is implicitly allowed but makes no sense or is unadvisable, rather than going for explicit prohibitions. Keep in mind that, on external files or scripts, the kind of usages that should be expected would be something like this: .person { @|subject: blank() } .person time.dob { foo|birthdate: foo|date(attr(datetime)) } /* foo|date(...) is the explicit datatype notation */ Rules in the form "prefix|property: literalvalue" are only intended for inline usages. Actually, trying to use them externally would be quite hard, unless an author can be sure that all the elements matched by a selector would actually share the value (and if they do, what'd be wrong with stating it just once?). > [... some stuff about how will English change in a thousand years ...] > > A great help in clarifying your usage of terms is the inclusion of a > glossary. For example, I could write: > > <dl> > ?<dt>name</dt> > ?<dd> > ? ?A name is a label for a noun, (human or animal, > ? ?thing, place, product [as in a brand name] and even an > ? ?idea or concept), normally used to distinguish one from > ? ?another. > ? ?(<a href="http://en.wikipedia.org/wiki/Name">source</a>) > ?</dd> > </dl> > > With RDFa, the idea of a glossary can be used to reduce our reliance on > external vocabularies: > > ?<dl xmlns:foaf="http://xmlns.com/foaf/0.1/" > ? ?xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> > ?<dt about="[foaf:name]" property="rdfs:label">name</dt> > ?<dd ?about="[foaf:name]" property="rdfs:comment" datatype=""> > ? ?A name is a label for a noun, (human or animal, > ? ?thing, place, product [as in a brand name] and even an > ? ?idea or concept), normally used to distinguish one from > ? ?another. > ? ?(<a rel="rdfs:seeAlso" > ? ?href="http://en.wikipedia.org/wiki/Name">source</a>) > ?</dd> > </dl> > > This doesn't completely eliminate the risk, but goes a long way to > mitigating it. Agreed. But CRDF would also allow that kind of glossary. What's your point with it? Again, let me insist that external file CRDF is only one of its possible usages. Actually, it only makes sense when it holds rules that apply to multiple documents (otherwise, <script> or inline uses would work better). If an author is already caring about keeping several documents live, then keeping one extra .crdf file live as well shouldn't be too difficult. Please, don't be missguided by Tab's "favoritism" towards external .crdf files. While they are a useful tool for some of the cases, they do not cover all the cases. <script> and inline uses are equally important and; IMO, one of the strongest points of CRDF is that it provides a unified syntax for all three usages, rather than having to rely on different formats for each thing (for example, using RDFa for inline stuff and EASE for external stuff would be, on the best case, messy). >> The reduced number of attributes in CRDF is not aimed to deal with >> complexity; but with a separate issue: it is easier for a host >> language to add a rel value for <link>s and an extra attribute with no >> predefined name, than the bunch of attributes RDFa defines. > > Not just an extra rel value for <link>, but in some languages it would > involve introducing the <link> element to begin with. The cost of > introducing a new element is significantly higher than new attributes, > given that in most implementations of XML-like languages, unknown > attributes are generally ignored. Please, review "3.1. Linking to CRDF sheets" about this. <link> is used in X/HTML because: 1) X/HTML already defines it; and 2) it's made exactly for the kind of job we are doing here. For generic XML, a processing instruction like <?xml-metadata ...?> is suggested. Besides these case-specific recommendations, the basic requirement is stated as "The host language must include a mechanism for linking to external CRDF sheets." <link> and PIs, where available, are both good mechanisms to deal with this requirement, but a language can define any other mechanism it finds appropriate. Section "3.2. Embedding CRDF sheets", which deals with <script>, describes this as highly desirable, rather than a requirement: <script> is reused in X/HTML because it's available and it is ready for the job; for other languages three cases are possible: 1) The language has something as flexible as <script>, and thus it's re-used for CRDF 2) The language defines an element just to deal with this feature. 3) This feature is not avaiblable at all from that language This is a per-language choice, and all three options would be perfectly compliant with CRDF's requirements. In summary, the requirements for a CRDF host language would be: "a mechanism for linking to external CRDF sheets" and "an attribute whose content model is ?a CRDF inline definition? (other wordings are acceptable, of course, as long they mean the same)" (the document also describes what "a CRDF inline definition" is). >> Actually, >> there have been some complains [1] about why should HTML5 restraint >> itself from using quite useful attribute names such as "content" or >> "resource", just because RDFa decided to use them, without giving >> non-X HTML a thought. > > Attribute names are not a scarce commodity. Just using the 26 letters of > the English alphabet (I avoid calling it the "Latin alphabet" given that > three of the letters are post-Roman inventions) you can create about 10 > million different 5-letter attribute names. Certainly most of them are > nonsensical, but there are an awful lot of attribute names to choose > from, so it doesn't make sense to introduce potentially harmful clashes > where they could be avoided. > > You beg the question of whether the RDFa task force invented attributes > without giving HTML a thought. Certainly RDFa's XHTML 2.0 heritage is > clear, but the language employed by the RDFa syntax document appears > very carefully chosen to accommodate HTML. Really? It already has some conflicts with HTML4 (@content is already used in that format; more on this later). The point is that, among the 10 million or more available names, the RDFa group took names that are highly generic: "content" or "resource", for example, could be used for lots of things on a web markup language, but the RDFa guys decided that HTML should abstain from using them for anything, without asking. Not very polite, IMO. > The only aspect of RDFa which doesn't sit especially well in HTML is > CURIE prefix mappings, which use xmlns:* attributes. In practice, it > doesn't seem to have proved a difficulty to those of us who have > implemented support for RDFa in HTML, but there are theoretical and > aesthetic arguments against it. But this is a small issue which is not > especially difficult to fix, and there's no reason to throw the baby out > with the bathwater. Various solutions to it are being discussed both > here and on the public-rdf-in-xhtml-tf at w3.org list. Are you calling the DOM Consistency Principle a "theoretical" or "aesthetic" argument? That principle is the only thing that allows migrating documents from X to soup or vice-versa without having to redo every script; or to have scripts working properly with seamless frames where XHTML and tag-soup sources are mixed together. Sure, this is not an issue for script-less documents, but script-based web applications are a reality, and are growing in both number and complexity at a quite fast pace. One of the reasons HTML5 exists at all is that the W3C was quite unwilling to deal with this reality. The only reason that RDFa in HTML has worked until now is the same reason <font> worked until browsers were ready for CSS: authors will normally stick to what works, so they won't be messing with the DOM if they are putting "xmlns:" stuff in it on an HTML document. The point is that we need specs that deal with authors' and users' needs; rather than authors that workaround spec flaws. >> In other words: currently, RDFa parsers should have enough to ignore >> non-X HTML content (or, more specifically, documents with no default >> xmlns in <body>, so they can also cope with the XHTML1.1+RDFa served >> as text/html aberration, which is wrong no matter how you look at it). > > Personally I think it was a mistake to register a new content-type for > XHTML to begin with - it introduced an unnecessary schism between HTML > and XHTML which should have just been a natural progression. Personally, I think that XHTML (or, more exactly, trying to bring draconic error handling to the web) was a mistake itself. XHTML can't be a natural progression for HTML, for a quite simple reason: most of existing HTML content would be rendered as an XML parsing error notice if it was processed as XHTML requires a page to be processed. > Any XHTML-family language which doesn't use elements from non-XHTML > namespaces and follows a few simple rules for backwards-compatibility in > practise seems to work fine served as text/html. Any document that can work properly served as text/html could be authored in plain HTML, and takes no benefits from XHTML. What's the point of switching to XHTML if you aren't going to take profit of it, and you are going to deal with the compatibility rules? >> If RDFa was taken into HTML5, then parsers should also care about >> non-X documents, which binds HTML to not use these attribute names for >> any future extension (actually, as pointed on Ian's mail referenced >> above, @content is already used on <meta> since HTML4, so this can't >> even be fulfilled). > > RDFa's use of @content is compatible with its use in HTML4. No, they are > not identical uses, but they are not inconsistent either. Much like > saying that "I am a human", and "I am a mammal" are not identical > statements, but are consistent. > > In HTML4 @content is used on <meta> to indicate a string that parsers > interested in a particular piece metadata should use. In RDFa it is used > in the same way, but allowed globally instead of just on <meta>. At any given moment, the HTML group could have decided to extend the use of @content to other elements. It would especially make sense if it was a use comparable to that done on <meta>. RDFa took away this possibility without even asking the HTML folks if there was any expected ampliation of this attribute. Just like that, @typeof could have lots of usages on future versions of webforms; but RDFa shut that door for HTML. Again, @resource could also have several potential uses (for example, to refer to cache or local storage resources by web applications), but RDFa shut also that door for HTML. RDFa could have taken a less disruptive approach, for example prefixing "rdfa-" or even just "r" to attribute names to avoid shutting doors to HTML, but they didn't. Now, don't be surprised that the HTML guys are so unwilling to open the doors to RDFa. Regards, Eduard Pascual
Received on Friday, 22 May 2009 03:26:23 UTC