- From: Calogero Alex Baldacchino <alex.baldacchino@email.it>
- Date: Fri, 09 Jan 2009 21:07:47 +0100
Julian Reschke ha scritto: > Calogero Alex Baldacchino wrote: >> ... >> This is why I was thinking about somewhat "data-rdfa-about", >> "data-rdfa-property", "data-rdfa-content" and so on, so that, for the >> purposes of an RDFa processor working on top of HTML5 UAs (perhaps in >> a test phase, if needed at all, of course), an element dataset would >> give access to "rdfa-about", instead of just "about", that is using >> the prefix "rdfa-" as acting as a namespace prefix in xml (hence, as >> if there were "rdfa:about" instead of "data-rdfa-about" in the markup). >> ... > > That clashed with the documented purpose of data-*. Hmm, I'm not sure there is a clash, since I was suggesting a *custom* and essentially *private* mechanism to experiment with RDFa in conjunction with HTML serialization, for the *small-scale* needs of some organizations willing to embed RDFa metadata in text/html documents, and to exchange them with each other by using a convention likely avoiding name clashes with other private metadata. Since I think it's unlikely to find data-rdfa-* used with different semantics in the very same page, and in a small-scale scenario involving a few *selected* sources for RDFa-modelled information, it should be likely to know in advance that someone else is using the same conventions. Such a modelled document might be used in conjunction with an external RDFa processor, thus avoiding any direct support in a browser. However, such a convention might be enough "clash-free" to work on a wider scale, thus it might become widespread and provide an evidence that the web /needs/, or at least /has chosen/ to use RDFa as (one of) the most common way to embed metadata in a document, and such might be enough to add a native support for the whole range of "RDFa" attributes, eventually along with support for earlier experimental ones (such as "data-rdfa-*" and "rdfa:*" ones, for backward compatibility). And actually I can't see much of a problem if a private-born feature became the base of a widespread and widely accepted convention (I'm not saying the spec should name data-rdfa-* as a mean to implement RDFa, instead I think that, if a general agreement on if and how RDFa must be spec'ed out and implemented can't be found, such an experiment might be proposed to the semantic web industry and wait for the results - given a lack in support might prevent any interested party to use RDFa and HTML5 altogether). > > *If* we want to support RDFa, why not add the attributes the way they > are already named??? > For instance, to experiment whether it is worth to change the "if we want" into "we do want", without requiring an early implementation and specification, nor relying on if and what a certain browser vendor might want to experiment differently from others (such a convention would only require support for HTML5 datasets and a script or a plugin capable to handle them as representing RDFa metadata). -- the point here is that after introducing data-* attributes as a mean to support custom attributes any browser vendors might decide to drop support for other kind of custom attributes in html serialization (that is, for attributes being neither part of the language nor data-* ones), therefore if they (or any of them) decided to avoid to support RDFa attributes until they were introduced in a specification there might be no mean to experiment with them (in general, that is cross-browser) without resorting either to data-* or to "rdfa:*" (the latter in xhtml). Anyway, /in general/ what should a browser do with RDFa metadata, on a *wide scale*, other than classifying a portion of the open web (e.g. in its local history), eventually allowing users to select trusted sources? Actually, I don't think such would bring enough benefits for *average* users, compared to the risk to get a lot of spam metadata from /heterogeneous/ sources. I really don't expect average users to understand how to filter sites basing on metadata reliability (and just for the purpose to use a metadata-based query interface, because a site with wrong metadata might still contain usefull informations); instead they might just try and use a query interface the same way they use a default search bar, get wrong results (once spam metadata became widespread) and decide the mechanism doesn't work fine (eventually complaining for that). A somewhat antispam filter might help, but I think that understanding if metadata are reliable, that is if they really correspond to a web page content, is an odd problem to be solved by a bot without a good degree of Artificial Intelligence (filtering emails by looking for suspicious patterns is far easier than implementing a filter capable to /understand/ metadata, /understand/ natural language and compare /semantics/ ). As well, I don't expect the great majority of web pages to contain "valid" metadata: most people would not care of them, and a potentially growing number might copy&paste code containing metadata from other sites as a kind of template, then edit the content and ignore any metadata, thus breaking reliability. I do think wide-scale use of metadata coming from heterogeneous sources can be more harmful than useful. *If* we do agree that small-scale needs is the main context where RDFa can bring benefits, perhaps a custom mechanism and external plugins are all we need; otherwise, it should be proved that /misused/ and /abused/ metadata can be filtered out *easily* and *automatically*, without requiring average users to understand the problem, nor affecting the overall efficiency. IMHO. >> ... >> However, AIUI, actual xml serialization (xhtml5) allows the use of >> namespaces and prefixed attributes, thus couldn't a proper namespace >> be introduced for RDFa attributes, so they can be used, if needed, in >> xhtml5 documents? I think such might be a valuable choice, because it >> seems to me RDFa attributes can be used to address such cases where >> metadata must stay as close as possible to correspondent data, but a >> mistake in a piece of markup may trigger the adoption agency or >> foster parenting algorithms, eventually causing a separation between >> metadata and content, thus possibly breaking reliability of gathered >> informations. From this perspective, a parser stopping on the very >> first error might give a quicker feedback than one rearranging >> misnested elements as far as it is reasonably possible (not >> affecting, and instead improving, content presentation and users' >> "direct" experience, but possibly causing side-effects with metadata). >> ... > > That would make RDFa as used in XHTML 1.* and RDFa used in HTML 5 > incompatible. What for? > > > ... > > BR, Julian Because I'm not sure RDFa can work fine with HTML serialization. To clarify that, let me take and modify an example from W3C Recommendation (without pretending it to be a good example to build a good worst-case scenario, but just to give an idea): [...] <p> I'm holding <span property="cal:summary"> one last summer Barbecue </span>, to meet friends and have a party before the end of holidays on <span property="cal:dtstart" content="2007-09-16T16:00:00-05:00" datatype="xsd:dateTime"> September 16th at 4pm </span>. </p> [...] Now let consider it written as: [...] <p> I'm holding <span property="cal:summary"> one last summer Barbecue <!-- now the </span> close tag is missing here -->, to meet friends and have a party before the end of holidays on <span property="cal:dtstart" content="2007-09-16T16:00:00-05:00" datatype="xsd:dateTime"> September 16th at 4pm </span>. </p> [...] The above would result in a parse error as an xml-serialized document, since the document isn't well formed. Instead, as part of an html-serialized document, the above fragment would be processed anyway, improving users' experience (with respect to a page stopping rendering on a missing close tag), but potentially causing metadata to be imprecisely binded to any data, thus potentially harming automated data extraction (for some purpose). Therefore, perhaps using such metadata only inside xml serialized pages might give a quick feedback on such a problem as soon as the author checked a page appearance (which I think would be the very first check, as well as I think about no one would check the _whole_ range of possible queries people might make over a document, to look for errors). *If* this is meaningful, supporting RDFa attributes as "rdfa:*" might ensure that xml serialization is preferred by people really needing to use this kind of metadata (while leaving a chance to experiment RDFa with html serialization, because no one can be prohibited to use data-<prefix>-* for this purpose beside a proper script or plugin), whereas introducing "about", "property", "content", "datatype" and so on directly in html namespace, as attributes shared by all elements, would make the choice of one serialization or the other indifferent, thus leading to every possible side-effects html serialization may cause. As a side note, It seems that people from the W3C are evaluating a resort to extensibility to introduce RDFa attributes into xml-serialized html documents, and they also have some doubts whether allow use of RDFa attributes within html serialization or not: "The HTML WG is encouraged to provide a mechanism to permit independently developed vocabularies such as Internationalization Tag Set (ITS), Ruby, and RDFa to be mixed into HTML documents. /Whether this occurs through the extensibility mechanism of XML, *whether it is also allowed in the classic HTML serialization*, and whether it uses the DTD and Schema modularization techniques/, is for the HTML WG to determine." (from <http://www.w3.org/2007/03/HTML-WG-charter#deliverables>) WBR, Alex -- Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f Sponsor: Meetic: il leader italiano ed europeo per trovare l'anima gemella online. Provalo ora Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8291&d=9-1
Received on Friday, 9 January 2009 12:07:47 UTC