- From: Ben Adida <ben@adida.net>
- Date: Fri, 25 May 2007 19:38:15 -0700
- To: Keith Alexander <keithalexander@keithalexander.co.uk>
- CC: public-rdf-in-xhtml-tf@w3.org, public-grddl-comments@w3.org
Keith, I'm not opposed to having a profile for RDFa, but it's important to point out that we already have a proposed way of flagging RDFa content: change your HTML version to XHTML1.1+RDFa. (And then your doc even validates, which is important to a number of folks.) Even GRDDL works here since we can specify the transform in the XHTML1.1+RDFa schema doc (when we eventually specify this using XML schema.) But let's talk practice here, since that's part of what you're worried about (specifically, efficiency of checking for presence of RDFa.) If you build software that assumes some RDFa header flag is always there when RDFa is present in the document, then you're going to lose big time. The main argument is simple: we now live in a world of mashups and widgets. There are now third-party applications that run inside Facebook's very own HTML page. Chances are, some widgets will include RDFa, even if the containing page does not flag the presence of RDFa. If you want to find the structured data in the page, you're going to have to try the RDFa parser and see what comes out. I can't imagine that you'll get anything useful out of the structured-data web if you don't do this. This isn't an RDFa issue. It's just the way the web is: pages aren't atomic chunks anymore, they're bags of disparate chunks of HTML, each one of which might have been authored by a different party. The good news is that, unlike microformats, there's only one RDFa parser, and it's not going to change regularly over time as we use more vocabularies. That's a key difference. On to some details.... > HTML (I'd argue) isn't really suited for being a candidate for treating > data as a first class citizen, because its primary use is for presenting > documents (not units of data) to humans. We have a notable disagreement here :) What other format would you use for providing units of data to humans? XML+XSLT (ouch)? When units of data are presented to a human, they need to be rendered, yet you also need to close the loop so that I can point my mouse to the rendered stuff and get back to the structured unit of data. That's why, in my mind, HTML is actually a *very good* place to put some amount of structured data. Not all structured data, but certainly data that's meant to be interpreted by human eyes to some degree. [...] > you still have to provide the information twice RDFa has as one specific goal to try to *not* repeat the information. I think we've seen that this is actually quite doable for a whole bunch of use cases, so I'd have to disagree with that point. > I think the heart of my disagreement with this attitude towards > @profile, is that you obviously want *RDFa* to be in First Class, and > all *other* methods of embedding data in html to lump it in Third - > which is pretty much the same impression I get with regards to HTML 5's > attitude to microformats. This is not quite a correct comparison. Microformats don't have a consistent syntax. You can't parse microformats without knowing *which* vocabulary you're looking for. So there's no way that kind of loose syntax can ever be first class. eRDF still requires declaring page-specific stuff (like namespaces) in the HEAD of the document, so it can't be mashed up. What I mean is this: this isn't an *attitude* that RDFa should be First Class and other methods should be Third. It's a realization that the web needs *some* kind of generic syntax that is mashup-compatible, and neither microformats nor eRDF (nor any other syntax that we know of) fits the bill. > And that's disappointing because there isn't one syntax that's going to > be best for everyone all of the time, and there doesn't have to be a > 'winner' Your argument is deceptive here. Of course most people agree that "one syntax isn't going to be best for everyone all the time." But you're missing some of the context: we're already talking about embedding something inside the *HTML syntax*. We already agree that, if you're going to combine sources of HTML, you probably want the components you're combining to be fairly consistent and use more or less the same version of HTML, and not, say, XAML. <P> better mean a paragraph break. If you accept the assumption that a web page is no longer an atomic chunk of data, then it seems to me a worthy goal to aim for one fairly generic syntax for structured data *in HTML*. There's a clear benefit there. Of course that syntax won't be the best all the time for all users, but if you want your HTML remixed, mashed up, taken apart and put back together, then what else can you do but aim for one reasonable syntax that has the important advantage of being consistent and produce self-contained chunks? I certainly see how in some cases, using GRDDL on some highly domain-specific data would be better. But then you can't break up that page and recombine it with something else. You give up generic syntax in order to gain efficiency. RDFa aims to be a generic enough syntax that you can do mashup-able structured data in HTML. It won't always be the most compact, especially in certain information-dense domains, but at least it will be generic, and that's something we actually need. > I apologise for making my first post to this list one of dissent, and > I'm sorry if I'm irritating you all about an issue that has already been > laid to bed, I just think that offering at least the *option* of a > @profile to authors is important. It doesn't stop anyone from using or > parsing RDFa without it, it just acknowledges that not every web page > contains RDFa, and that RDFa isn't the only syntax for expressing RDF in > HTML. Happy to hear this dissent, it helps crystallize ideas and shakes us from any potential tunnel vision. I hope that my dissent from your ideas can similarly cause some good discussion here :) -Ben
Received on Saturday, 26 May 2007 02:38:33 UTC