Re: microdata use cases and Getting data out of poorly written Web pages from Shelley Powers on 2009-05-09 (public-html@w3.org from May 2009)

From: Shelley Powers <shelleyp@burningbird.net>
Date: Sat, 09 May 2009 17:32:19 -0500
To: public-html@w3.org
Message-ID: <4A060473.6020607@burningbird.net>
Philip Taylor wrote:
> [Removed from public-html since I don't think I'm saying anything 
> interesting enough to bother everyone with,]
>
> Shelley Powers wrote:
>> I noticed, though, that folks were allowed to mention SVG in relation 
>> to the use cases when it comes to the vector graphics discussion. 
>> We're not been able to say the 'R' word in the discussions related to 
>> metadata.
>>
>> Come Monday, I may get naughty, litter the 'R' word all about, like 
>> elephant droppings that you ignore at your own peril.
>
> It's probably also worth noticing that the problem statements and 
> requirements/priorities in 
> http://wiki.whatwg.org/wiki/New_Vocabularies don't mention SVG at all; 
> only the proposed solutions do. The mailing list discussions mentioned 
> SVG a lot, but the use cases were extracted from the discussions in a 
> way that doesn't completely presuppose the solution, and those use 
> cases were adequate justification for adding SVG to the spec as the 
> best solution for some.
>
> The use cases were still written with SVG kept in mind. It provided a 
> way to group similar use cases (so people could suggest use cases 
> which could perhaps be solved using SVG), supporting some coherence in 
> the analysis. It also provided a way to see that certain use cases are 
> sensible to consider (e.g. "I want to put scriptable 2D vector 
> graphics in my page") because some approximate solutions already exist 
> and so it's demonstrably possible, versus use cases that are not worth 
> seriously proposing (e.g. "I want to put photorealistic 3D animations 
> with AI-controlled characters and physics simulation into my page").
>
> But at some point it's necessary to step back from SVG, to allow other 
> solutions to be examined, rather than immediately diving into the 
> details of SVG and potentially missing other ideas. In the current 
> idealised HTML5 process, that stepping-back point is just before you 
> write the problem statements and requirements in the use cases.
>
> So problem statements like "I want to use SVG/RDFa in text/html to do 
> X" are likely to get the response "But why do you want to do that?", 
> and you would need to step back and say "I want to do X" (and accept 
> that X might be solved without involving SVG/RDFa at all). So the 
> problem statements should be more like "I want to mix scriptable 
> vector graphics with my HTML content without migrating to XHTML" or "I 
> want to mark up a complex data structure representing various types of 
> product in my online store, so somebody can easily write software to 
> process the data and compare products between other 
> similarly-marked-up stores" (though that doesn't carry much weight if 
> it's purely hypothetical rather than coming from someone who's really 
> trying to do that). Otherwise the discussion is unlikely to be 
> productive.
>
> Anyway, I'm sure I'm not saying anything new; I just want to attempt 
> to explain why making the use cases dependent on RDF(a) is probably 
> not going to go down well, but it can still be very relevant in other 
> parts of the process.
>
You see, there is where I don't agree.

First of all, HTML is a W3C specification, so it would make sense to 
incorporate support for W3C technologies. And HTML5 does, by 
incorporating support for CSS. To not do so would be frankly, foolish.

It also makes sense to incorporate broadly used technologies, which 
HTML5 does by supporting JavaScript/ECMAScript. To not do so would also 
be frankly foolish.

What were the options for vector graphics? VML? Not even Microsoft 
supports VML going forward. And all the other options mentioned were not 
W3C, or most potentially proprietary.

There are exactly two options for metadata going forward, unless the 
group is so foolish as to think it either can completely ignore complex 
metadata, or manage to shove it all into rel, class, and id. The options 
are microformats, and RDFa.

Microformats are fine..if you're doing addresses or something that fits 
in one of the few defined vocabularies. But it fails, completely and 
totally, when it comes to support of _any_ vocabulary, including new 
ones, as well as established vocabularies. I don't think I have to 
mention, either, about the concerns people have had about the lack of 
precision for parsing incorporated into the microformat documentation, 
either.

Frankly, all of the use cases I've seen to this point, were either 
submitted in emails or list postings by people interested in RDFa, or 
were established by the RDFa group, at the RDFa wiki. Included with 
those use cases were examples of implementations and existing effort, in 
addition to the more generic requirements. Not to include this 
information is the same as saying, "I have six blind men, each one 
touching, in turn, a wall, a spear, a snake, a tree, a fan, and a rope. 
Now, invent something viable that uses all six components. "

What has happened, instead, from what I can see of Ian's use cases, and 
the raw material that went into them, that Ian's use cases were so 
diluted, and so simplified, that they, no offense to Ian, fail as either 
use case, or requirement. Why? Because this group has decided, after 
drinking from whatever fountain of wisdom, that one must look at 
requirements purely in a platonic manner, so as not to "taint" the markup.

Case in point was Ian's "search" use case. I've already started my own 
take on this, at 
http://realtech.burningbird.net/web/standards/searchcase, pointing out 
how something such as "Site owners want a way to provide enhanced search 
results to the engines, so that an entry in the search results page is 
more than just a bare link and snippet of text, and provides additional 
resources for users straight on the search page without them having to 
click into the page and discover those resources themselves", doesn't 
even touch on how complex something like this can be.

But the complexity was stripped out because of the RDFaness of the 
documentation. Frankly, because of the RDFaness of the capability.

So perhaps my own take on the use cases will also fall short, tainted as 
they will be by re-incorporating that which was stripped out. But I am 
not going to pretend that there are literally dozens of metadata 
options, just waiting around to be picked up and plopped into HTML5.

Shelley
Received on Saturday, 9 May 2009 22:33:03 UTC