W3C home > Mailing lists > Public > public-html@w3.org > May 2009

Re: microdata use cases and Getting data out of poorly written Web pages

From: Shelley Powers <shelleyp@burningbird.net>
Date: Sat, 09 May 2009 08:59:33 -0500
Message-ID: <4A058C45.209@burningbird.net>
To: Sam Ruby <rubys@intertwingly.net>
CC: Ian Hickson <ian@hixie.ch>, public-html@w3.org
Sam Ruby wrote:
> Ian Hickson wrote:
>>>> Whether further discussion would be a waste of time depends on what 
>>>> is discussed, obviously. In general I am always open to changing my 
>>>> mind when faced with new information. In the context of the W3C 
>>>> HTML WG, what I write into the HTML5 spec is but a first draft 
>>>> proposal, our process requires that the working group have 
>>>> consensus on a topic before it can be considered "closed".
>>> How is the item proposed for group discussion? How is consensus 
>>> recorded?
>> I will let the chairs respond to these process questions.
> At the ASF, we have two ways of operating (and, yes, I'll get to the 
> point quickly enough, I just ask that you both indulge me for a 
> moment).  The first is called Review Then Commit (RTC) where proposals 
> are discussed, consensus is reached, and the committed.  In my 
> experience, that's the way most standards organizations aspire to 
> operate.
> The other way is referred to as Commit Then Review (CTR).  While it 
> does mean that at times what is in Subversion does not reflect 
> consensus, one can not conclude that that means that what is 
> ultimately released does not enjoy consensus.  To the contrary, 
> releases at the ASF routinely enjoy consensus, and even those that do 
> not represent absolute consensus do enjoy substantial support.
> For better or worse, the HTML WG is operating under a CTR process.  As 
> far as I'm concerned, no attempt has been made to assess consensus on 
> any part of the current draft, at least not to my satisfaction.  That 
> does not mean that such assessment of consensus can't be obtained 
> rather quickly in many areas, in fact, I'd suggest that it can.
> I joined this working group as co-chair with a number of personal 
> goals.  The first two I'll characterize as "no excuse not to", and the 
> subgoals were to significantly reduce the hostile working environment 
> that existed in public-html at the time and to provide anybody and 
> everybody who wished to an opportunity to pursue alternative proposals.
> Shelley once referred to this as "put up or shut up", and I will admit 
> that there is an element of truth to this.  I will, however, point out 
> that as we move from spring to summer to fall, if it continues to be 
> the state that absolutely nobody is willing to step forward, then that 
> is something that I will take into consideration.
> My third and final personal goal was to assess consensus.  As to the 
> order in which we assess consensus, I have every intention of being 
> opportunistic -- taking the low hanging fruit when it is offered, and 
> taking on topics when the topic of conversation is naturally occurring 
> anyway.  For a number of reasons (including the discussions about 
> merging with XHTML2), the topic du jour is RDFa.  In April, I was 
> assured that this would be done in April.  Earlier this week, I was 
> assured that it would be done by the end of the week.  Both estimates 
> appear to be optimistic.  But as long as there is reasonable 
> expectation of progress, I'm still OK with that -- for the moment.
> It appears that Ian is on the cusp of making a proposal.  It may turn 
> out to be something that people can live with, and if so, I'll be glad 
> to declare consensus (with Chris's concurrence, of course) and move 
> on.  If not, people will have ample opportunity to discuss it with Ian 
> and/or develop alternative proposals.  If all of the above fail, I 
> will simply ask that Ian take his proposal out of the draft.  Note: 
> that won't be a suggestion to replace it with RDFa or anything else, 
> it will be to simply take it out.  Because in the final analysis, my 
> personal bar is a fairly low one: have the working group produce a 
> spec which is widely recognized as being better than HTML4.
> - Sam Ruby
I appreciate the comment, Sam. Forgive me, in turn, for what will be a 
long response. I am a writer. I can't help it.

I'm not necessarily comfortable participating in these groups, as you 
know. And contrary to what some folks think, I don't throw wrenches into 
these processes only for a giggle, or because I was born under a full 
moon and hence am inherently a troll.

To be blunt, I'm only interested in this specification because I hope to 
see it advance two other specifications that have been hindered by their 
XHTML-only state: SVG and RDFa. And though I'm not into math, I'll also 
throw in MathML, though there seems to be few problems with including it 
in HTML5.

To me, the biggest limitation with HTML 4 is still the limitation that 
is being carried forward into HTML5, and that is a lack of 
extensibility. OK, fine, extensibility in HTML will cause black holes to 
spontaneously appear. Barring that extensibility, though, this means 
that I have to do what I can, in my own way, to see that SVG, MathML, 
and yes, RDFa are included. Not just those, but support for 
accessibility and yes, even microformats, too. 

I am not an expert in SVG, or in RDFa. All I know is their potential, 
and their benefits. I leave the details to how both can be integrated 
into HTML5 to people much more capable than me. From work I've seen in 
these lists, I know for a fact that these specifications can be 
integrated into HTML5, without any lasting harm. Not as gracefully as 
with XHTML 1.1 or 2.0, but they can be integrated.

SVG seems to have made the cut, at least for now, so I'm going to focus 
on RDFa. My concern with the current approach to "microdata", is that 
there's a contrariness being applied to the process, primarily because 
the clock is ticking (2022 will be here before we know it), and also 
somewhat because I see a real bias in Ian's writings about RDFa, and 
therefore I don't think whatever proposal he puts forward will be 
necessarily the best for either HTML5, or the community that will be 
stuck with HTML5 in the future.

However, I could be wrong about my read of Ian's philosophy of 
microdata, microformats, and RDFa. It could be all a matter of 
interpretation. So, rather than wait for him to continue with his 
non-proposals in the WhatWG group, I'm going to do what I told Ian I 
will do, which is review the raw data to the use cases he's put forward, 
perhaps re-capture the context for each, refine, and re-present on 
Monday. Then I plan on commenting on his non-proposals, and use whatever 
opportunity presents itself to show him where perhaps his interpretation 
is not the only interpretation, or even the best.

I'll also be included on Monday, two other use cases, or perhaps 
requirements would be the better term, that Ian did not put forward: 
that HTML5 support microformats, and that HTML5 support RDFa. It was 
inappropriate of Ian to filter these out, just because he sees them as 
"implementation" details. These are both much more than just 
implementation details. Both encompass mature processes, specifications, 
are in widespread use (as wide as canvas, SVG, and MathML, and much 
wider than any new data storage implementation and use), and have 
support from established communities.  Communities, I might add, willing 
to work with the HTML5 group to ensure that inclusion will not adversely 
impact on HTML5, nor add a burden to user agents implementing HTML5, or 
overwhelm the HTML5 spec with unnecessary complexity.

When it came time to look at support for bitmap graphics, it made sense 
to focus on the canvas object, since support for it already exists. When 
it came time to look for vector graphics, it made sense to focus on SVG, 
since support for it already exists. It would have been foolish to break 
down what requirements there are for vector graphics, then create use 
cases and proposals for every aspect (must support gradients, support 
for transforms, etc.), and then come up with some kludged together 
vector graphic support that isn't currently supported by any company, 
only because SVG is an "implementation detail".

The same acceptance of maturity of process and spec, as well as 
widespread support, should be applied when it comes to RDFa and 
microformats. Yes, both, since they are--for the most 
part--complementary, not contradictory. Right now, I can use both, yes 
with HTML5, and though the effort may not validate, they do _work_. 

Starting from scratch to define what is needed to support microdata in 
HTML5 was an incredibly contrary move. Why not start from scratch to 
define a scripting language, too? Or a new database model for the data 
storage effort? Or a new query language for the same?

However, I'm not a member of WhatWG, nor the W3C and this HTML WG. I 
appreciate you telling me how this effort is being managed, and I 
appreciate your concerns. My own concern, though, is that in the 
interest of expediency, and given the temperaments of the involved 
people, that not only will support for RDFa not be included, but 
modifications to HTML5 could be made in such a way that our current use 
of technically invalid, but workable, RDFa will break in the future. 
(See this mailing list thread 

This would not be acceptable.

However, I can respect your process, and if my entries into these 
mailing lists become an encumbrance, I will take my discussions 
elsewhere. Note, also, that I am not a member of the SVG working group, 
or the RDFa in XHTML task group. I am speaking only for myself, and 
whatever I propose or inject into the discussions represent my views only.

Received on Saturday, 9 May 2009 14:00:20 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:45 UTC