- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Mon, 03 Dec 2012 15:59:30 -0500
- To: HTML WG <public-html@w3.org>
This is the final Rationale Statement for the Microdata Candidate
Recommendation Objection. Please link to this e-mail from the preference
poll to provide participants some background on the arguments presented
for this objection.
An easier-on-the-eyes version of this e-mail is available here:
http://manu.sporny.org/2012/microdata-cr/
-------------------------------------------------------------------
Objection to Microdata Candidate Recommendation
Full disclosure: I'm the current chair of the standards group at
the World Wide Web Consortium that created the newest version of
RDFa, editor of the HTML5+RDFa 1.1 and RDFa Lite 1.1
specifications, and I'm also a member of the HTML Working Group.
The HTML Working Group at the W3C is currently trying to decide if
they should transition the Microdata specification to the next
stage in the standardization process. There has been a [1]call for
consensus to transition the spec to the Candidate Recommendation
stage. The problem is that we already have a set of specifications
that are official W3C recommendations that do what Microdata does
and more. RDFa 1.1 became an official W3C Recommendation last
summer. From a standards perspective, this is a mistake and sends
a confused signal to Web developers. Officially supporting two
specification that do almost exactly the same thing in almost
exactly the same way is, ultimately, a failure to standardize.
The fact that RDFa already does what Microdata does has been
elaborated upon before:
[2]Mythical Differences: RDFa Lite vs. Microdata
[3]An Uber-comparison of RDFa, Microdata, and Microformats
Here's the problem in a nutshell: The W3C is thinking of ratifying
two completely different specifications that [4]accomplish the
same thing in basically the same way. The functionality of RDFa,
which is already a W3C Recommendation, overlaps Microdata by a
large margin. In fact, RDFa Lite 1.1 was developed as a plug-in
replacement for Microdata. The full version of RDFa can also do a
number of things that Microdata cannot, such as datatyping,
associating more than one type per object, embed-ability in
languages other than HTML, ability to easily publish and mix
vocabularies, etc.
Microdata would have easily been dead in the water had it not been
for two simple facts: 1) The editor of the specification works at
Google, and 2) Google pushed Microdata as the markup language for
schema.org before also accepting RDFa markup. The first enabled
Google and the editor to work on schema.org without signalling to
the public that it was creating a competitor to Facebook's Open
Graph Protocol. The second gave Microdata enough of a jump start
to establish a foothold for schema.org markup. There have been a
number of studies that [5]show that Microdata's sole use case (99%
of Microdata markup) is for the markup of schema.org terms.
Microdata is not widely used outside of that context, we now have
data to back up what we had predicted would happen when schema.org
made their initial announcement for Microdata-only support. Note
that schema.org now supports both RDFa and Microdata.
It is typically a bad idea to have two formats published by the
same organization that do the same thing. It leads to Web
developer confusion surrounding which format to use. One of the
goals of Web standards is to reduce, or preferably eliminate, the
confusion surrounding the correct technology decision to make. The
HTML Working Group and the W3C is failing miserably on this front.
There is more confusion today about picking Microdata or RDFa
because they accomplish the same thing in effectively the same
way. The only reason both exist is due to political reasons.
If we step back and look at the technical arguments, there is no
compelling reason that Microdata should be a W3C Recommendation.
There is no compelling reason to have two specifications that do
the same thing in basically the same way. Therefore, as a member
of the HTML Working Group (not as a chair or editor of RDFa) I
object to the publication of Microdata as a Candidate
Recommendation.
Note that this is not a W3C formal objection. This is an informal
objection to publish Microdata along the Recommendation track.
This objection will not become an official W3C formal objection if
the HTML Working Group holds a poll to gather consensus around
whether Microdata should proceed along the Recommendation
publication track. I believe the publication of a W3C Note will
continue to allow Google to support Microdata in schema.org, but
will hopefully correct the confused message that the W3C has been
sending to Web developers regarding RDFa and Microdata. We don't
need two specifications that do almost exactly the same thing.
The message sent by the W3C needs to be very clear: There is one
recommendation for doing structured data markup in HTML. That
recommendation is RDFa. It addresses all of the use cases that
have been put forth by the general Web community, and it's ready
for broad adoption and implementation today.
Summary of Facts and Arguments
Below is a summary of arguments presented as a basis for
publishing Microdata along the W3C Note track:
1. RDFa 1.1 is already a [7]ratified Web standard as of June 7th
2012 and absorbed almost every Microdata feature before it
became official. If the majority of the differences between
RDFa and Microdata boil down to different attribute names
(property vs. itemprop), then the two solutions have
effectively converged on syntax and W3C should not ratify two
solutions that do effectively the same thing in almost exactly
the same way.
2. RDFa is [8]supported by all of the major search crawlers,
including Google (and schema.org), Microsoft, Yahoo!, Yandex,
and Facebook. Microdata is not supported by Facebook.
3. RDFa Lite 1.1 is [9]feature-equivalent to Microdata. Over 99%
of Microdata markup can be expressed easily in RDFa Lite 1.1.
Converting from Microdata to RDFa Lite is as simple as a
search and replace of the Microdata attributes with RDFa Lite
attributes. Conversely, Microdata does not support a number of
the more advanced RDFa features, like being able to tell the
difference between feet and meters.
4. You can [10]mix vocabularies with RDFa Lite 1.1, supporting
both schema.org and Facebook's Open Graph Protocol (OGP) using
a single markup language. You don't have to learn Microdata
for schema.org and RDFa for Facebook - just use RDFa for both.
5. The [11]creator of the Microdata specification doesn't like
Microdata. When people are not passionate about the solutions
that they create, the desire to work on those solutions and
continue improve upon them is muted. The RDFa community is
passionate about the technology that they have created
together and have strived to make it better since the
standardization of RDFa 1.0 back in 2008.
6. RDFa Lite 1.1 is [12]fully upward-compatible with RDFa 1.1,
allowing you to seamlessly migrate to a more feature-rich
language as your Linked Data needs grow. Microdata does not
support any of the more advanced features provided by RDFa
1.1.
7. RDFa [13]deployment is broader than Microdata. RDFa deployment
continues to grow at a rapid pace.
8. The economic damage generated by publishing both RDFa and
Microdata along the Recommendation track should not be
underestimated. W3C should try to provide clear direction in
an attempt to reduce the economic waste that a "let the market
sort it out among two nearly identical solutions" strategy
will generate. At some point, the market will figure out that
both solutions are nearly identical, but only after publishing
and building massive amounts of content and tooling for both.
9. The W3C Technical Architecture Group (TAG), which is
responsible for ensuring that the core architecture of the Web
is sound, has [14]raised their concern about the publication
of both Microdata and RDFa as recommendations. After the W3C
TAG raised their concerns, the RDFa Working Group created RDFa
Lite 1.1 to be a near feature-equivalent replacement for
Microdata that was also backwards-compatible with RDFa 1.0.
10. Publishing a standard that does almost exactly the same thing
as an existing standard in almost exactly the same way is a
[15]failure to standardize.
Counter-arguments and Rebuttals
[This is a] [16]classic case of monopolistic anti-competitive
protectionism.
No, this is an objection to publishing two specifications that do
almost exactly the same thing in almost exactly the same way along
the W3C Recommendation publication track. Protectionism would have
asked that all work on Microdata be stopped and the work scuttled.
The proposed resolution does not block anybody from using
Microdata, nor does it try to stop or block the Microdata work
from happening in the HTML WG. The objection asks that the W3C
decide what the best path forward for Web developers is based on a
fairly complicated set of predicted outcomes. This is not an easy
decision. The objection is intended to ensure that the HTML
Working Group has this discussion before we proceed to Candidate
Recommendation with Microdata.
<manu1> I'd like the W3C to work as well, and I think publishing
two specs that accomplish basically the same thing in
basically the same way shows breakage.
<annevk> Bit late for that. XDM vs DOM, XPath vs Selectors,
XSL-FO vs CSS, XSLT vs XQuery, XQuery vs XQueryX,
RDF/XML vs Turtle, XForms vs Web Forms 2.0,
XHTML 1.0 vs HTML 4.01,
XML 1.0 4th Edition vs XML 1.0 5th Edition,
XML 1.0 vs XML 1.1, etc.
link to full conversation[17]
While W3C does have a history of publishing competing
specifications, there have been features in each competing
specification that were compelling enough to warrant the
publication of both standards. For example, XHTML 1.0 provided a
standard set of rules for validating documents that was aligned
with XML and a decentralized extension mechanism that HTML4.01 did
not. Those two major features were viewed as compelling enough to
publish both specifications as Recommendations via W3C.
For authors, the differences between RDFa and Microdata are so
small that, for 99% of documents in the wild, you can convert a
Microdata document to an RDFa Lite 1.1 document with a simple
search and replace of attribute names. That demonstrates that the
syntaxes for both languages are different only in the names of the
HTML attributes, and that does not seem like a very compelling
reason to publish both specifications as Recommendations.
[18]Microdata's processing algorithm is vastly simpler, which
makes the data extracted more reliable and, when something does go
wrong, makes it easier for 1) users to debug their own data, and
2) easier for me to debug it if they can't figure it out on their
own.
Microdata's processing algorithm is simpler for two major reasons:
* [19]Microdata does not support as many features and use cases
as RDFa does.
* RDFa 1.1 is backwards-compatible with RDFa 1.0, which
complicates the processing rules. The same is true for HTML5.
The complexity of implementing a processor has little bearing on
how easy it is for developers to author documents. For example,
XHTML 1.0 had a simpler processing model which made the data that
was extracted more reliable and when something went wrong, it was
easier to debug. However, HTML5 supported more use cases and
recovers from errors in cases where it can, which made it more
popular with Web developers in the long-run.
Additionally, authors of Microdata and RDFa [20]should be using
tools like RDFa Play to debug their markup. This is true for any
Web technology. We debug our HTML, JavaScript, and CSS by loading
it into a browser and bringing up the debugging tools. This is no
different for Microdata and RDFa. If you want to make sure your
markup does what you want, make sure to verify it by using a tool
and not by trying to memorize the processing rules and running
them through your head.
For what it is worth, I personally think [21]RDFa is generally
a technically better solution. But as Marcos says, "so what"?
Our job at W3C is to make standards for the technology the
market decides to use.
If we think one of these technologies is a technically better
solution than the other one, we should signal that realization at
some level. The most basic thing we could do is to make one an
official Recommendation, and the other a Note. I also agree that
our job at W3C is to make standards that the technology market
decides to use, but clearly this particular case isn't that
cut-and-dried. Schema.org's only option in the beginning was to
use Microdata, and since authors didn't want to risk not showing
up in the search engines, they used Microdata. This forced the
market to go in one direction.
This discussion would be in a different place had Google kept the
playing field level. That is not to say that Google didn't have
good reasons for making the decisions that they did at the time,
but those reasons influenced the development of RDFa, and RDFa
Lite 1.1 was the result. The differences between Microdata and
RDFa have been removed and a new question is in front of us: given
two almost identical technologies, should the W3C publish two
specifications that do almost exactly the same thing in almost
exactly the same way?
... the [HTML] Working Group explicitly [22]decided not to pick
a winner between HTML Microdata and HTML+RDFa
The question before the HTML WG at the time was whether or not to
split Microdata out of the HTML5 specification. The HTML Working
Group did not discuss whether the publishing track for the
Microdata document should be the W3C Note track or the W3C
Recommendation track. At the time the decision was made, RDFa Lite
1.1 did not exist, RDFa Lite 1.1 was not a W3C Recommendation, nor
did the RDFa and Microdata functionality so greatly overlap as
they do now. Additionally, the HTML WG decision at that time
states the following under the "Revisiting the issue" section:
"If Microdata and RDFa converge in syntax..."
Microdata and RDFa have effectively converged in syntax. Since
Microdata can be interpreted as RDFa based on a simple
search-and-replace of attributes that the languages have
effectively converged on syntax except for the attribute names.
The proposal is not to have work on Microdata stopped. Let work on
Microdata proceed in this group, but let it proceed on the W3C
Note publication track.
Closing Statements
I felt uneasy raising this issue because it's a touchy and painful
subject for everyone involved. Even if the discussion is painful,
it is a healthy one for a standardization body to have from time
to time. What I wanted was for the HTML Working Group to have this
discussion. If the upcoming poll finds that the consensus of the
HTML Working Group is to continue with the Microdata specification
along the Recommendation track, I will not pursue a W3C Formal
Objection. I will respect whatever decision the HTML Working Group
makes as I trust the Chairs of that group, the process that
they've put in place, and the aggregate opinion of the members in
that group. After all, that is how the standardization process is
supposed to work and I'm thankful to be a part of it.
References
1. http://lists.w3.org/Archives/Public/public-html/2012Nov/0128.html
2. http://manu.sporny.org/2012/mythical-differences/
3. http://manu.sporny.org/2011/uber-comparison-rdfa-md-uf/
4. http://xkcd.com/927/
5. http://webdatacommons.org/vocabulary-usage-analysis/index.html
6. mailto:public-html-comments@w3.org
7. http://www.w3.org/TR/rdfa-core/
8. http://blog.schema.org/2012/06/semtech-rdfa-microdata-and-more.html
9. file://localhost/tmp/mdobjection.html
10. http://www.w3.org/TR/rdfa-primer/#using-multiple-vocabularies
11. http://krijnhoetmer.nl/irc-logs/whatwg/20121128#l-1122
12. http://www.w3.org/TR/rdfa-lite/#the-attributes
13. http://events.linkeddata.org/ldow2012/papers/ldow2012-inv-paper-1.pdf
14.
http://lists.w3.org/Archives/Public/public-html-comments/2011Jun/0038.html
15. http://lists.w3.org/Archives/Public/public-html/2012Nov/0180.html
16. http://lists.w3.org/Archives/Public/public-html/2012Nov/0178.html
17. http://krijnhoetmer.nl/irc-logs/whatwg/20121128#l-789
18. http://lists.w3.org/Archives/Public/public-html/2012Nov/0243.html
19. http://manu.sporny.org/2011/uber-comparison-rdfa-md-uf/
20. http://rdfa.info/play/
21. http://lists.w3.org/Archives/Public/public-html/2012Nov/0179.html
22. http://lists.w3.org/Archives/Public/public-html/2012Nov/0186.html
-- manu
--
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: HTML5 and RDFa 1.1
http://manu.sporny.org/2012/html5-and-rdfa/
Received on Monday, 3 December 2012 21:00:03 UTC