- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Sat, 22 Oct 2011 21:30:49 -0400
- To: public-vocabs@w3.org
On 10/22/2011 05:36 PM, Guha wrote: > We will look into sharing what we can. We have on a number of > occasions shared aggregate data. It is not clear we are in a position > to share detailed information about other people's websites. You are > of course welcome to do the analysis yourself. We have been doing analysis, but not on the scale that Google has been doing. When new sites launch with RDFa support, we tend to look at them and see if there are markup errors or places where we should change the spec to match author expectations. Most of our work on RDFa 1.1 has focused on making the authoring experience easier. I would also add that we do not have the resources to do the type of crawl that you are asking us to do - only the search companies do and if you're unable to share that data, it will be very difficult to understand if the changes we might make would actually address the problem. So, we're left in a catch-22 situation. If we don't change the spec in the way that Google wants us to change it, which is counter to our experience to date, you're saying that Google won't support it. However, if we do change it in the way that Google wants us to change it, but it breaks backwards compatibility /and/ doesn't correct the authoring mistakes, we have failed to deliver on a technology that works for authors /and/ we've broken all of the pre-existing RDFa markup on the Web. So, as you can see - that data is incredibly important to all of us when trying to get the authoring experience as smooth as it can be for RDFa. >> I will also note that this particular data was never brought to the >> attention of the RDFa Working Group. When did you know about these >> errors? Why did you not share the data when you came across it? I >> ask because it would've impacted the design of RDFa 1.1 if you had >> shared this data with us at the time. > > Manu, I think you are missing something here. We have communicated > this information, many times, in one-one meetings with Ben Adida and > others as we were working on developing microdata. Let me clarify further, as I was intentionally very precise in the wording that I used in my initial question. I was talking about "data", not the more informal "information". The values that you quoted from your data - "3 times" as many errors, "40% caused by" @property/@rel, etc. That is the first time that I am seeing those numbers. A number of people have claimed this to be an issue, people say that "there were studies", but each time that we have asked for the data, no data was provided. Now, we may have missed something and if we have - please point us to the public discussion where these numbers were quoted. Since these claims ran counter to the RDFa WG's experiences (which included RDFa community implementation experiences), and since we did not see any large mis-implementation pattern for @property/@rel in the sites that we were analyzing, it was impossible for us to validate what to change to address these claims. That is, we cannot be scientific about this if there is no data. Additionally, we need some ammunition if we're going to go to W3C and say that we intend to break backwards compatibility for @property and @rel in a big way. /How/ we break backwards compatibility is important and to know exactly how to break it requires us to analyze the data that Google has in its possession. > At the end of the day, it was negligence on the part of the folks > designing RDFa 1.1 to not actively seek input from some biggest > consumers of RDFa. I am going to quote from e-mails that I have sent in the past to Othar Hansson, Kavi Goel and you: 2009-10-28 (to: Othar, Kavi, Guha) """ We are currently seeking feedback, support and participation in the Working Group from companies that have implemented RDFa as a part of their deployed infrastructure. It would be an understatement to say that we were overjoyed when we heard about RDFa support at Google via Rich Snippets. We'd like to further extend our desire to support Google's use of RDFa and ask that Google take part in the RDFa Working Group. """ 2009-11-03 (to: Othar, Kavi, Guha) """ We'd really love it if Google was more directly involved. ... We would also prefer that Google participate in some capacity (even if it's not ideal) than not participate at all. """ 2009-12-03 (to: Othar, Kavi, Guha) """ ... if one of you, or somebody else from Google, could join us on a couple of telecons and provide your input to the RDFa WG ... """ 2010-05-20 (to: Othar) """ ... we'd love to have you, or someone from your Rich Snippets team in the RDFa Working Group. We are several months from our Last Call deadline for RDFa 1.1, so there is still time to affect the specification and ensure that we're taking Google's needs into account. """ All that said, I will take your feedback back to the RDFa WG and raise it there as we take all input like this /very seriously/. We will publicly ask the community if there was any negligence on our part and see if there are more organizations or individuals that feel as if we did not consider their input. >> A list of URLs would be great along with a technical analysis of >> all of those URLs. Specifically, the following data would be very >> helpful: > > Google DOES NOT provide lists of URLs to anyone. You are welcome to > go crawl the web. It's going to be very difficult to compare data if we don't know which URLs Google was analyzing. >> * How frequent was the use of @rel vs. the use of @property? >> >> * When @rel was used, was it used in chaining or was it used to >> simply refer to an external resource? > > We don't recommend chaining. Almost no one producing markup with rich > snippets uses external resources. Let's define "external resource" as "something identified by an IRI that doesn't point to something on the current page" Are you saying that people don't link to images outside of a page? What about "schema:image" or "schema:url"? >> * In the Microformats and Creative Commons cases (rel="license", >> rel="tag", etc.) did people get @rel wrong? > > You should ask them. I just happen to have been a very active member in the Microformats community and not once do I remember people mis-using/abusing @rel on a large scale. I'll ask CC, but I don't think that people have been stuffing up rel="license" either - at least, if they have been, they didn't say anything about it to us. >> * How frequently does @rel and @property exist on the same >> element? >> > In the vocabulary we specified, never. Do you mean for the original Rich Snippets "v" vocabulary or the "schema" vocabulary? >> * How frequently is @property used when @rel should have been used >> instead? >> > Don't have the numbers, but it was pretty random. You have to > understand that at anything more than a few percent error rate, the > data becomes largely unusable in scale. Numbers would be good, but raw data would be better. >> * How frequently is @rel used when @property should have been used >> instead? >> > I will look into doing this analysis, but am not sure when we will be > able to get around to this. Take a step back and look at what you're asking us to do: The RDFa WG has addressed every last major technical issue with the RDFa 1.1 specs as of last week. We are ready to take the document to our 3rd Last Call. We scuttled our 2nd Last Call because of the schema.org announcement - to buy more time so that we could find out why Google decided to not support RDFa when it was already supporting RDFa in Rich Snippets. These issues are just now being raised on this mailing list, which go against our implementation experience, but with no public data to back up the claims. We are open to making changes if we can see exactly how to address the issue, but without data, we cannot make a sensible decision. You are now telling us that you will look into doing the analysis (which I thought was already done) but that there is timeframe on completing the work. This is effectively asking the RDFa WG to wait indefinitely until your group publishes the analysis. So, if you were in our position, with the information above - what would you do? >> Who is "we" in this case? The RDFa WG does not want to get into a >> theoretical debate either. We care about authors easily generating >> good, valid data. > > We = Google, Schema.org. Us = Google, Schema.org I'm still confused. I was under the impression that schema.org was a joint project between Microsoft, Google and Yahoo? That is, when you say "We" is Google and then you say "We" is also schema.org - does that mean that you are speaking for Google, Microsoft and Yahoo (but only for schema.org)? -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) Founder/CEO - Digital Bazaar, Inc. blog: Standardizing Payment Links - Why Online Tipping has Failed http://manu.sporny.org/2011/payment-links/
Received on Sunday, 23 October 2011 01:31:18 UTC