Re: RDF 1.1 Lite Issue # 2: property vs rel from Manu Sporny on 2011-10-23 (public-vocabs@w3.org from October 2011)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Sat, 22 Oct 2011 21:30:49 -0400
To: public-vocabs@w3.org
Message-ID: <4EA36E49.4050309@digitalbazaar.com>
On 10/22/2011 05:36 PM, Guha wrote:
> We will look into sharing what we can. We have on a number of
> occasions shared aggregate data. It is not clear we are in a position
> to share detailed information about other people's websites. You are
> of course welcome to do the analysis yourself.

We have been doing analysis, but not on the scale that Google has been
doing. When new sites launch with RDFa support, we tend to look at them
and see if there are markup errors or places where we should change the
spec to match author expectations. Most of our work on RDFa 1.1 has
focused on making the authoring experience easier.

I would also add that we do not have the resources to do the type of
crawl that you are asking us to do - only the search companies do and if
you're unable to share that data, it will be very difficult to
understand if the changes we might make would actually address the problem.

So, we're left in a catch-22 situation. If we don't change the spec in
the way that Google wants us to change it, which is counter to our
experience to date, you're saying that Google won't support it. However,
if we do change it in the way that Google wants us to change it, but it
breaks backwards compatibility /and/ doesn't correct the authoring
mistakes, we have failed to deliver on a technology that works for
authors /and/ we've broken all of the pre-existing RDFa markup on the Web.

So, as you can see - that data is incredibly important to all of us when
trying to get the authoring experience as smooth as it can be for RDFa.

>> I will also note that this particular data was never brought to the
>> attention of the RDFa Working Group. When did you know about these
>> errors? Why did you not share the data when you came across it? I
>> ask because it would've impacted the design of RDFa 1.1 if you had
>> shared this data with us at the time.
>
> Manu, I think you are missing something here. We have communicated
> this information, many times, in one-one meetings with Ben Adida and
>  others as we were working on developing microdata.

Let me clarify further, as I was intentionally very precise in the
wording that I used in my initial question. I was talking about "data",
not the more informal "information". The values that you quoted from
your data - "3 times" as many errors, "40% caused by" @property/@rel,
etc. That is the first time that I am seeing those numbers. A number of
people have claimed this to be an issue, people say that "there were
studies", but each time that we have asked for the data, no data was
provided. Now, we may have missed something and if we have - please
point us to the public discussion where these numbers were quoted.

Since these claims ran counter to the RDFa WG's experiences (which
included RDFa community implementation experiences), and since we did
not see any large mis-implementation pattern for @property/@rel in the
sites that we were analyzing, it was impossible for us to validate what
to change to address these claims.

That is, we cannot be scientific about this if there is no data.
Additionally, we need some ammunition if we're going to go to W3C and
say that we intend to break backwards compatibility for @property and
@rel in a big way. /How/ we break backwards compatibility is important
and to know exactly how to break it requires us to analyze the data that
Google has in its possession.

> At the end of the day, it was negligence on the part of the folks
> designing RDFa 1.1 to not actively seek input from some biggest
> consumers of RDFa.

I am going to quote from e-mails that I have sent in the past to Othar
Hansson, Kavi Goel and you:

2009-10-28 (to: Othar, Kavi, Guha)
"""
We are currently seeking feedback, support and participation in the
Working Group from companies that have implemented RDFa as a part of
their deployed infrastructure.

It would be an understatement to say that we were overjoyed when we
heard about RDFa support at Google via Rich Snippets. We'd like to
further extend our desire to support Google's use of RDFa and ask that
Google take part in the RDFa Working Group.
"""

2009-11-03 (to: Othar, Kavi, Guha)
"""
We'd really love it if Google was more directly involved.
...
We would also prefer that Google participate in some capacity (even if
it's not ideal) than not participate at all.
"""

2009-12-03 (to: Othar, Kavi, Guha)
"""
... if one of you, or somebody else from Google, could join us on a
couple of telecons and provide your input to the RDFa WG ...
"""

2010-05-20 (to: Othar)
"""
... we'd love to have you, or someone from your Rich Snippets team in
the RDFa Working Group. We are several months from our Last Call
deadline for RDFa 1.1, so there is still time to affect the
specification and ensure that we're taking Google's needs into account.
"""

All that said, I will take your feedback back to the RDFa WG and raise
it there as we take all input like this /very seriously/. We will
publicly ask the community if there was any negligence on our part and
see if there are more organizations or individuals that feel as if we
did not consider their input.

>> A list of URLs would be great along with a technical analysis of
>> all of those URLs. Specifically, the following data would be very
>> helpful:
>
> Google DOES NOT provide lists of URLs to anyone. You are welcome to
> go crawl the web.

It's going to be very difficult to compare data if we don't know which
URLs Google was analyzing.

>> * How frequent was the use of @rel vs. the use of @property?
>>
>> * When @rel was used, was it used in chaining or was it used to
>> simply refer to an external resource?
>
> We don't recommend chaining. Almost no one producing markup with rich
> snippets uses external resources.

Let's define "external resource" as "something identified by an IRI that
doesn't point to something on the current page"

Are you saying that people don't link to images outside of a page? What
about "schema:image" or "schema:url"?

>> * In the Microformats and Creative Commons cases (rel="license",
>> rel="tag", etc.) did people get @rel wrong?
>
> You should ask them.

I just happen to have been a very active member in the Microformats
community and not once do I remember people mis-using/abusing @rel on a
large scale. I'll ask CC, but I don't think that people have been
stuffing up rel="license" either - at least, if they have been, they
didn't say anything about it to us.

>> * How frequently does @rel and @property exist on the same
>> element?
>>
> In the vocabulary we specified, never.

Do you mean for the original Rich Snippets "v" vocabulary or the
"schema" vocabulary?

>> * How frequently is @property used when @rel should have been used
>>  instead?
>>
> Don't have the numbers, but it was pretty random. You have to
> understand that at anything more than a few percent error rate, the
> data becomes largely unusable in scale.

Numbers would be good, but raw data would be better.

>> * How frequently is @rel used when @property should have been used
>>  instead?
>>
> I will look into doing this analysis, but am not sure when we will be
> able to get around to this.

Take a step back and look at what you're asking us to do:

The RDFa WG has addressed every last major technical issue with the RDFa
1.1 specs as of last week. We are ready to take the document to our 3rd
Last Call. We scuttled our 2nd Last Call because of the schema.org
announcement - to buy more time so that we could find out why Google
decided to not support RDFa when it was already supporting RDFa in Rich
Snippets.

These issues are just now being raised on this mailing list, which go
against our implementation experience, but with no public data to back
up the claims.

We are open to making changes if we can see exactly how to address the
issue, but without data, we cannot make a sensible decision. You are now
telling us that you will look into doing the analysis (which I thought
was already done) but that there is timeframe on completing the work.
This is effectively asking the RDFa WG to wait indefinitely until your
group publishes the analysis.

So, if you were in our position, with the information above - what would
you do?

>> Who is "we" in this case? The RDFa WG does not want to get into a
>> theoretical debate either. We care about authors easily generating
>> good, valid data.
>
> We = Google, Schema.org. Us = Google, Schema.org

I'm still confused. I was under the impression that schema.org was a
joint project between Microsoft, Google and Yahoo? That is, when you say
"We" is Google and then you say "We" is also schema.org - does that mean
that you are speaking for Google, Microsoft and Yahoo (but only for
schema.org)?

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
Founder/CEO - Digital Bazaar, Inc.
blog: Standardizing Payment Links - Why Online Tipping has Failed
http://manu.sporny.org/2011/payment-links/
Received on Sunday, 23 October 2011 01:31:18 UTC