W3C home > Mailing lists > Public > www-tag@w3.org > March 2013

Re: Meaningless: towards a real-world web semantics observatory

From: Noah Mendelsohn <nrm@arcanedomain.com>
Date: Wed, 27 Mar 2013 10:47:41 -0400
Message-ID: <5153068D.7000803@arcanedomain.com>
To: Alex Russell <slightlyoff@google.com>
CC: "www-tag@w3.org List" <www-tag@w3.org>, Tantek Çelik <tcelik@mozilla.com>
(Leaving off most of the cc: list to avoid cross-posted discussion. Nothing 
sensitive here -- feel feel to forward if useful.)

This looks very cool. Would it be 
easy/reasonable/in-the-spirit-of-the-thing to extend it start gathering 
statistics on JSON, XML, various forms of RDF, RDF-a, etc? For that matter, 
it would also be >really< interesting to watch things like content that 
will be interpreted differently by the HTML5 sniffing rules than by 
following authoritative metadata.

In general, you seem to be on a very nice slippery slope of building a 
dashboard for the Web's data/content encoding. Are you interested in 
heading further down the slope?

Noah

On 3/27/2013 9:58 AM, Alex Russell wrote:
> Hi all,
>
> These lists host many debates about the semantics (or lack thereof) of
> HTML. Good data that bears on these questions is often hard to come by.
> This isn't anyone's fault per sae but it sure would be nice if we had
> better data to use as the baseline for discussions about what should (and
> shouldn't) be in HTML.next.
>
> In the interest of building such a corpus, I've created a small extension
> to help gather information on the real-world semantics that users encounter
> in the web; both semantic HTML and extensions to it like Microformats,
> schema.org <http://schema.org> markup, and ARIA roles and states. Crawlers
> miss a lot as they (generally) aren't running scripts and interacting
> deeply with sites, so this anonymizing system attempts to fill that gap by
> observing the semantic content of pages both when the load and as they
> change over time.
>
> Why cross-post this so broadly? Because I need your help! If you think
> evolving the web based on data is better than trying to do it without and
> you happen to use Chrome as your browser, please install the extension:
>
> https://chrome.google.com/webstore/detail/meaningless/gmmhpelpfhlofjjolcegdddjadkmincn/details
>
> If you're a developer and use another browser, I'd love your help in
> porting the extension to other platforms (FF, Safari, etc.):
>
> https://github.com/slightlyoff/meaningless
>
> If you're interested in the data, a sparse reporting front-end is currently
> in place:
>
> http://meaningless-stats.appspot.com/global
>
> Help is needed to analyze the data in more meaningful ways, visualize it,
> etc. Filing tickets and submitting pull requests is the easiest way to
> help: https://github.com/slightlyoff/meaningless/issues
>
> Thanks for your help and attention.
Received on Wednesday, 27 March 2013 14:48:10 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 27 March 2013 14:48:11 UTC