Re: Index of Types -> Domains

Speaking not to where the domain usage counts on come from (Dan
Brickley might be able to address that), but this specifically:

> "... examples of domains implementing a particular Type or format in
their pages"

While the last structured data extraction was generated from a crawl now
more than a year old, the Web Data Commons (
does make statistics available on "RDFa, Microdata, Embedded JSON-LD, and
Microformats" found in each crawl, as well as the full corpus.  You can
find the most recent data here:

Web Data Commons Extraction Report - October 2016 Corpus

If you have the desire to do so, you can also access and analyze Common
Crawl data yourself (data sets are generated monthly); learn more here:

On Fri, Nov 3, 2017 at 11:14 AM, David Pierce <>

> I've seen this come up a bit in the SEO community where webmasters and
> researchers are trying to look for examples of domains implementing a
> particular Type or format in their pages.
> Moreover, when I look at the documentation for any given type--Product
> <>, for example--I see a note about its usage: *Usage:
> Over 1,000,000 domains*
> How is this calculated, where is stored, and how might I make use of it?
> Has an index of Types to Domains already been built?
> If not, has anybody explored building one? At the outset, I imaging
> something complementing the aforementioned Type documentation. For each
> type, I want to see what sites include that in their markup,
> and what implementation they use (microdata vs. json+ld).
> If this has already been built, I'd love to make use of it for my own
> learning. If it hasn't...well, I suppose I might start working on building
> it.

Received on Friday, 3 November 2017 19:13:59 UTC