W3C home > Mailing lists > Public > public-dxwg-wg@w3.org > September 2018

Re: Google search and Datasets

From: Dan Brickley <danbri@google.com>
Date: Wed, 5 Sep 2018 22:29:19 +0100
Message-ID: <CAK-qy=6209BKLo+Jh=ToUdTpc5eTj0=+-39jaPhru8q2dC7PBw@mail.gmail.com>
To: Rob Atkinson <rob@metalinkage.com.au>
Cc: Annette Greiner <amgreiner@lbl.gov>, Dataset Exchange Working Group <public-dxwg-wg@w3.org>, Natasha Noy <noy@google.com>
It may be that there are various notions of "profile" in play here. I'll
check in with Ed! If there are interesting quantities of data out there
expressed in DCAT-based patterns (potentially captured via shex/shacl
shapes) and if they're written in a form we extract (json-ld etc) then
there's certainly potential.  Can you give examples of any pages (rather
than the underlying specs) with the kind of dataset-describing profile you
have in mind? Re fora, I'm happy having a mail thread here until the WG
chairs nudge us to move along elsewhere :)

Dan

On Wed, 5 Sep 2018 at 22:18, Rob Atkinson <rob@metalinkage.com.au> wrote:

>
> Hi Dan, et al
>
> I spoke to Ed Parsons about this, and he advised that it was unlikely that
> any specific DCAT profiles would be supported, but my thinking is that if
> you support DCAT + some way of handling, say, statistical datasets using
> datacube - that support would actually constitute a DCAT profile logically,
> and could be described as such.
>
> Happy to work with you therefore to describe what you do support AS
> profiles, rather than push a profile at you :-)  It would make sense to
> formalise goverance of geospatial data profiles via OGC - as a sub-profile
> of GeoDCAT for example, if you support GeoDCAT (????)
>
> I'm trying to track this issue across a number of statistical data fora -
> but struggling to identify a center of gravity for the discussion - do you
> have any suggestions
>
> Rob Atkinson
>
>
> On Thu, 6 Sep 2018 at 06:33 Dan Brickley <danbri@google.com> wrote:
>
>> You beat me to it :)
>>
>> (cc:'ing Natasha Noy who led this work at at Google, and who might not be
>> able to post to this list directly but I can relay any bounced posts)
>>
>> I am really happy to see this work launch and am happy to answer any
>> questions, here or offlist as folk prefer.
>>
>> Schema.org's dataset vocab is based on the core pattern from the early
>> DCAT drafts a few years ago (and so shares its strengths and weaknesses).
>> The Google implementation is based on JSON-LD, RDFa and Microdata embedded
>> in the main per-dataset pages. While we focussed more on Schema.org there
>> is some understanding of DCAT too and our support for both will hopefully
>> evolve with the ecosystem (and updated W3C specs) over time. Other
>> questions of course loom, e.g. how this relates to markup for fact
>> checking, or for describing funders and projects, specialist domains (e.g.
>> bioschemas, ...), or other W3C efforts like Data Cube and CSVW....
>>
>> Dan
>>
>> On Wed, 5 Sep 2018, 19:38 Annette Greiner, <amgreiner@lbl.gov> wrote:
>>
>>> I noticed their developer guide says "We can understand structured data
>>> in Web pages about datasets, using either schema.org Dataset markup
>>> <http://schema.org/Dataset>, or equivalent structures represented in W3C
>>> <http://www.w3.org/>'s Data Catalog Vocabulary (DCAT) format
>>> <https://www.w3.org/TR/vocab-dcat/>." :)
>>>
>>> -Annette
>>>
>>> On 9/5/18 11:16 AM, Karen Coyle wrote:
>>>
>>> "Making it easier to find datasets" at the Google Blog:
>>> https://www.blog.google/products/search/making-it-easier-discover-datasets/
>>>
>>> You may already be aware of their developer guide for datasets:
>>> https://developers.google.com/search/docs/data-types/dataset
>>>
>>> which advises the use of schema.org.
>>>
>>> Apologies if this is old news to some of you.
>>>
>>>
>>> --
>>> Annette Greiner
>>> NERSC Data and Analytics Services
>>> Lawrence Berkeley National Laboratory
>>>
>>>
>>>
Received on Wednesday, 5 September 2018 21:29:56 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:28:24 UTC