W3C home > Mailing lists > Public > public-sdw-wg@w3.org > December 2015

Re: ACTION-94 a few thoughts

From: Phil Archer <phila@w3.org>
Date: Wed, 9 Dec 2015 15:50:02 +0000
To: Frans Knibbe <frans.knibbe@geodan.nl>, Ed Parsons <eparsons@google.com>, Bill Roberts <bill@swirrl.com>
Cc: "public-sdw-wg@w3.org" <public-sdw-wg@w3.org>
Message-ID: <56684DAA.7010805@w3.org>


On 09/12/2015 15:20, Frans Knibbe wrote:
> Could it be a problem that Google does not crawl/index Turtle? Are there
> cases conceivable where content negation with HTML as one of the supported
> formats is hard to achieve?
>
> I assume that somewhere in the Best Practices it will say: make your data
> available in several formats,

It says:
http://w3c.github.io/dwbp/bp.html#MultipleFormats

Which may or may not be enough for here. It doesn't explicitly say that 
you should provide HTML or JSON (actually it gives CSV RDF and XML as 
the examples) - but the doc just pointed to will be published as the 
next working draft - subject to the WG so resolving on Friday afternoon.



at least JSON (because that is what web
> applications like) and HTML (because that is what humans and search engines
> like).
>
> Regards,
> Frans
>
> 2015-12-09 14:46 GMT+01:00 Ed Parsons <eparsons@google.com>:
>
>> Not yet..
>>
>
>
>
>
>
>> Ed
>>
>> On Wed, 9 Dec 2015 13:21 Bill Roberts <bill@swirrl.com> wrote:
>>
>>> (sorry, by 'Turtle file'  I really meant an HTTP response with
>>> text/turtle content-type)
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: Bill Roberts <bill@swirrl.com>
>>> Date: 9 December 2015 at 13:19
>>> Subject: Re: ACTION-94 a few thoughts
>>> To: Ed Parsons <eparsons@google.com>, "public-sdw-wg@w3.org" <
>>> public-sdw-wg@w3.org>
>>>
>>>
>>> Ed - do Google crawlers look at all at contents of eg a Turtle file?
>>>
>>> On 9 December 2015 at 12:48, Ed Parsons <eparsons@google.com> wrote:
>>>
>>>> +1 especially  "HTML representation should be optimised for indexing -
>>>> it should embed the metadata themselves, as RDFa, Microformats, etc."
>>>>
>>>> ed
>>>>
>>>> On Wed, 9 Dec 2015 at 12:38 Andrea Perego <
>>>> andrea.perego@jrc.ec.europa.eu> wrote:
>>>>
>>>>> On 02/12/2015 17:49, Jeremy Tandy wrote:
>>>>>
>>>>>> [snip]
>>>>>>
>>>>>> FWIW, note that the catalogue discovery mode (search for the record,
>>>>>>   read the record to find the access point. query the access point) is
>>>>>>   covered by the DWBP. Furthermore, I'd be bold enough to say that
>>>>>> data that's accessed only from an opaque service endpoint is not
>>>>>> really on the web. I think to be "on the web" the data needs to be
>>>>>> visible to (and crawlable by) search engines.
>>>>>
>>>>> I tend to share Jeremy's concern.
>>>>>
>>>>> I see three main requirements / recommendations here:
>>>>>
>>>>> 1. HTML should be supported, via HTTP conneg, as an alternative format
>>>>> for CSW output (metadata records and, possibly, also service
>>>>> capabilities).
>>>>>
>>>>> 2. This HTML representation should be optimised for indexing - it should
>>>>> embed the metadata themselves, as RDFa, Microformats, etc.
>>>>>
>>>>> 3. Metadata records should use HTTP URIs to enable link crawling.
>>>>>
>>>>>
>>>>> About (1) & (2), this is actually related to UCR #4.43:
>>>>>
>>>>> http://www.w3.org/TR/sdw-ucr/#ImprovingDiscoveryOfSpatialDataOnTheWeb
>>>>>
>>>>> And this is what has been done, e.g., in the GeoDCAT-AP API, which is
>>>>> able to return CSW records in different RDF serialisations, including
>>>>> HTML+RDFA - see, e.g.:
>>>>>
>>>>>
>>>>> http://geodcat-ap.semic.eu:8890/api/?outputSchema=extended&src=http%3A%2F%2Fsdi.eea.europa.eu%2Fcatalogue%2Fsrv%2Feng%2Fcsw%3Frequest%3DGetRecords%26service%3DCSW%26version%3D2.0.2%26namespace%3Dxmlns%2528csw%3Dhttp%3A%2F%2Fwww.opengis.net%2Fcat%2Fcsw%2529%26resultType%3Dresults%26outputSchema%3Dhttp%3A%2F%2Fwww.isotc211.org%2F2005%2Fgmd%26outputFormat%3Dapplication%2Fxml%26typeNames%3Dcsw%3ARecord%26elementSetName%3Dfull%26constraintLanguage%3DCQL_TEXT%26constraint_language_version%3D1.1.0%26maxRecords%3D20&outputFormat=text%2Fhtml
>>>>>
>>>>> About (3), this can be partially addressed by mapping, e.g., ISO code
>>>>> list values to URIs, but it eventually requires HTTP URIs to be used in
>>>>> the original records.
>>>>>
>>>>> Andrea
>>>>>
>>>>>
>>>>> --
>>>>
>>>> *Ed Parsons*
>>>> Geospatial Technologist, Google
>>>>
>>>> Google Voice +44 (0)20 7881 4501
>>>> www.edparsons.com @edparsons
>>>>
>>>
>>>
>>> --
>>
>> *Ed Parsons*
>> Geospatial Technologist, Google
>>
>> Google Voice +44 (0)20 7881 4501
>> www.edparsons.com @edparsons
>>
>

-- 


Phil Archer
W3C Data Activity Lead
http://www.w3.org/2013/data/

http://philarcher.org
+44 (0)7887 767755
@philarcher1
Received on Wednesday, 9 December 2015 15:50:14 UTC

This archive was generated by hypermail 2.3.1 : Friday, 2 September 2016 12:03:10 UTC