Re: ACTION-94 a few thoughts

Could it be a problem that Google does not crawl/index Turtle? Are there
cases conceivable where content negation with HTML as one of the supported
formats is hard to achieve?

I assume that somewhere in the Best Practices it will say: make your data
available in several formats, at least JSON (because that is what web
applications like) and HTML (because that is what humans and search engines
like).

Regards,
Frans

2015-12-09 14:46 GMT+01:00 Ed Parsons <eparsons@google.com>:

> Not yet..
>





> Ed
>
> On Wed, 9 Dec 2015 13:21 Bill Roberts <bill@swirrl.com> wrote:
>
>> (sorry, by 'Turtle file'  I really meant an HTTP response with
>> text/turtle content-type)
>>
>>
>> ---------- Forwarded message ----------
>> From: Bill Roberts <bill@swirrl.com>
>> Date: 9 December 2015 at 13:19
>> Subject: Re: ACTION-94 a few thoughts
>> To: Ed Parsons <eparsons@google.com>, "public-sdw-wg@w3.org" <
>> public-sdw-wg@w3.org>
>>
>>
>> Ed - do Google crawlers look at all at contents of eg a Turtle file?
>>
>> On 9 December 2015 at 12:48, Ed Parsons <eparsons@google.com> wrote:
>>
>>> +1 especially  "HTML representation should be optimised for indexing -
>>> it should embed the metadata themselves, as RDFa, Microformats, etc."
>>>
>>> ed
>>>
>>> On Wed, 9 Dec 2015 at 12:38 Andrea Perego <
>>> andrea.perego@jrc.ec.europa.eu> wrote:
>>>
>>>> On 02/12/2015 17:49, Jeremy Tandy wrote:
>>>>
>>>> > [snip]
>>>> >
>>>> > FWIW, note that the catalogue discovery mode (search for the record,
>>>> >  read the record to find the access point. query the access point) is
>>>> >  covered by the DWBP. Furthermore, I'd be bold enough to say that
>>>> > data that's accessed only from an opaque service endpoint is not
>>>> > really on the web. I think to be "on the web" the data needs to be
>>>> > visible to (and crawlable by) search engines.
>>>>
>>>> I tend to share Jeremy's concern.
>>>>
>>>> I see three main requirements / recommendations here:
>>>>
>>>> 1. HTML should be supported, via HTTP conneg, as an alternative format
>>>> for CSW output (metadata records and, possibly, also service
>>>> capabilities).
>>>>
>>>> 2. This HTML representation should be optimised for indexing - it should
>>>> embed the metadata themselves, as RDFa, Microformats, etc.
>>>>
>>>> 3. Metadata records should use HTTP URIs to enable link crawling.
>>>>
>>>>
>>>> About (1) & (2), this is actually related to UCR #4.43:
>>>>
>>>> http://www.w3.org/TR/sdw-ucr/#ImprovingDiscoveryOfSpatialDataOnTheWeb
>>>>
>>>> And this is what has been done, e.g., in the GeoDCAT-AP API, which is
>>>> able to return CSW records in different RDF serialisations, including
>>>> HTML+RDFA - see, e.g.:
>>>>
>>>>
>>>> http://geodcat-ap.semic.eu:8890/api/?outputSchema=extended&src=http%3A%2F%2Fsdi.eea.europa.eu%2Fcatalogue%2Fsrv%2Feng%2Fcsw%3Frequest%3DGetRecords%26service%3DCSW%26version%3D2.0.2%26namespace%3Dxmlns%2528csw%3Dhttp%3A%2F%2Fwww.opengis.net%2Fcat%2Fcsw%2529%26resultType%3Dresults%26outputSchema%3Dhttp%3A%2F%2Fwww.isotc211.org%2F2005%2Fgmd%26outputFormat%3Dapplication%2Fxml%26typeNames%3Dcsw%3ARecord%26elementSetName%3Dfull%26constraintLanguage%3DCQL_TEXT%26constraint_language_version%3D1.1.0%26maxRecords%3D20&outputFormat=text%2Fhtml
>>>>
>>>> About (3), this can be partially addressed by mapping, e.g., ISO code
>>>> list values to URIs, but it eventually requires HTTP URIs to be used in
>>>> the original records.
>>>>
>>>> Andrea
>>>>
>>>>
>>>> --
>>>
>>> *Ed Parsons*
>>> Geospatial Technologist, Google
>>>
>>> Google Voice +44 (0)20 7881 4501
>>> www.edparsons.com @edparsons
>>>
>>
>>
>> --
>
> *Ed Parsons*
> Geospatial Technologist, Google
>
> Google Voice +44 (0)20 7881 4501
> www.edparsons.com @edparsons
>

Received on Wednesday, 9 December 2015 15:20:58 UTC