Re: BP comments

Hi Jeremy,

many thanks for your prompt and diligent processing of my comments!

Just two remarks:

>> I also note that in the world of the web developers the term "Structured Data" seems to
>> be used quite frequently, but it is not used in this document except for crowd-sourced data (?).
> We use the term a few times elsewhere ... Regarding the BP17 for crowd-sourced data,
> this needs quite a lot of work at the moment, as captured in ISSUE #220<https://github.com/w3c/sdw/issues/220>

I only found "structured data" in BP17, hence the comment. "Structured geometry", "structured markup", "structured metadata", etc. is also used in other places, but the relationship between the terms was not clear to me. But this was a minor comment, mostly in the context of the "Linked Data approach" discussion, and I think we can consider it closed.

>> "increasing usefullness and cost" - I think "decreasing" is meant?
> It seems correct to me as currently written; bulk download is less useful and cheaper to
> implement than a bespoke Web service API tailored to a given task. I wrote the text; I
> might be missing something though ...

You are of course correct! Sorry for this - the only excuse I can think of is that it must have been too late when I got that far in the document...

Thanks again,
Clemens


On 14 Jan 2016, at 18:42, Jeremy Tandy <jeremy.tandy@gmail.com<mailto:jeremy.tandy@gmail.com>> wrote:

Hi Clemens. You comments are now processed. Specific points below ...

> Terminology
> I think that has been raised by others, too, but right now I see at least "spatial data", "spatial information",  "geospatial data", "geographic information" and "geospatial information". Probably we need only two of those, although from the document it is also unclear that there is a distinction between "spatial" and "geospatial" and what that distinction is.

I've added your comment to ISSUE #208<https://github.com/w3c/sdw/issues/208>

> References
> Why do some references go to the glossary (eg WFS) and some to the references (eg SPARQL)? Maybe WFS etc should be added to the references, too, and the text should include a link both to the glossary and the references?

I've created a new ISSUE (#222<https://github.com/w3c/sdw/issues/222>) so that we don't forget to fix this.

> Section 1.1
> "Analysis of the requirements derived from scenarios that describe how spatial data is commonly published and used on the Web (as documented in [UCR]) indicates that, in contrast to the workings of a typical SDI, the Linked Data approach is most appropriate for publishing and using spatial data on the Web. Linked Data provides a foundation to many of the best practices in this document."
>
> Where is that analysis documented and the practical evidence underpinning? Or is it more of an opinion?

I've added this comment to ISSUE #218<https://github.com/w3c/sdw/issues/218>.

> I also note that in the world of the web developers the term "Structured Data" seems to be used quite frequently, but it is not used in this document except for crowd-sourced data (?).

We use the term a few times elsewhere ... Regarding the BP17 for crowd-sourced data, this needs quite a lot of work at the moment, as captured in ISSUE #220<https://github.com/w3c/sdw/issues/220>

> Another comment related to the references to "typical SDIs" is that to me the wording in the document sometimes gives the impression, that by design SDIs do not fit what is the Web today. While that is true (to me BP25 is a key point here among other aspects), I think this is only part of the story. SDIs offer the capability for linking data and for expressing models and semantics, but quite often these capabilities are not used and data is often published like GIS data more or less before we had the Web. One could in principle implement many of the BPs in current SDIs, too. Or publish spatial data using RDF technologies in ways that wouldn’t be very Webby either. To me, the potential value of the BP document is in helping to establish am understanding what aspects are key with respect to putting and using spatial data *on* the Web. Specific technologies also play a role here and we need to cover them as well, in particular in the examples, but it should be the principles that matter most. There will always be a mix in technologies that will be used, often for good reasons.

Another excellent point. I've capture this in a new ISSUE (#223<https://github.com/w3c/sdw/issues/223>): "Need to illustrate role of SDIs in publishing spatial data on the Web"

> BP1
> "much like how Twitter's hashtags are created dynamically"
>
> This is not a good example in this BP as hashtags are obviously not "unambiguously identifying" a Thing. #ogc in tweets references the French football club OGC Lille, the > > Open Geospatial Consortium and other things.

I've removed the reference.

> BP2
> The text seems to largely discuss something else than the title suggests. The text mostly seems to say that one should link resources to well-known resources, not to reuse other URIs as the subject URI of my own statements. For example, the URI in the example (http://www.gemeentegeschiedenis.nl/gemeentenaam/Amsterdam) *is* a new URI.

You're right ... I didn't spot that. I've updated the "Why" section to more accurately explain the intent.

> BP3, BP23, etc.
> These BPs could be read to recommend to create and maintain explicit links to all other (related) datasets in the Web? That clearly does not scale, so I think we should be more clear on what we are considering good practice.

BP3 has been reworded and now has a change of emphasis. Your comment is still of relevance to BP23, so I have created an ISSUE (#224<https://github.com/w3c/sdw/issues/224>) in the document: "Maintaining links to *all* related resources doesn't scale. Redraft required."

> BP3
> This BP seems to recommend to establish links between entitly-level resources across datasets, but the title of the BP is "Working with data that lacks globally unique identifiers for entity-level resources"?

I think that I've fixed this already in response to other comments ...

> 6.2
> The LD-BP reference makes this very RDF dependent. Is there a need / justification for this? Are we saying that RDF is the only recommended way to publish data and models on the Web? I know some will say "yes", but if we target web developers I would guess that many do not care about RDF vocabularies and maybe they prefer a Swagger document (just to pick an example)?

Another excellent point. ISSUE #225<https://github.com/w3c/sdw/issues/225> created: "We must avoid being overly focused on RDF"

> BP27
> "increasing usefullness and cost" - I think "decreasing" is meant?

It seems correct to me as currently written; bulk download is less useful and cheaper to implement than a bespoke Web service API tailored to a given task. I wrote the text; I might be missing something though ...

> BP28, BP29, BP30
> What is the spatial aspect of these BPs? These are more general Data on the Web recommendations. I see that we may want to discuss examples in the spatial context, but maybe we should shorten the general discussion and simply reference the related Data on the Web best practices and focus in our document on the spatial aspects?

There is potential scope creep here; I've already captured this in ISSUE #187<https://github.com/w3c/sdw/issues/187>; I've added your comment to the discussion.

Thanks very much for your insight ...

BR, Jeremy

On Wed, 13 Jan 2016 at 12:40 Clemens Portele <portele@interactive-instruments.de<mailto:portele@interactive-instruments.de>> wrote:
Dear Editors, all,

well done! The document looks like a good first step and I think it is worth to capture the current status as a public working draft.

I have some (late) comments:

Terminology
I think that has been raised by others, too, but right now I see at least "spatial data", "spatial information",  "geospatial data", "geographic information" and "geospatial information". Probably we need only two of those, although from the document it is also unclear that there is a distinction between "spatial" and "geospatial" and what that distinction is.

References
Why do some references go to the glossary (eg WFS) and some to the references (eg SPARQL)? Maybe WFS etc should be added to the references, too, and the text should include a link both to the glossary and the references?

Section 1.1
"Analysis of the requirements derived from scenarios that describe how spatial data is commonly published and used on the Web (as documented in [UCR]) indicates that, in contrast to the workings of a typical SDI, the Linked Data approach is most appropriate for publishing and using spatial data on the Web. Linked Data provides a foundation to many of the best practices in this document."

Where is that analysis documented and the practical evidence underpinning? Or is it more of an opinion?

I also note that in the world of the web developers the term "Structured Data" seems to be used quite frequently, but it is not used in this document except for crowd-sourced data (?).

Another comment related to the references to "typical SDIs" is that to me the wording in the document sometimes gives the impression, that by design SDIs do not fit what is the Web today. While that is true (to me BP25 is a key point here among other aspects), I think this is only part of the story. SDIs offer the capability for linking data and for expressing models and semantics, but quite often these capabilities are not used and data is often published like GIS data more or less before we had the Web. One could in principle implement many of the BPs in current SDIs, too. Or publish spatial data using RDF technologies in ways that wouldn’t be very Webby either. To me, the potential value of the BP document is in helping to establish am understanding what aspects are key with respect to putting and using spatial data *on* the Web. Specific technologies also play a role here and we need to cover them as well, in particular in the examples, but it should be the principles that matter most. There will always be a mix in technologies that will be used, often for good reasons.

BP1
"much like how Twitter's hashtags are created dynamically"

This is not a good example in this BP as hashtags are obviously not "unambiguously identifying" a Thing. #ogc in tweets references the French football club OGC Lille, the Open Geospatial Consortium and other things.

BP2
The text seems to largely discuss something else than the title suggests. The text mostly seems to say that one should link resources to well-known resources, not to reuse other URIs as the subject URI of my own statements. For example, the URI in the example (http://www.gemeentegeschiedenis.nl/gemeentenaam/Amsterdam) *is* a new URI.

BP3, BP23, etc.
These BPs could be read to recommend to create and maintain explicit links to all other (related) datasets in the Web? That clearly does not scale, so I think we should be more clear on what we are considering good practice.

BP3
This BP seems to recommend to establish links between entitly-level resources across datasets, but the title of the BP is "Working with data that lacks globally unique identifiers for entity-level resources"?

BP4
Should this recommend to include information about when the statements about the Thing were valid? Maybe reference BP11?

6.2
The LD-BP reference makes this very RDF dependent. Is there a need / justification for this? Are we saying that RDF is the only recommended way to publish data and models on the Web? I know some will say "yes", but if we target web developers I would guess that many do not care about RDF vocabularies and maybe they prefer a Swagger document (just to pick an example)?

BP27
"increasing usefullness and cost" - I think "decreasing" is meant?

BP28, BP29, BP30
What is the spatial aspect of these BPs? These are more general Data on the Web recommendations. I see that we may want to discuss examples in the spatial context, but maybe we should shorten the general discussion and simply reference the related Data on the Web best practices and focus in our document on the spatial aspects?

Thanks for the good work!
Clemens

Received on Thursday, 14 January 2016 18:03:12 UTC