- From: <Simon.Cox@csiro.au>
- Date: Wed, 13 Jan 2016 23:13:52 +0000
- To: <frans.knibbe@geodan.nl>, <andrea.perego@jrc.ec.europa.eu>
- CC: <bill@swirrl.com>, <janowicz@ucsb.edu>, <jeremy.tandy@gmail.com>, <public-sdw-wg@w3.org>
- Message-ID: <2A7346E8D9F62D4CA8D78387173A054A60345EBF@exmbx04-cdc.nexus.csiro.au>
… and spaghetti! Un spaghetto (also linguino, trofia, orecchietta …) is an interesting concept ☺ From: Frans Knibbe [mailto:frans.knibbe@geodan.nl] Sent: Wednesday, 13 January 2016 8:49 PM To: Andrea Perego <andrea.perego@jrc.ec.europa.eu> Cc: Bill Roberts <bill@swirrl.com>; Krzysztof Janowicz <janowicz@ucsb.edu>; Jeremy Tandy <jeremy.tandy@gmail.com>; SDW WG Public List <public-sdw-wg@w3.org> Subject: Re: My BP comments Whether 'data' is used as a plural or singular noun probably does not have much to do with British English versus US English. The problem exists in Dutch language too and I can imagine in some others too. I think it has to do with awareness of the word being a plural form. When someone recognizes that 'data' is the plural form of 'datum' she or he will probably be more likely to treat it as a plural form. A similar word is 'media'. I think it is used as a singular when the word is not recognized as the plural form of 'medium'. It happens with Italian words too - I often hear or read words like 'grafitti' or 'panini' being used as singular nouns. Greetings, Frans 2016-01-12 19:11 GMT+01:00 Andrea Perego <andrea.perego@jrc.ec.europa.eu<mailto:andrea.perego@jrc.ec.europa.eu>>: The Wiktionary may help here: https://en.wiktionary.org/wiki/data#English Quoting: [[ Usage notes This word is more often used as an uncountable noun with a singular verb than as a plural noun with singular datum. ]] Andrea On 12/01/2016 18:50, Bill Roberts wrote: not perhaps our most important issue, but my opinion is that 'data' reads most naturally as a singular word - probably because it's often thought of as a non-countable noun, like water - you can have 'some data', but few people would say 'I have 100 data'. Some people like to be more faithful to its Latin roots and have plural 'data' and singular 'datum' - but use of 'datum' is very rare in English (UK English anyway). 'Data point' is probably a more common way to refer to a datum. So probably either approach is acceptable if we are self-consistent, but I would vote for singular 'data'. Bill On 12 January 2016 at 16:54, Krzysztof Janowicz <janowicz@ucsb.edu<mailto:janowicz@ucsb.edu> <mailto:janowicz@ucsb.edu<mailto:janowicz@ucsb.edu>>> wrote: > 2. I notice the word 'data' is taken as singular. That looks funny to me, but I know there are differences of opinion in that respect. Do W3C or OGC have a recommendation on whether to treat 'data' as a singular or plural noun? As a native English speaker (OK, that doesn't mean much) "data" looks and sounds correct. @phila ... any comment from W3C perspective; I know I'm supposed to write in US-english :-) To the best of my knowledge data is plural, datum is the singular form. Krzysztof On 01/12/2016 08:44 AM, Jeremy Tandy wrote: Hi Frans. Thanks for your commentary ... responses below. @lvdbrink ... can you comment on number #4? Also, can you consider a redraft of Section 2 (see points #7 and #8 below) and the opening of section 6.1 (see point #11). > 1. (already discussed in the teleconference) The introduction or scope section could do with an explanation of how the document relates to the description of the Best Practices deliverable in the charter, especially the first and last bullet points. See PR 203 <https://github.com/w3c/sdw/pull/203> (already merged) ... hopefully this does the trick. > 2. I notice the word 'data' is taken as singular. That looks funny to me, but I know there are differences of opinion in that respect. Do W3C or OGC have a recommendation on whether to treat 'data' as a singular or plural noun? As a native English speaker (OK, that doesn't mean much) "data" looks and sounds correct. @phila ... any comment from W3C perspective; I know I'm supposed to write in US-english :-) > 3.In paragraph 1.1 discoverability and accessibility are listed as the key problems. I think interoperability (between different publications of spatial data and between spatial data and other types of data) could be listed as a third main problem; many requirements have to do with interoperability. Created new issue for discussion: ISSUE 205 <https://github.com/w3c/sdw/issues/205> > 4. section 1.1: problems that are experienced by different groups (commercial operators, geospatial experts, web developers, public sector) are described. I get the impression that those problems are the only or main problems that are experienced by a certain group, but I don't think that is the case. Perhaps the listed problems could be marked as examples? Or the list of problems per group could be expanded? Indeed- the list of problems is not exhaustive, only illustrative. As an introduction I felt that this reads OK. @lvdbrink - wdyt? > 5.secion 1:1 “we've adopted a Linked Data approach as the underlying principle of the best practices ”: Such a statement might drive away people that for some reason resist the idea of Linked Data, or in general don't like to have to adopt a new unknown paradigm. It also looks like the WG was biased in identifying best practices (Linked Data or bust). How about stating that upon inspection of requirements and current problems and solutions concepts from the Linked Data paradigm transpired to be most applicable? Or perhaps Linked Data does not need to be mentioned at all.... Requirements like linkability, discoverability and interoperability automatically lead to recommending using HTTP(S) URIs and common semantics. The WG has agreed on several occasions (including F2F at Nottingham) that we would "adopt the linked data approach" because we feel this is the best way to surface spatial data on the web. Rereading the BP text, I can see how a bias might be taken. I've reworded as follows ... "Analysis of the requirements derived from scenarios that describe how spatial data is commonly published and used on the Web (as documented in [[UCR]]) indicates that, in contrast to the workings of a typical SDI, the <a href="<http://www.w3.org/standards/semanticweb/data>http://www.w3.org/standards/semanticweb/data">Linked Data</a> approach is most appropriate for publishing and using spatial data on the Web. Linked Data provides a foundation to many of the best practices in this document." Hope that works for you. > 6. I think an explanation of the term 'spatial data' should be somewhere very high up in the document (abstract and/or introduction), especially that spatial <> geographic (geographical data is a subset of spatial data) Agreed. New issue added to the document at beginning of Intro. ISSUE 206 <https://github.com/w3c/sdw/issues/206> > 7. Section 2: There seems to be overlap with description of user groups in the introduction (1.1). This leads (or could lead) to duplicate information. Why not just mention in the introduction that there are multiple audiences and that they are described in section 2? Agreed. New issue added. ISSUE 207 <https://github.com/w3c/sdw/issues/207> > 8. Section 2: I wonder if the three groups that are described cover all audience types. Some more I can think of are [...] Good point. Added toISSUE 207 <https://github.com/w3c/sdw/issues/207> as additional copy for a potential redraft of section 2. > 9. Section 3: “SDW focuses on exposing the individual; the entities, the SpatialThings, within a spatial dataset ”. That seems to exclude spatial metadata, which is an important subject in SDW. Agreed. Now, referencing the deliverables from the charter, the Scope states: "The use of metadata to complement spatial data". > 10.“Can be tested by machines and/or data consumers ”: I consider data consumers to be humans or machines. In fact, it could be used as a useful way of avoiding having to write ''humans or machines' each time. Most best practices should benefit both humans and machines. Only in some cases the distinction is meaningful. Reworded to: "Compliance with each best practice in this document can be tested by programmatically and/or by human inspection." > 11.6.1: Is the discussion about features, information resources and real world things really necessary? I find it slightly confusing and I can imagine other will too. Why not just say that if you want spatial data to be referenceable on the web you need to use URIs? Just that makes a lot of sense and could be less confusing. @lvdbrink has attempted to capture the discussion that occurred during the Sapporo F2F; this discussion certainly had value at the time. I'm wary of reducing the context to the single statement you suggest but agree that it's not currently straight forward. We may also want to talk about the difference between Features (information resources) and Spatial Things (the resources described by the information) and the fact that in the end, the distinction is often not helpful. I've added a new issue to capture this point. ISSUE 208 <https://github.com/w3c/sdw/issues/208> > 12. Best practice 3: I notice best practices 1 and 2 are phrased as solutions or recommendations . I think it is a good idea to try to do that for all best practices. So instead of “Working with data that lacks globally unique identifiers for entity-level resources” one could write “make spatial relationships explicit” See ISSUE 193 <https://github.com/w3c/sdw/issues/193> that echoes your sentiment for BP style. That said, your suggested text misses the intended point. There's more content needed for BP3 (and perhaps a major redraft?) as stated in ISSUE 102 <https://github.com/w3c/sdw/issues/102> ... the concern is not so much making spatial relationships explicit, but what to do if your data doesn't use URIs. How do you convert from locally scoped identifier to URI? > 13.I appreciate seeing references to BP requirements from the UCR document. But they are placed in the 'Evidence' section of the BP template now. Is it appropriate to count requirements derived from use cases as evidence of a best practice? I would expect references to use cases and requirements to occur in the 'Why' section of the template. Or in a template section that is especially reserved for requirements, e.g 'Relevant requirements'. We're following the pro-forma set out by DWBP (for example, see <http://w3c.github.io/dwbp/bp.html#identifiersWithinDatasets>http://w3c.github.io/dwbp/bp.html#identifiersWithinDatasets). I'll admit to not thinking too hard about this so far. I have raised an issue in the WG tracker (ISSUE 36 <https://www.w3.org/2015/spatial/track/issues/36>) so that we come back to this discussion post release of FPWD. > 14. Best practice 8: Is this based on theCRS wiki page <https://www.w3.org/2015/spatial/wiki/Coordinate_Reference_Systems>? It seems that WGS84 is recommended. But that is debatable and could be considered American-centric. European guidelines recommend ETRS89. Also, high-precision is not defined. Also, no mention is made of the need to add temporal data if a CRS with an increasing error with time (like WGS84) is needed. Also no mention is made of how to reconcile local CRSs (as in a building plan) with global CRSs. I think CRSs are one of the areas that do require some extra standardisation efforts outside of this document, but which could be instigated by our working group. I've added your comment to ISSUE 128 <https://github.com/w3c/sdw/issues/128> which is associated with BP 8. We can improve the content post FPWD release. > 15.BP 10: I would at least recommend to be aware of significant digits. Added your comment to ISSUE 125 <https://github.com/w3c/sdw/issues/125> > 16. Appendix C: Why are all UC requirements listed? Why not only the BP requirements? That would make a more compact table. There were many requirements that were not specifically marked for the BP- but turned out to be related ... so we captured those. Also, while we are working on the BP, it's good to have this full list. Perhaps when we're complete, it would make sense to truncate. Thanks for all your efforts. Jeremy On Thu, 7 Jan 2016 at 12:30 Frans Knibbe <<mailto:frans.knibbe@geodan.nl<mailto:frans.knibbe@geodan.nl>>frans.knibbe@geodan.nl<mailto:frans.knibbe@geodan.nl> <mailto:frans.knibbe@geodan.nl<mailto:frans.knibbe@geodan.nl>>> wrote: Hello, Following are my comments, after reading the BP draft from top to bottom: 1. (already discussed in the teleconference) The introduction or scope section could do with an explanation of how the document relates to the description of the Best Practices deliverable in the charter, especially the first and last bullet points. 2. I notice the word 'data' is taken as singular. That looks funny to me, but I know there are differences of opinion in that respect. Do W3C or OGC have a recommendation on whether to treat 'data' as a singular or plural noun? 3. In paragraph 1.1 discoverability and accessibility are listed as the key problems. I think interoperability (between different publications of spatial data and between spatial data and other types of data) could be listed as a third main problem; many requirements have to do with interoperability. 4. section 1.1: problems that are experienced by different groups (commercial operators, geospatial experts, web developers, public sector) are described. I get the impression that those problems are the only or main problems that are experienced by a certain group, but I don't think that is the case. Perhaps the listed problems could be marked as examples? Or the list of problems per group could be expanded? 5. secion 1:1 “we've adopted a Linked Data approach as the underlying principle of the best practices ”: Such a statement might drive away people that for some reason resist the idea of Linked Data, or in general don't like to have to adopt a new unknown paradigm. It also looks like the WG was biased in identifying best practices (Linked Data or bust). How about stating that upon inspection of requirements and current problems and solutions concepts from the Linked Data paradigm transpired to be most applicable? Or perhaps Linked Data does not need to be mentioned at all.... Requirements like linkability, discoverability and interoperability automatically lead to recommending using HTTP(S) URIs and common semantics. 6. I think an explanation of the term 'spatial data' should be somewhere very high up in the document (abstract and/or introduction), especially that spatial <> geographic (geographical data is a subset of spatial data) 7. Section 2: There seems to be overlap with description of user groups in the introduction (1.1). This leads (or could lead) to duplicate information. Why not just mention in the introduction that there are multiple audiences and that they are described in section 2? 8. Section 2: I wonder if the three groups that are described cover all audience types. Some more I can think of are A) People working with spatial data that is not geographical (e.g. SVG, CAD, BIM). B) People involved in development of standards that have something to do with spatial data on the web . C) People involved in development of software that can work with spatial data. 9. Section 3: “SDW focuses on exposing the individual; the entities, the SpatialThings, within a spatial dataset ”. That seems to exclude spatial metadata, which is an important subject in SDW. 10. “Can be tested by machines and/or data consumers ”: I consider data consumers to be humans or machines. In fact, it could be used as a useful way of avoiding having to write ''humans or machines' each time. Most best practices should benefit both humans and machines. Only in some cases the distinction is meaningful. 11. 6.1: Is the discussion about features, information resources and real world things really necessary? I find it slightly confusing and I can imagine other will too. Why not just say that if you want spatial data to be referenceable on the web you need to use URIs? Just that makes a lot of sense and could be less confusing. 12. Best practice 3: I notice best practices 1 and 2 are phrased as solutions or recommendations . I think it is a good idea to try to do that for all best practices. So instead of “Working with data that lacks globally unique identifiers for entity-level resources” one could write “make spatial relationships explicit” 13. I appreciate seeing references to BP requirements from the UCR document. But they are placed in the 'Evidence' section of the BP template now. Is it appropriate to count requirements derived from use cases as evidence of a best practice? I would expect references to use cases and requirements to occur in the 'Why' section of the template. Or in a template section that is especially reserved for requirements, e.g 'Relevant requirements'. 14. Best practice 8: Is this based on the CRS wiki page <https://www.w3.org/2015/spatial/wiki/Coordinate_Reference_Systems>? It seems that WGS84 is recommended. But that is debatable and could be considered American-centric. European guidelines recommend ETRS89. Also, high-precision is not defined. Also, no mention is made of the need to add temporal data if a CRS with an increasing error with time (like WGS84) is needed. Also no mention is made of how to reconcile local CRSs (as in a building plan) with global CRSs. I think CRSs are one of the areas that do require some extra standardisation efforts outside of this document, but which could be instigated by our working group. 15. BP 10: I would at least recommend to be aware of significant digits. 16. Appendix C: Why are all UC requirements listed? Why not only the BP requirements? That would make a more compact table. Greetings, and keep up the good work! Frans -- Krzysztof Janowicz Geography Department, University of California, Santa Barbara 4830 Ellison Hall, Santa Barbara, CA 93106-4060 Email:jano@geog.ucsb.edu<mailto:Email%3Ajano@geog.ucsb.edu> <mailto:jano@geog.ucsb.edu<mailto:jano@geog.ucsb.edu>> Webpage:http://geog.ucsb.edu/~jano/ Semantic Web Journal:http://www.semantic-web-journal.net -- Andrea Perego, Ph.D. Scientific / Technical Project Officer European Commission DG JRC Institute for Environment & Sustainability Unit H06 - Digital Earth & Reference Data Via E. Fermi, 2749 - TP 262 21027 Ispra VA, Italy https://ec.europa.eu/jrc/
Received on Wednesday, 13 January 2016 23:14:35 UTC