- From: Frans Knibbe <frans.knibbe@geodan.nl>
- Date: Wed, 13 Jan 2016 10:49:23 +0100
- To: Andrea Perego <andrea.perego@jrc.ec.europa.eu>
- Cc: Bill Roberts <bill@swirrl.com>, Krzysztof Janowicz <janowicz@ucsb.edu>, Jeremy Tandy <jeremy.tandy@gmail.com>, SDW WG Public List <public-sdw-wg@w3.org>
- Message-ID: <CAFVDz41Szm2i0VVNePy63YV+6FqxyaOa33kLq+jh6LkA39DNcg@mail.gmail.com>
Whether 'data' is used as a plural or singular noun probably does not have much to do with British English versus US English. The problem exists in Dutch language too and I can imagine in some others too. I think it has to do with awareness of the word being a plural form. When someone recognizes that 'data' is the plural form of 'datum' she or he will probably be more likely to treat it as a plural form. A similar word is 'media'. I think it is used as a singular when the word is not recognized as the plural form of 'medium'. It happens with Italian words too - I often hear or read words like 'grafitti' or 'panini' being used as singular nouns. Greetings, Frans 2016-01-12 19:11 GMT+01:00 Andrea Perego <andrea.perego@jrc.ec.europa.eu>: > The Wiktionary may help here: > > https://en.wiktionary.org/wiki/data#English > > Quoting: > > [[ > Usage notes > > This word is more often used as an uncountable noun with a singular verb > than as a plural noun with singular datum. > ]] > > > Andrea > > On 12/01/2016 18:50, Bill Roberts wrote: > >> not perhaps our most important issue, but my opinion is that 'data' >> reads most naturally as a singular word - probably because it's often >> thought of as a non-countable noun, like water - you can have 'some >> data', but few people would say 'I have 100 data'. >> >> Some people like to be more faithful to its Latin roots and have plural >> 'data' and singular 'datum' - but use of 'datum' is very rare in English >> (UK English anyway). 'Data point' is probably a more common way to >> refer to a datum. >> >> So probably either approach is acceptable if we are self-consistent, but >> I would vote for singular 'data'. >> >> Bill >> >> >> >> >> >> On 12 January 2016 at 16:54, Krzysztof Janowicz <janowicz@ucsb.edu >> <mailto:janowicz@ucsb.edu>> wrote: >> >> > 2. I notice the word 'data' is taken as singular. That looks >>> funny to me, but I know there are differences of opinion in that >>> respect. Do W3C or OGC have a recommendation on whether to treat >>> 'data' as a singular or plural noun? >>> >>> As a native English speaker (OK, that doesn't mean much) "data" >>> looks and sounds correct. >>> >>> @phila ... any comment from W3C perspective; I know I'm supposed >>> to write in US-english :-) >>> >> >> To the best of my knowledge data is plural, datum is the singular >> form. >> >> Krzysztof >> >> >> >> On 01/12/2016 08:44 AM, Jeremy Tandy wrote: >> >>> Hi Frans. Thanks for your commentary ... responses below. >>> >>> @lvdbrink ... can you comment on number #4? Also, can you consider >>> a redraft of Section 2 (see points #7 and #8 below) and the >>> opening of section 6.1 (see point #11). >>> >>> > 1. (already discussed in the teleconference) The introduction or >>> scope section could do with an explanation of how the document >>> relates to the description of the Best Practices deliverable in >>> the charter, especially the first and last bullet points. >>> >>> See PR 203 <https://github.com/w3c/sdw/pull/203> (already merged) >>> ... hopefully this does the trick. >>> >>> > 2. I notice the word 'data' is taken as singular. That looks >>> funny to me, but I know there are differences of opinion in that >>> respect. Do W3C or OGC have a recommendation on whether to treat >>> 'data' as a singular or plural noun? >>> >>> As a native English speaker (OK, that doesn't mean much) "data" >>> looks and sounds correct. >>> >>> @phila ... any comment from W3C perspective; I know I'm supposed >>> to write in US-english :-) >>> >>> > 3.In paragraph 1.1 discoverability and accessibility are listed as >>> the key problems. I think interoperability (between different >>> publications of spatial data and between spatial data and other >>> types of data) could be listed as a third main problem; many >>> requirements have to do with interoperability. >>> >>> Created new issue for discussion: ISSUE 205 >>> <https://github.com/w3c/sdw/issues/205> >>> >>> > 4. section 1.1: problems that are experienced by different >>> groups (commercial operators, geospatial experts, web developers, >>> public sector) are described. I get the impression that those >>> problems are the only or main problems that are experienced by a >>> certain group, but I don't think that is the case. Perhaps the >>> listed problems could be marked as examples? Or the list of >>> problems per group could be expanded? >>> >>> Indeed- the list of problems is not exhaustive, only illustrative. >>> As an introduction I felt that this reads OK. @lvdbrink - wdyt? >>> >>> > 5.secion 1:1 “we've adopted a Linked Data approach as the >>> underlying >>> principle of the best practices ”: Such a statement might drive >>> away people that for some reason resist the idea of Linked Data, >>> or in general don't like to have to adopt a new unknown paradigm. >>> It also looks like the WG was biased in identifying best practices >>> (Linked Data or bust). How about stating that upon inspection of >>> requirements and current problems and solutions concepts from the >>> Linked Data paradigm transpired to be most applicable? Or perhaps >>> Linked Data does not need to be mentioned at all.... Requirements >>> like linkability, discoverability and interoperability >>> automatically lead to recommending using HTTP(S) URIs and common >>> semantics. >>> >>> The WG has agreed on several occasions (including F2F at >>> Nottingham) that we would "adopt the linked data approach" because >>> we feel this is the best way to surface spatial data on the web. >>> Rereading the BP text, I can see how a bias might be taken. I've >>> reworded as follows ... >>> >>> "Analysis of the requirements derived from scenarios that describe >>> how spatial data is commonly published and used on the Web (as >>> documented in [[UCR]]) indicates that, in contrast to the workings >>> of a typical SDI, the <a >>> href="<http://www.w3.org/standards/semanticweb/data> >>> http://www.w3.org/standards/semanticweb/data">Linked >>> Data</a> approach is most appropriate for publishing and using >>> spatial data on the Web. Linked Data provides a foundation to many >>> of the best practices in this document." >>> >>> Hope that works for you. >>> >>> > 6. I think an explanation of the term 'spatial data' should be >>> somewhere very high up in the document (abstract and/or >>> introduction), especially that spatial <> geographic (geographical >>> data is a subset of spatial data) >>> >>> Agreed. New issue added to the document at beginning of Intro. >>> ISSUE 206 <https://github.com/w3c/sdw/issues/206> >>> >>> > 7. Section 2: There seems to be overlap with description of user >>> groups in the introduction (1.1). This leads (or could lead) to >>> duplicate information. Why not just mention in the introduction >>> that there are multiple audiences and that they are described in >>> section 2? >>> >>> Agreed. New issue added. ISSUE 207 >>> <https://github.com/w3c/sdw/issues/207> >>> >>> > 8. Section 2: I wonder if the three groups that are described >>> cover all audience types. Some more I can think of are [...] >>> >>> Good point. Added toISSUE 207 >>> <https://github.com/w3c/sdw/issues/207> as additional copy for a >>> potential redraft of section 2. >>> >>> > 9. Section 3: “SDW focuses on exposing the individual; the >>> entities, the SpatialThings, within a spatial dataset ”. That >>> seems to exclude spatial metadata, which is an important subject >>> in SDW. >>> >>> Agreed. Now, referencing the deliverables from the charter, the >>> Scope states: "The use of metadata to complement spatial data". >>> >>> > 10.“Can be tested by machines and/or data consumers ”: I consider >>> data consumers to be humans or machines. In fact, it could be used >>> as a useful way of avoiding having to write ''humans or machines' >>> each time. Most best practices should benefit both humans and >>> machines. Only in some cases the distinction is meaningful. >>> >>> Reworded to: "Compliance with each best practice in this document >>> can be tested by programmatically and/or by human inspection." >>> >>> > 11.6.1: Is the discussion about features, information resources and >>> real world things really necessary? I find it slightly confusing >>> and I can imagine other will too. Why not just say that if you >>> want spatial data to be referenceable on the web you need to use >>> URIs? Just that makes a lot of sense and could be less confusing. >>> >>> @lvdbrink has attempted to capture the discussion that occurred >>> during the Sapporo F2F; this discussion certainly had value at the >>> time. I'm wary of reducing the context to the single statement you >>> suggest but agree that it's not currently straight forward. We may >>> also want to talk about the difference between Features >>> (information resources) and Spatial Things (the resources >>> described by the information) and the fact that in the end, the >>> distinction is often not helpful. >>> >>> I've added a new issue to capture this point. ISSUE 208 >>> <https://github.com/w3c/sdw/issues/208> >>> >>> > 12. Best practice 3: I notice best practices 1 and 2 are phrased >>> as solutions or recommendations . I think it is a good idea to try >>> to do that for all best practices. So instead of “Working with >>> data that lacks globally unique identifiers for entity-level >>> resources” one could write “make spatial relationships explicit” >>> >>> See ISSUE 193 <https://github.com/w3c/sdw/issues/193> that echoes >>> your sentiment for BP style. That said, your suggested text misses >>> the intended point. There's more content needed for BP3 (and >>> perhaps a major redraft?) as stated in ISSUE 102 >>> <https://github.com/w3c/sdw/issues/102> ... the concern is not so >>> much making spatial relationships explicit, but what to do if your >>> data doesn't use URIs. How do you convert from locally scoped >>> identifier to URI? >>> >>> > 13.I appreciate seeing references to BP requirements from the UCR >>> document. But they are placed in the 'Evidence' section of the BP >>> template now. Is it appropriate to count requirements derived from >>> use cases as evidence of a best practice? I would expect >>> references to use cases and requirements to occur in the 'Why' >>> section of the template. Or in a template section that is >>> especially reserved for requirements, e.g 'Relevant requirements'. >>> >>> We're following the pro-forma set out by DWBP (for example, see >>> <http://w3c.github.io/dwbp/bp.html#identifiersWithinDatasets> >>> http://w3c.github.io/dwbp/bp.html#identifiersWithinDatasets). >>> I'll admit to not thinking too hard about this so far. I have >>> raised an issue in the WG tracker (ISSUE 36 >>> <https://www.w3.org/2015/spatial/track/issues/36>) so that we come >>> back to this discussion post release of FPWD. >>> >>> > 14. Best practice 8: Is this based on theCRS wiki page >>> <https://www.w3.org/2015/spatial/wiki/Coordinate_Reference_Systems>? >>> It seems that WGS84 is recommended. But that is debatable and >>> could be considered American-centric. European guidelines >>> recommend ETRS89. Also, high-precision is not defined. Also, no >>> mention is made of the need to add temporal data if a CRS with an >>> increasing error with time (like WGS84) is needed. Also no mention >>> is made of how to reconcile local CRSs (as in a building plan) >>> with global CRSs. I think CRSs are one of the areas that do >>> require some extra standardisation efforts outside of this >>> document, but which could be instigated by our working group. >>> >>> I've added your comment to ISSUE 128 >>> <https://github.com/w3c/sdw/issues/128> which is associated with >>> BP 8. We can improve the content post FPWD release. >>> >>> > 15.BP 10: I would at least recommend to be aware of significant >>> digits. >>> >>> Added your comment to ISSUE 125 >>> <https://github.com/w3c/sdw/issues/125> >>> >>> > 16. Appendix C: Why are all UC requirements listed? Why not only >>> the BP requirements? That would make a more compact table. >>> >>> There were many requirements that were not specifically marked for >>> the BP- but turned out to be related ... so we captured those. >>> Also, while we are working on the BP, it's good to have this full >>> list. Perhaps when we're complete, it would make sense to truncate. >>> >>> Thanks for all your efforts. Jeremy >>> >>> On Thu, 7 Jan 2016 at 12:30 Frans Knibbe >>> <<mailto:frans.knibbe@geodan.nl>frans.knibbe@geodan.nl >>> <mailto:frans.knibbe@geodan.nl>> wrote: >>> >>> Hello, >>> >>> Following are my comments, after reading the BP draft from top >>> to bottom: >>> >>> 1. (already discussed in the teleconference) The introduction >>> or scope section could do with an explanation of how the >>> document relates to the description of the Best Practices >>> deliverable in the charter, especially the first and last >>> bullet points. >>> 2. I notice the word 'data' is taken as singular. That looks >>> funny to me, but I know there are differences of opinion >>> in that respect. Do W3C or OGC have a recommendation on >>> whether to treat 'data' as a singular or plural noun? >>> 3. In paragraph 1.1 discoverability and accessibility are >>> listed as the key problems. I think interoperability >>> (between different publications of spatial data and >>> between spatial data and other types of data) could be >>> listed as a third main problem; many requirements have to >>> do with interoperability. >>> 4. section 1.1: problems that are experienced by different >>> groups (commercial operators, geospatial experts, web >>> developers, public sector) are described. I get the >>> impression that those problems are the only or main >>> problems that are experienced by a certain group, but I >>> don't think that is the case. Perhaps the listed problems >>> could be marked as examples? Or the list of problems per >>> group could be expanded? >>> 5. secion 1:1 “we've adopted a Linked Data approach as the >>> underlying principle of the best practices ”: Such a >>> statement might drive away people that for some reason >>> resist the idea of Linked Data, or in general don't like >>> to have to adopt a new unknown paradigm. It also looks >>> like the WG was biased in identifying best practices >>> (Linked Data or bust). How about stating that upon >>> inspection of requirements and current problems and >>> solutions concepts from the Linked Data paradigm >>> transpired to be most applicable? Or perhaps Linked Data >>> does not need to be mentioned at all.... Requirements like >>> linkability, discoverability and interoperability >>> automatically lead to recommending using HTTP(S) URIs and >>> common semantics. >>> 6. I think an explanation of the term 'spatial data' should >>> be somewhere very high up in the document (abstract and/or >>> introduction), especially that spatial <> geographic >>> (geographical data is a subset of spatial data) >>> 7. Section 2: There seems to be overlap with description of >>> user groups in the introduction (1.1). This leads (or >>> could lead) to duplicate information. Why not just mention >>> in the introduction that there are multiple audiences and >>> that they are described in section 2? >>> 8. Section 2: I wonder if the three groups that are described >>> cover all audience types. Some more I can think of are >>> A) People working with spatial data that is not >>> geographical (e.g. SVG, CAD, BIM). >>> B) People involved in development of standards that have >>> something to do with spatial data on the web . >>> C) People involved in development of software that can >>> work with spatial data. >>> 9. Section 3: “SDW focuses on exposing the individual; the >>> entities, the SpatialThings, within a spatial dataset ”. >>> That seems to exclude spatial metadata, which is an >>> important subject in SDW. >>> 10. “Can be tested by machines and/or data consumers ”: I >>> consider data consumers to be humans or machines. In fact, >>> it could be used as a useful way of avoiding having to >>> write ''humans or machines' each time. Most best practices >>> should benefit both humans and machines. Only in some >>> cases the distinction is meaningful. >>> 11. 6.1: Is the discussion about features, information >>> resources and real world things really necessary? I find >>> it slightly confusing and I can imagine other will too. >>> Why not just say that if you want spatial data to be >>> referenceable on the web you need to use URIs? Just that >>> makes a lot of sense and could be less confusing. >>> 12. Best practice 3: I notice best practices 1 and 2 are >>> phrased as solutions or recommendations . I think it is a >>> good idea to try to do that for all best practices. So >>> instead of “Working with data that lacks globally unique >>> identifiers for entity-level resources” one could write >>> “make spatial relationships explicit” >>> 13. I appreciate seeing references to BP requirements from the >>> UCR document. But they are placed in the 'Evidence' >>> section of the BP template now. Is it appropriate to count >>> requirements derived from use cases as evidence of a best >>> practice? I would expect references to use cases and >>> requirements to occur in the 'Why' section of the >>> template. Or in a template section that is especially >>> reserved for requirements, e.g 'Relevant requirements'. >>> 14. Best practice 8: Is this based on the CRS wiki page >>> < >>> https://www.w3.org/2015/spatial/wiki/Coordinate_Reference_Systems>? >>> It seems that WGS84 is recommended. But that is debatable >>> and could be considered American-centric. European >>> guidelines recommend ETRS89. Also, high-precision is not >>> defined. Also, no mention is made of the need to add >>> temporal data if a CRS with an increasing error with time >>> (like WGS84) is needed. Also no mention is made of how to >>> reconcile local CRSs (as in a building plan) with global >>> CRSs. I think CRSs are one of the areas that do require >>> some extra standardisation efforts outside of this >>> document, but which could be instigated by our working group. >>> 15. BP 10: I would at least recommend to be aware of >>> significant digits. >>> 16. Appendix C: Why are all UC requirements listed? Why not >>> only the BP requirements? That would make a more compact >>> table. >>> >>> >>> Greetings, and keep up the good work! >>> >>> Frans >>> >>> >> >> -- >> Krzysztof Janowicz >> >> Geography Department, University of California, Santa Barbara >> 4830 Ellison Hall, Santa Barbara, CA 93106-4060 >> >> Email:jano@geog.ucsb.edu <mailto:jano@geog.ucsb.edu> >> Webpage:http://geog.ucsb.edu/~jano/ >> Semantic Web Journal:http://www.semantic-web-journal.net >> >> >> > -- > Andrea Perego, Ph.D. > Scientific / Technical Project Officer > European Commission DG JRC > Institute for Environment & Sustainability > Unit H06 - Digital Earth & Reference Data > Via E. Fermi, 2749 - TP 262 > 21027 Ispra VA, Italy > > https://ec.europa.eu/jrc/ >
Received on Wednesday, 13 January 2016 09:49:57 UTC