- From: Biplav Srivastava <sbiplav@in.ibm.com>
- Date: Wed, 20 Nov 2013 13:23:27 +0530
- To: "GLD Chairs" <team-gld-chairs@w3.org>
- Cc: "Government Linked Data Working Group" <public-gld-wg@w3.org>
Here are some comments on the best practices draft document [1]. 1. In the summary of best practices, we should use a crisp set of key words describing what to do or not do in the form of in a consistent format. Currently, there is a mix of styles (nouns - e.g., NAME or PII, verbs - e.g., IDENTIFY, verb-noun - e.g., SPECIFY_LICENSE). This can be misleading and also hampers readability. Example - PII. Should we have or not have them? (Off course not have them as the rest of the sentence says). New Format Suggested "The following best practices are discussed in this document and listed here for convenience. IDENTIFY_DATASETS Indentify data sets that other people may wish to re-use. MODEL_APPLICATION_INDEPENDENT Model the data in an application-independent, objective way in terms of representation. Denormalize the data as necessary. PROVIDE_METADATA Provide basic metadata, including MIME type, publishing organization and/or agency, creation date, modification date, version, frequency of updates, contact email for the data steward(s). REMOVE_PII Do not Publish Personally Identifiable Information as Open Data on the Web Data on the public Web can be potentially misused. Examples of personally identifiable data include: individual names, national identification number, phone number, credit card number and driver license number. HAVE_URI_NAME Use HTTP URIs as names for your objects. Give careful consideration to the URI naming strategy. Consider how the data will change over time and name as necessary. USE_STANDARD_VOCABULARIES Describe objects with standard vocabularies whenever possible. USE_VOCABULARY Use vocabularies as loosely coupled modular components. HAVE_LD_REPRESENTATION Convert the source data into a Linked Data representation, also called an RDF serialization including Turtle, Notation-3 (N3), N-Triples, XHTML with embedded RDFa, and RDF/XML. BE_HUMAN READABLE Provide human readable descriptions with your Linked Data. BE_MACHINE ACCESSIBLEProvide access to the data representation via RESTful API, SPARQL endpoint(s) and RDF download. SPECIFY_LICENSE Specify an appropriate license. HOST_AUTHORITATIVE_DOMAIN Deliver open government data on authoritative domain to increase perceived trust. ANNOUNCE_NEWS Announce open government data, have a feedback mechanism and be prepared to be responsive to feedback. DELIVER_ON_SOCIAL_CONTRACT Maintenance is critical. Without a permanent identifier scheme, if you move or remove data that is published to the Web, you may break third party applications or mashups which is clearly undesireable. URI strategy and implementation are critical. " 2. Add reference and link to a) Hyland et al. [BHYLAND]. It is missing. b) Hausenblas et al. [HAUSENBLAS]. It is missing. 3. There was some discussion in the group on high-level guidance on best practices. It may not be directly used by a govt decision maker but can be useful for technical people reading the GLD best practices document to refer to when discussing with govt decision makers [2]. Are there plans to include them in the current draft? [1] https://dvcs.w3.org/hg/gld/raw-file/default/bp/index.html [2] http://lists.w3.org/Archives/Public/public-gld-wg/2013Aug/0006.html Regards, --Biplav
Received on Wednesday, 20 November 2013 07:52:41 UTC