Comments on best practices draft

Here are some comments on the best practices draft document [1].


1. In the summary of best practices, we should use a crisp set of key words
describing what to do or not do in the form of in a consistent format.
Currently, there is a mix of styles (nouns - e.g., NAME or PII, verbs -
e.g., IDENTIFY, verb-noun - e.g., SPECIFY_LICENSE).  This can be misleading
and also hampers readability. Example - PII. Should we have or not have
them? (Off course not have them as the rest of the sentence says).

New Format Suggested

"The following best practices are discussed in this document and listed
here for convenience.


IDENTIFY_DATASETS Indentify data sets that other people may wish to re-use.


MODEL_APPLICATION_INDEPENDENT Model the data in an application-independent,
objective way in terms of representation. Denormalize the data as
necessary.


PROVIDE_METADATA Provide basic metadata, including MIME type, publishing
organization and/or agency, creation date, modification date, version,
frequency of updates, contact email for the data steward(s).


REMOVE_PII Do not Publish Personally Identifiable Information as Open Data
on the Web Data on the public Web can be potentially misused. Examples of
personally identifiable data include: individual names, national
identification number, phone number, credit card number and driver license
number.


HAVE_URI_NAME Use HTTP URIs as names for your objects. Give careful
consideration to the URI naming strategy. Consider how the data will change
over time and name as necessary.


USE_STANDARD_VOCABULARIES Describe objects with standard vocabularies
whenever possible.


USE_VOCABULARY Use vocabularies as loosely coupled modular components.


HAVE_LD_REPRESENTATION Convert the source data into a Linked Data
representation, also called an RDF serialization including Turtle,
Notation-3 (N3), N-Triples, XHTML with embedded RDFa, and RDF/XML.


BE_HUMAN READABLE Provide human readable descriptions with your Linked
Data.


BE_MACHINE ACCESSIBLEProvide access to the data representation via RESTful
API, SPARQL endpoint(s) and RDF download.


SPECIFY_LICENSE Specify an appropriate license.


HOST_AUTHORITATIVE_DOMAIN Deliver open government data on authoritative
domain to increase perceived trust.


ANNOUNCE_NEWS Announce open government data, have a feedback mechanism and
be prepared to be responsive to feedback.


DELIVER_ON_SOCIAL_CONTRACT Maintenance is critical. Without a permanent
identifier scheme, if you move or remove data that is published to the Web,
you may break third party applications or mashups which is clearly
undesireable. URI strategy and implementation are critical. "


2. Add reference and link to
  a) Hyland et al. [BHYLAND]. It is missing.
  b) Hausenblas et al. [HAUSENBLAS]. It is missing.


3. There was some discussion in the group on high-level guidance on best
practices. It may not be directly used by a govt decision maker but can be
useful for technical people reading the GLD best practices document to
refer to when discussing with govt decision makers [2]. Are there plans to
include them in the current draft?


[1] https://dvcs.w3.org/hg/gld/raw-file/default/bp/index.html
[2] http://lists.w3.org/Archives/Public/public-gld-wg/2013Aug/0006.html


Regards,
--Biplav

Received on Wednesday, 20 November 2013 07:52:41 UTC