Re: Additional review and telecon (Tuesday) for Best Practices

Hi Hadley,

On Dec 17, 2013, at 10:04 AM, Hadley Beeman <hadley@linkedgov.org> wrote:

> A few thoughts:
> 
> # 6 Personal Identifiable Data
> 
> I'm very uncomfortable with including this section.  The topic varies too broadly across cultures and jurisdictions, both in terms of what fits a definition of PII and what is "appropriate" (or even legal) to publish.
> 
> Also, the potential consequences are diverse:  Negatives may include: possible breaches of privacy and human rights, combining datasets for de-anonymisation (or "mosaic-ing") to make other data personally identifiable, taking data across international boundaries…  Positives may include: reduction in corruption, accountability for public officials, increases in corporate transparency...  It's just too complex an area for us to do justice in a document of this level.
> 
> I would rather strike this section and put something about identifying what data is appropriate or useful to publish (in its own legal framework).

Done. Section removed at the request of multiple people.

> 
> # 2 Preparing to publish Linked Open Data
> I think the three lifecycle models are confusing.  I'm not sure we add anything by including all three.

Sorry, this was discussed over a year ago & the WG decided to keep all three because different cycles appealed to different people.  As an informative doc, it cannot hurt having 3 vs 1 it was felt.

> 
> # 14 Vocabulary checklist
> This begins "It is best practice…." which leads me to ask, why?  I think it would help if we could back this up.

I agree that if I were writing a peer reviewed academic paper, I'd include some citations.  However, in this form of writing, a best practices document is informative guidance, written & reviewed by people who are experts in this area. 

> 
> 
> Overall, I have to say that I'm concerned we are still asking fundamental questions about the structure of this document.  I'm looking forward to today's telecon, when we can hopefully talk through some of them.

I'm sorry that some people are asking fundamental questions about the structure.   The feedback the editors have received has been more along the lines of lack of flow, duplicated content, inconsistent voice, some obvious errors, and some concerns about PII.

We're all volunteers doing our best to produce something that is a useful guide to gov't stakeholders.  

I'll personally take the blame for not having met my deadlines a quarter ago, but we did a lot of heavy lifting in 2012 in the hopes of getting the engagement that largely came during Nov-Dec '13.  I'm not complaining --  It came in due course and the doc improved considerably because of the feedback.

Cheers,
Bernadette

> 
> Looking forward to speaking to you all soon.
> 
> Cheers,
>    Hadley
> 
> Hadley Beeman
> Co-chair
> W3C Government Linked Data Working Group
> 
> On 15 Dec 2013, at 22:04, Dave Reynolds wrote:
> 
>> I don't have time to do another full review and am very uncomfortable with the volume of last minute changes.
>> 
>> Is there a diff between the version we reviewed before and this version?
>> 
>> Here are a few things that I noticed in doing a quick look, no claim that these are exhaustive. Some of these may have been present before.
>> 
>> # Overall
>> 
>> It reads like there are two documents clashing in here. A document outlining a standard template for a Linked Data publishing project with activities like "prepare" and "announce". Plus a document containing best practice advice for government linked data practitioners.
>> 
>> I guess that was true before but the restructuring seems to have brought the mismatch to the fore.
>> 
>> Don't have a specific suggestion for how to address that in the available time so presumably just have to live with it.
>> 
>> # Abstract
>> 
>> Not sure that the sentences on why web of data is wonderful are really an abstract of the document.
>> 
>> Rephrase "The following recommendations are offered to creators, maintainers and operators of Web sites." This not aimed purely at such people, this is about data not "Web sites".
>> 
>> # Audience and Scope
>> 
>> These sections seem confused.  Both of them are about audience and prerequisites - neither of them are about scope. The Audience section says you should know about HTML, URIs, HTTP. The scope says you should know about RDF. Put these two lists of prerequisites together.
>> 
>> Why is there a list of Linked Data syntaxes in the scope section?
>> 
>> HTTP URIs are *not* a syntax for Linked Data. [Repeated in section 8].
>> 
>> # 1 Summary of Best Practices
>> 
>> "The following best practices are discussed in this document and listed here for convenience."
>> 
>> Only a subset of the document is linked and listed here. What's the status of the other sections?
>> 
>> # 4 Data Modelling
>> 
>> Not sure what value this section adds, it says too little. Should either say more or say that modelling advice is out of scope.
>> 
>> [I understand what you mean by "application independent modelling" but makes me nervous. There's no such thing as a completely neutral ontology, you always have to make choices about how and how deeply to model based on the envisioned range of use of the data, that's why competency questions are such an important part of the process. I guess it's a matter of degree, you try reduce application dependence while accepting that this is not an achievable goal. ]
>> 
>> # 5 Basic metadata
>> 
>> The second sentence starting "In the following section ..." is now incorrect.
>> 
>> # 6 PII
>> 
>> Doesn't quite seem to match our discussion on Thursday. I thought we proposed saying WTTEO "Don't accidentally publish PII." Sometimes the purpose of the publication may include PII e.g. for officials. The current second sentence talks about "required by law" which is too strong. For example, when the UK published names and salaries of senior government staff it was a policy decision but while it was *permitted* by law I don't think it was *required* by law.
>> 
>> #7 Specify an appropriate license
>> 
>> The Note is highly US specific. It was probably there before and I didn't pick it out then so I guess it stays but seems odd.
>> 
>> #8 Convert to Linked Data
>> 
>> Not sure what "consensus that the object and relationships correctly reflect the dataset(s)" means.
>> 
>> "The next step involves mapping the source data into a set of RDF statements via a script" - there are lots of ways to convert data and scripts are only one - there's declarative mapping languages, languages that do query translation rather then data translation (e.g. R2RML), non-script programs etc.
>> 
>> Again "HTTP URIs" are not an RDF serialization.
>> 
>> #8 & 9
>> 
>> Why are these two different sections?
>> 
>> # 12. Internationalized Resource Identifiers
>> 
>> Not sure how to read the last sentence:
>> "There is now a growing need to enable use of characters from any language in URIs."
>> Reads as if you are saying that other than IRIs is needed which I assume is not the case.
>> 
>> # 13 Standard vocabularies
>> 
>> The sentence starting "CSARVENón-Capadisli propose in [CSARVEN] the RDF Data Cube Vocabulary ..." is broken.
>> 
>> # 18 Publishing Data for Access and Reuse
>> 
>> The 5 star scale in this section is phrased for vocabularies but the rest of the text is talking about general data.
>> 
>> # 21. Announce to the public
>> 
>> The check list in the note repeats material from elsewhere in the document in a different form. It is not clear why this particular subset of best practice is listed again in this section.
>> 
>> 
>> I guess of these only the two mentions of HTTP URIs as being an RDF serialization and the yet-another-5-star scale in #18 are show shoppers.
>> 
>> Dave
>> 
>> 
>> On 13/12/13 20:02, Bernadette Hyland wrote:
>>> Hi,
>>> The Best Practices document has feedback incorporated & is available for
>>> review.[1]  Please send comments to the mailing list and the editors
>>> will continue responding.[2]
>>> 
>>> Thank you.
>>> 
>>> On behalf of the GLD best practice editors,
>>> 
>>> Bernadette, Boris & Ghislain
>>> W3C Government Linked Data Working Group
>>> 
>>> [1] https://dvcs.w3.org/hg/gld/raw-file/default/bp/index.html
>>> [2] public-gld-comments@w3.org <mailto:public-gld-comments@w3.org>
>>> 
>>> On Dec 12, 2013, at 12:27 PM, Hadley Beeman <hadley@linkedgov.org
>>> <mailto:hadley@linkedgov.org>> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> As we agreed in today's call: [1]
>>>> 
>>>> 1.  The Best Practices editors will incorporate the existing feedback
>>>> and send out their finalized document to the working group tomorrow
>>>> (Friday) by 12:00 EST / 18:00 CET.
>>>> 2.  The working group will then review it, send comments to the
>>>> mailing list, and the editors will continue responding.
>>>> 3.  The editors, chairs and any interested working group participants
>>>> will hold an informal call on
>>>> 
>>>>      Telecon:  Tuesday  10:00 am EST / 15:00 GMT / 16:00 CET
>>>> 
>>>> to discuss issues, resolve any conflicts and work out what changes
>>>> need to be made in the document.
>>>> 4.  The editors will then make final changes, implement PubRules and
>>>> return the document to the working group for final review/approval on
>>>> Wednesday by 12:00 EST / 18:00 CET.
>>>> 5.  The working group will have 24 hours for a final read to make sure
>>>> they're ready to vote.
>>>> 6.  We will meet again
>>>> 
>>>>      Telecon:  Thursday 10:00 am EST / 15:00 GMT / 16:00 CET (our
>>>> normal time)
>>>> 
>>>> to vote on publishing the document as a working group note.
>>>> 
>>>> Lots to power through here, in a short amount of time… Mark your
>>>> calendars accordingly!
>>>> 
>>>> Thanks again to everyone for their input and to the editors for
>>>> continuing to persevere.  Speak to you all on Tuesday!  (And Thursday.)
>>>> 
>>>> Cheers,
>>>> 
>>>>    Hadley
>>>> 
>>>> 
>>>> Hadley Beeman
>>>> Co-chair
>>>> W3C Government Linked Data Working Group
>>>> 
>>>> [1] http://www.w3.org/2013/meeting/gld/2013-12-12
>>> 
>> 
>> 
> 

Received on Tuesday, 17 December 2013 23:31:39 UTC