RE: Actions for all of us — from today's call

 

A.    Review the Use Case draft and be ready to vote on transitioning it to first public working draft (FPWD) in next week's call.

 

My comment:

 

There is quite a difference in the level of detail in the use cases and also in the nature of the technical challenges. As far as I can see, there are two broad categories of use cases. Most of them are based on actual practical experience where specific challenges are derived from problems that were encountered (e.g. UC6, 10 through 25) but some are based on more general considerations, like a general need for documenting release schedules or provision of feedback mechanisms (e.g. UC1, 2, 5, 7, 8, 9) that lead to more open questions or general requirements. 

Would it be useful to distinguish the two categories in separate sections? 

 

Another issue that I see is that many of the ‘challenges’ are not ‘technical’ in the sense that there could be a technical solution (software, standard etc.) to overcome the challenge – some are more organisational or even legal (e.g. “Data is not available for further reuse by other parties”, UC16). Challenges could either be grouped in different categories depending on the type of solution, or otherwise we could just call them ‘Challenges’?

 

In the requirements section, in general, I am not sure how we can formulate a requirement if there is no explicit motivation for it?

 

I am not so sure about R-MetadataOpen. I don’t think we can say that metadata should always be open. In relation to  R-SensitivePrivacy and R-SensitiveSecurity, there could be cases where even access to the metadata needs to be restricted because it would give evidence for the existence of data that needs to be kept secret.

 

In R-GranularityMax, what is the particular relation between granularity and ‘privacy rights’? In general, any publication of data should respect the applicable laws of the jurisdiction in which it is published, but we don’t need to say that – or maybe once in the introduction? 

 

R-FormatMachineRead seems to be more specific than the requirement from the two use cases listed as motivation. The cases seem to be pointing to the problem of different formats which is already included as R-FormatStandardised. I am actually not so sure about R-FormatMachineRead in principle. After all, all formats of data on the Web (which is what we are concerned with) are machine-readable – it can only be on the Web if it is a file on a computer. Some formats may be easier to process for certain purposes but they are all machine-readable. For example, for a visually-impaired person with a PDF-to-speech reader, PDF is an ideal machine-readable format. Maybe the requirement is rather that data should be published in formats that are appropriate for its intended or potential use?

 

For R-FormatLocalise maybe the requirement could be more neutrally expressed as “Information about locale parameters (date and number formats, language) should be made available” which is more specific than “It should be possible to localise data on the Web”. 

 

R-LicenseMachineRead and R-ProvMachineRead have the same description.

 

I could not find the motivation for R-SelectHighValue in the use case mentioned (UC11). Moreover, it is not at all clear what “high value” is – this depends very much on your perspective, the type of data and the potential benefit.

 

I don’t understand how R-Citable is a requirement for Data Usage. It’s much more general than that. And don’t URIs for data solve that requirement already? Or do you mean that something like DataCite https://www.datacite.org/ should be considered?

 

Thanks again for the good work!

 

Makx.

 

Received on Friday, 30 May 2014 09:55:16 UTC