Re: Proposals for Annette's comments to be considered before publishing the last working draft

Hi Annette,

Thanks for your message and your efforts too :) I have just few comments.

cheers,
Berna

23 (Introduction):
>
> Phil made the native-speaker review. Phenomenon was removed. We propose to
> keep the examples [1].
>
> We need to use examples that are examples of the thing we are talking
> about, which is the expansion of the Web as a medium for the exchange of
> data. These examples don't represent use of the web per se, though they are
> things that could drive more usage of the web, if people decided to do
> that. The worst offender in this regard is "the provision of important
> cultural heritage collections". Important cultural heritage collections
> have been around for millennia. That only works as an example if it refers
> to putting those collections on the web.
>

--> If we say "...the Web as a medium for data sharing." rather than "the
Web as a medium for the exchange of data." Would it be ok?



27 (Context): Eric helped us to rewrite the diagram description:

The following is a composite diagram illustrating the anatomy of a
published and acessible Web dataset. Data values correspond to the data
itself and may be available in one or more distributions, which should be
defined by the publisher considering data consumer's expectations. The
Metadata component corresponds to the additional information that describes
the dataset and dataset distributions, helping consumers manipulate and
reuse the data. In order to allow easy access to the dataset and its
corresponding distributions, multiple dataset access mechanisms can  be
available. Finally, to promote the interoperability among datasets it is
important to adopt data vocabularies and standards.

Eric's description is very helpful in understanding the right side of the
figure, and I think the right-hand side is helpful, but the left-hand side
is still not working for me.  The colored rectangles are very abstract
concepts, and representing them in this way doesn't make them less
abstract. Also, if you inserted the details of the distributions into the
dataset, you would have metadata represented at two different levels. It's
not clear to me why that choice was made, but it seems to suggest that
there is metadata for the dataset that isn't to be included in the
distributions. It also appears that the concept of a dataset only exists
before it is distributed. Is the left side about storage of the data? If
so, then the colored rectangles make little sense being there. I think the
goal of the diagram was to explain the relationship between datasets,
distributions, data, and metadata. If it concentrated on those elements, it
would be more useful.

--> ok! I'm gonna try to redraw the diagram.

Machine-readable: A format in a standard computer language (not natural
language text) that can be read automatically by a computer system.
Traditional word processing documents and portable document format (PDF)
files are easily read by humans but typically are difficult for machines to
interpret. Formats such as XML, JSON, NetCDF, RDF or spreadsheets with
header columns that can be exported as CSV are machine readable formats.

This definition of machine-readable was proposed by Phil and it is from [2].

I disagree with the word "language" here, as a computer language usually
refers to a programming language, like C++ or Java.

How about
"Machine-readable data: Data in a standard format that can be read and
processed automatically by a computing system. Traditional word processing
documents and portable document format (PDF) files are easily read by
humans but typically are difficult for machines to interpret and
manipulate. Formats such as XML, JSON, HDF5, RDF and CSV are
machine-readable data formats."

--> I understand your point, but I'm not sure if we should change the
definition and still make a reference to it.

69 (license):

Could you contact Renato Ianella? Do you have any updates about this
comment?

I think I understand what Renato is after. He is pointing out that for
ODRL, they pretty much avoided using the word "license" altogether. For the
verb, they use "grantUse" (though, I don't think we have the option of
using that term in our text, since it's not in standard English in any side
of the Atlantic), and for the noun they use "agreement". I'm sure there are
many (of the 66) places in our text where "agreement" would work. We could
read through and look for opportunities to substitute "agreement" for the
noun "license". We would still have to use "license" for the verb and for
the noun in places where "agreement" didn't provide enough context.

--> I think we should keep using license rather than changing to agreement.

The comment that I was referring to is the following:
"We say "Data license information can be provided as a link to a
human-readable license or as a link/embedded machine-readable license."
Since licensing info is part of metadata, and we tell people to provide
metadata for both humans and machines, we should also require licensing
info for both humans and machines"

We discussed this comment in one of our skype meetings and the idea of
having "link/embedded machine-readable license" was not clear for you.

I have a proposal:

Data license information can be provided as a link to a human-readable
license or to a machine-readable license, as well as an embedded
machine-readable license.
--------------------------

----------------------------------------------------------------------------
Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil
----------------------------------------------------------------------------



-- 
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory





-- 
Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil
----------------------------------------------------------------------------

Received on Tuesday, 26 April 2016 21:49:28 UTC