Re: The document needs editing by a native English speaker

Thank you, Phil!

On 22/01/15 10:55, Phil Archer wrote:
> I'm working through it now and yes, native speaker adjustment is part 
> of that.
>
> On 22/01/2015 12:42, Caroline Burle wrote:
>> Hello!
>>
>> Annette raised the question that the document needs editing by a native
>> English speaker, I understand Phil would do it, is that correct?
>>
>> Kind regards,
>> Caroline
>>
>> On 21/01/15 18:31, Annette Greiner wrote:
>>> I’m concerned that we are maybe getting out of scope with all the
>>> detail about vocabularies. Creating new vocabularies is a different
>>> task from publishing data that uses them, unless you are talking about
>>> custom controlled vocabularies. Of course, publishers need to document
>>> any custom controlled vocabularies they are using, but the best
>>> practices we have seem to be written for people inventing large
>>> standardized ones. Creating large standardized vocabularies is not
>>> something we expect data publishers to do per se. In fact, the more we
>>> emphasize information about creating vocabularies, the more we seem to
>>> be suggesting that data publishers should be doing that regularly. I
>>> would rather we de-emphasized creating new vocabularies and instead
>>> emphasized re-using existing vocabularies.
>>>
>>> thus, I think we should reconsider whether each of the BPs in 7.4 is
>>> in scope. BP12, BP13, and BP15 seem to me meant for people developing
>>> large standard vocabularies. BP14 is just the reverse way of saying
>>> the same thing as BP3. BP11 should only address custom controlled
>>> vocabularies. (Publishers should not produce new—and possibly
>>> conflicting--documentation for existing vocabularies; that task falls
>>> to the creators of the vocabulary.)
>>>
>>> I also think BP3 should only be a SHOULD. I wouldn’t want someone to
>>> avoid publishing because they felt they had to use standard
>>> vocabularies to do that. Many datasets in the sciences are
>>> overwhelmingly filled with data that has no standard vocabulary
>>> (because the domain is too new).
>>>
>>> The document needs editing by a native English speaker. Is someone
>>> already in line to do that?
>>> -Annette
>>>
>>> -- 
>>> Annette Greiner
>>> NERSC Data and Analytics Services
>>> Lawrence Berkeley National Laboratory
>>> 510-495-2935
>>>
>>> On Jan 21, 2015, at 10:16 AM, Bernadette Farias Lóscio
>>> <bfl@cin.ufpe.br> wrote:
>>>
>>>> Hello Carlos,
>>>>
>>>> Thanks for your comments!
>>>>
>>>> When I said that the Document Metadata BP was redundant with the
>>>> Document Vocabularies BP, I was considering the BP definition and not
>>>> the real intention of the BP.
>>>>
>>>> If we consider the meaning that "BP4 is about documenting what
>>>> metadata terms (being reused or ad-hoc) are you finally using", then
>>>> Document Metadata is not redundant with Document Vocabularies BP.
>>>>
>>>> In this case, it should be more clear what is the real meaning of
>>>> "documenting". If documenting means to "provide a document that
>>>> describe the metadata", then I think that BP on human vs. machine
>>>> readable metadata covers this requirement. On the other hand, if
>>>> documenting metadata concerns to maintain a documentation for
>>>> metadata, then maybe we should have a different BP. In this case,
>>>> there will be three BP:
>>>>
>>>> 1. Document metadata BP: data publishers SHOULD maintain a
>>>> documentation of the metadata that describe your data. This BP
>>>> concerns something that has to be done by the data consumer, but this
>>>> action doesn't have a direct impact on data consumers. There is
>>>> another BP (Provide metadata) to say that this documentation should be
>>>> provided to data consumers. This BP should be more general than the
>>>> Document Vocabularies BP. The metadata documentation should just tell
>>>> the vocabularies that are used, instead of providing a complete
>>>> documentation for vocabularies.
>>>>
>>>> 2. Provide  metadata for both human and machines BP: data publishers
>>>> SHOULD document metadata in such a way that both humans and machines
>>>> can read. This BP complements the previous one because it says how
>>>> metadata should be documented.
>>>>
>>>> 3. Provide metadata BP: data publishers SHOULD provide metadata
>>>> documentation to data consumers. When you have the documentation, give
>>>> it to the data consumers.
>>>>
>>>> Does it make sense for you?
>>>>
>>>> Cheers,
>>>> Bernadette
>>>>
>>>>
>>>> 2015-01-20 21:51 GMT-03:00 Laufer <laufer@globo.com>:
>>>>> Hi, Carlos,
>>>>>
>>>>>> BP4 is about documenting what metadata terms are you finally using
>>>>> Terms are parts of a vocabulary.
>>>>>
>>>>> And we will have a whole section about vocabularies.
>>>>>
>>>>> Metadata is documenting data. Then, metadata should be documented.
>>>>> These
>>>>> documents about metadata are metadata of metadata. We should take
>>>>> care about
>>>>> an infinite chain.
>>>>>
>>>>> If we talk about documents for machines, we are talking about
>>>>> vocabularies.
>>>>> And section 7.
>>>>> 4 will take care of this.
>>>>>
>>>>> If we are talking about humans, metadata is the documentation. Have a
>>>>> documentation about metadata is mandatory. If metadata does not 
>>>>> have a
>>>>> documentation, it does not have a meaning. For example, If one says
>>>>> that the
>>>>> dataset has a GNU license, how this can be understood by a human if
>>>>> GNU is
>>>>> not documented? The meaning is the documentation and must exist if
>>>>> someone
>>>>> decides to refer to it.
>>>>>
>>>>> In respect to code lists, (maybe this is not the formal definition)
>>>>> I think
>>>>> they are a kind of type, or even a kind of vocabulary. Again, I think
>>>>> section 7.4 is a better candidate to talk about this.
>>>>>
>>>>> Best regards,
>>>>> Laufer
>>>>>
>>>>>
>>>>>
>>>>> Em terça-feira, 20 de janeiro de 2015, Carlos Iglesias
>>>>> <contact@carlosiglesias.es> escreveu:
>>>>>
>>>>>> Hello everyone,
>>>>>>
>>>>>> Here goes my view on this:
>>>>>>
>>>>>> - I tend to disagree on (former) BP4 being derived from BP1+2+3
>>>>>>
>>>>>> BP1 is on metadata availability (provide metadata)
>>>>>> BP2 is on human vs. machine readable metadata (how to present
>>>>>> metadata)
>>>>>> BP3 is reusing generic standard metadata terms when possible 
>>>>>> (i.e. dc,
>>>>>> foaf and the like)
>>>>>> BP4 is about documenting what metadata terms (being reused or
>>>>>> ad-hoc) are
>>>>>> you finally using
>>>>>>
>>>>>> I don't see overlap between any of the above.
>>>>>>
>>>>>> - WRT BP11 Document vocabularies
>>>>>>
>>>>>> I don't see any overlap with (fomer) BP4 either as:
>>>>>>
>>>>>> BP4 is about documenting what metadata terms are you finally using
>>>>>> BP11 is about documenting your data (not metadata) models (or
>>>>>> "vocabularies") in the case you are developing new ones.
>>>>>>
>>>>>> - Finally WRT Annette's comments I think there is a missing point
>>>>>> here:
>>>>>> BPXX Document your data
>>>>>>
>>>>>> This is about the "data codebooks" that should be accompanying our
>>>>>> data as
>>>>>> additional documentation but unfortunately are rarely available 
>>>>>> making
>>>>>> working with 3rd party data a pain. This "codebooks" usually
>>>>>> document all
>>>>>> the information that Annette is refereeing to in her message and 
>>>>>> more.
>>>>>>
>>>>>> Best,
>>>>>> CI.
>>>>>>
>>>>>> On 20 January 2015 at 21:00, Annette Greiner <amgreiner@lbl.gov>
>>>>>> wrote:
>>>>>>> Here are a few things that come to mind as needing to be
>>>>>>> documented in
>>>>>>> metadata.
>>>>>>> Units, for any measure that is not unitless.
>>>>>>> For responses to a survey question, the question itself and how it
>>>>>>> was
>>>>>>> coded. (This is where code lists come in.)
>>>>>>> Meaning of nulls, zeroes, NA, etc.
>>>>>>> language, locale (we have this one covered elsewhere, but 
>>>>>>> probably it
>>>>>>> should be included under the more general BP.)
>>>>>>>
>>>>>>> I think the metadata information right now is a little bit 
>>>>>>> redundant.
>>>>>>> Documenting metadata is really the same as providing metadata.
>>>>>>> When we have
>>>>>>> generalized the BP about documenting, it will be even more like
>>>>>>> the one
>>>>>>> about providing metadata. In both cases, we are talking about
>>>>>>> using good
>>>>>>> metadata to describe the data and making it available to data
>>>>>>> consumers.
>>>>>>> -Annette
>>>>>>> -- 
>>>>>>> Annette Greiner
>>>>>>> NERSC Data and Analytics Services
>>>>>>> Lawrence Berkeley National Laboratory
>>>>>>> 510-495-2935
>>>>>>>
>>>>>>> On Jan 20, 2015, at 5:16 AM, Bernadette Farias Lóscio
>>>>>>> <bfl@cin.ufpe.br>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> The Document metadata BP should be rewritten to become more 
>>>>>>>> general,
>>>>>>>> i.e., not just vocabularies should be documented. In this case, 
>>>>>>>> what
>>>>>>>> else should be documented when talking about metadata?
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> ---
>>>>>>
>>>>>> Carlos Iglesias.
>>>>>> Internet & Web Consultant.
>>>>>> +34 687 917 759
>>>>>> contact@carlosiglesias.es
>>>>>> @carlosiglesias
>>>>>> http://es.linkedin.com/in/carlosiglesiasmoro/en
>>>>>
>>>>>
>>>>> -- 
>>>>> .  .  .  .. .  .
>>>>> .        .   . ..
>>>>> .     ..       .
>>>>
>>>>
>>>> -- 
>>>> Bernadette Farias Lóscio
>>>> Centro de Informática
>>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>>> ---------------------------------------------------------------------------- 
>>>>
>>>>
>>>
>>
>>
>>
>

Received on Thursday, 22 January 2015 13:01:21 UTC