Re: Best Practice 4 (Document Metadata) - I agree to suppress it

On 21 January 2015 at 01:51, Laufer <laufer@globo.com> wrote:

> Hi, Carlos,
>
> > BP4 is about documenting what metadata terms are you finally using
>
> Terms are parts of a vocabulary.
>

Not necessarily, See for example DC. I don't see they call their set or
metadata terms a vocabulary anywhere.
Same for schema.org for example.



> And we will have a whole section about vocabularies.
>
> Metadata is documenting data. Then, metadata should be documented. These
> documents about metadata are metadata of metadata. We should take care
> about an infinite chain.
>

Yes, we could follow the catch-22 forever or just try to forget the more
philosophical aspects of metadata for a while and try to find a practical
solution here from the user's perspective.



> If we talk about documents for machines, we are talking about
> vocabularies. And section 7.
> 4 will take care of this.
>

Not necessarily, we are talking just about machine-readable data, that's
all. Have never heard anybody using the word "vocabularies" to refer to
data models outside the semantic web/linked data world. Remember we should
be technology agnostic in our discourse with the exception of the
implementation sections where specific technical guidelines are expected.
So, firs thing maybe is that 7.4 should be renamed to "Data models" in the
shake of a more agnostic and technologically neutral document and to avoid
also unnecessary overlap with the LDBP document, where the LD-only
perspective is already been well captured.


If we are talking about humans, metadata is the documentation. Have a
> documentation about metadata is mandatory. If metadata does not have a
> documentation, it does not have a meaning. For example, If one says that
> the dataset has a GNU license, how this can be understood by a human if GNU
> is not documented? The meaning is the documentation and must exist if
> someone decides to refer to it.
>

For a better understanding Let's say that metadata is the pairs term-value
and documentation is what explain you what that pairs mean, the measure
units being used, the possible ranges and the like.



> In respect to code lists, (maybe this is not the formal definition) I
> think they are a kind of type, or even a kind of vocabulary. Again, I think
> section 7.4 is a better candidate to talk about this.
>

Not everything in the world need now to be a vocabulary, although
conceptually it could be. Codelists are codelists. The term is well know in
the industry and has been in use since time before anybody started to talk
about vocabularies for the first time. Please, let's try we all to see all
these from a global perspective and just apply SW/LD specific concepts
where applicable for specific implementation techniques.

Wether codelists could be considered just codelists or metadata or data
models or thesaurus or any other thing is a different discussion.

All the best,
 CI.



> Em terça-feira, 20 de janeiro de 2015, Carlos Iglesias <
> contact@carlosiglesias.es> escreveu:
>
> Hello everyone,
>>
>> Here goes my view on this:
>>
>> - I tend to disagree on (former) BP4 being derived from BP1+2+3
>>
>> BP1 is on metadata availability (provide metadata)
>> BP2 is on human vs. machine readable metadata (how to present metadata)
>> BP3 is reusing generic standard metadata terms when possible (i.e. dc,
>> foaf and the like)
>> BP4 is about documenting what metadata terms (being reused or ad-hoc) are
>> you finally using
>>
>> I don't see overlap between any of the above.
>>
>> - WRT BP11 Document vocabularies
>>
>> I don't see any overlap with (fomer) BP4 either as:
>>
>> BP4 is about documenting what metadata terms are you finally using
>> BP11 is about documenting your data (not metadata) models (or
>> "vocabularies") in the case you are developing new ones.
>>
>> - Finally WRT Annette's comments I think there is a missing point here:
>> BPXX Document your data
>>
>> This is about the "data codebooks" that should be accompanying our data
>> as additional documentation but unfortunately are rarely available making
>> working with 3rd party data a pain. This "codebooks" usually document all
>> the information that Annette is refereeing to in her message and more.
>>
>> Best,
>>  CI.
>>
>> On 20 January 2015 at 21:00, Annette Greiner <amgreiner@lbl.gov> wrote:
>>
>>> Here are a few things that come to mind as needing to be documented in
>>> metadata.
>>> Units, for any measure that is not unitless.
>>> For responses to a survey question, the question itself and how it was
>>> coded. (This is where code lists come in.)
>>> Meaning of nulls, zeroes, NA, etc.
>>> language, locale (we have this one covered elsewhere, but probably it
>>> should be included under the more general BP.)
>>>
>>> I think the metadata information right now is a little bit redundant.
>>> Documenting metadata is really the same as providing metadata. When we have
>>> generalized the BP about documenting, it will be even more like the one
>>> about providing metadata. In both cases, we are talking about using good
>>> metadata to describe the data and making it available to data consumers.
>>> -Annette
>>> --
>>> Annette Greiner
>>> NERSC Data and Analytics Services
>>> Lawrence Berkeley National Laboratory
>>> 510-495-2935
>>>
>>> On Jan 20, 2015, at 5:16 AM, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
>>> wrote:
>>>
>>> >
>>> > The Document metadata BP should be rewritten to become more general,
>>> > i.e., not just vocabularies should be documented. In this case, what
>>> > else should be documented when talking about metadata?
>>> >
>>>
>>>
>>>
>>
>>
>> --
>> ---
>>
>> Carlos Iglesias.
>> Internet & Web Consultant.
>> +34 687 917 759
>> contact@carlosiglesias.es
>> @carlosiglesias
>> http://es.linkedin.com/in/carlosiglesiasmoro/en
>>
>
>
> --
> .  .  .  .. .  .
> .        .   . ..
> .     ..       .
>



-- 
---

Carlos Iglesias.
Internet & Web Consultant.
+34 687 917 759
contact@carlosiglesias.es
@carlosiglesias
http://es.linkedin.com/in/carlosiglesiasmoro/en

Received on Wednesday, 21 January 2015 18:43:57 UTC