Re: How to express the nature of the content of a Dataset?

Dear Tim,

Thanks for your input. We have created a GitHub Issue "Discussion: how to
describe the nature of the content for Datasets?"
<https://github.com/BioSchemas/specifications/issues/629> to gather input,
I think some people have added some comments there. If you are interested
in the topic, please have a look.

wrt your comments. Bioschemas already provides a Dataset schema
specification (https://bioschemas.org/profiles/Dataset/) which includes a
property "description" for free text descriptions of a Dataset.

The discussion at GitHub Issue "Discussion: how to describe the nature of
the content for Datasets?"
<https://github.com/BioSchemas/specifications/issues/629> in more about a
quick way to describe (e.g., express the metadata corresponding to) the
content of the dataset, e.g., a dataset of proteomics for butterflies.

Kind regards,

On Tue, Jan 24, 2023 at 9:38 PM Clark, Timothy (twc8q) <twc8q@virginia.edu>
wrote:

> My thought on this is that you have multiple kinds / levels of content
> description.
>
> (1) *Full text description*, similar to an “abstract” of the dataset.
> Explains to humans what the dataset is concretely, and can serve to aid in
> searches. This is contained in “Bibliographic” style metadata - as in
> Datacite - and obviously goes along with other relevant terminology such as
> “version” etc.
> (2) *Guide metadata*, i.e. keywords, using controlled vocabularies, which
> are specifically designed to aid in tuning searches - use of controlled
> vocabularies will enable synomyms etc. to ensure semantics included in
> search.
> (3) *Dataset schema specifications*.  There are various approaches to
> doing this, but essentially you serialize names, descriptions, datatypes,
> and controlled vocabulary term alignments for each data element in the
> dataset. These not only enable proper reuse of the datasets but also
> strengthen description specificity, as in “this dataset contains data with
> these attributes”.
>
> Cheers,
>
> Tim
>
>
> On Jan 24, 2023, at 1:06 PM, LJ.Garcia <lj.garcia.co@gmail.com> wrote:
>
> Hi Bioschemas community,
>
> Some of you have expressed interest in adding information about the nature
> of the content of your Dataset. For instance, if your Dataset is compiling
> information about a particular Taxon, you would want to add that
> information to the description (i.e., metadata) of your Dataset. @Daniel
> Arend <arendd@ipk-gatersleben.de> this would be your use case.
>
> There is, so far, no consensus at Bioschemas on how to do so. We have
> created a GitHub Issue "Discussion: how to describe the nature of the
> content for Datasets?"
> <https://github.com/BioSchemas/specifications/issues/629> to gather
> options and views on the subject, please contribute to the discussion.
>
> Kind regards,
> Leyla Jael Castro
> Bioschemas Chair
>
>
>

Received on Sunday, 5 February 2023 10:12:27 UTC