Re: Intrinsic vs extrinsic metadata (my action #54) from Andrea Perego on 2014-07-11 (public-dwbp-wg@w3.org from July 2014)

From: Andrea Perego <andrea.perego@jrc.ec.europa.eu>
Date: Fri, 11 Jul 2014 11:48:22 +0200
To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
Cc: Mark Harrison <mark.harrison@gs1.org>, public-dwbp-wg <public-dwbp-wg@w3.org>, Hadley Beeman <hadley@linkedgov.org>, Phil Archer <phila@w3.org>
Message-id: <CAHzfgWA=jvrsepxFOAA6DTMOXxqXuJxyZEw6N5H-FfNECYqxFg@mail.gmail.com>

Thanks for your reply, Bernadette.

Please find my comments inline.

> [snip]
>
> When I prepared the diagram I had DCAT definitions on my mind. However,  I also see some intersection with VOID definitions. In my opinion, dataset and distribution definitions should be "general", i.e. independent of data model, for example.

Thanks for the clarification, Bernadette.

>> 2. From the diagram it is not clear if some metadata elements are specific to data, datasets or their distributions, or, rather, they can be used for all of them. E.g., "access metadata" are just for distributions or also for data/sets?
>
> The initial idea was to identify metadata to describe datasets. I included access metadata is part of the classification, but I'm not sure if this type of metadata should be used to describe datasets or distributions. Moreover, it is not clear for me what types of metadata should be used to describe the distributions. For example, should we use the same ones that we use to describe datasets?

IMHO, this depends on how the different entities are defined. E.g.,
supposing that the notions of dataset and distribution correspond to
the ones defined in DCAT, access metadata concern distributions - a
dataset is an abstract entity, you can access just its representations
- or manifestations, using the FRBR terminology.

>> 3. I wonder whether structural metadata are meant to describe only the structure (database schema) or also the content (database instances)? Actually, in VoID structural metadata are doing both.
>
> Structural metadata should describe the data itself. They should provide an interpretation for the dataset content (i.e. the data). It can be seen as the vocabulary (ontology) that describes the data. I think this idea is different from the structural metadata proposed by VOID. If you have a RDF distribution for a given dataset, maybe you can have a VOID description for this specific distribution.

I see the point. So, the description is only intensional (i.e., about
the characteristics of the entities in the dataset), or also
extensional (e.g, how many entities are in the dataset, and which are
such entities)?

>> 4. The diagram does not model the fact that metadata are, in turn, data. As such, metadata records may be available in different formats (metadata distributions) and they can be described by other metadata (this scheme is, in theory, recursive). A real world example is given by INSPIRE [1], where we have "metadata on metadata", providing information concerning the provenance of a metadata record (responsible, language, creation/publication/modification dates).
>
> Yes, this is a good observation! I agree that metadata itself may have some properties (metadata) . Maybe, we can consider that these properties will be associated to the class metadata and will be inherited by the sublasses. Does it make sense for you?

I was thinking that, in order to model this, it might be enough to
make metadata as a subclass of entity "data". This would also model:
- the "recursive" nature of metadata (i.e., in theory, you may have
metadata on metadata, which in turn can be described by other metadata
and so on);
- the fact that, as data, also metadata have distributions.

Cheers,

Andrea

Received on Friday, 11 July 2014 09:49:04 UTC