- From: Laufer <laufer@globo.com>
- Date: Mon, 15 Dec 2014 15:36:02 -0200
- To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
- Cc: Phil Archer <phila@w3.org>, Data on the Web Best Practices Working Group <public-dwbp-wg@w3.org>
- Message-ID: <CA+pXJih4bPznCFDgtLL=Qp2GZo5cXP-MH0YsqCRdLqq_=LW=RA@mail.gmail.com>
Hi, Bernadette, Here, the 3 texts. Best Regards, Laufer ============================================================ First Text ============================================================ <section id="metadata"> <h4>Metadata</h4> <p>Data on the web ecosystem has a subjacent architecture that involves actors with different roles as, for example, data Publisher, data Consumer and data Broker. The Broker is the one that has information that can help the Consumer to find, to access and to process data published by the Publisher. Published data is a central entity in this ecosystem. A way of helping the Consumer to execute the tasks listed above is to provide data about data. Metadata is data about data. It provides additional information about data, to help consumers better understand the meaning of data, its structure, and to clarify other issues, as for example, license of use, the organization that generated the data, data quality, data access, the update schedule of datasets, etc.</p> <p>Metadata can be used to help tasks as, for example, dataset discovery and reuse. Data consumers could aggregate metadata about, for example, data usage, generating feedback to data providers, in a way of enhancing the needs of users and to help improving data quality. Metadata can be assigned considering different granularity that goes from a single property of a resource to a whole dataset, or all datasets from a specific organization.</p> <p>Metadata can be provided in two forms: human-readable and machine-readable. It is important to provide both forms of metadata in order to reach humans and applications. In the case of machine-readable metadata, the use of standard vocabularies should be encouraged as a way of enhancing common semantics. For example, data provenance could be described using PROV-O, a W3C Recommendation that provides a set of classes, properties, and restrictions that can be used to represent and interchange provenance information generated in different systems and under different contexts.</p> <p>Metadata can be of different types. These types can be classified in different taxonomies, with different grouping criterias. For example, a specific taxonomy could define three metadata types according to descriptive, structural and administrative features. Descriptive metadata serves to identify a dataset, structural metadata serves to understand the format that the dataset is distributed and administrative metadata serves to provide information about version, update schedule, etc. A different taxonomy could define metadata types with a scheme according to tasks where metadata are used, for example, discovery and reuse.</p> <p>Is out of the scope of this document to talk about metadata types related to datasets distribution formats, for example, CSV files, Linked Data, etc. Each format has its particular metadata scheme and different W3C groups are responsible for defining each of these standards. Taking the CSV example, W3C CSV on the Web WG has the mission of providing technologies whereby data dependent applications on the Web can provide higher interoperability when working with datasets using the CSV (Comma-Separated Values) or similar formats.</p> <p>In this document we will talk about some types of metadata that are common to datasets, independently of the domain or the distribution format. A set of these types are described in the next sections.</p> ============================================================ Second Text (suppressed parts) ============================================================ <section id="metadata"> <h4>Metadata</h4> <p>Metadata is data about data. It provides additional information about data, to help consumers better understand the meaning of data, its structure, and to clarify other issues, as for example, license of use, the organization that generated the data, data quality, data access, the update schedule of datasets, etc.</p> <p>Metadata can be used to help tasks as, for example, dataset discovery and reuse, and can be assigned considering different granularity that goes from a single property of a resource to a whole dataset, or all datasets from a specific organization.</p> <p>Metadata SHOULD be be available in human-readable and machine-readable forms. It is important to provide both forms of metadata in order to reach humans and applications. In the case of machine-readable metadata, the use of standard vocabularies should be encouraged as a way of enhancing common semantics. For example, data provenance could be described using PROV-O, a W3C Recommendation that provides a set of classes, properties, and restrictions that can be used to represent and interchange provenance information generated in different systems and under different contexts.</p> <p>Metadata can be of different types. These types can be classified in different taxonomies, with different grouping criterias. For example, a specific taxonomy could define three metadata types according to descriptive, structural and administrative features. Descriptive metadata serves to identify a dataset, structural metadata serves to understand the format that the dataset is distributed and administrative metadata serves to provide information about version, update schedule, etc. A different taxonomy could define metadata types with a scheme according to tasks where metadata are used, for example, discovery and reuse.</p> <p>Is out of the scope of this document to talk about metadata types related to dataset distribution formats, for example, CSV files, Linked Data, etc. Each format has its particular metadata scheme and different W3C groups are responsible for defining each of these standards. Taking the CSV example, W3C CSV on the Web WG has the mission of providing technologies whereby data dependent applications on the Web can provide higher interoperability when working with datasets using the CSV (Comma-Separated Values) or similar formats. In this document we will talk about some types of metadata that are common to datasets, independently of the domain or the distribution format.</p> ============================================================ Phil's Text ============================================================ <section id="metadata"> <h4>Metadata</h4> <p>The data on the Web ecosystem has an underlying architecture that involves actors with different roles. Primary among these are the roles of data <em>publisher</ em> and data <em>consumer</em> but this suggests a clear boundary between the two that may not exist or be helpful. For example, a data <em>broker</em> would consume data, process and/or enrich it in some way and then re-publish it, perhaps charging a fee for the service.</p> <p>The data itself is a central entity in this ecosystem, but on its own it is likely to be hard to use if not completely useless. In order to help the consumer to discover and understand data sufficiently to be able to use it in some way requires more data about the data, that is, metadata.</p> <p>Metadata is a complex topic in its own right. It exists at different levels of granularity that go from a single property of a resource to a whole dataset, or all datasets from a specific organization. It supports multiple tasks including dataset discovery and dataset structure. Data consumers may aggregate metadata about, for example, data usage, generating feedback to data providers that might meet more needs of more users and to help improve data quality. And it's metadata that describes the license and terms of use, the organization that generated the data, the data quality, the update schedule etc.</p> <div class="issue">Should the following 2 paragraphs become best practices?</div> <p>Metadata can be provided in two forms: human-readable and machine-readable. It is important to provide both forms of metadata in order to reach humans and applications. In the case of machine-readable metadata, the use of standard vocabularies should be encouraged as a way of enhancing common semantics. For example, data provenance could be described using PROV-O, a W3C Recommendation that provides a set of classes, properties, and restrictions that can be used to represent and interchange provenance information generated in different systems and under different contexts.</p> <p>Metadata can be of different types. These types can be classified in different taxonomies, with different grouping criterias. For example, a specific taxonomy could define three metadata types according to descriptive, structural and administrative features. Descriptive metadata serves to identify a dataset, structural metadata serves to understand the format that the dataset is distributed and administrative metadata serves to provide information about version, update schedule, etc. A different taxonomy could define metadata types with a scheme according to tasks where metadata are used, for example, discovery and reuse.</p> <p>This document specifies the intended outcomes for each best practice and then gives some guidance on possible implementation methods. In terms of metadata, the particular implementation method will depend on the format of the dataset distribution, for example, metadata describing a CSV file should be provided in a different way than for an RDF dataset. However, the <em >intention</em> is the same irrespective of format.</p> 2014-12-15 14:10 GMT-02:00 Bernadette Farias Lóscio <bfl@cin.ufpe.br>: > > Hi Laufer, > > Could you please send to me the new version of your text, i.e., the one > edited by Phil but also without the parts the you suppressed? I'm making > some updates on the document and I can also update the metadata > introduction. > > Thank you! > Bernadette > > 2014-12-15 12:30 GMT-03:00 Laufer <laufer@globo.com>: > >> Hi, All, >> >> I wrote the metadata introduction text and, after the comments, I >> suppressed some parts of the text. Meanwhile, Phil has edited the text (the >> first one) as a native speaker (thank you Phil), and there was a conflict >> in github. Now, what we have in the bp document is the first text edited by >> Phil. I agree with the text but it has parts that I suppressed due to the >> comments. >> >> Bernadette, Phil, I would like to know what is the procedure now. >> >> Thank you. >> >> Cheers, >> Laufer >> >> -- >> . . . .. . . >> . . . .. >> . .. . >> > > > -- > Bernadette Farias Lóscio > Centro de Informática > Universidade Federal de Pernambuco - UFPE, Brazil > > ---------------------------------------------------------------------------- > -- . . . .. . . . . . .. . .. .
Received on Monday, 15 December 2014 17:36:31 UTC