W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > February 2015

Re: dwbp-ISSUE-134 (BernadetteLoscio): About Formats, schemas, vocabularies and data models [Best practices document(s)]

From: Eric Stephan <ericphb@gmail.com>
Date: Tue, 3 Feb 2015 08:29:20 -0800
Message-ID: <CAMFz4jg6opXmTXK8z4yPtTs+Tz3OLq7t-=2N=uo3F=3i2vMQ+Q@mail.gmail.com>
To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
Cc: Data on the Web Best Practices Working Group <public-dwbp-wg@w3.org>
Bernadette,

I think that we as a group could waste a lot of time going back and forth
citing what is a model, schema, format.    I am more concerned that
whatever we choose we:

   - clearly explain what we mean when we use a term and
   - are consistent using the same terminology throughout the document.

Voting on what you suggested:

- the structure of the data should be referred to as the data schema +1
- the collection of terms used in the schema to describe how to
interpret data values should be refered to as the vocabulary +1
- the abstract syntax to define schemas should be referred to as data model
-1 This seemed a bit confusing to me.

I hope this is helpful,

Eric S



On Tue, Feb 3, 2015 at 8:02 AM, Bernadette Farias Lóscio <bfl@cin.ufpe.br>
wrote:

> Hi all,
>
> I'd like to discuss with you the difference between vocabulary, data
> schema, data model and data format. João Paulo started this discussion
> earlier in this message:
> https://lists.w3.org/Archives/Public/public-dwbp-wg/2015Jan/0195.html
>
> It is worth to read the whole message to better understand the
> definitions. In the following, I show just parts of the message with
> some definitions:
> -------------------------
> - About data representation and data format
>
> "By "data representation" we mean any convention for the arrangement of
> symbols in such a way as to enable information to be encoded by a data
> producer and later decoded by data consumers.
>
> A particular convention for data representation is often referred to as a
> "data format"."
>
> ....
>
> - About schemas
>
> For example, an XML-based format can be
> specified with a "schema document" in the XML Schema Definition language,
> enabling XML documents to be checked for conformance to the format defined
> in the schema document [XML-SCHEMA].
>
> "schemas" are often used as a means to anchor natural language
> descriptions to guide humans in the interpretation of data produced using
> the format. Often, labels are used in these schemas to convey intuitive
> meaning and guide interpretation, in which case these labels serve the role
> of "terms" in communication. The collection of terms as used in the schema
> is then referred to as a "vocabulary".
>
> ------------------------------
>
> The notion of schema presented above is similar to the one of
> relational schema in the database world. A relational database schema
> describes the set of relation schemas of a given database. A relation
> schema is composed by the name of the relation together with its
> attributes. This specifies how to interpret instances of a given
> relation (or table). In the database world, a data model consists of a
> set of constructs to build databases. For example, in the relational
> model, databases are represented as a collection of relations (or
> tables).
>
> IMO vocabularies may be used to describe data schemas even when the
> RDF model is not being used. Vocabularies should be used to help tasks
> like data integration and to improve data interoperability.
>
> In this case, I suggest:
>
> - the structure of the data should be referred to as the data schema
> - the collection of terms used in the schema to describe how to
> interpret data values should be refered to as the vocabulary
> - the abstract syntax to define schemas should be referred to as data model
>
> Example  (relational schema defined according to the relational data
> model):
>
> Person(name, age, sex, id) --> this is the schema
> terms name, age, sex and id --> this is the vocabulary
>
> cheers,
> Bernadette
>
>
>
>
>
>
>
>
>
>
>
>
> 2015-01-22 13:46 GMT-03:00 Data on the Web Best Practices Working
> Group Issue Tracker <sysbot+tracker@w3.org>:
> > dwbp-ISSUE-134 (BernadetteLoscio): About Formats, schemas, vocabularies
> and data models  [Best practices document(s)]
> >
> > http://www.w3.org/2013/dwbp/track/issues/134
> >
> > Raised by: Joao Paulo Almeida
> > On product: Best practices document(s)
> >
> > The group needs to settle on some concepts (and ultimately terms) that
> should help us to structure our discussions,  give us a basis to
> communicate and help our audience to understand us.
> >
> >
> >
>
>
>
> --
> Bernadette Farias Lóscio
> Centro de Informática
> Universidade Federal de Pernambuco - UFPE, Brazil
>
> ----------------------------------------------------------------------------
>
>
Received on Tuesday, 3 February 2015 16:29:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 3 February 2015 16:29:47 UTC