RE: DCAT comments - dataset dependecy - http://www.w3.org/TR/2013/WD-vocab-dcat-20130801/

Thanks Fadi,

Your comment addresses my concern as it details the desired semantics!

I understand argument 2.

Argument 1 is more difficult to understand.
- Ontologically speaking, optional properties are not relevant.
- Still if present they carry specific semantics (like you explain - this distribution supports this language).
- For an Agent, exploiting a DCAT register it seems relevant / mandatory to know these semantics to be able to extract the dataset.

The EU typically has multi-lingual support.  Likely other nations (Canada, Switzerland, Belgium, ... ) and international organizations (UN, World Bank, OECD, ... ) will have like addressing semantic issues to solve if they would accept DCAT.
If you agree it is an improvement, is it possible to add an example as well (as given below in the mail).

Kind Regards,

Johan De Smedt 
> -----Original Message-----
> From: Fadi Maali [mailto:fadi.maali@deri.org]
> Sent: Wednesday, 30 October, 2013 10:51
> To: Johan De Smedt
> Cc: public-gld-comments@w3.org
> Subject: Re: DCAT comments - dataset dependecy - http://www.w3.org/TR/2013/WD-vocab-dcat-
> 20130801/
> 
> Hello Johan,
> 
> Please see the text I added at: https://dvcs.w3.org/hg/gld/raw-
> file/default/dcat/index.html#Property:dataset_language
> 
> Notice that the text mentions that publishers can add dct:language to instances of dcat:Distribution
> but I didn't add it to the set of properties of dcat:Distribution for two reasons:
> 1. This is only needed for the case when the dataset has multiple languages and provided through
> different distributions based on the language. As DCAT has no notion of optional properties, adding
> dct:language to Distribution to address the particular case of mutli-language dataset might, arguably,
> be confusing as the language property will be defined on the three levels of Catalog, Dataset and
> Distribution.
> 2. Doing so now goes beyond editorial changes and requires going through another call for comments
> for DCAT.
> 
> Does that properly address you comment?
> 
> Best regards,
> Fadi Maali
> --------------------------------------------------
> Fadi Maali
> PhD student @ Insight Galway (formerly DERI)
> Irish Research Council Embark Scholarship holder
> http://www.deri.ie/users/fadi-maali
> 
> On 30 Oct 2013, at 18:32, Johan De Smedt <johan.de-smedt@tenforce.com> wrote:
> 
> > Hi Fadi,
> >
> > It makes a lot of sense to me to have language as an optional parameter on the distribution class - as
> per your example below.
> > However, I did not see this possibility in the model as http://www.w3.org/TR/vocab-dcat/
> >
> > I would support
> > - adding the clarification you make
> > - adding dct:language for that purpose as an optional property on dcat:Distribution.
> >
> > This would cover my main concerns.
> >
> > Kind Regards,
> >
> > Johan De Smedt
> >> -----Original Message-----
> >> From: Fadi Maali [mailto:fadi.maali@deri.org]
> >> Sent: Wednesday, 30 October, 2013 08:19
> >> To: Johan De Smedt
> >> Cc: public-gld-comments@w3.org
> >> Subject: Re: DCAT comments - dataset dependecy - http://www.w3.org/TR/2013/WD-vocab-dcat-
> >> 20130801/
> >>
> >> Further comments inline….
> >>
> >>
> >> On 30 Oct 2013, at 17:46, Johan De Smedt <johan.de-smedt@tenforce.com> wrote:
> >>
> >>> Hi Fadi,
> >>>
> >>> In-line I deleted what is ok for me and answerer on some of your questions
> >>>
> >>> Kind Regards,
> >>>
> >>> Johan De Smedt
> >>>
> >>>> -----Original Message-----
> >>>> From: Fadi Maali [mailto:fadi.maali@deri.org]
> >>>> Sent: Wednesday, 30 October, 2013 06:43
> >>>> To: Johan De Smedt
> >>>> Cc: public-gld-comments@w3.org
> >>>> Subject: Re: DCAT comments - dataset dependecy - http://www.w3.org/TR/2013/WD-vocab-dcat-
> >>>> 20130801/
> >>>>
> >>>> Hello Johan,
> >>>> Thanks for the following up.
> >>>>
> >>>> Some comments inline...
> >>>>
> >>>> On 29 Oct 2013, at 16:58, Johan De Smedt <johan.de-smedt@tenforce.com> wrote:
> >>>>
> >>>>> Hi Sandro, Fadi,
> >>>>>
> >>>>> 1) [JDS:>] [...cut...]
> >>>>>
> >>>>> 2) In case there is still room for amending some text, I would suggest:
> >>>>> a) [JDS:>] [...cut...].
> >>>>
> >>>>> b) To make the usage note on dcat:mediaType more explicit.
> >>>>>    Add to usage note: “Best practice for retrieving a data using dcat:downloadURL is to set the
> HTTP
> >>>> header ‘Accept’ to a value of dcat:mediaType.”
> >>>>
> >>>> While this sounds right to be recommended, my personal opinion is that the vocabulary
> >> specification
> >>>> should not include this recommendation as it relates to the deployment… thoughts on this?
> >>>>
> >>>>> c) [JDS:>] [...cut...]
> >>>
> >>>>> d) It is not clear how a multilingual dataset can be registered that has different distributions per
> >>>> language
> >>>>>    either -d.1- using a different dcat:downloadURL
> >>>>>         With the current model, this situation can be handled unambiguously by having multiple
> >>>> (further unrelated) data sets.
> >>>>>         If this is considered best practice, this could be clarified in a usage note on dataset
> >>>> dcat:language
> >>>>>    or -d.2- using the same downloadURL but with different values for the HTTP header Accept-
> >>>> Language
> >>>>>         With the current model this could be handled by adding a usage note on the dataset
> >>>> dct:language and on the distribution dcat:downloadURL
> >>>>
> >>>> What about different distributions (each with its own downloadURL) for the same dataset?
> >>> [JDS:>] That is the case as detailed in -d.1- above - right?
> >>
> >> I was referring to different "distributions" while you mentioned "multiple further unrelated data
> sets".
> >> based on the example you provided below, I gather you meant multiple distributions.
> >>
> >>
> >>> Lets' take EU CELLAR which it actually provides examples for as well d.1 as d.2
> >>> The -d.1- case (multiple download URL)
> >>> - There is only 1 dataset with multiple format and language combinations, each distribution may
> >> have a different URL per language.
> >>> GET http://publications.europa.eu/resource/oj/JOC_2006_331_R_0026_06.DEU
> >>> - with: Accept=application/xml; notice=branch
> >>> GET http://publications.europa.eu/resource/oj/JOC_2006_331_R_0026_06.ENG
> >>> - with: Accept=application/xml; notice=branch
> >>> For DCAT, different dataset are required as the distribution in DCAT does not provide for detailing
> >> the language covered by that distribution.
> >>> Alternatively in DCAT,
> >>> - either 1  dataset is registered with 1 distribution, no downloadURL, an accessURL
> >>> requiring EU CELLAR to make additional landing pages to solve this ambiguity in DCAT.
> >>> - either 2 datasets are registered (one per language) - this would bring it to 20+ datasets as there
> are
> >> over 20 languages supported
> >>> The -d.2- case (1 download URL)
> >>> GET http://publications.europa.eu/resource/oj/JOC_2006_331_R_0026_06
> >>> - with: Accept=application/xml; notice=branch
> >>> gives a different result with either of the following:
> >>> - Accept-Language=en
> >>> - Accept-Language=de
> >>> The suggested usage note would cover this case without any change to DCAT or the dataset
> >> publisher.
> >>>
> >>> On usage of content negotiation with HTTP header, see also:
> >>> - http://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html
> >>> - http://www.ietf.org/rfc/rfc2295.txt
> >>> Would DCAT be more clear if these are added as a reference - complying with the usage note I
> >> suggest to add?
> >>
> >> IMHO, the right way to model this is by separate distributions of a single dataset. As you mentioned,
> >> language can be described only
> >> on the dataset level of DCAT. I suggest having multiple values on the dataset level for the language
> and
> >> specifying the specific language of each distribution.
> >>
> >> Example:
> >> :ds1 dct:language lng:en,  lng:de;
> >>         dcat:distribution :dist1, :dist2.
> >> :dist1 a dcat:Distribution;
> >>           dct:language lng:en;
> >>          dcat:accessURL <url-en> .
> >> :dist2 a dcat:Distribution;
> >>           dct:language lng:de;
> >>          dcat:accessURL <url-de> .
> >>
> >> This modelling is equivalent  to the text "this dataset is available in English and Deutsch. It can be
> >> accessed via dist1 which is in English or via dist2 which is in Deutsch"
> >>
> >> If that makes sense to you and to others, I can add the required clarification text to indicate how to
> >> handle multi-language datasets.
> >>
> >> Many thanks!
> >>
> >> - Fadi
> >>
> >>>>
> >>>>
> >>>> Regards,
> >>>> Fadi Maali
> >>>>
> >>>>
> >>>>> Sorry for these late results on an implementation exercise we made with the EU Publication
> Office
> >>>> CELLAR platform.
> >>>>>
> >>>>> Kind Regards,
> >>>>>
> >>>>> Johan De Smedt
> >>>>> Chief Technology Officer
> >>>>>
> >>>>> mail: johan.de-smedt@tenforce.com
> >>>>> mobile: +32 477 475934
> >>>>> <image002.jpg>
> >>>
> >

Received on Wednesday, 30 October 2013 11:07:54 UTC