W3C home > Mailing lists > Public > public-dxwg-wg@w3.org > January 2018

Re: AW: Requirements for profiles

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Thu, 11 Jan 2018 10:06:28 +0100
To: <public-dxwg-wg@w3.org>
Message-ID: <a3ff01ae-0331-c019-5e93-0aa8b22be678@few.vu.nl>
Very interesting discussion, Karen and Lars. Which mad me think a bit, do we want a requirement for making a complete list of features that  profile languages (both for humans and machine) could/should handle? And trying to align the profile languages? This would be very hard. We know that different languages will focus on different things, some quite irreconcilable (think of expressing constraints on the order of elements in RDF...)

Cheers,

Antoine

On 11/01/18 02:02, Karen Coyle wrote:
> 
> 
> On 1/10/18 10:46 AM, Svensson, Lars wrote:
>> Karen,
>>
>> On Monday, December 11, 2017 2:10 PM, Karen Coyle [mailto:kcoyle@kcoyle.net] wrote:
>>
>>> Lars, I've been looking at DSP and ShEx. Something that is missing from
>>> DSP is any ability to define relationships between elements. This is the
>>> bulk, however, of what SHACL and ShEx provide. So a simple ShEx statement:
>>>
>>> my:UserShape {
>>>    (
>>>       foaf:name LITERAL
>>>
>>>      |
>>>        foaf:givenName LITERAL+;
>>>        foaf:familyName LITERAL
>>>    );
>>>    foaf:mbox IRI
>>> }
>>>
>>> which includes: "either a foaf:name OR (a foaf:givenName AND a
>>> foaf:familyName)"
>>>
>>> is not something that either the DSP nor the tables of CSVW can express.
>>> And that is one of the simpler cases that SHACL and ShEx are designed to
>>> handle.
>>
>> Do you think it could be possible to extend DSP to accomodate those kinds of patterns?
> 
> The sum of patterns is pretty complex. I did a brief "patterns" document
> [1] to summarize it. As I understand it, this also goes beyond OWL in
> terms of complexity, but these are exactly the cases that ShEx and SHACL
> deal with.
> 
>>
>>> One solution could be to use either SHACL or ShEx to express profiles.
>>
>> Focusing exclusively on SHACL and ShEx seems a bit too RDF-centric to me.
> 
> I think that if one were to focus on design patterns, the "RDF-ness"
> would be less imposing.
> 
>>
>>> The down side of that is the human-readability/creatability factor,
>>> since both of these are complex executable code that are good at the
>>> detail of a profile but are hard to read for the macro data structure
>>> (which DSP aims at). If we can at least bridge the gap that would turn
>>> either of those into documentation that people can be comfortable with,
>>> that would take us quite a ways.
>>
>> That said, I think that it should be possible to create a human-readable page from a well-documented SHACL or ShEx document just as you can create HTML from an RDF vocabulary document (by evaluating rdf:label, dc:comment etc. in the class and property descriptions). I guess that best practices will emerge here.
> 
> Where I think SHACL and ShEx are limited is in their focus on individual
> graphs. I have yet to see a document that gives a good overview of a
> profile in SHACL or ShEx. So I would be interested in someone pursuing
> this as an option.
> 
> kc
> 
> [1] https://github.com/kcoyle/RDF-AP/blob/master/Patterns.md
> 
>>
>> Best,
>>
>> Lars
>>
>>> On 12/11/17 3:17 AM, Svensson, Lars wrote:
>>>> Hi Karen,
>>>>
>>>> On Mittwoch, 22. November 2017 02:10, Karen Coyle [mailto:kcoyle@kcoyle.net]
>>> wrote:
>>>>
>>>> [...]
>>>>
>>>>> * I once again would like folks to look at the technology stack of the Singapore
>>> Framework [1] which may be compatible with
>>>>> the statement that a "profile defines a set of additional structural and constraints
>>> and/or semantic interpretations that can
>>>>> apply to a given document on top of that document's media type." If the
>>> Framework doesn't have the same sense as the quote,
>>>>> perhaps we can clarify the differences. And eventually I would like to talk about
>>> the concept of description sets [2] which is the
>>>>> DCMI view of profiles.
>>>>
>>>> Even if it's not the same, at least there is significant overlap with my view of
>>> profiles. The DSP document states that a "DSP is a way of describing structural
>>> constraints on a description set. It constrains the resources that may be described by
>>> descriptions in the description set, the properties that may be used, and the ways a
>>> value surrogate may be given" which is close enough (although it mainly speaks of
>>> properties and less of classes).
>>>>
>>>> Also the Singapore Framework goes in the right direction even if I think it's too RDFy
>>> in that it mandates that "all references to terms in a Dublin Core metadata description
>>> be made using URIs" (what about XML QNames?) and that it only talks about metadata
>>> records where our scope is any kind of data.
>>>>
>>>> As an aside, it's interesting to note that the DSP document itself defines a profile for
>>> a DSP (ยง6) that is formalized in an XML schema; the creation of a ShEx document
>>> shouldn't be difficult and is left as an exercise for the reader.
>>>>
>>>>> [1] http://dublincore.org/documents/singapore-framework/
>>>>> And this is a shortcut to the diagram, which may be the most useful
>>>>> part:
>>>>> http://dublincore.org/documents/2008/01/14/singapore-framework/singapore-
>>> framework.png
>>>>> [2] http://dublincore.org/documents/dc-dsp/ however some of the details pre-date
>>> general acceptance of RDF and need to change,
>>>>> so don't get hung up on how the lower levels of the model are defined
>>>>
>>>> Best,
>>>>
>>>> Lars
>>>>
>>>> On 11/21/17 4:26 PM, Rob Atkinson wrote:
>>>>>
>>>>> Profiles should IMHO reference type ontologies where necessary to
>>>>> further restrict the range of profiled properties (either base
>>>>> specification or a more general profile).
>>>>>
>>>>> e.g. a profile for "spatial area statistics standard X" may require
>>>>> the statistical dimension property  is related to (has a rdfs:range)
>>>>> a 'feature with a polygon geometry' ,
>>>>>
>>>>> the "US Census profile" may require this to have a FIPS code and the
>>>>> 2020 census may require it to be from the set of 2020 US  state
>>>>> boundaries, by reference to a specific implementation.
>>>>>
>>>>> I think "vocabulary" is a set of definitions in the general case, and
>>>>> is agnostic about how much information model goes along with that set
>>>>> - so we need to be pretty careful about assumptions as to what it means here.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, 22 Nov 2017 at 10:21 Karen Coyle <kcoyle@kcoyle.net
>>>>> <mailto:kcoyle@kcoyle.net>> wrote:
>>>>>
>>>>>      Are you referring to value vocabularies? I was thinking about
>>>>>      properties, and in the profiles I've seen they tend to be lists of terms
>>>>>      representing properties and classes.
>>>>>
>>>>>      kc
>>>>>
>>>>>      On 11/21/17 2:18 PM, Rob Atkinson wrote:
>>>>>      >
>>>>>      > Profiles should reference controlled vocabularies - and practically
>>>>>      > these must be accessible via distributions such as REST API
>>>>>      endpoints -
>>>>>      >  - consider GBIF biota taxon vocabulary - miilons of terms and changes
>>>>>      > every day. Can not embed this in a profile, or even in a static
>>>>>      resource.
>>>>>      >
>>>>>      > Rob
>>>>>      >
>>>>>      >
>>>>>      >
>>>>>      > On Wed, 22 Nov 2017 at 09:11 Karen Coyle <kcoyle@kcoyle.net
>>>>>      <mailto:kcoyle@kcoyle.net>
>>>>>      > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>>> wrote:
>>>>>      >
>>>>>      >
>>>>>      >
>>>>>      >     On 11/21/17 12:25 PM, Antoine Isaac wrote:
>>>>>      >     > Hi Karen,
>>>>>      >     >
>>>>>      >     > I'm trying to work on it.
>>>>>      >     > But I have to say I'm a bit lost, what has happened to our
>>>>>      use case
>>>>>      >     > (5.37) and requirements. At some point everything was
>>>>>      included at
>>>>>      >     > https://w3c.github.io/dxwg/ucr/#ID37
>>>>>      >     > but the requirement list seems to have been really
>>>>>      simplified, not the
>>>>>      >     > only requirement derived from 5.37 is
>>>>>      >     > https://w3c.github.io/dxwg/ucr/#RID11
>>>>>      >     >
>>>>>      >     > When we contributed our use case we had listed these
>>>>>      requirements:
>>>>>      >     > - Each application profile needs to be documented, preferably by
>>>>>      >     > showing/reusing what is common across profiles
>>>>>      >
>>>>>      >     We'll make sure that these get in. I do have a very basic
>>>>>      question,
>>>>>      >     though, which is whether you have any assumptions about the
>>>>>      content of a
>>>>>      >     profile. This says that it is documented, that it is
>>>>>      machine-readable,
>>>>>      >     that it contains validation, and that profiles can contain
>>>>>      pieces of
>>>>>      >     data from other profiles. Is there some statement that can be
>>>>>      made about
>>>>>      >     the nature of this data? Are you assuming that profiles contain
>>>>>      >     vocabulary terms? This seems to be the missing background
>>>>>      information
>>>>>      >     from our requirements.
>>>>>      >
>>>>>      >     kc
>>>>>      >
>>>>>      >     > - Machine-readable specifications of application profiles
>>>>>      need to be
>>>>>      >     > easily publishable, and optimize re-use of existing
>>>>>      specification.
>>>>>      >     > - Application profiles need a rich expression for the the
>>>>>      >     validation of
>>>>>      >     > metadata
>>>>>      >     > - publishers (data providers, intermediary aggregators,
>>>>>      Europeana and
>>>>>      >     > DPLA) need to be able to indicate the profile to which a
>>>>>      certain piece
>>>>>      >     > of data (record describing an individual cultural object, or
>>>>>      a whole
>>>>>      >     > dataset) belong.
>>>>>      >     > - Data publishers need to be able to serve different
>>>>>      profiles of the
>>>>>      >     > same data via the same data publication channel (Web API)
>>>>>      >     > - Data consumers (intermediary aggregators, Europeana and
>>>>>      DPLA, data
>>>>>      >     > consumers) need to be able to specify the profile they are
>>>>>      >     interested in
>>>>>      >     > - Europeana needs to be able to accept the data described
>>>>>      using EDM
>>>>>      >     > extensions that are compatible with its EDM-external profile
>>>>>      >     whether it
>>>>>      >     > doesn't ingest this data entirely (i.e. some elements will
>>>>>      be left out
>>>>>      >     > are they are useless for the main Europeana Collections
>>>>>      portal) or it
>>>>>      >     > does ingest it (e.g. for Thematic Collections portals or
>>>>>      >     domain-specific
>>>>>      >     > applications that Europeana or third parties would develop)
>>>>>      >     >
>>>>>      >     > I'm going to see how it aligns to your list. But I prefered to
>>>>>      >     send you
>>>>>      >     > our raw list now, so that you can have a brief look at. If just
>>>>>      >     because
>>>>>      >     > this list supports your point " Also, there are some obvious
>>>>>      >     > requirements, like being both machine and human-readable, having
>>>>>      >     > identifiers, etc., that we do not have use cases for".
>>>>>      Valentine and I
>>>>>      >     > wanted our use case to be a motivation for such requirements...
>>>>>      >     >
>>>>>      >     > Cheers,
>>>>>      >     >
>>>>>      >     > Antoine
>>>>>      >     >
>>>>>      >     > On 21/11/17 16:34, Karen Coyle wrote:
>>>>>      >     >> Because we need to move to FPWD, if we can agree on the
>>>>>      >     requirements for
>>>>>      >     >> profiles as written here, we can amend those for the next
>>>>>      >     publication of
>>>>>      >     >> the UCR. We can add a note that these are still in flux.
>>>>>      >     >>
>>>>>      >     >> kc
>>>>>      >     >>
>>>>>      >     >> On 11/20/17 1:57 PM, Antoine Isaac wrote:
>>>>>      >     >>> Hi Karen, all,
>>>>>      >     >>>
>>>>>      >     >>> Sorry I wanted to do this today but I will probably won't
>>>>>      have time,
>>>>>      >     >>> also seeing that a considerable thread has appeared after your
>>>>>      >     initial
>>>>>      >     >>> email and will probably require reading...
>>>>>      >     >>> I'll try to do this week, though reorganization at
>>>>>      Europeana is
>>>>>      >     keeping
>>>>>      >     >>> me busy.
>>>>>      >     >>>
>>>>>      >     >>> Very likely regrets for tomorrow by the way :-/
>>>>>      >     >>>
>>>>>      >     >>> Antoine
>>>>>      >     >>>
>>>>>      >     >>> On 15/11/17 04:32, Karen Coyle wrote:
>>>>>      >     >>>> All, I'm not sure that this requirement list is complete
>>>>>      but it
>>>>>      >     is what
>>>>>      >     >>>> I could come up with in a short time so that we could have
>>>>>      >     something to
>>>>>      >     >>>> discuss. [Note to Antoine and Valentine: please see if I
>>>>>      correctly
>>>>>      >     >>>> captured the requirements from your use case.]
>>>>>      >     >>>>
>>>>>      >     >>>> I want to mention that I believe there may be more than one
>>>>>      >     definition
>>>>>      >     >>>> of "profile" being used in the use cases. In particular,
>>>>>      UC 5.3
>>>>>      >     >>>> (submitted by Ruben) didn't seem to me to be a function of
>>>>>      >     profiles but
>>>>>      >     >>>> of the connection service. There may be other such
>>>>>      differences
>>>>>      >     in the
>>>>>      >     >>>> use cases where I'm not sure if the reference is to the
>>>>>      profile
>>>>>      >     or to a
>>>>>      >     >>>> specific selection of instance data.
>>>>>      >     >>>>
>>>>>      >     >>>> Also, there are some obvious requirements, like being both
>>>>>      >     machine and
>>>>>      >     >>>> human-readable, having identifiers, etc., that we do not have
>>>>>      >     use cases
>>>>>      >     >>>> for. I did a talk at the recent Dublin Core conference that
>>>>>      >     included a
>>>>>      >     >>>> number of requirements of this nature that we may wish to
>>>>>      examine.
>>>>>      >     >>>>
>>>>>      >     >>>>
>>>>>      http://dcevents.dublincore.org/IntConf/dc-2017/paper/view/520/643
>>>>>      >     >>>>
>>>>>      >     >>>>
>>>>>      >     >>>> ****
>>>>>      >     >>>> profiles list valid vocabulary terms for a metadata usage
>>>>>      >     environment
>>>>>      >     >>>> (5.37)
>>>>>      >     >>>>
>>>>>      >     >>>> profile vocabulary lists may be defined as closed (no other
>>>>>      >     terms are
>>>>>      >     >>>> allowed) or open (other terms are allowed) (5.37)
>>>>>      >     >>>>
>>>>>      >     >>>> conceptually, profiles can extend other vocabularies or
>>>>>      >     profiles, or
>>>>>      >     >>>> can
>>>>>      >     >>>> be refinements of other vocabularies or profiles (5.37)
>>>>>      >     >>>>
>>>>>      >     >>>> profiles can be "cascading", inheriting from other
>>>>>      profiles or
>>>>>      >     profile
>>>>>      >     >>>> fragments (discussion at first f2f)
>>>>>      >     >>>>
>>>>>      >     >>>> profiles reuse vocabulary terms defined elsewhere (Dublin
>>>>>      Core
>>>>>      >     >>>> profiles;
>>>>>      >     >>>> no use case)
>>>>>      >     >>>>
>>>>>      >     >>>> profiles must be able to define finer-grained semantics for
>>>>>      >     vocabulary
>>>>>      >     >>>> terms that are used (visible in DCAT APs)
>>>>>      >     >>>>
>>>>>      >     >>>> profiles must be able to express rules that support data
>>>>>      validation
>>>>>      >     >>>> (cardinality, valid values) (5.41)
>>>>>      >     >>>>
>>>>>      >     >>>> profiles must be able to express cardinality rules of
>>>>>      >     vocabulary terms
>>>>>      >     >>>> (5.41)
>>>>>      >     >>>>
>>>>>      >     >>>> profiles can contain links to detailed validation rules or to
>>>>>      >     >>>> validation
>>>>>      >     >>>> applications that can process the profile (5.48)
>>>>>      >     >>>>
>>>>>      >     >>>> profiles must be able to support information that can
>>>>>      drive data
>>>>>      >     >>>> creation functions, including brief and detailed
>>>>>      documentation
>>>>>      >     (5.46)
>>>>>      >     >>>>
>>>>>      >     >>>> profiles must be able to express what standards
>>>>>      (including creation
>>>>>      >     >>>> rules) the data conforms to (5.43) (5.42)
>>>>>      >     >>>>
>>>>>      >     >>>> profiles must support discoverability via search engines
>>>>>      (5.40)
>>>>>      >     >>>>
>>>>>      >     >>>> profiles must have identifiers that can be used to link
>>>>>      the DCAT
>>>>>      >     >>>> description to the relevant profile (seems obvious; no
>>>>>      use case)
>>>>>      >     >>>>
>>>>>      >     >>>> *Not covered* (because I didn't know what the requirement
>>>>>      would
>>>>>      >     be):
>>>>>      >     >>>> 5.3
>>>>>      >     >>>> Responses can conform to multiple, modular profiles (by
>>>>>      Ruben)
>>>>>      >     >>>>
>>>>>      >     >>>> kc
>>>>>      >     >>>>
>>>>>      >     >>>
>>>>>      >     >>
>>>>>      >     >
>>>>>      >
>>>>>      >     --
>>>>>      >     Karen Coyle
>>>>>      >     kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>
>>>>>      <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>> http://kcoyle.net
>>>>>      >     m: 1-510-435-8234 (Signal)
>>>>>      >     skype: kcoylenet/+1-510-984-3600 <tel:+1%20510-984-3600>
>>>>>      <tel:+1%20510-984-3600>
>>>>>      >
>>>>>
>>>>>      --
>>>>>      Karen Coyle
>>>>>      kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net
>>>>>      m: 1-510-435-8234 (Signal)
>>>>>      skype: kcoylenet/+1-510-984-3600 <tel:+1%20510-984-3600>
>>>>>
>>>>
>>>> --
>>>> Karen Coyle
>>>> kcoyle@kcoyle.net http://kcoyle.net
>>>> m: 1-510-435-8234 (Signal)
>>>> skype: kcoylenet/+1-510-984-3600
>>>>
>>>
>>> --
>>> Karen Coyle
>>> kcoyle@kcoyle.net http://kcoyle.net
>>> m: 1-510-435-8234 (Signal)
>>> skype: kcoylenet/+1-510-984-3600
>>
> 
Received on Thursday, 11 January 2018 09:06:55 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 April 2019 13:44:57 UTC