W3C home > Mailing lists > Public > public-dxwg-wg@w3.org > January 2018

RE: AW: Requirements for profiles

From: Svensson, Lars <L.Svensson@dnb.de>
Date: Wed, 10 Jan 2018 18:46:02 +0000
To: "kcoyle@kcoyle.net" <kcoyle@kcoyle.net>, "public-dxwg-wg@w3.org" <public-dxwg-wg@w3.org>
Message-ID: <f2a2f6b216eb45e8835c9ca7492bba1c@dnb.de>
Karen,

On Monday, December 11, 2017 2:10 PM, Karen Coyle [mailto:kcoyle@kcoyle.net] wrote:

> Lars, I've been looking at DSP and ShEx. Something that is missing from
> DSP is any ability to define relationships between elements. This is the
> bulk, however, of what SHACL and ShEx provide. So a simple ShEx statement:
> 
> my:UserShape {
>   (
>      foaf:name LITERAL
> 
>     |
>       foaf:givenName LITERAL+;
>       foaf:familyName LITERAL
>   );
>   foaf:mbox IRI
> }
> 
> which includes: "either a foaf:name OR (a foaf:givenName AND a
> foaf:familyName)"
> 
> is not something that either the DSP nor the tables of CSVW can express.
> And that is one of the simpler cases that SHACL and ShEx are designed to
> handle.

Do you think it could be possible to extend DSP to accomodate those kinds of patterns?

> One solution could be to use either SHACL or ShEx to express profiles.

Focusing exclusively on SHACL and ShEx seems a bit too RDF-centric to me.

> The down side of that is the human-readability/creatability factor,
> since both of these are complex executable code that are good at the
> detail of a profile but are hard to read for the macro data structure
> (which DSP aims at). If we can at least bridge the gap that would turn
> either of those into documentation that people can be comfortable with,
> that would take us quite a ways.

That said, I think that it should be possible to create a human-readable page from a well-documented SHACL or ShEx document just as you can create HTML from an RDF vocabulary document (by evaluating rdf:label, dc:comment etc. in the class and property descriptions). I guess that best practices will emerge here.

Best,

Lars

> On 12/11/17 3:17 AM, Svensson, Lars wrote:
> > Hi Karen,
> >
> > On Mittwoch, 22. November 2017 02:10, Karen Coyle [mailto:kcoyle@kcoyle.net]
> wrote:
> >
> > [...]
> >
> >> * I once again would like folks to look at the technology stack of the Singapore
> Framework [1] which may be compatible with
> >> the statement that a "profile defines a set of additional structural and constraints
> and/or semantic interpretations that can
> >> apply to a given document on top of that document's media type." If the
> Framework doesn't have the same sense as the quote,
> >> perhaps we can clarify the differences. And eventually I would like to talk about
> the concept of description sets [2] which is the
> >> DCMI view of profiles.
> >
> > Even if it's not the same, at least there is significant overlap with my view of
> profiles. The DSP document states that a "DSP is a way of describing structural
> constraints on a description set. It constrains the resources that may be described by
> descriptions in the description set, the properties that may be used, and the ways a
> value surrogate may be given" which is close enough (although it mainly speaks of
> properties and less of classes).
> >
> > Also the Singapore Framework goes in the right direction even if I think it's too RDFy
> in that it mandates that "all references to terms in a Dublin Core metadata description
> be made using URIs" (what about XML QNames?) and that it only talks about metadata
> records where our scope is any kind of data.
> >
> > As an aside, it's interesting to note that the DSP document itself defines a profile for
> a DSP (§6) that is formalized in an XML schema; the creation of a ShEx document
> shouldn't be difficult and is left as an exercise for the reader.
> >
> >> [1] http://dublincore.org/documents/singapore-framework/

> >> And this is a shortcut to the diagram, which may be the most useful
> >> part:
> >> http://dublincore.org/documents/2008/01/14/singapore-framework/singapore-

> framework.png
> >> [2] http://dublincore.org/documents/dc-dsp/ however some of the details pre-date
> general acceptance of RDF and need to change,
> >> so don't get hung up on how the lower levels of the model are defined
> >
> > Best,
> >
> > Lars
> >
> > On 11/21/17 4:26 PM, Rob Atkinson wrote:
> >>
> >> Profiles should IMHO reference type ontologies where necessary to
> >> further restrict the range of profiled properties (either base
> >> specification or a more general profile).
> >>
> >> e.g. a profile for "spatial area statistics standard X" may require
> >> the statistical dimension property  is related to (has a rdfs:range)
> >> a 'feature with a polygon geometry' ,
> >>
> >> the "US Census profile" may require this to have a FIPS code and the
> >> 2020 census may require it to be from the set of 2020 US  state
> >> boundaries, by reference to a specific implementation.
> >>
> >> I think "vocabulary" is a set of definitions in the general case, and
> >> is agnostic about how much information model goes along with that set
> >> - so we need to be pretty careful about assumptions as to what it means here.
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Wed, 22 Nov 2017 at 10:21 Karen Coyle <kcoyle@kcoyle.net
> >> <mailto:kcoyle@kcoyle.net>> wrote:
> >>
> >>     Are you referring to value vocabularies? I was thinking about
> >>     properties, and in the profiles I've seen they tend to be lists of terms
> >>     representing properties and classes.
> >>
> >>     kc
> >>
> >>     On 11/21/17 2:18 PM, Rob Atkinson wrote:
> >>     >
> >>     > Profiles should reference controlled vocabularies - and practically
> >>     > these must be accessible via distributions such as REST API
> >>     endpoints -
> >>     >  - consider GBIF biota taxon vocabulary - miilons of terms and changes
> >>     > every day. Can not embed this in a profile, or even in a static
> >>     resource.
> >>     >
> >>     > Rob
> >>     >
> >>     >
> >>     >
> >>     > On Wed, 22 Nov 2017 at 09:11 Karen Coyle <kcoyle@kcoyle.net
> >>     <mailto:kcoyle@kcoyle.net>
> >>     > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>>> wrote:
> >>     >
> >>     >
> >>     >
> >>     >     On 11/21/17 12:25 PM, Antoine Isaac wrote:
> >>     >     > Hi Karen,
> >>     >     >
> >>     >     > I'm trying to work on it.
> >>     >     > But I have to say I'm a bit lost, what has happened to our
> >>     use case
> >>     >     > (5.37) and requirements. At some point everything was
> >>     included at
> >>     >     > https://w3c.github.io/dxwg/ucr/#ID37

> >>     >     > but the requirement list seems to have been really
> >>     simplified, not the
> >>     >     > only requirement derived from 5.37 is
> >>     >     > https://w3c.github.io/dxwg/ucr/#RID11

> >>     >     >
> >>     >     > When we contributed our use case we had listed these
> >>     requirements:
> >>     >     > - Each application profile needs to be documented, preferably by
> >>     >     > showing/reusing what is common across profiles
> >>     >
> >>     >     We'll make sure that these get in. I do have a very basic
> >>     question,
> >>     >     though, which is whether you have any assumptions about the
> >>     content of a
> >>     >     profile. This says that it is documented, that it is
> >>     machine-readable,
> >>     >     that it contains validation, and that profiles can contain
> >>     pieces of
> >>     >     data from other profiles. Is there some statement that can be
> >>     made about
> >>     >     the nature of this data? Are you assuming that profiles contain
> >>     >     vocabulary terms? This seems to be the missing background
> >>     information
> >>     >     from our requirements.
> >>     >
> >>     >     kc
> >>     >
> >>     >     > - Machine-readable specifications of application profiles
> >>     need to be
> >>     >     > easily publishable, and optimize re-use of existing
> >>     specification.
> >>     >     > - Application profiles need a rich expression for the the
> >>     >     validation of
> >>     >     > metadata
> >>     >     > - publishers (data providers, intermediary aggregators,
> >>     Europeana and
> >>     >     > DPLA) need to be able to indicate the profile to which a
> >>     certain piece
> >>     >     > of data (record describing an individual cultural object, or
> >>     a whole
> >>     >     > dataset) belong.
> >>     >     > - Data publishers need to be able to serve different
> >>     profiles of the
> >>     >     > same data via the same data publication channel (Web API)
> >>     >     > - Data consumers (intermediary aggregators, Europeana and
> >>     DPLA, data
> >>     >     > consumers) need to be able to specify the profile they are
> >>     >     interested in
> >>     >     > - Europeana needs to be able to accept the data described
> >>     using EDM
> >>     >     > extensions that are compatible with its EDM-external profile
> >>     >     whether it
> >>     >     > doesn't ingest this data entirely (i.e. some elements will
> >>     be left out
> >>     >     > are they are useless for the main Europeana Collections
> >>     portal) or it
> >>     >     > does ingest it (e.g. for Thematic Collections portals or
> >>     >     domain-specific
> >>     >     > applications that Europeana or third parties would develop)
> >>     >     >
> >>     >     > I'm going to see how it aligns to your list. But I prefered to
> >>     >     send you
> >>     >     > our raw list now, so that you can have a brief look at. If just
> >>     >     because
> >>     >     > this list supports your point " Also, there are some obvious
> >>     >     > requirements, like being both machine and human-readable, having
> >>     >     > identifiers, etc., that we do not have use cases for".
> >>     Valentine and I
> >>     >     > wanted our use case to be a motivation for such requirements...
> >>     >     >
> >>     >     > Cheers,
> >>     >     >
> >>     >     > Antoine
> >>     >     >
> >>     >     > On 21/11/17 16:34, Karen Coyle wrote:
> >>     >     >> Because we need to move to FPWD, if we can agree on the
> >>     >     requirements for
> >>     >     >> profiles as written here, we can amend those for the next
> >>     >     publication of
> >>     >     >> the UCR. We can add a note that these are still in flux.
> >>     >     >>
> >>     >     >> kc
> >>     >     >>
> >>     >     >> On 11/20/17 1:57 PM, Antoine Isaac wrote:
> >>     >     >>> Hi Karen, all,
> >>     >     >>>
> >>     >     >>> Sorry I wanted to do this today but I will probably won't
> >>     have time,
> >>     >     >>> also seeing that a considerable thread has appeared after your
> >>     >     initial
> >>     >     >>> email and will probably require reading...
> >>     >     >>> I'll try to do this week, though reorganization at
> >>     Europeana is
> >>     >     keeping
> >>     >     >>> me busy.
> >>     >     >>>
> >>     >     >>> Very likely regrets for tomorrow by the way :-/
> >>     >     >>>
> >>     >     >>> Antoine
> >>     >     >>>
> >>     >     >>> On 15/11/17 04:32, Karen Coyle wrote:
> >>     >     >>>> All, I'm not sure that this requirement list is complete
> >>     but it
> >>     >     is what
> >>     >     >>>> I could come up with in a short time so that we could have
> >>     >     something to
> >>     >     >>>> discuss. [Note to Antoine and Valentine: please see if I
> >>     correctly
> >>     >     >>>> captured the requirements from your use case.]
> >>     >     >>>>
> >>     >     >>>> I want to mention that I believe there may be more than one
> >>     >     definition
> >>     >     >>>> of "profile" being used in the use cases. In particular,
> >>     UC 5.3
> >>     >     >>>> (submitted by Ruben) didn't seem to me to be a function of
> >>     >     profiles but
> >>     >     >>>> of the connection service. There may be other such
> >>     differences
> >>     >     in the
> >>     >     >>>> use cases where I'm not sure if the reference is to the
> >>     profile
> >>     >     or to a
> >>     >     >>>> specific selection of instance data.
> >>     >     >>>>
> >>     >     >>>> Also, there are some obvious requirements, like being both
> >>     >     machine and
> >>     >     >>>> human-readable, having identifiers, etc., that we do not have
> >>     >     use cases
> >>     >     >>>> for. I did a talk at the recent Dublin Core conference that
> >>     >     included a
> >>     >     >>>> number of requirements of this nature that we may wish to
> >>     examine.
> >>     >     >>>>
> >>     >     >>>>
> >>     http://dcevents.dublincore.org/IntConf/dc-2017/paper/view/520/643

> >>     >     >>>>
> >>     >     >>>>
> >>     >     >>>> ****
> >>     >     >>>> profiles list valid vocabulary terms for a metadata usage
> >>     >     environment
> >>     >     >>>> (5.37)
> >>     >     >>>>
> >>     >     >>>> profile vocabulary lists may be defined as closed (no other
> >>     >     terms are
> >>     >     >>>> allowed) or open (other terms are allowed) (5.37)
> >>     >     >>>>
> >>     >     >>>> conceptually, profiles can extend other vocabularies or
> >>     >     profiles, or
> >>     >     >>>> can
> >>     >     >>>> be refinements of other vocabularies or profiles (5.37)
> >>     >     >>>>
> >>     >     >>>> profiles can be "cascading", inheriting from other
> >>     profiles or
> >>     >     profile
> >>     >     >>>> fragments (discussion at first f2f)
> >>     >     >>>>
> >>     >     >>>> profiles reuse vocabulary terms defined elsewhere (Dublin
> >>     Core
> >>     >     >>>> profiles;
> >>     >     >>>> no use case)
> >>     >     >>>>
> >>     >     >>>> profiles must be able to define finer-grained semantics for
> >>     >     vocabulary
> >>     >     >>>> terms that are used (visible in DCAT APs)
> >>     >     >>>>
> >>     >     >>>> profiles must be able to express rules that support data
> >>     validation
> >>     >     >>>> (cardinality, valid values) (5.41)
> >>     >     >>>>
> >>     >     >>>> profiles must be able to express cardinality rules of
> >>     >     vocabulary terms
> >>     >     >>>> (5.41)
> >>     >     >>>>
> >>     >     >>>> profiles can contain links to detailed validation rules or to
> >>     >     >>>> validation
> >>     >     >>>> applications that can process the profile (5.48)
> >>     >     >>>>
> >>     >     >>>> profiles must be able to support information that can
> >>     drive data
> >>     >     >>>> creation functions, including brief and detailed
> >>     documentation
> >>     >     (5.46)
> >>     >     >>>>
> >>     >     >>>> profiles must be able to express what standards
> >>     (including creation
> >>     >     >>>> rules) the data conforms to (5.43) (5.42)
> >>     >     >>>>
> >>     >     >>>> profiles must support discoverability via search engines
> >>     (5.40)
> >>     >     >>>>
> >>     >     >>>> profiles must have identifiers that can be used to link
> >>     the DCAT
> >>     >     >>>> description to the relevant profile (seems obvious; no
> >>     use case)
> >>     >     >>>>
> >>     >     >>>> *Not covered* (because I didn't know what the requirement
> >>     would
> >>     >     be):
> >>     >     >>>> 5.3
> >>     >     >>>> Responses can conform to multiple, modular profiles (by
> >>     Ruben)
> >>     >     >>>>
> >>     >     >>>> kc
> >>     >     >>>>
> >>     >     >>>
> >>     >     >>
> >>     >     >
> >>     >
> >>     >     --
> >>     >     Karen Coyle
> >>     >     kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>
> >>     <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>> http://kcoyle.net

> >>     >     m: 1-510-435-8234 (Signal)
> >>     >     skype: kcoylenet/+1-510-984-3600 <tel:+1%20510-984-3600>
> >>     <tel:+1%20510-984-3600>
> >>     >
> >>
> >>     --
> >>     Karen Coyle
> >>     kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net

> >>     m: 1-510-435-8234 (Signal)
> >>     skype: kcoylenet/+1-510-984-3600 <tel:+1%20510-984-3600>
> >>
> >
> > --
> > Karen Coyle
> > kcoyle@kcoyle.net http://kcoyle.net

> > m: 1-510-435-8234 (Signal)
> > skype: kcoylenet/+1-510-984-3600
> >
> 
> --
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net

> m: 1-510-435-8234 (Signal)
> skype: kcoylenet/+1-510-984-3600

Received on Wednesday, 10 January 2018 18:46:34 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 April 2019 13:44:56 UTC