Re: AW: Requirements for profiles

On 1/10/18 10:46 AM, Svensson, Lars wrote:
> Karen,
> 
> On Monday, December 11, 2017 2:10 PM, Karen Coyle [mailto:kcoyle@kcoyle.net] wrote:
> 
>> Lars, I've been looking at DSP and ShEx. Something that is missing from
>> DSP is any ability to define relationships between elements. This is the
>> bulk, however, of what SHACL and ShEx provide. So a simple ShEx statement:
>>
>> my:UserShape {
>>   (
>>      foaf:name LITERAL
>>
>>     |
>>       foaf:givenName LITERAL+;
>>       foaf:familyName LITERAL
>>   );
>>   foaf:mbox IRI
>> }
>>
>> which includes: "either a foaf:name OR (a foaf:givenName AND a
>> foaf:familyName)"
>>
>> is not something that either the DSP nor the tables of CSVW can express.
>> And that is one of the simpler cases that SHACL and ShEx are designed to
>> handle.
> 
> Do you think it could be possible to extend DSP to accomodate those kinds of patterns?

The sum of patterns is pretty complex. I did a brief "patterns" document
[1] to summarize it. As I understand it, this also goes beyond OWL in
terms of complexity, but these are exactly the cases that ShEx and SHACL
deal with.

> 
>> One solution could be to use either SHACL or ShEx to express profiles.
> 
> Focusing exclusively on SHACL and ShEx seems a bit too RDF-centric to me.

I think that if one were to focus on design patterns, the "RDF-ness"
would be less imposing.

> 
>> The down side of that is the human-readability/creatability factor,
>> since both of these are complex executable code that are good at the
>> detail of a profile but are hard to read for the macro data structure
>> (which DSP aims at). If we can at least bridge the gap that would turn
>> either of those into documentation that people can be comfortable with,
>> that would take us quite a ways.
> 
> That said, I think that it should be possible to create a human-readable page from a well-documented SHACL or ShEx document just as you can create HTML from an RDF vocabulary document (by evaluating rdf:label, dc:comment etc. in the class and property descriptions). I guess that best practices will emerge here.

Where I think SHACL and ShEx are limited is in their focus on individual
graphs. I have yet to see a document that gives a good overview of a
profile in SHACL or ShEx. So I would be interested in someone pursuing
this as an option.

kc

[1] https://github.com/kcoyle/RDF-AP/blob/master/Patterns.md

> 
> Best,
> 
> Lars
> 
>> On 12/11/17 3:17 AM, Svensson, Lars wrote:
>>> Hi Karen,
>>>
>>> On Mittwoch, 22. November 2017 02:10, Karen Coyle [mailto:kcoyle@kcoyle.net]
>> wrote:
>>>
>>> [...]
>>>
>>>> * I once again would like folks to look at the technology stack of the Singapore
>> Framework [1] which may be compatible with
>>>> the statement that a "profile defines a set of additional structural and constraints
>> and/or semantic interpretations that can
>>>> apply to a given document on top of that document's media type." If the
>> Framework doesn't have the same sense as the quote,
>>>> perhaps we can clarify the differences. And eventually I would like to talk about
>> the concept of description sets [2] which is the
>>>> DCMI view of profiles.
>>>
>>> Even if it's not the same, at least there is significant overlap with my view of
>> profiles. The DSP document states that a "DSP is a way of describing structural
>> constraints on a description set. It constrains the resources that may be described by
>> descriptions in the description set, the properties that may be used, and the ways a
>> value surrogate may be given" which is close enough (although it mainly speaks of
>> properties and less of classes).
>>>
>>> Also the Singapore Framework goes in the right direction even if I think it's too RDFy
>> in that it mandates that "all references to terms in a Dublin Core metadata description
>> be made using URIs" (what about XML QNames?) and that it only talks about metadata
>> records where our scope is any kind of data.
>>>
>>> As an aside, it's interesting to note that the DSP document itself defines a profile for
>> a DSP (ยง6) that is formalized in an XML schema; the creation of a ShEx document
>> shouldn't be difficult and is left as an exercise for the reader.
>>>
>>>> [1] http://dublincore.org/documents/singapore-framework/
>>>> And this is a shortcut to the diagram, which may be the most useful
>>>> part:
>>>> http://dublincore.org/documents/2008/01/14/singapore-framework/singapore-
>> framework.png
>>>> [2] http://dublincore.org/documents/dc-dsp/ however some of the details pre-date
>> general acceptance of RDF and need to change,
>>>> so don't get hung up on how the lower levels of the model are defined
>>>
>>> Best,
>>>
>>> Lars
>>>
>>> On 11/21/17 4:26 PM, Rob Atkinson wrote:
>>>>
>>>> Profiles should IMHO reference type ontologies where necessary to
>>>> further restrict the range of profiled properties (either base
>>>> specification or a more general profile).
>>>>
>>>> e.g. a profile for "spatial area statistics standard X" may require
>>>> the statistical dimension property  is related to (has a rdfs:range)
>>>> a 'feature with a polygon geometry' ,
>>>>
>>>> the "US Census profile" may require this to have a FIPS code and the
>>>> 2020 census may require it to be from the set of 2020 US  state
>>>> boundaries, by reference to a specific implementation.
>>>>
>>>> I think "vocabulary" is a set of definitions in the general case, and
>>>> is agnostic about how much information model goes along with that set
>>>> - so we need to be pretty careful about assumptions as to what it means here.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, 22 Nov 2017 at 10:21 Karen Coyle <kcoyle@kcoyle.net
>>>> <mailto:kcoyle@kcoyle.net>> wrote:
>>>>
>>>>     Are you referring to value vocabularies? I was thinking about
>>>>     properties, and in the profiles I've seen they tend to be lists of terms
>>>>     representing properties and classes.
>>>>
>>>>     kc
>>>>
>>>>     On 11/21/17 2:18 PM, Rob Atkinson wrote:
>>>>     >
>>>>     > Profiles should reference controlled vocabularies - and practically
>>>>     > these must be accessible via distributions such as REST API
>>>>     endpoints -
>>>>     >  - consider GBIF biota taxon vocabulary - miilons of terms and changes
>>>>     > every day. Can not embed this in a profile, or even in a static
>>>>     resource.
>>>>     >
>>>>     > Rob
>>>>     >
>>>>     >
>>>>     >
>>>>     > On Wed, 22 Nov 2017 at 09:11 Karen Coyle <kcoyle@kcoyle.net
>>>>     <mailto:kcoyle@kcoyle.net>
>>>>     > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>>> wrote:
>>>>     >
>>>>     >
>>>>     >
>>>>     >     On 11/21/17 12:25 PM, Antoine Isaac wrote:
>>>>     >     > Hi Karen,
>>>>     >     >
>>>>     >     > I'm trying to work on it.
>>>>     >     > But I have to say I'm a bit lost, what has happened to our
>>>>     use case
>>>>     >     > (5.37) and requirements. At some point everything was
>>>>     included at
>>>>     >     > https://w3c.github.io/dxwg/ucr/#ID37
>>>>     >     > but the requirement list seems to have been really
>>>>     simplified, not the
>>>>     >     > only requirement derived from 5.37 is
>>>>     >     > https://w3c.github.io/dxwg/ucr/#RID11
>>>>     >     >
>>>>     >     > When we contributed our use case we had listed these
>>>>     requirements:
>>>>     >     > - Each application profile needs to be documented, preferably by
>>>>     >     > showing/reusing what is common across profiles
>>>>     >
>>>>     >     We'll make sure that these get in. I do have a very basic
>>>>     question,
>>>>     >     though, which is whether you have any assumptions about the
>>>>     content of a
>>>>     >     profile. This says that it is documented, that it is
>>>>     machine-readable,
>>>>     >     that it contains validation, and that profiles can contain
>>>>     pieces of
>>>>     >     data from other profiles. Is there some statement that can be
>>>>     made about
>>>>     >     the nature of this data? Are you assuming that profiles contain
>>>>     >     vocabulary terms? This seems to be the missing background
>>>>     information
>>>>     >     from our requirements.
>>>>     >
>>>>     >     kc
>>>>     >
>>>>     >     > - Machine-readable specifications of application profiles
>>>>     need to be
>>>>     >     > easily publishable, and optimize re-use of existing
>>>>     specification.
>>>>     >     > - Application profiles need a rich expression for the the
>>>>     >     validation of
>>>>     >     > metadata
>>>>     >     > - publishers (data providers, intermediary aggregators,
>>>>     Europeana and
>>>>     >     > DPLA) need to be able to indicate the profile to which a
>>>>     certain piece
>>>>     >     > of data (record describing an individual cultural object, or
>>>>     a whole
>>>>     >     > dataset) belong.
>>>>     >     > - Data publishers need to be able to serve different
>>>>     profiles of the
>>>>     >     > same data via the same data publication channel (Web API)
>>>>     >     > - Data consumers (intermediary aggregators, Europeana and
>>>>     DPLA, data
>>>>     >     > consumers) need to be able to specify the profile they are
>>>>     >     interested in
>>>>     >     > - Europeana needs to be able to accept the data described
>>>>     using EDM
>>>>     >     > extensions that are compatible with its EDM-external profile
>>>>     >     whether it
>>>>     >     > doesn't ingest this data entirely (i.e. some elements will
>>>>     be left out
>>>>     >     > are they are useless for the main Europeana Collections
>>>>     portal) or it
>>>>     >     > does ingest it (e.g. for Thematic Collections portals or
>>>>     >     domain-specific
>>>>     >     > applications that Europeana or third parties would develop)
>>>>     >     >
>>>>     >     > I'm going to see how it aligns to your list. But I prefered to
>>>>     >     send you
>>>>     >     > our raw list now, so that you can have a brief look at. If just
>>>>     >     because
>>>>     >     > this list supports your point " Also, there are some obvious
>>>>     >     > requirements, like being both machine and human-readable, having
>>>>     >     > identifiers, etc., that we do not have use cases for".
>>>>     Valentine and I
>>>>     >     > wanted our use case to be a motivation for such requirements...
>>>>     >     >
>>>>     >     > Cheers,
>>>>     >     >
>>>>     >     > Antoine
>>>>     >     >
>>>>     >     > On 21/11/17 16:34, Karen Coyle wrote:
>>>>     >     >> Because we need to move to FPWD, if we can agree on the
>>>>     >     requirements for
>>>>     >     >> profiles as written here, we can amend those for the next
>>>>     >     publication of
>>>>     >     >> the UCR. We can add a note that these are still in flux.
>>>>     >     >>
>>>>     >     >> kc
>>>>     >     >>
>>>>     >     >> On 11/20/17 1:57 PM, Antoine Isaac wrote:
>>>>     >     >>> Hi Karen, all,
>>>>     >     >>>
>>>>     >     >>> Sorry I wanted to do this today but I will probably won't
>>>>     have time,
>>>>     >     >>> also seeing that a considerable thread has appeared after your
>>>>     >     initial
>>>>     >     >>> email and will probably require reading...
>>>>     >     >>> I'll try to do this week, though reorganization at
>>>>     Europeana is
>>>>     >     keeping
>>>>     >     >>> me busy.
>>>>     >     >>>
>>>>     >     >>> Very likely regrets for tomorrow by the way :-/
>>>>     >     >>>
>>>>     >     >>> Antoine
>>>>     >     >>>
>>>>     >     >>> On 15/11/17 04:32, Karen Coyle wrote:
>>>>     >     >>>> All, I'm not sure that this requirement list is complete
>>>>     but it
>>>>     >     is what
>>>>     >     >>>> I could come up with in a short time so that we could have
>>>>     >     something to
>>>>     >     >>>> discuss. [Note to Antoine and Valentine: please see if I
>>>>     correctly
>>>>     >     >>>> captured the requirements from your use case.]
>>>>     >     >>>>
>>>>     >     >>>> I want to mention that I believe there may be more than one
>>>>     >     definition
>>>>     >     >>>> of "profile" being used in the use cases. In particular,
>>>>     UC 5.3
>>>>     >     >>>> (submitted by Ruben) didn't seem to me to be a function of
>>>>     >     profiles but
>>>>     >     >>>> of the connection service. There may be other such
>>>>     differences
>>>>     >     in the
>>>>     >     >>>> use cases where I'm not sure if the reference is to the
>>>>     profile
>>>>     >     or to a
>>>>     >     >>>> specific selection of instance data.
>>>>     >     >>>>
>>>>     >     >>>> Also, there are some obvious requirements, like being both
>>>>     >     machine and
>>>>     >     >>>> human-readable, having identifiers, etc., that we do not have
>>>>     >     use cases
>>>>     >     >>>> for. I did a talk at the recent Dublin Core conference that
>>>>     >     included a
>>>>     >     >>>> number of requirements of this nature that we may wish to
>>>>     examine.
>>>>     >     >>>>
>>>>     >     >>>>
>>>>     http://dcevents.dublincore.org/IntConf/dc-2017/paper/view/520/643
>>>>     >     >>>>
>>>>     >     >>>>
>>>>     >     >>>> ****
>>>>     >     >>>> profiles list valid vocabulary terms for a metadata usage
>>>>     >     environment
>>>>     >     >>>> (5.37)
>>>>     >     >>>>
>>>>     >     >>>> profile vocabulary lists may be defined as closed (no other
>>>>     >     terms are
>>>>     >     >>>> allowed) or open (other terms are allowed) (5.37)
>>>>     >     >>>>
>>>>     >     >>>> conceptually, profiles can extend other vocabularies or
>>>>     >     profiles, or
>>>>     >     >>>> can
>>>>     >     >>>> be refinements of other vocabularies or profiles (5.37)
>>>>     >     >>>>
>>>>     >     >>>> profiles can be "cascading", inheriting from other
>>>>     profiles or
>>>>     >     profile
>>>>     >     >>>> fragments (discussion at first f2f)
>>>>     >     >>>>
>>>>     >     >>>> profiles reuse vocabulary terms defined elsewhere (Dublin
>>>>     Core
>>>>     >     >>>> profiles;
>>>>     >     >>>> no use case)
>>>>     >     >>>>
>>>>     >     >>>> profiles must be able to define finer-grained semantics for
>>>>     >     vocabulary
>>>>     >     >>>> terms that are used (visible in DCAT APs)
>>>>     >     >>>>
>>>>     >     >>>> profiles must be able to express rules that support data
>>>>     validation
>>>>     >     >>>> (cardinality, valid values) (5.41)
>>>>     >     >>>>
>>>>     >     >>>> profiles must be able to express cardinality rules of
>>>>     >     vocabulary terms
>>>>     >     >>>> (5.41)
>>>>     >     >>>>
>>>>     >     >>>> profiles can contain links to detailed validation rules or to
>>>>     >     >>>> validation
>>>>     >     >>>> applications that can process the profile (5.48)
>>>>     >     >>>>
>>>>     >     >>>> profiles must be able to support information that can
>>>>     drive data
>>>>     >     >>>> creation functions, including brief and detailed
>>>>     documentation
>>>>     >     (5.46)
>>>>     >     >>>>
>>>>     >     >>>> profiles must be able to express what standards
>>>>     (including creation
>>>>     >     >>>> rules) the data conforms to (5.43) (5.42)
>>>>     >     >>>>
>>>>     >     >>>> profiles must support discoverability via search engines
>>>>     (5.40)
>>>>     >     >>>>
>>>>     >     >>>> profiles must have identifiers that can be used to link
>>>>     the DCAT
>>>>     >     >>>> description to the relevant profile (seems obvious; no
>>>>     use case)
>>>>     >     >>>>
>>>>     >     >>>> *Not covered* (because I didn't know what the requirement
>>>>     would
>>>>     >     be):
>>>>     >     >>>> 5.3
>>>>     >     >>>> Responses can conform to multiple, modular profiles (by
>>>>     Ruben)
>>>>     >     >>>>
>>>>     >     >>>> kc
>>>>     >     >>>>
>>>>     >     >>>
>>>>     >     >>
>>>>     >     >
>>>>     >
>>>>     >     --
>>>>     >     Karen Coyle
>>>>     >     kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>
>>>>     <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>> http://kcoyle.net
>>>>     >     m: 1-510-435-8234 (Signal)
>>>>     >     skype: kcoylenet/+1-510-984-3600 <tel:+1%20510-984-3600>
>>>>     <tel:+1%20510-984-3600>
>>>>     >
>>>>
>>>>     --
>>>>     Karen Coyle
>>>>     kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net
>>>>     m: 1-510-435-8234 (Signal)
>>>>     skype: kcoylenet/+1-510-984-3600 <tel:+1%20510-984-3600>
>>>>
>>>
>>> --
>>> Karen Coyle
>>> kcoyle@kcoyle.net http://kcoyle.net
>>> m: 1-510-435-8234 (Signal)
>>> skype: kcoylenet/+1-510-984-3600
>>>
>>
>> --
>> Karen Coyle
>> kcoyle@kcoyle.net http://kcoyle.net
>> m: 1-510-435-8234 (Signal)
>> skype: kcoylenet/+1-510-984-3600
> 

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234 (Signal)
skype: kcoylenet/+1-510-984-3600

Received on Thursday, 11 January 2018 01:03:55 UTC