Re: Some DC requirements

OK, with a bit more thought, I think I understand the "ephemeral" 
comment. Because what is true for this validation may not be true at 
another moment in time.

We've talked a little about workflow, but I definitely see different 
requirements for different points on a workflow - different requirements 
for input vs. output, for receipt vs. stored data, for batch input vs. 
input forms, etc. The main validation case for cultural heritage objects 
(CHO) is data exchange or data receipt. Essentially, the old "batch 
input". (I suspect the CHO case is even less dynamic than the OSLC data 
flow point of view.) This also connects to the desire for graduated 
error handling - to be able to return information to the data provider 
where something is not fatally wrong but not quite right. So if someone 
sends you info about a book and the link to the cover image is missing, 
that's not fatal but they have something wrong that perhaps could be fixed.

It seems to me that the only thing that a validation standard needs for 
this case is a way to mark, in the validation rules, what should be 
checked, and what type of check is to be done. There may be a way to 
generalize this, which would be fine. The use case still remains. I also 
suspect that there may be needs for this outside of the batch data flow 
-- that's worth thinking about.

kc

On 12/11/14 8:27 AM, Karen Coyle wrote:
>
>
> On 12/11/14 8:10 AM, Peter F. Patel-Schneider wrote:
>
>>> Mandatory & Repeatable
>>> by Karen Coyle
>>> Folks in our community are used to cardinality being expressed as
>>> "mandatory
>>> or optional" and "repeatable or not-repeatable". We don't have any use
>>> cases
>>> for a more open-ended min/maxCardinality, so we wish to include these
>>> in our
>>> core requirements, with their "min/max" being defined in a layer that
>>> the
>>> requirements user does not see.
>>
>> I'm guessing that these correspond to [0,1], [1,1], [1,*], and [0,*],
>> but it would be nice to get some confirmation of this.
>
>
> Yes, I'm assuming that as well. I don't know what else they could mean.
>
>
>>
>>> Checking the IRIs
>>> by Karen Coyle
>>> Europeana aggregates metadata about cultural heritage objects from
>>> hundreds of
>>> libraries, archives and museums. The incoming data needs to be
>>> thoroughly
>>> checked for accuracy. Among these checks are those on IRIs as values,
>>> which
>>> can vary depending on the property. Briefly, the checks are
>>> 1) the IRI must resolve, i.e. http status code = 2XX
>>> 2) the IRI value must return a media object of a given type (e.g.
>>> based on
>>> list of MIME types)
>>> 3) the IRI value must return an object which is of the rdf:type
>>> SKOS:Concept
>>
>> I am uncomfortable including this kind of checking, although I do see
>> that it has uses.  One issue here is that the results of the checks are
>> all ephemeral.
>
>
> Don't know what you mean by ephemeral here. Knowing that an IRI resolves
> seems like an obvious check, to me, when receiving data from a third
> party whose output isn't terribly trustworthy (which in our case it
> often isn't). The other two are still under discussion, but are actual
> validations from current applications.
>
>
>>
>>> Comparing values
>>> by Karen Coyle
>>> There are cases where the values in two or more triples have a specific
>>> relationship. The obvious one is "birthDate/deathDate" or
>>> "startDate/endDate".
>>> The validation model must allow these to be defined. One assumption is
>>> that
>>> the validation takes place within the context of a graph or node.
>>> Another is
>>> that the comparison is between literal values or datatypes, not IRIs.
>>> The
>>> question of whether this could be used more generally for ordering of
>>> lists is
>>> still being discussed, but it may be best to treat lists as a special
>>> case.
>>
>> I believe that there already is a story about relationships between
>> literal values of properties.
>
> Yes, there may be. I assume stories will get de-duped as we move into
> requirements.
>
>
>>
>>> Defining allowed values
>>> by Karen Coyle
>>> Developers need to have these ways of defining the allowed values for
>>> each
>>> property
>>> 1) must be an IRI
>>> 2) must be an IRI matching this pattern (e.g.
>>> http://id.loc.gov/authorities/names/)
>>> 3) must be an IRI matching one of these patterns
>>> 4) must be a (any) literal
>>> 5) must be one of these literals ("red" "blue" "green")
>>> 6) must be a typed literal of this type (e.g. XML dataType)
>>
>> I believe that all (or at least almost all) of these are already in some
>> story or other.
>
> I found some of these, but they seemed to me to be buried in more
> general stories. Again, de-duping will happen, but I want to be sure
> that all of the DC cases are made explicit.
>
> kc
>
>>
>> peter
>>
>>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600

Received on Thursday, 11 December 2014 17:31:27 UTC