- From: John Erickson <olyerickson@gmail.com>
- Date: Mon, 30 Jan 2012 07:45:30 -0500
- To: Stasinos Konstantopoulos <konstant@iit.demokritos.gr>
- Cc: Government Linked Data Working Group WG <public-gld-wg@w3.org>
RE Stasinos' comments, I'd like to remind those "listening" to this date thread that this is about catalog metadata, NOT metadata within a dataset. Thus we should NOT be considering examples including e.g. birth dates in this discussion. +1 to CLOSING this discussion by recommend xad:date but permitting text literals... pax vobiscum ;) On Mon, Jan 30, 2012 at 7:30 AM, Stasinos Konstantopoulos <konstant@iit.demokritos.gr> wrote: > All, hi. > > just some random ranting on the possiblity of more flexible > underspecified dates than what has been proposed so far. I am towards > recommending Approach 3 myself, although Approach 1 has the merit of > perfectly fitting the current practice of using the first day of the > month to mean any time during the month. > > Best, > Stasinos > > > > Let us say that we want to use the property gld:birthDate to assert > that http://konstant.gr/#stasinos was born between Aug 15th and Sep > 15th, 1973, but we are unable to provide a specific date. > > > Aproach 1 > > We define one specificity property for each property that ranges over > dates that can potentially be underspefied. This property ranges over > xsd:duration and means that the value of the original property should > be understood as an unknown xsd:date instance that lies within the > interval starting on the date shown by the original property and > lasting for the duration shown by the specificity property. > > In our example we define > gld:birthDateSpecificity rdfs:range xsd:duration > and specify that, if present, the value of gld:birthDate should be > understood as an unknown xsd:date instance that lies within the > interval starting on the date shown by gld:birthDate and lasting for > the duration shown by gld:birthDateSpecificity: > > <http://konstant.gr/#stasinos> > gld:birthDate "1973-08-15"^^xsd:date ; > gld:birthDateSpecificity "P1M"^^xsd:duration . > > Pros: > - It is simple to explain and populate. > - It is consistent with current practice of using midnight of the > first day of the month (year) to mean an unknown date during that > month (year). We can make explicit that a given "1973-01-01" value > is actual meant to mean "sometime during 1973" WITHOUT retracting > any statements, but by adding a specificity statement of > "P1Y"^^xsd:duration. > > Cons: > - It requires a new property for each property that we want to treat. > - It distributes a meaning over two properties that are not nested > within the same pattern, but are at the same level as other, related > properties of the same resource. > > > Aproach 2 > > Both cons above can be treated by introducing blank nodes (shudders) > or genids or whatever name is more palatable as values of > gld:birthDate. Such nodes would have properties of their own, > restricting the range of possible concrete values they can assume: > > <http://konstant.gr/#stasinos> gld:birthDate [ > rdf:type gld:underSpecifiedDate ; > gld:startDate "1973-08-15"^^xsd:date ; > gld:specificity "P1M"^^xsd:duration > ] . > > We can, if so inclined, give more formal rigour by making explicit > that gld:underSpecifiedDate is an instance inside an inteval and not > the whole interval: > > <http://konstant.gr/#stasinos> gld:birthDate [ > rdf:type gld:date ; > gld:within [ > gld:dateInterval > gld:startDate <http://dates.org/1973/08/15> ; > gld:specificity "P1M"^^xsd:duration > ] > ] . > > Note that the range of gld:birthDate is now a resource (since it has > properties of its own) so this breaks compatibility with using > xsd:date values when the date is known exactly. Exact dates would have > to either be date URIs or be blank nodes with a data property ranging > over xsd:date: > > <http://konstant.gr/#stasinos> > gld:birthDate <http://dates.org/1973/09/02> . > > or > > <http://konstant.gr/#stasinos> > gld:birthDate [ > rdf:type gld:date ; > gld:hasValue "1973-09-02"^^xsd:date ] . > > > Pros: > - Relatively simple to explain > - It defines a handful of types and properties that can be used for > any property that ranges over dates. gld:date does not need to be a > new type, but can be the type of any existing date URI schema. > - It collects all the triples about the underspecified date under > single reosurce > > Cons: > - Harder to populate than Approach 1 > - It breaks compatibility with current practice, even for fully known > dates. > > > Approach 3 > > We define gld:birthDate as a datatype property that ranges over the > union of xsd:date and gld:underspecifiedDate. gld:underspecifiedDate > is a simple datatype, derived by restricting xsd:string to: > DD(SS)? > where DD is the lexical space of xsd:date and SS is the lexical space > of xsd:duration. > > Semantics is start date and specificity as above. SS is optional and, > if missing, defaults to "P1D" (one day). > > Examples: > > <http://konstant.gr/#stasinos> > gld:birthDate "1973-08-15P1M"^^gld:underspecifiedDate . > > The following values are equal (although not identical, so functional > properties can have only one): > "1973-09-02P1D"^^gld:underspecifiedDate . > "1973-09-02"^^gld:underspecifiedDate . > "1973-09-02"^^xsd:date . > > The following values are not equal, as per the definition of > xsd:duration that states that no relationship exists between months > and days: > > gld:birthDate "1973-08-15P1M"^^gld:underspecifiedDate . > gld:birthDate "1973-08-15P31D"^^gld:underspecifiedDate . > > Pros: > - Relatively simple to explain and populate > - It maintains compatibility with xsd:date, although inconsistent with > the practice of using midnight of the first day of the month (year) > to mean an unknown date during that month (year), as all xsd:date > values are interpreted as exact dates. > > Cons: > - Harder to index, as "1973-08-15P1M", "1973-08-15P15D", and > "1973-08-15" are all different values. Searching for all documents > related to "1973-08-15" requires full-text search with globs; not > a hard requirement (e.g., Solr does prefix* globs), but less > efficient than searching for exact values. > > > > Approach 4 > > One, rather cumbersome, solution using existing OWL 2 constructs is to > not make a direct gld:birthDate assertion, but instead restrict the > possible values of this property for this resource, if ever > discovered: > > ClassAssertion( > DataAllValuesFrom( > gld:birthDate > DatatypeRestriction( > xs:dateTime > xsd:minInclusive "1973-08-15T00:00:00Z"^^xsd:dateTime > xsd:maxExclusive "1973-09-16T00:00:00Z"^^xsd:dateTime )) > <http://konstant.gr/#stasinos> ) > > and as RDF triples: > > <http://konstant.gr/#stasinos> rdf:type > [ rdf:type owl:Restriction ; > owl:onProperty gld:birthDate ; > owl:allValuesFrom > [ rdf:type rdfs:Datatype ; > owl:onDatatype xsd:dateTime ; > owl:withRestrictions ( > [ xsd:minInclusive "1973-08-15T00:00:00Z"^^xsd:dateTime ] > [ xsd:maxExclusive "1973-09-16T00:00:00Z"^^xsd:dateTime ] ) > ] > ] . > > The use of midnight values of xsd:dateTime instead of xsd:date is > mandated by the fact that xsd:date does not permit the > xsd:minInclusive/xsd:maxExclusive restriction facets. > > This is, obviously, not something any sane person would suggest that > GLD recommends, but goes to show that it is very well possible to > formalize a human-readable underspecified date format by transforming > to equivalent OWL 2 data. > -- John S. Erickson, Ph.D. Director, Web Science Operations Tetherless World Constellation (RPI) <http://tw.rpi.edu> <olyerickson@gmail.com> Twitter & Skype: olyerickson
Received on Monday, 30 January 2012 12:46:08 UTC