- From: Jeremy Carroll <jeremy@topquadrant.com>
- Date: Fri, 14 Oct 2011 13:56:11 -0700
- To: public-rdf-wg@w3.org
- Message-ID: <4E98A1EB.20105@topquadrant.com>
I think Alex's example is an excellent case where HTTP caching is a sufficient solution. The copy of the assignments is a cache, and if you make an HTTP cache without following the HTTP recommended caching mechanism, and it goes wrong .... fix the damn code, not the model. Jeremy On 10/14/2011 8:13 AM, David Wood wrote: > Hi Alex, > > On Oct 14, 2011, at 11:06, Alex Hall wrote: > >> On Thu, Oct 13, 2011 at 9:43 PM, David Wood <david@3roundstones.com >> <mailto:david@3roundstones.com>> wrote: >> >> Hi Dan, >> >> Unfortunately, your scenario doesn't have an explicit requirement >> for any temporal data. Why not just update all the cubical >> assignments when they change? Even Jeremy's sandwich delivery >> requirement could be satisfied by a trivial SPARQL query that >> lists current cube assignments. >> >> >> I think there's a very clear requirement for temporal data here. >> Sure, you could update all the cubical assignments *in your own >> database* when they change, but once you publish those assignments >> you can't un-publish them. If somebody copies those assignments and >> stores them somewhere else (e.g. corporate headquarters collects >> cubical assignments from all departments to create a company-wide >> directory) then you now have the very real possibility of that copy >> of the data becoming out of sync with the most current version. >> >> Recording that on Jan 1, Alice was assigned cube 1000 and on March 1, >> Alice was assigned cube 1001 is much more resistant to becoming out >> of sync like this, because the context is in the data so you don't >> have to carry it around as metadata. The obvious downside is that it >> makes queries to find the current state of things more complicated. > > > Yes, good point. So you suggest a cubicleChangeEvent class and the > recording of each event? I agree. > > Regards, > Dave > > >> >> I've seen this sort of event-based modeling in real-world systems, >> e.g. an HR system defines a person's salary as the value of the >> salary property of the most recent SalaryEvent recorded for that person. >> >> -Alex >> >> >> One could concoct a scenario in which Wally (Dilbert's >> notoriously vindictive and lazy co-worker) is collecting data >> with which to report the Pointy Haired Boss to the board for >> extraneous cubical arranging, but that seems contrived (because >> it is). >> >> I propose that the age requirement is a more appropriate scenario >> to start with. Each employee has their birthday recorded. >> Dogbert wishes to send de-motivational birthday greetings and >> constructs a SPARQL query to discover which employees should get >> a card on any given day. The query would use some SPARQL 1.1 >> features to calculate who has a birthday. We could explain why a >> birthday is recorded and not employee age. >> >> It would be beneficial to also do something that is hard or >> impossible with an RDBMS. Perhaps the scenario could be extended >> with another dataset, this one kept by the Pointy Haired Boss. >> Eventually, someone (probably Dogbert) would merge the two >> datasets to satisfy some new query that can only be answered >> across both datasets. >> >> The Boss's dataset tracks which employees are behind in their >> work. This is where temporal data comes in, because he wants to >> keep a history and query who the worst employee is for all time, >> not just the current time. Like Dogbert's dataset, the Boss uses >> the same employee ids, making later merging easy. >> >> The Boss might track projects, their start dates, their >> anticipated end dates and their actual end dates. This is like a >> real-world Gantt chart, but simplified. Each project involves >> one or more employees. To follow your lead, each project id >> starts with p- and an integer: >> >> <http://example.com/p-120> >> hasStartDate '2011-03-12'^^xsd:Date ; >> hasPlannedEndDate '2011-05-28'^^xsd:Date ; >> hasActualEndDate '2011-08-31'^^xsd:Date ; >> assignedEmployee <http://example.com/e-1> . >> >> Additional employees could use duplicated assignedEmployee >> properties. >> >> I think your same questions to Pat apply, but the scenario seems >> less contrived to me. >> >> The merged data from the two datasets could be used to justify, >> e.g., the firing of older employees by Dogbert before they could >> claim a pension. >> >> Just my two cents (pence) at the end of a long day. Please >> ignore if I am no longer making sense. >> >> Regards, >> Dave >> >> >> On Oct 13, 2011, at 12:34, Dan Brickley <danbri@danbri.org >> <mailto:danbri@danbri.org>> wrote: >> >> > Pat, (well, everyone; but triggered by Pat's comments) >> > >> > You're suggesting if I read you right, that RDF shouldn't be >> written >> > in ways that make it's truth context-dependent; e.g. that a >> 'date of >> > birth' property is preferable by far to an 'age' property. >> > >> > Below is a sketch of a reasonably common descriptive scenario. >> Could >> > you maybe suggest a modelling / descriptive idiom that avoids these >> > problems? I hope it anchors some of the >> > issues we've discussed in a small enough example that might be >> turned >> > into concrete decision test cases or example documentation. >> > >> > Dan >> > >> > ----- >> > >> > Theory and Practice >> > >> > Consider an RDF vocabulary for describing office assignments in >> the cartoon >> > universe inhabited by Dilbert >> <http://en.wikipedia.org/wiki/Dilbert>. >> > First I describe the universe, then some ways in >> > which we might summarise what's going on using RDF graph >> descriptions. >> > I would love to get a sense for any >> > 'best practice' claims here. Personally I see no single best way to >> > deal with this, only different and annoying tradeoffs. >> > >> > So --- this is a fictional highly simplified company in which >> workers >> > each are assigned to occupy exactly one cubicle, >> > and in which every cubicle has at most one assigned worker. >> Cubicles >> > may also sometimes >> > be empty. >> > >> > * Every 3 months, the Pointy-haired boss >> > >> <http://en.wikipedia.org/wiki/List_of_Dilbert_characters#Pointy-haired_boss> >> > has a strategic re-organization, and re-assigns workers to >> cubicles. >> > * He does this in a memo dictated to Dogbert, who will take the >> boss's >> > vague and forgetful instructions and compare them >> > to an Excel spreadsheet. This, cleaned up, eventually becomes an >> > emailed Word .doc sent to the all-staff@ mailing list. >> > The word document is basically a table of room moves, it is headed >> > with a date and in bold type "EFFECTIVE >> > IMMEDIATELY", usually mailed out mid-evening and read by staff the >> > next morning. >> > * In practice, employees move their stuff to the new cubicles >> over the >> > course of a few days; longer if they're >> > on holiday or off sick. Phone numbers are fixed later, >> hopefully. As >> > are name badges etc. >> > * But generally the move takes place the day after the word file is >> > circulated, and at any one point, a given >> > cubicle can be fairly said to have at most one official >> occupant worker. >> > >> > So let's try to model this in RDF/RDFS/OWL. >> > >> > First, we can talk about the employees. Let's make a class, >> 'Employee'. >> > >> > In the company systems, each employee has an ID, which is 'e-' >> plus an >> > integer. Once assigned, these are >> > never re-assigned, even if the employee leaves or dies. >> > >> > We also need to talk about the office space units, the cubes or >> > 'Cubicles'. Let's forget for now that >> > the furniture is movable, and treat each Cubicle as if it lasts >> > forever. Maybe they are even somehow symbolic >> > cubicle names, and the furniture that embodies them can be moved >> > around to diferent office locations. But we >> > don't try modelling that for now. >> > >> > In the company systems, each cubicle has an ID, which is 'c-' >> plus an >> > integer. Once assigned, these are >> > never re-assigned, even if the cubicle becomes in any sense >> de-activated. >> > >> > Let's represent these as IRIs. Three employees, three cubicles. >> > >> > * http://example.com/e-1 >> > * http://example.com/e-2 >> > * http://example.com/e-3 >> > * http://example.com/c-1000 >> > * http://example.com/c-1001 >> > * http://example.com/c-1002 >> > >> > We can describe the names of employees. Cubicicles also have >> informal >> > names. Let's say that neither change, ever. >> > >> > * e-1 name 'Alice' >> > * e-2 name 'Bob' >> > * e-3 name 'Charlie' >> > * c-1000 'The Einstein Suite'. >> > * c-1001 'The doghouse'. >> > * c-1002 'Helpdesk'. >> > >> > Describing these in RDF is pretty straightforward. >> > >> > Let's now describe room assignments. >> > >> > At the beginning of 2011 Alice (e-1) is in c-1000; Bob (e-2) is in >> > c-1001; Charlie (e-3) is in c-1002. How can >> > we represent this in RDF? >> > >> > We define an RDF/RDFS/OWL relationship type aka property, >> called eg:hasCubicle >> > >> > Let's say our corporate ontologist comes up with this schematic >> > description of cubicle assignments: >> > >> > * eg:hasCubicle has a domain of eg:Employee, a range of eg:Cubicle. >> > * it is an owl:FunctionalProperty, because any Employee has at most >> > one Cubicle related via hasCubicle. >> > * it is an owl:InverseFunctionalProperty, because any Cubicle >> is the >> > value of hasCubicle for no more than one Employee. >> > >> > So... at beginning of 2011 it would be truthy to assert these >> RDF claims: >> > >> > * <http://example.com/e-1> <http://example.com/hasCubicle> >> > <http://example.com/c-1000> . >> > * <http://example.com/e-2> <http://example.com/hasCubicle> >> > <http://example.com/c-1001> . >> > * <http://example.com/e-3> <http://example.com/hasCubicle> >> > <http://example.com/c-1002> . >> > >> > Now, come March 10th, everyone at the company receives an all-staff >> > email from Dogbert, with cubicle reassignments. >> > Amongst other changes, Alice and Bob are swapping cubicles, and >> > Charlie stays in c-1002. >> > >> > Within a week or so (let's say by March 20th to be sure) The >> cubicle >> > moves are all made real, in terms >> > of where people are supposed to be based, where they are, and where >> > their stuff and phone line routings are. >> > >> > The fictional world by March 20th 2011 is now truthily described by >> > the following claims: >> > >> > * <http://example.com/e-1> <http://example.com/hasCubicle> >> > <http://example.com/c-1001> . >> > * <http://example.com/e-2> <http://example.com/hasCubicle> >> > <http://example.com/c-1000> . >> > * <http://example.com/e-3> <http://example.com/hasCubicle> >> > <http://example.com/c-1002> . >> > >> > >> > Questions / view from Named Graphs. >> > >> > 1. Was it a mistake, bad modelling style etc, to describe >> things with >> > 'hasCubicle'? Should we have instead >> > described a date-stamped 'CubicleAssignmentEvent' that mentions for >> > example the roles of Dogbert, Alice, >> > and some Cubicle? Is there a 'better' way to describe things? >> Is this >> > an acceptable way to describe things? >> > >> > 2. How should we express then the notion that each employee has at >> > most one cubicle and vice versa? Is this >> > appropriate material to try to capture in OWL? >> > >> > 3. How should a SPARQL store or TriG++ document capture the >> different >> > graphs describing the evolving state of the >> > company's office-space allocations? >> > >> > 4. Can we offer any practical but machine-readable metadata >> that helps >> > indicate to consuming applications >> > the potential problems that might come from merging different >> graphs >> > that use this modelling style? >> > For example, can we write any useful definition for a class of >> > property "TimeVolatileProperty" that could help >> > people understand risk of merging different RDF graphs using >> 'hasCubicle'? >> > >> > 5. Can the 'snapshot of the world-as-it-now-is' view and the >> > 'transaction / event log view' be equal citizens, stored in the >> same >> > RDF store, and can metadata / manifest / table of contents info for >> > that store be used to make the information usefully exploitable and >> > reasonably truthy? >> > >> >> >
Received on Friday, 14 October 2011 20:56:35 UTC