Re: simon:entity (or Identifiable) from Myers, Jim on 2011-07-19 (public-prov-wg@w3.org from July 2011)

From: Myers, Jim <MYERSJ4@rpi.edu>
Date: Mon, 18 Jul 2011 22:33:49 -0400
To: <reza.bfar@oracle.com>, <public-prov-wg@w3.org>
Message-ID: <B7376F3FB29F7E42A510EB5026D99EF205467F03@troy-be-ex2.win.rpi.edu>
Reza,

 

The spam tag is getting added automatically somewhere - I'll try to keep
removing it as well.

 

My point about a view 'limited to one system/one witness' was not meant
to imply that you were arguing based on a particular system, or that
content systems are simple.  I was just trying to make it clear that we
don't usually model the world more than one way within a given system,
so use cases based on content management, or workflow, or any other app
in isolation don't usually capture the general case problem one has if
the intent is to allow the integration of provenance across all the
applications and processes that have caused a dataset to be.

 

I think we have agreed that this case is of interest. Tracking the
publication process in terms of approvals and licensing while also
tracking the digital versions and physical file copies involves such
multiple perspectives, particularly if one wants to catch problems, e.g.
one of the copies was corrupted. If that's an incorrect assumption on my
part, then we need to revisit. 

 

I don't like framing this as the provenance of XYZ beginning before XYZ
exists - I think the question is whether we expect to track the
provenance of legal, business, conceptual, digital, physical and other
types of entities and need to link those provenance traces. (I.e. "do we
need to connect the provenance of the conceptual painting with that of
the physical painting", not "do we have to document the provenance of
the potential physical painting before it exists").

 

Assuming that we do have this use case, we then have to make IVPof, or
some other set of mechanisms clear. We are certainly struggling to do
that to everyone's satisfaction. I'll suggest some variations on the
definition and make some comments that might help move this forward.

 

A and B can be any two things/bobs that we can record provenance about -
whatever we call them, they are not a subset/subtype of provenance
thing.

 

For a definition, how about: 

 

An assertion that B is an IVPof A implies that, for the asserter, A and
B are entities in different models of the world (different ontological
types) and that, at the point of the assertion, B is a form, variation,
or alternate view of A. Examples of IVPof relationships include
versioning, entity-state relationships, logical-digital-physical
correspondences, the FRBR work/expression/manifestation/item hierarchy,
etc. 

 

This is about as well as the concept of version gets defined, so perhaps
that is enough. 

 

To think about invariance, while the case where B is 'more invariant'
than A is a common one, I think there are cases (in many previous
emails) where the only real constraint on invariance is really that of
the base derivation model - the input-process execution-output construct
requires the inputs and outputs to be invariant with respect to the
process execution.

 

We could potentially point out that it is always possible, given an A
and a process type that modifies A, to construct new things B and B'
that are IVPof A and that are invariant to the process, i.e. B
represents A-in-state-X and B' is A-in-state-Y. The consequence of this
is that B inherits A's immutable properties and has additional ones,
i.e. this is a consequence of this particular way of defining B and B',
not a general consequence of the IVPof relationship. (Whether these are
useful objects in a common sense sense is not clear - there may be
clearer/better choices, perhaps ones where A is also fixed in ways B is
not, but you can always create this type). This seems more like some
guidance on use than a definition though.

 

If it makes more sense to the group, I think one could also frame this
guidance in terms of dimensions - B would be an n-1 dimensional
projection of A when constructed this way. (Again - this is not
generally true from IVPof, just from this particular construction).

 

Does any of this help?

 

Jim

 

 

 

 

 

 

[Reza]
My "world" is not "limited" to one witness, one management system, etc.
If you look at source code management systems that manage billions of
lines of code modified by tens of thousands of developers at big
commercial organizations, I think you'll find the problem is actually
fairly complex.  Anyways, my personal concern is not with source code
management.  I was trying to use it as an example of very large system
that needs provenance.  So, let me retract my solution and instead, I'll
say this.  IVPof is too complicated and generic, IMO, for implementation
as is documented.  There are lots of threads that I've read on IVPof.
Let me list my concerns/questions right now about IVPof as is posted on
the F2F1 page -

Text Reads:
--------------
Let A and B be two entity states. An assertion "B is an IVP of A"
indicates that, for its asserter, A and B represent the same entity in
the world, and the entity states modelled by A and B are consistent. "B
is an IVP of A" is valid only if, for its asserter, the following holds:


*	the properties they share must have corresponding values 
*	some mutable properties of A correspond to some immutable
properties of B 

B has invariant properties that have no correspondent for A
----------------

Concerns:

1.	"Let A and B be two entity states".  States of the same entity?
What determines entity equivalence?  If two entities are the same, does
that mean they are mathematically the same or representationally (like
URI) the same?
2.	"An assertion "B is an IVP of A" indicates that, for its
asserter, A and B represent the same entity in the world, and the entity
states modelled by A and B are consistent."  What does "consistent"
mean?  Can someone provide an exact definition of consistent referenced
from general CS field that relates to this for my education purposes (as
far as I know consistency has an exact definition that varies from field
to field).
3.	""B is an IVP of A" is valid".  What is "valid"?  Is there a
test for validity that's being defined?

[Jim]



[Jim]
OK - what is that notion? Are you proposing an additional mechanism to
be added to the provenance language to cover this beyond your root/ DAG
model? It seems like we should be comparing the full set of things you
think is needed against IVPof and derivation graph.

[Reza]
Your argument on derivation is reasonable.  Again, my intent being
expanding on Ryan's Identifier and Identifiable, all we would need is a
bridge between Identifiable and Derivation and my requirements would be
met.  Then I don't see the need for IVPof.  

[Jim]



BTW: One point of the discussion that you and Ryan may have missed was
the idea of profiles from OPM. We haven't agreed that this should be in
PIL, but there one use of the profile idea was to define a standard way
to map external vocabularies to the provenance world. This was done with
Dublin Core, but I would think could be done with a versioning ontology
as well, or agent ones. The idea was to not reinvent the wheel and avoid
multiple mappings from systems that used those vocabularies and perhaps
even to promote use of those vocabularies with the provenance language
to avoid balkanization while also avoiding limiting the provenance
language to just the use cases and/or communities where those other
ontologies fit. If in this case you're really arguing that there's a big
set of use cases that could be dealt with via something that looks like
versioning that is less general, a profile approach might be what you
should argue for.

[Reza]
I'm personally not a big fan of this.  If you do this, you dilute the
value of the standard.  As you proliferate outside of the core standard
with extensions, profiles, etc., you explode out the amount of code that
has to be written and maintained to deal with it and some point it just
loses value.  IMO, a good standard is something like HTML where they
don't get it perfect generically, but they get enough right that they
get a ton of adoption.  I've mentioned the examples several times
already, <p>, <div>, etc.

[Jim]

I agree that it does not have to be a file and could be a painting, but
I don't think anything that you're saying argues against the need for a
mechanism to track the 'conceptual painting' - there are many activities
going on once the painting is conceptualized - it might be commissioned
(money flows) and as you say there are things being bought, there could
be sketches, test paintings (not of the whole scene, not things that
could be called the physical painting). My point is not to argue that
this is somehow the same as the physical painting, but that we need a
system that tracks the provenance of the conceptual thing as well as the
physical thing and gives us a way to connect them.

[Reza]
I think this is a core disagreement and we'll leave it at that.  I
disagree that all the stuff that comes before the physical manifestation
of the thing can be considered part of the thing.  Anything that is a
part of the "thought about the thing" is something completely different
and should be tracked separately.  Every real system I've seen built has
had to deal with this problem at some point (again, examples are content
management systems, source control systems, publishing systems -- you
can't ignore these, they are simply too large of use-cases).  At some
point, you have to define a hard-and-fast rule that says, "Ok, whatever
is before is not related, let's create another instance of an
abstraction".  Otherwise, you end up with a knowledge graph of the
entire universe and good luck trying to reason over that.

[Jim]



My subgraphs are not sub-versions - the content doesn't change in the
approval process. Branches in code control systems are still just one
DAG of things that differ in content. If I own the approval app, I may
write my provenance separately for each file sent to me - I'm going to
treat that file as the root of a DAG that has states of
unapproved/reviewed/approved and a branch for
rejected/appealed/approved, etc. In your content system, you're going to
tell me there's a document that is a root and this file is a leaf, I'll
tell you that each file is a root and has leaves representing different
state (probably not files since my approval app doesn't change content).
I don't think it will make sense to either application to treat that as
one graph - your content system won't want to diff two of my leaves,
etc.
 
So again - if we want provenance to allow traversal of a graph like this
(and I think we do given our use cases), you're model will have to
include hierarchy of this sort and then some way to address the
non-hierarchical third case I gave before your proposal will cover
everything we need and be something we can compare with the IVPof
mechanism.

[Reza]
My proposal was actually just an amendment to Ryan's.  I still don't see
hierarchies as necessary, but, I'm not religious about use of DAG as I
used in my initial response.  If someone has a counter proposal as an
amendment to Ryan's, I'm all for it.  I just find that Identifiable and
Identifier are much better than what's there right now (I think it
successfully resolved the BOB thread), but that something else needs to
be added per comments from others on the beginning of this thread to
keep track of states if Identifiable represents manifestation of
physical object at time t=0.

Perhaps either you, someone else, or chairs can give me the answer to
this question -

Has it been decided that we need to track the stuff related to XXX
before XXX physically exists for this WG?  To me, that's a huge question
you've brought up in my mind.  At least in my small brain, it's kind of
similar to Open-World vs. Closed-World assumptions.

On 7/16/11 12:38 PM, Myers, Jim wrote: 

Reza,
I'm certainly away of how versioning is done in content systems and, if
your world is limited to one witness, one management system, I think the
world is as simple as you say. (Even then there are some versioning
systems that give an ID to the document (root node) separately from that
of the first version and they may have an ID for 'current version' as
well because they recognize that the root node is not the same type of
thing as the versions.) Regardless, in that case I think the idea of an
IVPof and derivation graph does what you want (as in the last email-we'd
have a 'root' document and versions with IVPof links from doc to
versions and derivation links forming the graph of versions) and the
question is really whether you're model can address the other cases.
 
[Jim]
I expect there will be cases where I would claim a document exists
before a 
version (which has content) exists. 'My project report' has a due date
and other 
aspects that processes may change before I have any version.
[Reza]
I believe that the concept of the document existing before it exists
(the 
meta-data about its formation like your project due date, etc.) are
completely 
different than the document itself.  From a practical perspective, in
the cases 
that I know of (take for example, legal documents, system
configurations, etc) 
this is captured as a separate notion and I would hope we would learn
from 
previously built systems.
 
[Jim again] 
OK - what is that notion? Are you proposing an additional mechanism to
be added to the provenance language to cover this beyond your root/ DAG
model? It seems like we should be comparing the full set of things you
think is needed against IVPof and derivation graph.
 
BTW: One point of the discussion that you and Ryan may have missed was
the idea of profiles from OPM. We haven't agreed that this should be in
PIL, but there one use of the profile idea was to define a standard way
to map external vocabularies to the provenance world. This was done with
Dublin Core, but I would think could be done with a versioning ontology
as well, or agent ones. The idea was to not reinvent the wheel and avoid
multiple mappings from systems that used those vocabularies and perhaps
even to promote use of those vocabularies with the provenance language
to avoid balkanization while also avoiding limiting the provenance
language to just the use cases and/or communities where those other
ontologies fit. If in this case you're really arguing that there's a big
set of use cases that could be dealt with via something that looks like
versioning that is less general, a profile approach might be what you
should argue for.
 
 
[Jim]
 
You could claim that the document is still just a 0 length version to
cover this, but I think when we say version, we're really trying to
point at files (for example - it could be paper instead). That suggests
to me that document is not really a root node as much as a different
type of entity that at some point is associated with another type -
files with 0 or more bytes in them representing the document content. I
might need to describe how the document was created and aspects of it
were changed or decided by processes (purpose, size, due date, scope,
audience) before, at some point, someone creates an empty file that is
now considered to be the document for editing purposes. I'm not sure how
your DAG formulation would handle this.
 
[Reza]
It's not necessary for it to be a file.  When Monet painted Garden Path,
there 
was an inception.  He may have thought of the painting before, have a
deadline 
for himself, bought paint, whatever.  But there was no painting before
there was 
a first stroke on a canvas after which you can call it a painting (even
if it's 
incomplete).  Once there was a painting, there was some evolutionary
process to 
completion of the first version, there were copies, there were
modifications to 
the copies, etc. etc.  Root node is the painting Garden Path.  The
actual 
thing.  Garden Path is Identifiable as Ryan calls it and can have an
Identifier 
that points to the 0 length version.  That's the original.  The file 
representing that can then be an Identifier.  What you call "0 length
version" 
is actually the Identifier.  The root node is the Identifiable.
 
[Jim again] 
I agree that it does not have to be a file and could be a painting, but
I don't think anything that you're saying argues against the need for a
mechanism to track the 'conceptual painting' - there are many activities
going on once the painting is conceptualized - it might be commissioned
(money flows) and as you say there are things being bought, there could
be sketches, test paintings (not of the whole scene, not things that
could be called the physical painting). My point is not to argue that
this is somehow the same as the physical painting, but that we need a
system that tracks the provenance of the conceptual thing as well as the
physical thing and gives us a way to connect them.
 
[Jim]
 
I could also think of versions having their own subgraphs - I might
create a version which then undergoes some approval process after which
it becomes public (published perhaps). This example is meant to convey
that any 'state-like' thing - a version that looks like a state of a
document, may itself look like something that has more internal state
that we then have to reapply our mechanisms to. I think that's how your
DAG model would have to become hierarchical - doc/version is one level,
version/unapproved-approved-published version is another. I agree its
more complex than hierarchy since you have the time/processing
chains/DAGs off of each thing as well, but I don't see how your model
avoids hierarchy if you allow multiple levels of statefulness and allow
each level to have independent provenance (i.e. the approval/publish
processes that version1 goes through are not necessarily part of the
history of version2 (which could have just come from the unapproved
version1).
 
[Reza]
I don't see how this causes an issue.  You can take a DAG and create sub

graphs.  There is nothing that keeps you from creating sub graphs of the
super 
graph.  Again, this is common place in source code control systems.
Many people 
can modify the same file at the same time, create new files, etc.  This
is not 
central to the argument.
 
[Jim again]
 
My subgraphs are not sub-versions - the content doesn't change in the
approval process. Branches in code control systems are still just one
DAG of things that differ in content. If I own the approval app, I may
write my provenance separately for each file sent to me - I'm going to
treat that file as the root of a DAG that has states of
unapproved/reviewed/approved and a branch for
rejected/appealed/approved, etc. In your content system, you're going to
tell me there's a document that is a root and this file is a leaf, I'll
tell you that each file is a root and has leaves representing different
state (probably not files since my approval app doesn't change content).
I don't think it will make sense to either application to treat that as
one graph - your content system won't want to diff two of my leaves,
etc.
 
So again - if we want provenance to allow traversal of a graph like this
(and I think we do given our use cases), you're model will have to
include hierarchy of this sort and then some way to address the
non-hierarchical third case I gave before your proposal will cover
everything we need and be something we can compare with the IVPof
mechanism.
 
 Jim
 
 
On 7/16/11 7:38 AM, Myers, Jim wrote:

	Reza - I think what you're talking about is a combination of the
IVPof and the core inputs-process execution - outputs model (the
OPM-like core that fits the immutable thing case). The latter is ~agreed
to and just hasn't been talked about lately. If I understand your DAG, I
think I would say that's a document that first appears through some
'creative' process with a first version that is an IVPof the document.
The first version then goes through a series of 'edit' process
executions to create a DAG of future versions. (We've talked about a
'derived from' link that would be a direct connection between the
versions versus a always linking through a process execution though I
think there's still some discussion as to whether 'derived from' can be
inferred. - that may be the more direct analog of the link in your DAG).
In any case, each future version is also an IVPof the document - not
sure if those links are in your DAG formulation or not.
	 
	So one question is: are we talking about the same things yet?
	 
	If so, I think there are some use cases/issues that make it hard
to think of this as just a mutable thing and DAG of states (or first
state is the mutable thing also) rather than a more general
processing/derivation DAG along with a separate IVPof mechanism:
	 
	I expect there will be cases where I would claim a document
exists before a version (which has content) exists. 'My project report'
has a due date and other aspects that processes may change before I have
any version.
	 
	You could claim that the document is still just a 0 length
version to cover this, but I think when we say version, we're really
trying to point at files (for example - it could be paper instead). That
suggests to me that document is not really a root node as much as a
different type of entity that at some point is associated with another
type - files with 0 or more bytes in them representing the document
content. I might need to describe how the document was created and
aspects of it were changed or decided by processes (purpose, size, due
date, scope, audience) before, at some point, someone creates an empty
file that is now considered to be the document for editing purposes. I'm
not sure how your DAG formulation would handle this.
	 
	I could also think of one more edit to a file where the title is
changed and we would now consider it to represent a different document -
is that part of the same DAG? It seems cleaner to me to assert IVPof
relationships for the versions/files that correspond to one document and
to just assert that the next file in the processing chain is an IVPof a
different document when that's true. Again, I'm not sure how that would
look in the DAG formalization.
	 
	I could also think of versions having their own subgraphs - I
might create a version which then undergoes some approval process after
which it becomes public (published perhaps). This example is meant to
convey that any 'state-like' thing - a version that looks like a state
of a document, may itself look like something that has more internal
state that we then have to reapply our mechanisms to. I think that's how
your DAG model would have to become hierarchical - doc/version is one
level, version/unapproved-approved-published version is another. I agree
its more complex than hierarchy since you have the time/processing
chains/DAGs off of each thing as well, but I don't see how your model
avoids hierarchy if you allow multiple levels of statefulness and allow
each level to have independent provenance (i.e. the approval/publish
processes that version1 goes through are not necessarily part of the
history of version2 (which could have just come from the unapproved
version1).
	 
	Does that make sense?
	 
	  Jim
	 
	 
	-----Original Message-----
	From: public-prov-wg-request@w3.org on behalf of Reza B'Far
	Sent: Sat 7/16/2011 2:37 AM
	To: public-prov-wg@w3.org
	Cc: public-prov-wg@w3.org
	Subject: Re: simon:entity (or Identifiable)
	 
	Jim -
	 
	I don't disagree with anything you're saying.  I think that
didn't state my
	point well.  Let me see if I can clarify and if you still think
that this is
	something that has been deemed outside of the scope.  To align
with your email,
	I'll use your statement:
	 
	We're debating:
	   how to define this relationship
	   whether the document and its versions are the same type/class
in the model
	 
	 
	What I'm suggesting is to augment Ryan's model so that:
	 
	  1. The very first version of a document is defined by an
Identifier and is
	     Identifiable.
	  2. The DAG that I mentioned is a graph of "state"
relationships where each
	     state is a node.  It's directed because time is different
than any other
	     dimension, you can only move forward -- well, for practical
purposes.  And
	     it can't loop on itself -- once you modify something, it's
modified, you
	     can't undue it with respect to time.  It's a graph because
multiple versions
	     can be made of the same source without needing to merge,
but merging is also
	     possible.
	  3. It's a DAG that has only one root because there is an
inception point for
	     any Bob/simon:entity/whatever.  It's created at some point
and that very
	     first version at the creation time is different than all
the other future
	     versions.  The atoms that made that thing didn't have the
semantic meaning
	     as a collection before it was made.
	 
	Based on this, I'm proposing that a document and its versions
are the same AFTER
	the inception point.  But that there is a unique concept at the
root node which
	is the thing at the inception point.  So, if you take Ryan's
example,
	Identifiable and Identifier define the entity at the inception
point.  But the
	graph itself is not a concept that I see anywhere, neither the
nodes in the
	graph which are the states of the entity as delta changes to the
previous state,
	linearly lined up in time.  I can't tell if IVPof represents the
edges in the
	graph, but I think it does per everything I've read on the wiki
so far... but am
	unsure.
	 
	There is no hierarchy in what I'm outlining above.  Only
capturing temporal
	behavior and saying that temporal behavior is different from all
the other
	dimensions since it gives rise to the notion of state and that
it should be
	captured uniquely.
	 
	Example
	-----------
	 
	(Legal Contract [Identifiable] at Inception Time) --->
(Modification 1 [State])
	---->  (Modification 2 [State]) --->  ....
	
|
	 
	|-->  (Modification 3 [State]) ----->  (Modification 4 [State])
---->.....
	 
	Regards.
	 
	On 7/15/11 4:50 PM, Myers, Jim wrote:

		This is going in the direction of a hierarchy of
'states' of an identifier? If so - I don't think we have a hierarchy. If
not, then I'm not sure what the DAG represents.
		 
		I remember Graham making a comment at one point about
trying to write a page that talked more about the purpose of the model
(as he wrote for access) - I wonder if that would help. Here's my
attempt to describe the requirements and where we agree/disagree in this
style. (My take could be wrong but perhaps we would make progress by
identifying if we disagree on requirements or where some are debating
something that others consider resolved. If so, perhaps trying to modify
the text below would help before we dive back to specific points).
		 
		In the following, I intend only the English meanings of
words unless otherwise noted.
		 
		There's a set of things we've agreed to/ignored for a
while related to the basic 'inputs - process execution -outputs' where
the purpose of the model is to describe the history in cases where
inputs and outputs are clear and the effects of a process execution are
captured by the set of inputs and outputs (i.e. the process execution
can't just change an input).
		 
		We also want to be able to model cases where the process
execution does change something versus just using input and generating
outputs. A document with versions is one example. In that case we're
making the choice to model both the document and its versions and are
adding a relationship (IVPof) between then to signify that the object we
consider to be changing could instead be thought of as distinct objects
(document with content1 and document with content2) that can be handled
by the base input-process execution-output model.
		 
		We're debating:
		   how to define this relationship
		   whether the document and its versions are the same
type/class in the model
		 
		We also expect the model to cover a third case - where
we have two different things - e.g. a document and a file - that may
both have provenance, but at some point have a correspondence - the file
bytes represent the document's content. This case causes problems for
IVPof definitions that involve hierarchy since one can't really consider
either a document or a file to be more stateful versions of the other.
		 
		This again leads to debate about the definition of
IVPof. So far formulation of this concept has been attempted in terms of
properties and dimensions as well as in terms of 'perspective relative
to processes'. Some of the debate here has been when these definitions
start to include hierarchy (thus not fitting the third use case), but it
may be possible to formulate all three in ways that don't require
hierarchy.
		 
		This last use case also makes it harder to see a
difference in the types of thing like document and version. In
particular, if we can imagine more than one level for the second use
case (e.g. document-version-encodedVersion), or think about the third
case with no hierarchy, a two class system of thing and thing-state does
not appear workable.
		 
		Another issue that has arisen in the discussions is how
to refer to things outside the model. We have several reasons we want to
do this -
		    to allow discovery of things with provenance using
descriptive metadata/behavior/other context outside the model
		    to aid in the definition of IVPof, where multiple
hierarchies ala TBL and the third, non-hierarchical use case make it
hard not to talk about something 'real' that both things involved in an
IVPof relationship are describing/representing.
		 
		Throughout we have trouble with nomenclature
thing/entity/stuff/etc., describe/represent/view of/etc. which helps
obscure when we do/don't agree.
		 
		We(I anyway) may be confusing what the model contains
versus how the model will be implemented (in RDF or in other languages
we think in).
		 
		I don't know that this is complete, but perhaps I can
stop and ask whether this is already controversial or if it captures
some of the nature of our debates?
		 
		   Jim
		 
		-----Original Message-----
		From: public-prov-wg-request@w3.org on behalf of Reza
B'Far
		Sent: Fri 7/15/2011 2:22 PM
		To: public-prov-wg@w3.org
		Subject: Re: simon:entity (or Identifiable)
		 
		Folks -
		 
		I realize that the "R" word has been banned and am fine
with that.  Here is a
		suggestion for reconciliation of proposals/suggestions
by Ryan, Jim(s), and Luc -
		 
		   1. That we specify that Identifier is some
"base-line" temporally identified as
		      zero point (there exist no entity to be identified
before this point).
		   2. That we have a new concept that encapsulates a
single "state" (sorry, I know
		      that's another dangerous word) of identifier from
that point on.  I don't
		      want to give it a name so I'll call it set S{}.
		   3. An Identifier can have a DAG (Directed Acyclic
Graph) of S{} nodes where the
		      DAG has a single root node and that root node has
equivalence with the
		      identifier itself.
		 
		Just trying to reconcile at this point.
		 
		 
		On 7/15/11 10:46 AM, Jim McCusker wrote:

			On Fri, Jul 15, 2011 at 12:06 PM, Myers,
Jim<MYERSJ4@rpi.edu> <mailto:MYERSJ4@rpi.edu>     wrote:

				Being able to describe what the entity
"looks like" at the time the
				provenance was recorded.
				 
				My understanding was that a BOB was
something like a named graph,

				graph

				literal
(http://webr3.org/blog/semantic-web/rdf-named-graphs-vs-graph-
				literals/),
				or information artifact similar to
iao:Dataset. The Bob would then

				have

				content that described, in some way, the
entity in question.
				Hence the Bob being a description of an
entity's state.

				Do you distinguish 'description of an
entity' from 'description of an
				entity's state'? I get the sense that
you are not using state in the
				same sense of 'a more stateful view of'
that is driving the discussion
				of entity versus entity-state in the
IVPof debates.

			Any description of an entity will occur with an
entity in a particular
			state, and so two are the same.
			 

				If it is possible to know, there should
be assertions on the BOB

				itself that say

				which entity the BOB is describing.
Ideally, this is a URI of

				something that's

				referenced within the BOB.

				I'm hoping someone will chime in on this
- I agree we need to connect
				the idea of a bob with the entity, but I
could see implementing that as
				a link (as you say) or by saying that my
entity's class is a subtype of
				Bob (hence there's only one URL for the
Bob and the entity).

			But that's clearly wrong, since Bobs only
describe the state of an
			entity at one point/span of time and context. If
the same entity is
			observed again, and a new Bob is created that
describes the state
			differently, then there's nothing to tie it
down. I'm guessing that by
			saying there is no referable entity outside of
the Bob, then you can
			just make Bobs all the way down. But there would
be no grounding to
			non-provenance resources in this case.
			 
			The Bob is the description of something based on
its state, the Entity
			is that something. A description of a thing is
not the thing itself.
			Within the context of information systems, one
can say that
			http://tw.rpi.edu/instances/JamesMcCusker is me.
If you were to
			download the RDF from that URL that would
contain a description of me
			within the context of RPI. The graph literal
behind
			http://tw.rpi.edu/instances/JamesMcCusker is one
description (that can
			change over time), and can be given an
identifier using a graph digest
			[1], guaranteeing that we always talk about the
same graph. But that
			graph is not me, even though the URI that
returns it stands in for me
			in the semantic web.
			 
			[1]
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.2187&rep=rep1
&type=pdf
			 
			Jim
			--
			Jim McCusker
			Programmer Analyst
			Krauthammer Lab, Pathology Informatics
			Yale School of Medicine
			james.mccusker@yale.edu | (203) 785-6330
			http://krauthammerlab.med.yale.edu
			 
			PhD Student
			Tetherless World Constellation
			Rensselaer Polytechnic Institute
			mccusj@cs.rpi.edu
			http://tw.rpi.edu
Received on Tuesday, 19 July 2011 02:49:33 UTC