Re: Arguments against digest URIs

Jonas Liljegren wrote:
> [...]
>
> Now. If you only could give a explicit URL for the statement, all
> would be greate. I would like to be able to write something like this:
> 
>   <rdf:RDF>
>     <rdf:Description about="http://www.w3.org/Home/Lassila">
>       <s:Creator rdf:StatementID="48">Ora Lassila</s:Creator>
>     </rdf:Description>
>   </rdf:RDF>

Explicit statement IDs are indeed a convenient option for a
human-readable RDF serialization (the parser could for example map them
to digest-based IDs if necessary). On the other hand, an RDF editor
would already know the digest of the statement one wants to refer to and
thus explicit statement IDs would be unnecessary.

> > (1) We need to refer to anonymous resources used by other people (or the
> > things that they represent).
> 
> You could argue that it is up to the service (application) to
> explicitly state the URIs for all resources that other would have an
> intreset in refering to.

However, the service might not anticipate the need correctly. Everything
should be identifiable.

> This means that you would still have to either copy the whole model or
> have extra metadata pointing to a location on the web.

Correct.

> And that would make the digest URI redundant.

Without them a reference to an anonymous resource could be long and
costly to evaluate at runtime.

> Would that be a digest of the first or the second RDF example above?

Using the current API [1] it would be
d248eeaa0a21f36a1a923a30fa9d8aa2302bda10 or
"uri:SHA-1-d248eeaa0a21f36a1a923a30fa9d8aa2302bda10" as a URI.

> > Legal issues are out of scope. For most other practical purposes,
> > 160-bit (or X-bit) hash seems to be a good approximation.
> 
> It still makes me feel unsatisfied. Why would you accept errors in
> some cases?
> ...
> In existing cases, digests are always used as a checksum. You already
> know that two documents are supposed to be equivalent, but want to
> make sure. You alredy know what user is trying to log in, but want to
> check wiith an extra password string.  In all cases, the digest is an
> complement to the unique identifier. It's not the identifier in
> itself.

Digests are also used for digital signatures. Signing a digest with a
private key is significantly less computationally expensive than signing
the whole content. However, I'm not aware about the legal status of
digital signatures.

> > >  The nature of the statement
> > >  ---------------------------
> > >
> > > In a reification of a statement, every reification should be handled
> > > separately, as separate events. They have properties like source,
> > > time, probability and context of statement. Even if the statement in
> > > itself would have a unique URI, there would have to be separate URIs
> > > for every stating event.
> >
> > I disagree with that. See also
> > http://lists.w3.org/Archives/Public/www-rdf-interest/1999Dec/0070.html
> 
> I think that this post confirms what i said. Let me clarify:
> 
> Lets say that there are two persons stating that the earth is
> flat. Lets say that the two statings is described in two diffrent
> models. (This model is not the same as the example in the post refered
> to above):
> 
> No 1:
> 
>   <rdf:RDF
>     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>     xmlns:a="http://description.org/schema/">
>     <rdf:Description>
>       <rdf:subject resource="http:://some.org/Earth" />
>       <rdf:predicate resource="http://some.org/shape" />
>       <rdf:object resource="http://some.org/Flat" />
>       <rdf:type
> resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement" />
>       <a:statedBy>Fred</a:statedBy>
>       <a:statedOn>20000227T1507</a:statedOn>
>     </rdf:Description>
>   </rdf:RDF>
> 
> No 2:
> 
>   <rdf:RDF
>     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>     xmlns:a="http://description.org/schema/">
>     <rdf:Description>
>       <rdf:subject resource="http:://some.org/Earth" />
>       <rdf:predicate resource="http://some.org/shape" />
>       <rdf:object resource="http://some.org/Flat" />
>       <rdf:type
> resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement" />
>       <a:statedBy>Tom</a:statedBy>
>       <a:statedOn>19991230T1905</a:statedOn>
>     </rdf:Description>
>   </rdf:RDF>
> 
> This has been discussed before. The question is if the two statements
> has the same URI or not. Is it the same reefied statement?
> 
> Think about this. It WOULD get the same URI if you would breake out
> the a:statedBy and a:statedOn, and created a digest URI from it:
> 
> No 3:
> 
>   <rdf:RDF
>     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>     xmlns:a="http://description.org/schema/">
>     <rdf:Description>
>       <rdf:subject resource="http:://some.org/Earth" />
>       <rdf:predicate resource="http://some.org/shape" />
>       <rdf:object resource="http://some.org/Flat" />
>       <rdf:type
> resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement" />
>     </rdf:Description>
>     <rdf:Description about="calculatedDigestURI">
>       <a:statedBy>Tom</a:statedBy>
>       <a:statedOn>19991230T1905</a:statedOn>
>     </rdf:Description>
>   </rdf:RDF>
> 
> But if you have a unique URI for the reified statement, you would mix
> up the properties for it. It would be like this:
> 
> No 4:
> 
>     <rdf:Description about="calculatedDigestURI">
>       <a:statedBy>Fred</a:statedBy>
>       <a:statedOn>20000227T1507</a:statedOn>
>       <a:statedBy>Tom</a:statedBy>
>       <a:statedOn>19991230T1905</a:statedOn>
>     </rdf:Description>
> 
> Now: Who made what statement on what date? You can't tell anymore.
>
> There are (as pointed out in the previuos posts on this topic) two
> solutions for this:
> 
> 1. Let the reified statements have individual URIs.
> 
> 2. Create a statement resource pointing to the global URI representing
>    the reified statement.

Another point for you. However, this is a general problem with
superimposing RDF models. There is nothing specific to digests here.
It's a modeling issue. For example, consider the following two
descriptions:

<rdf:Description about="offendingPicture.jpg">
  <a:rating>0</a:rating>
  <a:justification>Offending nudity</a:justification>
</rdf:Description>

and

<rdf:Description about="offendingPicture.jpg">
  <a:rating>100</a:rating>
  <a:justification>That sucks!</a:justification>
</rdf:Description>

If you through both descriptions into one model, you cannot distinguish
what was the justification of which rating. An integration application
that wants to preserve this information has to record the context (e.g.
origin) of each statement it stores. Thus, if the first description has
statements S1 and S2, and the second S3 and S4 the application would add
to the model e.g.:

S1 --origin--> URL1
S1 --origin--> URL1
S2 --origin--> URL2
S3 --origin--> URL2

So the things become distinguishable. The application does not
necessarily have to store these 4 additional statements however. It can
use flags, fields in the db etc. to optimize storage efficiency.

This is the solution 3 that I'd vote for.

> I would like this (no5) way to handle reified statement. The a:stating
> would represent the stating event. This would let the application use
> a globaly uniwue URI for the reified statement.

You are right in that if you want to capture individual "stating events"
you have to use some higher-level RDF construct.

> I was thinking about it's usefulness for my DB-based perl modules
> internal working...

This is how I originally came to the digests: I needed an internal
representation. And my guess is that many developers will face the same
task.

Thanks for your comments! I think that doing real implementation work
helps a lot in interpreting the standard, the model and in converging to
a common understanding.

Best,
Sergey


[1] http://www-db.stanford.edu/~melnik/rdf/api.html

Received on Monday, 28 February 2000 17:47:33 UTC