uri note comments from Peter Ansell on 2007-12-07 (public-semweb-lifesci@w3.org from December 2007)

From: Peter Ansell <ansell.peter@gmail.com>
Date: Fri, 7 Dec 2007 11:30:03 +1000
To: "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
Cc: p.roe@qut.edu.au, j.hogan@qut.edu.au
Message-ID: <a1be7e0e0712061730x6663eeedhc775b00f29e41eab@mail.gmail.com>
I am reviewing revision 40:

http://sw.neurocommons.org/2007/uri-note/40/

The points of current confusion relate to inconsistencies with the way
uri's are created essentially, and as is stated the main factor for
keeping things alive and well is that servers stay up with a
consistent set of URI's. This is fine I think.

The majority of the note however is dealing with an original system
for getting to information resources described by URI's, which will in
the context of HCLS have a representable relationship to real world
entities in the end.

"If policy statements are absent, the usage spec should be assumed to
be stable (the meaning won't change) and durable (meaningful into the
indefinite future)." How exact does the usage spec have to be at this
stage? Can there be lastModified information in the usage-spec which
changes based on data which is not in the small usage-spec? How do you
direct someone to changes which are in other information? Do they then
have to resolve it using the same process (total 4 HTTP queries
minimum as opposed to 1 if it were all in one document located at the
same location as the original query)?

According to RFC2616, the 303 redirect was "intended" to be an
automatic redirect to be used to consistently post forms to their real
destinations. Given that, the 303 redirect isn't cacheable as is. Does
this mean you always have to dereference this type of resource if only
to make sure that the real reference is exactly the same as it was
last time? Do you have to perform a GET on it with a cache-time to
make sure? Considering that 307 is more appropriate for caching, why
was it not chosen?

The 303 redirect in response to the URI which is used on other
documents, explicitly makes it necessary to have at least two URI's to
describe each object... Does the public need this level of assurance?
Do programmers wish to have to use double queries for each of their
objects? If the URI is going to go down it is likely to go down on the
first entity as much as the second, and with the no caching, if the
first goes down you have no chance of getting to the second through a
cache copy because you can't know its real resolvable address a priori
due to the restriction on 303 response caching. It is not reasonable
in my opinion to expect a system to cache the 303 redirect document in
such a way as to respond with a document when someone queries the 303
url which is current inaccessible.

The usage-spec concept also has no special significance to me as is.
Why should the document author be forced to define a subset of their
knowledge and put it in one place and then put other information,
which may be viewed as just as important, somewhere else? Why is the
type attribute not an accurate usage-spec on its own? Something which
can easily be found after parsing the entire metadata document,
particularly as you can also then validate the document if you have it
all to say once and for all whether the type does in fact fit the
document.

What is a use-case where currently a document may be changed so that
its type is not accurate enough to provide it as a plain reference to
identify the uri uniquely as belonging to a certain set?

Overall I don't get why the small RDF usage-spec document what makes
any of the designers think that people will go the extra effort of
defining a type for the usage-spec and a type for the rest of the
document so that people can validate both reliably. Is there really
any difference between usage-spec and metadata that will affect people
attempting to use uri's to reference specific objects? And can someone
define a case study which provides evidence for the advantages
provided by the usage-spec and redirection resolution rules in terms
that a researcher and their supporting IT specialists will agree with.

In regards to http://www.w3.org/2001/tag/issues.html#httpRange-14 , I
totally disagree with this usage from a physical sciences and medicine
point of view. It is too restrictive, people are using uri's to
represent metadata documents and referring to the object using the
same string, with success so far. Why do information resources need to
be such a limited domain for scientists? Can people put simple RDF
documents on the web and use them to reference both the document and
their desired entity at the same time? I would say they can do it
quite easily, and without confusion to either rdf parsers or current
http systems. If a 303 is returned and you redirect your query instead
to the given resource and you then get a 200 resource, why should you
immediately presume that the author was intending to make it easier
for you and not just making you go around the web twice?

When someone defines a term in a paper or in a dictionary, they do not
just intend for you to look at the definition and assume you are
talking about the letters you wrote once... Academics take it as a
given that you are going to understand that both can and will be used
identically. Computers are intelligent enough to realise this much!
The difference between RDF literal and RDF resource make this
abundantly clear whether you are talking about the concept and
document or whether you are directing them to actually look at the
string and take it to mean an information resource that is not meant
to be resolved. It is irrelevant to talk about the design of URI's
outside of the RDF literal/resource range/domain paradigm, as people
have to understand both to use them for semantics anyway.

Given that the document is only recommendations, users will be free to
use less restrictive or time-consuming methods, but it would be great
if the official recommendations were simple to follow to encourage
people to actually use them for more than a philosophical
mind-exercise, hopefully in order to make their work and research more
expressible and accessible.

Also, why is there no discussion about alternatives, such as RDFa
which can be embedded in 200 documents returned by queries, and about
link rel=alternate which can be embedded in 200 documents, where the
objects uri is likely to stay as the one resolving to the HTML
document for consistency and usability purposes. Using the current URI
note would require one to provide 3 urls to describe one concept if
one wished to use the rel=alternate method to guide people to semantic
descriptions of objects which they discover using non-semantic web
pages.

Peter
Received on Friday, 7 December 2007 01:30:16 UTC