- From: Ioachim Drugus <sw@semanticsoft.net>
- Date: Wed, 27 Jun 2007 11:24:24 -0700
- To: John Black <JohnBlack@kashori.com>
- CC: Tim Berners-Lee <timbl@w3.org>, Richard Cyganiak <richard@cyganiak.de>, Jacek Kopecky <jacek.kopecky@deri.org>, Bernard Vatant <bernard.vatant@mondeca.com>, semantic-web@w3.org
I am new to this list, but have been working on these notions, including
as architect at www.semanicsoft.net, and I hope my thoughts will be useful.
1. To distinguish information from data, I follow the principle:
Information = Data + Interpretation.
Without a content-type I cannot interpret the data - therefore, what
comes without a content-type is not information. I believe, in web
Architecture, by content type they made a perfect distinction between
data and information.
2. When I call "information" the non-interpreted data, then I refer to
the *potentiality* for data to be interpreted, or the "intention" of an
agent for the data to be information. But we cannot regularily call
something by the name of what it can *potentially* be, or based on the
"intention" of an agent - a better name would come from what *it is*.
So, non-interpreted data is just this - data.
3. Whether a piece of data is information is relative to an agent -
software or human. If you, as an agent, can interpret a piece of data,
then you have a content-type (which might be written in your own
format). Another agent, like a program, without an appropriate
content-type will not be able to interpret the data. I might find the
data format coinciding with a system of music notation and play a
melody, which somebody will treat as a cacophony and others as a new
style in music. All this sums up to the statement that a piece of data
can serve as different pieces of information for different agents due to
them using different content-types to interpret the data.
4. A resource must necessarily have a URI. Resources and their URIs are
in the relationship of "intentionality" as understood in philosophy and
informally treated as "aboutness"
(http://en.wikipedia.org/wiki/Intentionality). I believe, the semantic
web architecturers were aware of this when they used the term "about" to
make connection between a resource and its URI.
Now, according 4, a URI is *not* an information resource. Moreover, an
URI is *not* a resource. To become a resourse, the URI should have its
own URI ("URI of URI"). To become an information resource, the "inner
URI" should also come with one or several content types. If my
understanding 4 is interesting, I can share it in more detail.
Joe
Ioachim Drugus, Ph.D.
Architect
Semantic Soft, Inc.
John Black wrote:
> Tim,
> Ok. Now I am officially freaked out. I thought I was illustrating
> another difficulty with eliminating ambiguity. But after your response
> below, wherein you say a text string, in a text file, on my server,
> representing a URI, is NOT a representation of an "information
> resource", I am thrown back again to just trying to understand. If
> your response is accurate then the idea of an "information resource"
> has become incomprehensible to me.
> On 2007-06-26, at 19:25, Tim Berners-Lee wrote:
>
> On 2007-06 -25, at 11:00, John Black wrote:
>> [...] But surely a URI is an information resource in the same way
>> that a blog post is and so it can be represented by a web page
>> the same way a blog post is represented by the web page you get
>> through HTTP.
>>
>> Now my FOAF URI is this
>> http://kashori.com/JohnBlack/foaf.rdf#jpb. As a URI, it is an
>> information resource, namely a string of characters conforming to
>> rfc3986.
> Well, that is not how Information Resource is used in the web
> Architecture. An Information Resource conveys information, and in
> the web architecture it can severl representations, but any one of
> them must have a content-type (and possibly other metadata) as
> well as a string of bits.
>
> I am going by something like this: """We do not limit the scope of
> what might be a resource. The term "resource" is used in a general
> sense for whatever might be identified by a URI. It is conventional on
> the hypertext Web to describe Web pages, images, product catalogs,
> etc. as “resources”. The distinguishing characteristic of these
> resources is that all of their essential characteristics can be
> conveyed in a message. We identify this set as “information
> resources.”""" from http://www.w3.org/TR/webarch/#id-resources.
> Please tell me which of the essential characteristics of a URI cannot
> be conveyed in a message. I don't see any. How is a URI less of an
> information resource than a web page, image, product catalog, or that
> document itself?
>
> In other words, the architecture is not that strings of bits are
> self-describing. It is not that you can guess what a string of
> bits is intended to convey when you meet it on the street. It is
> that the content-type tells you how to interpret it. So, the same
> string of bits may signify the source markup of an HTML page when
> paired text/plain and the document as represented in HTML (the
> noemal bowsers case) when paired with text/html.
>
> So, strictly, you can say that an IR has a representation whcih is
> 48 bytes long, but not that the IR is 45 bytes long.
>
> When I access a representation of that information resource identified
> by http://kashori.com/ontology/MyURI and capture the full HTTP return
> with Paros, I do in fact get a Content-Type:
> HTTP/1.1 200 OK
> Date: Wed, 27 Jun 2007 03:14:43 GMT
> Server: Apache/2.0.51 (Fedora)
> Last-Modified: Mon, 25 Jun 2007 12:08:07 GMT
> ETag: "aff01a2-2a-dd9f17c0"
> Accept-Ranges: bytes
> Content-Length: 42
> Connection: close
> Content-Type: text/plain; charset=UTF-8
> As you can see, that representation has a Content-Type of
> "text/plain". How is that different from "...the source markup of an
> HTML page..."? And If I embed it in HTML, and return that
> representation, as a URI as represented in HTML, how is that different
> from a "...document as represented in HTML"? Why is a URI less of an
> information resource than a document?
>
>>
>> I have created a web page representation of this information
>> resource at http://kashori.com/ontology/MyURI according to
>> standard REST web architecture principles. As the owner of and
>> therefore the authority about the referent of that URI, I hereby
>> proclaim that this web URI denotes my RDF FOAF URI,
>> http://kashori.com/JohnBlack/foaf.rdf#jpb.
>
> In other words we would say <http://kashori.com/ontology/MyURI>
> owl:sameAs "http://kashori.com/JohnBlack/foaf.rdf#jpb".
>
> The thing denoted by the MyURI is the string "..#jpb".
>
> You mean without the base file? Why is that?
>
>
> Well, yes, but is this useful?
>
> You mean useful to anyone, ever? Well, I wasn't yet at the point of
> deciding the utility of this method for everyone for all time. But if
> you think, as I do, that most the semantics in RDF to date is
> accomplished by the incorporation of natural language words inside of
> URI identifiers, I should think it may be helpful to be able to parse
> them and use those embedded components at the level of RDF statements.
>
>
>> This uses web technologies to identify that FOAF URI by another
>> URI. In particular, as an information resource, something that
>> can be completely characterized by a message, I can identify it
>> directly with a 'slash' URI. I don't need a 303 or a 'hash' URI.
>
> Oh, Yes you do, as a literal string is not an information resource.
>
> As I said, this is incomprehensible to me. Many 'documents' can be
> represented as literal strings. Why can't a URI be represented that
> way also?
>
>
>> Now I can talk directly about, or mention, that FOAF URI in RDF.
>>
>> <http://kashori.com/ontology/MyURI> str:numOfCharacters 41.
>>
>> In this case, the RDF statement is about the identifier. This
>> contradicts your statement that "...RDF statements always are
>> about the referents, and never about the identifier." Here the
>> referent is the identifier.
>
> No, not THE identifier, a different identifier.
>
> Yes, thats what I meant, the URI used in the RDF statement, denotes an
> identifier that is mentioned in the RDF statement.
>
>
>> I am talking as directly about my FOAF URI as I am talking
>> directly about any other information resource as represented by a
>> web page by stating in RDF:
>>
>> 1. <http://kashori.com/ontology/MyURI> owl:sameAs
>> "http://kashori.com/JohnBlack/foaf.rdf#jpb"^^xsd:anyURI.
>> 2. <http://kashori.com/ontology/MyURI> dc:creator
>> <http://kashori.com/JohnBlack/foaf.rdf#jpb>.
>>
>> In natural language, 1. that FOAF URI is the same as that literal
>> URI. and 2. that FOAF URI has a creator that is John Black.
>>
>> Finally, consider this URI:
>> http://kashori.com/ontology/self-referential. This URI
>> identifies/denotes itself. So we can say
>>
>> <http://kashori.com/ontology/self-referential> owl:sameAs
>> "http://kashori.com/ontology/self-referential"^^xsd:anyURI.
>>
>> Only problem is, these URI are ambiguous, we can't tell if they
>> identify the identifiers or the web pages representing the
>> identifiers.
>
> No, they are not ambiguous, you said they represent the
> identifiers and so they must NOT return 200.
>
> Ok. Here is where I must draw a line in the sand with my toe. Here I
> will not cross. I interpret this to mean that you classify a URI along
> with cars and people and other non-information resources, and claim
> that best practices require that I set up a 303 redirect for it. I
> can't comprehend that. For if that is required because I called it an
> 'identifier' then why would it not be true if I call a document a
> 'contract', for example? But it also brings up another problem for me.
> For years I have been under the impression that an HTTP URI
> identifies/denotes the content that is returned when a GET is
> performed using that URI. But lately I have learned that is not the
> case. The URI identifies an "information resource" that is represented
> by the content that is returned. As a result, doesn't it now become
> impossible to distinguish between a URI that identifies a
> representation of an information resource from one that identifies the
> information resource? Which does this URI identify,
> http://www.w3.org/TR/webarch/, the document or the content that is
> returned with GET? If the former, how do I identify the later? And If
> the W3C asserts that the "information resource" identified is a
> 'recommendation', does that mean it must NOT return 200? If not, then
> how can you say that because I call a text string an 'identifier', it
> must NOT return a 200?
>
>
> As far as I can see, the semantic web has a consistent
> architecture which works.
>
> (I am not sure whether you are trying to understand it or to
> suggest an alternative or
> try to show it doesn't work, or just check the seals. :-)
>
> Once again thrown back to just trying to understand it, as I said. But
> in general, for several years now, I have been investigating
> alternative ways to establish and convey the reference
> (denotation/nterpretation) of an RDF URI using HTTP technology. I
> believe there must be something more powerful than to just to 'return
> useful information'. However, many of my ideas are apparently outlawed
> (or strongly discouraged) by the Architecture. So I have tried to show
> where the Architecture that outlaws these alternatives may not be
> optimal - or at least show that it has leaks.
> John
>
>
> Tim
>
>>
>> John Black
>> www.kashori.com
>
Received on Thursday, 28 June 2007 12:17:14 UTC