- From: Ioachim Drugus <sw@semanticsoft.net>
- Date: Wed, 27 Jun 2007 11:24:24 -0700
- To: John Black <JohnBlack@kashori.com>
- CC: Tim Berners-Lee <timbl@w3.org>, Richard Cyganiak <richard@cyganiak.de>, Jacek Kopecky <jacek.kopecky@deri.org>, Bernard Vatant <bernard.vatant@mondeca.com>, semantic-web@w3.org
I am new to this list, but have been working on these notions, including as architect at www.semanicsoft.net, and I hope my thoughts will be useful. 1. To distinguish information from data, I follow the principle: Information = Data + Interpretation. Without a content-type I cannot interpret the data - therefore, what comes without a content-type is not information. I believe, in web Architecture, by content type they made a perfect distinction between data and information. 2. When I call "information" the non-interpreted data, then I refer to the *potentiality* for data to be interpreted, or the "intention" of an agent for the data to be information. But we cannot regularily call something by the name of what it can *potentially* be, or based on the "intention" of an agent - a better name would come from what *it is*. So, non-interpreted data is just this - data. 3. Whether a piece of data is information is relative to an agent - software or human. If you, as an agent, can interpret a piece of data, then you have a content-type (which might be written in your own format). Another agent, like a program, without an appropriate content-type will not be able to interpret the data. I might find the data format coinciding with a system of music notation and play a melody, which somebody will treat as a cacophony and others as a new style in music. All this sums up to the statement that a piece of data can serve as different pieces of information for different agents due to them using different content-types to interpret the data. 4. A resource must necessarily have a URI. Resources and their URIs are in the relationship of "intentionality" as understood in philosophy and informally treated as "aboutness" (http://en.wikipedia.org/wiki/Intentionality). I believe, the semantic web architecturers were aware of this when they used the term "about" to make connection between a resource and its URI. Now, according 4, a URI is *not* an information resource. Moreover, an URI is *not* a resource. To become a resourse, the URI should have its own URI ("URI of URI"). To become an information resource, the "inner URI" should also come with one or several content types. If my understanding 4 is interesting, I can share it in more detail. Joe Ioachim Drugus, Ph.D. Architect Semantic Soft, Inc. John Black wrote: > Tim, > Ok. Now I am officially freaked out. I thought I was illustrating > another difficulty with eliminating ambiguity. But after your response > below, wherein you say a text string, in a text file, on my server, > representing a URI, is NOT a representation of an "information > resource", I am thrown back again to just trying to understand. If > your response is accurate then the idea of an "information resource" > has become incomprehensible to me. > On 2007-06-26, at 19:25, Tim Berners-Lee wrote: > > On 2007-06 -25, at 11:00, John Black wrote: >> [...] But surely a URI is an information resource in the same way >> that a blog post is and so it can be represented by a web page >> the same way a blog post is represented by the web page you get >> through HTTP. >> >> Now my FOAF URI is this >> http://kashori.com/JohnBlack/foaf.rdf#jpb. As a URI, it is an >> information resource, namely a string of characters conforming to >> rfc3986. > Well, that is not how Information Resource is used in the web > Architecture. An Information Resource conveys information, and in > the web architecture it can severl representations, but any one of > them must have a content-type (and possibly other metadata) as > well as a string of bits. > > I am going by something like this: """We do not limit the scope of > what might be a resource. The term "resource" is used in a general > sense for whatever might be identified by a URI. It is conventional on > the hypertext Web to describe Web pages, images, product catalogs, > etc. as “resources”. The distinguishing characteristic of these > resources is that all of their essential characteristics can be > conveyed in a message. We identify this set as “information > resources.”""" from http://www.w3.org/TR/webarch/#id-resources. > Please tell me which of the essential characteristics of a URI cannot > be conveyed in a message. I don't see any. How is a URI less of an > information resource than a web page, image, product catalog, or that > document itself? > > In other words, the architecture is not that strings of bits are > self-describing. It is not that you can guess what a string of > bits is intended to convey when you meet it on the street. It is > that the content-type tells you how to interpret it. So, the same > string of bits may signify the source markup of an HTML page when > paired text/plain and the document as represented in HTML (the > noemal bowsers case) when paired with text/html. > > So, strictly, you can say that an IR has a representation whcih is > 48 bytes long, but not that the IR is 45 bytes long. > > When I access a representation of that information resource identified > by http://kashori.com/ontology/MyURI and capture the full HTTP return > with Paros, I do in fact get a Content-Type: > HTTP/1.1 200 OK > Date: Wed, 27 Jun 2007 03:14:43 GMT > Server: Apache/2.0.51 (Fedora) > Last-Modified: Mon, 25 Jun 2007 12:08:07 GMT > ETag: "aff01a2-2a-dd9f17c0" > Accept-Ranges: bytes > Content-Length: 42 > Connection: close > Content-Type: text/plain; charset=UTF-8 > As you can see, that representation has a Content-Type of > "text/plain". How is that different from "...the source markup of an > HTML page..."? And If I embed it in HTML, and return that > representation, as a URI as represented in HTML, how is that different > from a "...document as represented in HTML"? Why is a URI less of an > information resource than a document? > >> >> I have created a web page representation of this information >> resource at http://kashori.com/ontology/MyURI according to >> standard REST web architecture principles. As the owner of and >> therefore the authority about the referent of that URI, I hereby >> proclaim that this web URI denotes my RDF FOAF URI, >> http://kashori.com/JohnBlack/foaf.rdf#jpb. > > In other words we would say <http://kashori.com/ontology/MyURI> > owl:sameAs "http://kashori.com/JohnBlack/foaf.rdf#jpb". > > The thing denoted by the MyURI is the string "..#jpb". > > You mean without the base file? Why is that? > > > Well, yes, but is this useful? > > You mean useful to anyone, ever? Well, I wasn't yet at the point of > deciding the utility of this method for everyone for all time. But if > you think, as I do, that most the semantics in RDF to date is > accomplished by the incorporation of natural language words inside of > URI identifiers, I should think it may be helpful to be able to parse > them and use those embedded components at the level of RDF statements. > > >> This uses web technologies to identify that FOAF URI by another >> URI. In particular, as an information resource, something that >> can be completely characterized by a message, I can identify it >> directly with a 'slash' URI. I don't need a 303 or a 'hash' URI. > > Oh, Yes you do, as a literal string is not an information resource. > > As I said, this is incomprehensible to me. Many 'documents' can be > represented as literal strings. Why can't a URI be represented that > way also? > > >> Now I can talk directly about, or mention, that FOAF URI in RDF. >> >> <http://kashori.com/ontology/MyURI> str:numOfCharacters 41. >> >> In this case, the RDF statement is about the identifier. This >> contradicts your statement that "...RDF statements always are >> about the referents, and never about the identifier." Here the >> referent is the identifier. > > No, not THE identifier, a different identifier. > > Yes, thats what I meant, the URI used in the RDF statement, denotes an > identifier that is mentioned in the RDF statement. > > >> I am talking as directly about my FOAF URI as I am talking >> directly about any other information resource as represented by a >> web page by stating in RDF: >> >> 1. <http://kashori.com/ontology/MyURI> owl:sameAs >> "http://kashori.com/JohnBlack/foaf.rdf#jpb"^^xsd:anyURI. >> 2. <http://kashori.com/ontology/MyURI> dc:creator >> <http://kashori.com/JohnBlack/foaf.rdf#jpb>. >> >> In natural language, 1. that FOAF URI is the same as that literal >> URI. and 2. that FOAF URI has a creator that is John Black. >> >> Finally, consider this URI: >> http://kashori.com/ontology/self-referential. This URI >> identifies/denotes itself. So we can say >> >> <http://kashori.com/ontology/self-referential> owl:sameAs >> "http://kashori.com/ontology/self-referential"^^xsd:anyURI. >> >> Only problem is, these URI are ambiguous, we can't tell if they >> identify the identifiers or the web pages representing the >> identifiers. > > No, they are not ambiguous, you said they represent the > identifiers and so they must NOT return 200. > > Ok. Here is where I must draw a line in the sand with my toe. Here I > will not cross. I interpret this to mean that you classify a URI along > with cars and people and other non-information resources, and claim > that best practices require that I set up a 303 redirect for it. I > can't comprehend that. For if that is required because I called it an > 'identifier' then why would it not be true if I call a document a > 'contract', for example? But it also brings up another problem for me. > For years I have been under the impression that an HTTP URI > identifies/denotes the content that is returned when a GET is > performed using that URI. But lately I have learned that is not the > case. The URI identifies an "information resource" that is represented > by the content that is returned. As a result, doesn't it now become > impossible to distinguish between a URI that identifies a > representation of an information resource from one that identifies the > information resource? Which does this URI identify, > http://www.w3.org/TR/webarch/, the document or the content that is > returned with GET? If the former, how do I identify the later? And If > the W3C asserts that the "information resource" identified is a > 'recommendation', does that mean it must NOT return 200? If not, then > how can you say that because I call a text string an 'identifier', it > must NOT return a 200? > > > As far as I can see, the semantic web has a consistent > architecture which works. > > (I am not sure whether you are trying to understand it or to > suggest an alternative or > try to show it doesn't work, or just check the seals. :-) > > Once again thrown back to just trying to understand it, as I said. But > in general, for several years now, I have been investigating > alternative ways to establish and convey the reference > (denotation/nterpretation) of an RDF URI using HTTP technology. I > believe there must be something more powerful than to just to 'return > useful information'. However, many of my ideas are apparently outlawed > (or strongly discouraged) by the Architecture. So I have tried to show > where the Architecture that outlaws these alternatives may not be > optimal - or at least show that it has leaks. > John > > > Tim > >> >> John Black >> www.kashori.com >
Received on Thursday, 28 June 2007 12:17:14 UTC