- From: Ioachim Drugus <sw@semanticsoft.net>
- Date: Sun, 01 Jul 2007 15:48:05 -0700
- To: Renato Golin <renato@ebi.ac.uk>
- CC: John Black <JohnBlack@kashori.com>, Tim Berners-Lee <timbl@w3.org>, Richard Cyganiak <richard@cyganiak.de>, Jacek Kopecky <jacek.kopecky@deri.org>, Bernard Vatant <bernard.vatant@mondeca.com>, semantic-web@w3.org
In my "personal thesaurus", *Entity*, *Object*, *Resource*, *Information Resource* are distinct terms arranged along a path of strictly decreasing generality from left to right. From the discussions in this list I noticed that people interpret these terms different ways. Nor the standards seem to give a complete clarification, which is natural - standards must leave as much freedom as possible to (1) developers of the conceptuality and (2) implementors of conceptuality into software. Probably, a correct approach would be to expose my understanding of these terms and check on how much it is in sync with general understanding. Each person is biased by his background and views. I am biased by my background in mathematical logics (Marcov's school, Moscow University) and a piece of conceptuality which I informally expose below between ****. ************* A *representation* *represents* something which I call *source* of representation *as* something else, which I call *target* or *result* of representation. I say both sources and targets of representations to be *presentations*. So, I distinguish between two terms: *re-presentation* and *presentation*. I formalize this view by using "category theory" in mathematics, considering re-presentations as morphisms and presentations as objects of the category. Presentations and re-presentations are ubiquitous in real life. But you do not know if something is a representation of something else, and therefore, usually, it is hard to select the correct word - *representation* or *presentation*. Also, the two words differ only by the prefix re-, which creates difficulties in reading. Therefore, instead of *presentation*, I often use *phenomenon*. According my view, Category Theory in mathematics is the formalism for the *phenomena and their representations*. I regard - Cognition as an iterative representation activity - Knowledge as result of this activity - presentations in mind This manner, I reduce cognition and knowledge to something with which mathematicians have lots of practice - representations. This might help formalize the very complex phenomena of cognition and knowledge. Suppose I read "XML *presentation* of...". The authors are involved in iterative representation activity which is a process of cognition - they write and re-write documents and publish certain versions. To me this is *presentation* because the source of re-presentation is hidden. But the authors wrote it for me, therefore, they used the word which is correct according my view :-) Why is it so important to distinguish between the two terms? I believe, the lack of discrimination between them creates many problems in interpretation which can result wrong implementation in software. What is worse, this lack of discrimination is due to one term *missing*. The problems are of the following type - if something A can be represented as something B, then they sometimes implicitly and unconsciously assume that "A is B", despite that "ontologically" A and B are distinct. The introduction of the notion of "presentation" as separate from "representation" helps to avoid this problem. This language also helps me talk about many things in generic terms. Say, a necessary component for an "intelligent agent" (of which there should be plenty in semantic web), must be a *presentation system* which plays the role of "mind as storage" in humans. A presentation system has a *presentation media* which put constraints onto presentations. To be able to make reference to this conceptuality, for lack of a good or final name, I would call it MyPhenomenology. The name is not that bad as it looks, because the "My" component is "indexical" and can apply to any person who shares this view. **************** Now I am proceeding to discrimination between notions above. 1. *Entity* I call *entity* anything which can potentially exist (including the legendary bird Phoenix). The word *entity* comes from a Latin word, which in English would sound like "existent" with plural "existents" - sure, English does not have such a word. "Entity" has same meaning in Latin as the word "ontos" in Greek. Therefore, "entity" looks like the best candidate for a name for the most general notion in Ontology as a discipline. I say that an entity is a "piece of existence". The word Thing has same meaning and usage as "entity" in OWL. 2. *Object* Very often they don't discriminate between *entity* and *object*, but in language sometimes you feel "conceptual discomfort" when "entity" is used instead of "object". I believe, that *object* is a particular kind of *entity* which *has* content and *is* represented in mind (here, *has* and *is* are two ontological relationships of fundamental importance and I emphasized them). Also, I think, that by a *resource* we mean exactly an *object*. But, I found that a "theory of objects" must be very complex - I will expose below the beginnings of it in terms of MyPhenomenology. If an entity exists "somewhere outside" and the mind has no representation of it ("no idea" of it), then this is an *entity*, but not an *object*. I call *object* only an entity which has a representation in mind - a label, a mental picture, anything which stands in mind for the entity. Same applies to software - if a "visual processor" creates an identifier for a fragment of a picture, then this fragment becomes an *object* for this software agent and starts making part of this agent's *reality*. There is a huge number of fragments of a picture, an infinite number (if you admit infinity), and the human mind or device cannot have a label for each of them, say nothing of more complex presentations. The first attempt to formalize the notion of *object* is to regard it as an ordered pair (*presentation* , *content*), where *presentation* is in our mind and *content* is outside (this view complies with "form-content" dichotomy). The ordered pair above is a "rudiment" of the object and we need a special name for it - I call it object's *manifestation*. "Manifestation" is a synonym for "phenomenon" , but it is more frequently used as "manifestation of", which makes it good candidate name for the "rudiment" *of* an object. The *presentation* (as first member of the ordered pair *manifestation*) of an object plays different roles in different activities - "it gives an idea", it is the target in reification process, it is the source in identification as referencing (intentionality), and it is initial state in identification process (identification has many meanings) These are different activities, and we cannot expect "presentation" to be called "presentation" in all contexts. In the domains I mentioned, I say *presentation* to be - *notion* or *concept*, *identity*, *identifier*, *presentation* or *re-presentation*, respectively. Now, let us see what happens when one member of the ordered pair above vary and the other member remains the same. A cloud changes in shape but we somehow know that it is the same cloud. Here the cloud is content and this content is changing. In order for the mind to know that cloud is the same, it must keep the presentation unchanged. Now, suppose the content is an elephant which is a solid thing and does not change - so the content does not vary. But you look at it under different angles and you get different presentations. What property of the manifestations is responsible for sameness in both cases? I define sameness of manifestations this way *(A, B) sameAs (A', B') if and only if ((A=A') OR (B=B'))* And so, two manifestations are the same iff at least one of their two, presentation or content, coincide. The *defining* property of an ordered pair in set theory differs from my defining property - set theory uses the conjunction AND and I am using the disjunction OR. So, in case of objects this is not quite an "ordered pair". I call it *reification pair* for reasons which will become clearer below. "Reification pair" is a synonym for "manifestation", but we need yet another term to disclose the type of structure a manifestation is. The relationship of sameness as defined above is reflexive, symmetric and transitive - it is a relationship of equivalence. Therefore, it induces a partition over the set of object manifestations as ordered pairs. I say each set of this partition to be an *identification class*. For an agent to be able to keep track of "sameness", it has to to select exactly one reification pair (i, c) within each identification class against which to check the others on sameness. I say the presentation *i* of the selected reification pair to be *identity* of an object. I think the process of identification is like this - given a presentation, the agent looks for the identification class where resides this presentation and produces the identity residing in this class. This completes the formalisation of the notion of object like this - we call an object a reification pair *(ID, content)*, where ID is an identity and *content* is an open world (I regard closed world as partial case of open world). One last thing to notice is that presentations in object manifestations should be better regarded as *re-presentations* when the content exists, and presentations when the content is yet to be found or created. Now, the content of an object might be empty (void). This does not mean that such an object "does not have content". Same as in set theory they gave an identity to the emptiness by introducing the notion of empty set, I regard empty content as void content, which *is* content. Therefore, even for agnostics who doubt existence of things outside their mind, the objects have content - void content. In terms of "intentionality" the association of the void content with a presentation in mind signifies "intendedness" of the presentation. For example, in case of Phoenix, even if we know that such a bird does not exist, we admit that there is something which we "intend" by Phoenix - this is because we have associated void content to an identity called by its name. We can later discover that such a bird exists - then the content changes, but the identity remains the same. The etimology and morphology of "rei-fication" suggest the meaning of "creating things". I treat "things" as objects (not entities) and creation of things as an activity of running through various re-presentations, until the agent chooses one which is invariant enough while content is changing, and assigning it the role of *identity* for an object. Therefore, I regard "reification" as a process of creating object *identities*. I regard the relationship of *intentionality* ("referencing") as *inverse* to the relationship of *reification*. Due to the defining property of "reification pair", all the manifestations within one *identification class* are manifestations of the same object. Therefore, the first members of reification pairs within one identification class are *identifiers* of the same object. The blind people might get incomplete ideas of an elephant, but even their representations are identifiers of the elephant. And so, in the activity of referencing the presentations play the role of "identifiers". 3. *Resource* What is a *resource* in the web Architecture? I treat it as an *object* with its specific *presentations* called URI references and the *content* - any piece of "content of the Universe" (which can be a piece of software, a presentation in mind, or a physical body). I believe, this treatment of resource as an *object* can help better understand the notion of *resource* Web is an agent, the presentation system of which has two main subsystems - Web pages presentation system as an interface with human agents - A data and information presentation system as an interface with applications Probably the second system needs some "semantic servers" and "semantic browsers" The name *Universal* Reource Identifiers is a prescription to all agents to reify by using this schemata. In order for the agents to be able to do this, each needs a central authority for their domain, which would maintain a uniform "scheme" within URI schemata. But there is a huge number of domains and they all need authorities. And these authorities need guidelines. Also, I am sure that not only people will have to be involved in reification, but also software. Now in order for the URI to be really universal, I believe a standard on reification is needed. Currently, there is only one standard on reification which shows how to reify only RDF "reality" - the statements. I did not share my understanding on *Information Resource*, because it is already time to call this a message. Ioachim (In my first presentation of myself, I used both the short "Joe" and the original long "Ioachim". But this created an "identity crisis" - I am now called different names. So, now I selected the original "Ioachim" as "identity" for this object here :-) Renato Golin wrote: > Hi Ioachim, > > Ioachim Drugus wrote: > >> I think, content-type is the type that the *author* of the content >> *intended* the content to be. Content-type helps the interpreter >> (interpreting agent) to select the right approach to interpretation, but >> does not guarantee that it will interpret the content as it was >> intended by the author. >> > > Exactly, it's only the author's intention, nothing more. > > > >> Availability of content-type is necessary but >> not sufficient for a piece of data to become information. >> What I wrote previously refers only to discrimination between data and >> information, but it does not explain how things go further. >> > > I wouldn't say not even necessary, but optional. You definitely don't > need content-type to know an HTML when you look at it. Programs aren't > that different, just a bit dumber. > > Of course it's *much* simpler to have context type, even for us. ;) > > > >> Now, since the interpreter is confined by the knowledge {content, >> content-type}, the only other thing which is given to start the >> interpretation process is *context*. >> > > As content-type is a kind of context this is a bit redundant. > > Data + context = Information > > SYN-SUM(Information) = Knowledge > > ie. all contexts (known) about the same data, in synergy, so: > > SYN-SUM[N](Information) != SYN-SUM[N-1](Information) + Information N > > Of course things can get much more complicated, data can be a subset of > other data in a different context and things like that but that's > further than the discussion about the same data's contexts. > > > > >> There is yet another aspect - the difference between *information* and >> *information resource* on which I which I will not write here to keep >> to the point of this discussion - discrimination between data and >> information. This difference is clearly stated in how Tim defined the >> information resource, but I think, after I work here a little, I will >> come back with a " formalized" manner to put it, which might also help. >> > > Yes, good thread going on about it, I couldn't help much with that, > though... ;) > > cheers, > --renato > >
Received on Sunday, 1 July 2007 22:48:14 UTC