Please suggest further alternative views.
This `small' project was started during the first month of the Working Group's life and abandoned because of complete lack of consensus in some very fundamental areas such as `resource' and `entity'. The WG's ability to find a contrary viewpoint to any case is not helpful to the reader new to the subject. I am trying here to provide a key to at least the appearance of agreement.
I have two objectives for the final glossary:
I have listed what I think are the important terms with, in most cases, my attempt at a humane definition and then a selection of sources. Occasionally, I have added a comment of my own.
To promote consensus I propose the following process. Draft 1 of this document will contain my proposed text and all the direct quotations from other sources that I can lay my hands on. My selection of sources is intended to be diverse, authoritative and a mixture of the focussed and the discursive. This is by no means a model for the final document! I intend to remove the sources: this layout just shows where the conflicts may lie. I will put this up for the group to consider and ask for email comments. Members should feel free to suggest more alternatives (!). Comments meant to change existing alternatives will spawn new alternatives. The point is to mix in all the opposing views in one document so that every idea gets its chance.
I will then try and resolve differences and issue draft 2 to ask for final resolution of conflicts. If the group cannot arrive at a resolution then I suggest there is no consensus and I will humbly abandon the project in whole or those parts not agreed on. I don't think I should choose definitions just because I like them if there is substantial disagreement: this would not serve any purpose for the reader.
The final draft will contain just those parts agreed by the Group (with no remaining dissenting comments).
While not wishing to produce a redundant document, I feel that a `tutorial' glossary is useful to help a reader form an internally consistent language before receiving the precise definitions of the specs. It is therefore important not to give the user redolent but false images which may cause him/her to resist the finer descriptions when they are encountered. Definitions here should be `natural' and easily assimilated.
It is hoped that reading this document before any others in the collection of specifications and discussions which present RDF will allow the reader to build a `bigger picture' of how this particular collection of words and concepts are used.
RDF places specific requirements on some of its core terms and concepts but it is intended to reflect the real world and the Semantic Web in particular. The most technical terms must therefore be understood in the broadest of contexts, that of the World Wide Web.
Almost every word defined here will be redefined more rigorously and more precisely in later documents and those definitions are the `normative' ones. This short dictionary is meant to show how these terms will fit together and forewarn the reader that, while some words will have their natural meanings, others will be used in specialized ways.
[RDFT&C] Anything which exists or has existed. Note that RFC2396 uses this term in a more restricted sense, to mean some data represents some aspect of a Web Resource.
[MH] Actually [RFC2396] doesn't attempt to define `entity'.
[MH] [RDFM&S:introduction] uses `entity' as an undefined primitive.
[RFC2616 betraying its need to talk about transport mechanisms] The information transferred as the payload of a request or response.
[MH] Of course `entity' has a meaning in the lexicography of XML which suggests that it cannot be used safely to mean something quite as general as `something'. Perhaps we need `Thing' as DAML+OIL have it.
Resources are the identifiable items in the world, the contact points between you and the world of data. They are `entities' as we need to refer to them, fixed for a short time while we talk about them.
A typical resource would be a unit of data on the Web such as a page or a significant segment of a page. Equally another person, an organization or anything else that you would wish to point at out there in this universe can be referred to as a `resource'. The significant characteristic is the identifiable nature of resources, that they have for whatever period of time an identity which makes them distinguishable.
[RDFT&C] May refer to an RDF resource or a Web Resource. Some resources may be both. In discussion of RDF, this term is often used to mean RDF Resource.
[RDFM&S:introduction] A resource may be an entire Web page; such as the HTML document "http://www.w3.org/Overview.html" for example. A resource may be a part of a Web page; e.g. a specific HTML or XML element within the document source. A resource may also be a whole collection of pages; e.g. an entire Web site. A resource may also be an object that is not directly accessible via the Web; e.g. a printed book. Resources are always named by URIs plus optional anchor ids (see [URI]). Anything can have a URI; the extensibility of URIs allows the introduction of identifiers for any entity imaginable.
[RDFM&S:glossary] An abstract object that represents either a physical object such as a person or a book or a conceptual object such as a color or the class of things that have colors. Web pages are usually considered to be physical objects, but the distinction between physical and conceptual or abstract objects is not important to RDF. A resource can also be a component of a larger object; for example, a resource can represent a specific person's left hand or a specific paragraph out of a document. As used in this specification, the term resource refers to the whole of an object if the URI does not contain a fragment (anchor) id or to the specific subunit named by the fragment or anchor id.
[Jena] Some entity. It could be a web resource such as web page, or it could be a concrete physical thing such as a tree or a car. It could be an abstract idea such as chess or football. Resources are named by URIs.
[N3] That identified by a Universal Resource Identifier (without a "#"). If the URI starts "http:", then the resource is some form of generic document.
[RDFT&C] Anything that is identified by a URI
[Dan Connolly (email)] Nope; that rules out real numbers...
[Accessibility] anything that has identity on the Web. A Web resource is identified by a URI.
[WebChar] A resource, identified by a URI, that is a member of the Web Core (The collection of resources residing on the Internet that can be accessed using any implemented version of HTTP as part of the protocol stack (or its equivalent), either directly or via an intermediary. Notes: By the term "or its equivalent" we consider any version of HTTP that is currently implemented as well as any new standards which may replace HTTP (HTTP-NG, for example). Also, we include any protocol stack including HTTP at any level, for example HTTP running over SSL.).
[RFC2616] A network data object or service that can be identified by a URI, as defined in section 3.2. Resources may be available in multiple representations (e.g. multiple languages, data formats, size, and resolutions) or vary in other ways.
[RFC2396 (in context of defining URI)] A resource can be anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources. The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. Thus, a resource can remain constant even when its content---the entities to which it currently corresponds---changes over time, provided that the conceptual mapping is not changed in the process
A resource's URI is as different from another resource's URI as one resource is from another.
The actual content of a URI carries no explicit description of the resource although parts of the string describe how that resource may be contacted, downloaded or otherwise viewed. The most obvious examples are `http:...' Web pages which can be fetched with a browser and email addresses which allow unique contact with an individual.
URIs are the physical identifications of resources and connections to them. They are not the resources themselves nor are they the entities identified as resources. Entities have biographies of their own. The role they play in the information world is as resources. We use URIs to connect with them through that information world.
[RFC2396 - obviously definitive] A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URI that identify resources via a representation of their primary access mechanism (e.g., their network "location"), rather than identifying the resource by name or by some other attribute(s) of that resource. The term "Uniform Resource Name" (URN) refers to the subset of URI that are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable.
[WebChar] a compact string of characters for identifying an abstract or physical resource.
[N3] The way of identifying anything (including Classes, Properties or individual things of any sort). Not everything has a URI, as you can talk about something by just using its properties. But using a URI allows other documents and systems to easily reuse your information.
The construction of URIs carries some recognition of the ways resources group together in Web sites or communities. A reference my exploit this by locating a resource only within a pre-determined location. Such a reference is referred to as `relative'. This is strictly a matter of conciseness. Behind each use of a relative reference there must be a procedure to produce an absolute reference before meaning can be attached to the reference.
Relative references until they are resolved into absolute URIs can be used as a local map round a locality on the Web.
Any collection of data specifically supplying information about resources. This naturally takes the form of describing the relationships between resources and their characteristics in well-defined terms.
Metadata takes form as RDF.
[RDFM&S:introduction] Metadata is "data about data" (for example, a library catalog is metadata, since it describes publications) or specifically in the context of this specification "data describing Web resources". The distinction between "data" and "metadata" is not an absolute one; it is a distinction created primarily by a particular application, and many times the same resource will be interpreted in both ways simultaneously.
[RDFT&C] Data describing Web resources.
[Accessibility] Data about data on the Web, including but not limited to authorship, classification, endorsement, policy, distribution terms, IPR, and so on. A significant use for the Semantic Web.[quoting Barron's Dict. of Computer & Internet Terms] Data that describes data. Data dictionaries and repositories are examples of metadata. The term may also refer to any file or database that holds information about another database's structure, attributes, processing or changes.
[RDFM&S:introduction] a foundation for processing metadata; it provides interoperability between applications that exchange machine-understandable information on the Web.The broad goal of RDF is to define a mechanism for describing resources that makes no assumptions about a particular application domain, nor defines (a priori) the semantics of any application domain.
[RDFM&S:introduction] `1.There is a set called Resources.'
[RDFT&C] Note that an RDF resource is not necessarily a web resource, though any web resource can be an RDF resource.Consider: http://foo.com/#a and http://foo.com/#b may name distinct RDF resources, but if used to access web resources they both refer to the common web resource http://foo.com/ This distinction between "Web resource" and "RDF Resource" is not a desired outcome, but an interpretation of different uses of the term "resource" in different documents.
[RDFT&C] A URI plus optional anchor ID. [RDFM&S] RDF Resource Identifiers are understood to name RDF Resources.
[MH] This seems needlessly circular.
Specifically, the word `property' is used for the `relationship' or `characteristic' and the word `value' is used for the the target of the relationship or how this characteristic is expressed.
Thus if resource `A' (the one we are talking about) has the relationship `A is-the-father-of B' then `is-the-father-of' is the property here and `B' (a resource) is the value of this property of `A'. Equally `C is green' suggests a property `has-colour' and the value here `green'.
Properties (in order to be useful) must be identifiable and individual: they are thus declared to be resources (and generally have published URIs). They have all the characteristics of resources. Not all resources are properties, however.
[RDFM&S:introduction] a specific aspect, characteristic, attribute, or relation used to describe a resource. Each property has a specific meaning, defines its permitted values, the types of resources it can describe, and its relationship with other properties. This document does not address how the characteristics of properties are expressed; for such information, refer to the RDF Schema specification).
[RDFM&S:Section5] There is a subset of Resources called Properties.
[RDFM&S:glossary] A specific attribute with defined meaning that may be used to describe other resources. A property plus the value of that property for a specific resource is a statement about that resource. A property may define its permitted values as well as the types of resources that may be described with this property.
[Jena] A property is an attribute of a resource. For example DC.title is a property, as is RDF.type.
[N3] A sort of relationship between two things; a binary relation.
To carry information however, this text is understood to be a legitimate value for the accompanying property.
[RDFM&S:introduction] a simple string or other primitive data-type defined by XML. In RDF terms, a literal may have content that is XML markup but is not further evaluated by the RDF processor
[RDFM&S:Section5] (any well-formed XML)
[RDFM&S:glossary] The most primitive value type represented in RDF, typically a string of characters. The content of a literal is not interpreted by RDF itself and may contain additional XML markup. Literals are distinguished from Resources in that the RDF model does not permit literals to be the subject of a statement
[Jena] A string of characters which can be the value of a property.
[MH] It would be a forward reference to talk about the literal being interpreted (or not) as XML.
[RDFM&S:introduction] The object being described (in the XML syntax indicated by the about attribute) is in RDF called the referent.
[RDFT&C] The entity or concept that an RDF Resource describes. [RDFM&S]
[MH] This `definition' (of `referent') is made in [RDFM&S:introduction] but the word is hardly ever used by anyone: is it helpful? I think not. (`Subject' is essential, of course.)
[RDFM&S:Section5] There is a set called Statements, each element of which is a triple of the form{pred, sub, obj}Where pred is a property (member of Properties), sub is a resource (member of Resources), and obj is either a resource or a literal (member of Literals).
[RDFM&S:glossary] An expression following a specified grammar that names a specific resource, a specific property (attribute), and gives the value of that property for that resource. More specifically here, an RDF statement is a statement using the RDF/XML grammar specified in this document.
[Jena] An arc in an RDF graph, normally interpreted as a fact.
[N3] A subject, predicate and object which assert meaning defined by the particular predicate used.
[RDFM&S:glossary] A representation of a statement used by RDF, consisting of just the property, the resource identifier, and the property value in that order.
[Jena] A structure containing a subject, a predicate and an object. Another term for a statement.
This only makes sense if such a resource plays at least two roles within a connected set of statements (say as an object in one triple and a subject of another). It carries its identity between two references without any requirement to declare its address (or its accessibility) in the Web.
The anonymity of such a resource becomes apparent chiefly when the triples involved are expressed as text and some means must be found to refer to it without a URI.
A single triple is often expressed `graphically' as two figures connected by an `arc' a curved line to show the relationship. A collection of these connected figures represents the mathematical concept of `graph'.
[RDFT&C] A set of RDF Statements.
[MH] Note that nobody seems to want to include this in a glossary
The concept of a document is important since it defines the limits of validity of certain local devices in the expression of RDF such as anonymous resources.
[MH] Note that nobody seems to want to include this in a glossary
[RDFT&C] [See RDFM&S section 5]. This term is used in three distinct ways:
- The RDF Model, meaning the underlying structure and interpretation of RDF data
- An RDF Model, meaning an instance of a collection of RDF statements
- Logical Model, being a formal logicians' term with quite specific meaning. (see http://www-rci.rutgers.edu/~cfs/305_html/Deduction/FormalSystemDefs.html).
(This term has caused some confusion, since it has a quite specific meaning to logicians, which is not the same as some would regard as its "natural" meaning.)
[MH] Not so much a vocabulary item as a place-holder for other concepts. Over-used: stress this. For example: Document Object Model.
The distinction may be apparent in the interpretation of the RDF or may be inherent in the meaning of the property part of the triple.
Each container is labelled and defined as being one of:
[MH] Not defined elsewhere
[MH] Not defined elsewhere
To exploit this semantic information, an RDF document will be associated with a schema and this schema will be shared by all documents which require a common set of semantics.
Typically a schema is defined for a particular community or industry which requires a consistently interpreted standard for exchanged RDF.
One purpose of a schema is to define what sort of literal is a legitimate value for a particular property although it does not explain how to interpret the literal.
[Accessibility] An RDF schema denotes resources which constitute the particular unchanging versions of an RDF vocabulary at any point in time. It is used to provide semantic information (such as organization and relationship) about the interpretation of the statements in an RDF data model. It does not include the values associated with the attributes.
Once a subject's ontology has been devised and agreed upon, semantic information can be exchanged between interested parties reliably and rigorously.
[Accessibility] An ontology in RDF and Artificial Intelligence infers a document or file that formally defines the relations among terms. Ontologies establish a joint terminology between members of a community of interest. These members can be human or automated agents. The most typical kind of ontology for the Web has a taxonomy and a set of inference rules.
[ScAm] a document or file that formally defines the relations among terms
Typically, because so much of the Web is oriented towards text transmission, this will be as characters and, because so much of the world of computers is biased towards its Western origins, this will be in the most restricted set of characters.
As a side effect, RDF may appear readable to various degrees by ordinary humans.
The major form of serialization is in XML form. This is an example of a triple in XML form:
<rdf:description rdf:about="http://www.profium.com/">
<a:location>Sophia-Antipolis</a:location>
</rdf:description>
However, this is not the only form. More readable forms exist such as N3 or N-Triples.
A serialization should promote the same interpretation as any other and any programs charged with translating to or from any serialization style should result in the same semantic information. Note that how anonymous resources are expressed is theoretically immaterial since they have no relation to anything outside of their document.
[RDFT&C] [See RDFM&S section 5] A resource that stands for the statement together with the four statements that describe the statement. More than one reification may exist for a given statement. (There is some debate whether multiple reifications of a statement are necessarily equivalent.)
[RDFT&C] [See RDFM&S section 5] A resource that stands for a statement in a Reification. This resource has four properties describing the statement, and maybe others.
[RDFM&S:introduction] A new resource with the above four properties represents the original statement and can both be used as the object of other statements and have additional statements made about it. The resource with these four properties is not a replacement for the original statement, it is a model of the statement. A statement and its corresponding Reified statement exist independently in an RDF graph and either may be present without the other. The RDF graph is said to contain the fact given in the statement if and only if the statement is present in the graph, irrespective of whether the corresponding reified statement is present
[RDFM&S:Section5] Reification of a triple {pred, sub, obj} of [the set of] Statements is an element r of [the set of] Resources representing the reified triple and the elements s1, s2, s3, and s4 of [the set of] Statements such thats1: {RDF:predicate, r, pred} s2: {RDF:subject, r, subj} s3: {RDF:object, r, obj} s4: {RDF:type, r, [RDF:Statement]}
[RDFT&C] A (bag/collection?) containing the reifications of the statements in an RDF Graph