Re: What *is* RDF? from Peter Frederick Patel-Schneider on 2011-03-25 (public-rdf-wg@w3.org from March 2011)

From: Peter Frederick Patel-Schneider <pfps@research.bell-labs.com>
Date: Fri, 25 Mar 2011 12:49:52 -0400
To: <msporny@digitalbazaar.com>
CC: <public-rdf-wg@w3.org>
Message-ID: <20110325.124952.1531087018520174635.pfps@research.bell-labs.com>
From: Manu Sporny <msporny@digitalbazaar.com>
Subject: What *is* RDF?
Date: Thu, 24 Mar 2011 19:30:37 -0500

> On 24 Mar 2011, at 19:32, Peter Frederick Patel-Schneider wrote:

[...]

> The same could be said about RDF, couldn't it? Where would I point a Web
> Developer to get an understanding of RDF? You could say that I'd point
> them to RDF Concepts, but that doesn't help figure out how to serialize
> the data, does it? I point them at the HTML+RDFa Primer and now they
> kind-of understand it, but they really need to go read the XHTML+RDFa
> spec to get a full understanding, which then requires them to read the
> XHTML spec, and then the XML spec, as well as the XML Namespaces spec,
> and then the URI spec, ad nauseum. Each new spec raises a slew of new
> questions. What if they ask the question "What does a URI represent in
> the semantic web?" - well, I could point them at the HTTP Range 14
> decision or the Cool URIs document, but that would just confuse them
> even more.

[...]

Well, I just whipped up the following, which I think is a first cut at
what I might give to a knowledgable CS person (whether this group covers
enough web developers is a different question, of course).  Of course,
it is a lot longer that Richard's charaterisation of JSON, but this is
only to be expected.

peter


		What is RDF(S)?

RDF(S) (Resource Desription Framework (Schema)) is a logic [but don't be
scared by this] (and data model) for representing information on the
Web.  

RDF(S) uses RDF graphs to represent information.  An RDF graph is a set
of facts or RDF triples, each of which has a subject, a predicate, and
an object.  The RDF triple
  <http://www.w3.org/People/EM/contact#me> <http://xmlns.com/foaf/0.1/name> "Eric Miller"
says that the object identified by http://www.w3.org/People/EM/contact#me
is related via the relationship identified by
http://xmlns.com/foaf/0.1/name to the string (identified by) "Eric Miller". 

RDF(S) considers everything to be a resource.  Resources can be
(non-uniquely) identified by either a URI reference, a blank node, or a
literal.  The subject of a triple can be a URI reference or a blank
node.  The predicate (or property) of a triple can only be a blank node.
The object of a triple can be a URI reference, a blank node, or a
literal.


RDF(S) uses URI references as identifiers precisely so that there are
not accidental collisions between the identifiers used in different
places.  URI references then form *universal* (non-unique) identifiers
for resources.  Because URI references are so long it is common to write
them in short forms, like em:contact#me or foaf:name.  In informal
settings the expansion from these short forms into URI references may be
left unstated, but there are official ways of specifying the expansion.
URI references that are commonly written starting rdf:, rdfs:, xsd:, and
owl: are considered to be built-in to RDF(S).

RDF(S) allows the identification of resources via blank nodes, that is
without the need to use an identifier!  This may seem strange at first
glance, but it really amounts to saying that there is some resource out
there that we don't currently have a good (universal) identifier for.
[This captures part of the notion of existential quantification from
logic, but, again, don't be scared by this.]  A blank node can also be
considered to be a *private* identifier for a resource.  [This is
somewhat like Skolemization in logic, but, again, don't be scared by
this.]

RDF(S) resources include data values (literals), i.e., strings, numbers,
etc.  Aside from strings with a langauge tag, the identifiers for these
are borrowed from XML Schema datatypes, but datatypes can be extended at
will.  It is best not to use this extension facility, and to restrict
oneself to strings with language tags plus the built-in boolean,
numeric, string, and date datatypes of XML Schema datatypes.

It is very important to realize that there is no form of any unique
values assumption in RDF(S), nor is there any assumption that two
different identifiers identify different resources unless they are both
literal identifiers identifying different values, nor is there any
assumption that an RDF graph carries all the information relevant to any
particular task at hand.


Information in RDF(S) is generally unordered.  RDF collections are used
to create ordered lists, which can be nested.  The identifiers
rdf:first, rdf:next, rdf:nil, and rdf:List are used for this purpose,
but it is best to just write down lists using brackets.  It is also best
to not use RDF collections unless they are absolutely needed.  In
particular, RDF collection should not be used for multiple values.


RDF(S) incorporates a simple object-oriented type system (or, to be more
precise, theory of classes).  Resources belong to classes, and the
relationship between a resource and the various classes is belongs to is
carried by rdf:type.  Classes are organized into a (multiple-parent)
hierarchy, the relationship between a class and its ancestors is carried
by rdfs:subClassOf.  Properties are also organized into a
(multiple-parent) hierarchy, using rdfs:subPropertyOf.

There are built-in RDF(S) classes for all properties, namely
rdf:Property; resources (i.e., everything), rdfs:Resource; literals,
rdfs:Literal; datatype classes, rdfs:Datatype; and classes, rdfs:Class.
It is best not to directly add information to any of these built-in
classes, except stating that a non-built-in identifier has type
rdfs:Class or type rdf:Property.

Properties can be given domains, via rdfs:domain, and ranges, via
rdfs:range.  A property can have multiple domains or ranges, which are
considered to be conjunctive.  It is best not use these built-in
properties except to give domain or range information to a non-built-in
property.

All other parts and uses of the built-in vocabulary are controversial
and best avoided.


There is a full-fledged logic that provides the formal meaning for RDF
graphs, specified by the RDF Semantics document.  Mostly this just
states obvious things about RDF graphs, but a full understanding of
RDF(S) may require an understanding of this document.  For various
historical reasons, this document divides the meaning into several
sections, but this division can be ignored.  This document also does not
define a full set of datatypes.  The document is missing a few bits that
many users of RDF(S) consider to be part of RDF(S), notably a notion of
equality.


RDF(S) doesn't really care about serialization.  Serialization is just a
necessary evil required to transfer RDF graphs across the Web.  The
official RDF(S) serialization is RDF/XML, a serialization designed to be
simple (but it isn't), human readable (but it isn't), and compatible
with XML (but it isn't really).  Nonetheless all these annoying issues
can be ignored.  The only issue about RDF/XML that should not be
ignored is that RDF/XML cannot serialize all RDF graphs, so it is best
to not use predicates that have URIs that cannot ... [does anyone have a
good description to go here?].
Received on Friday, 25 March 2011 16:50:43 UTC