RE: RDF and XLink

I think that a lot of the issues in making a mapping
from XLinks to RDF statements have been covered in
various messages on and off various lists. Rather than
recapitulate those discussions, I think the fastest way
to get to adding an Appendix to the XLink spec is to
draft one and have people start to critique it.

So, here's a first draft. People should feel free to
critique it, you are not going to hurt my feelings.

Ron


Appendix X: Harvesting RDF Statements from XLinks

Introduction:
=============

The Resource Description Framework (RDF) [cite] is a
W3C Recommendation for providing machine-understandable
information about web resources. Both XLink and RDF provide
a way of asserting relations between resources. RDF is
primarily for describing resources and their relations, while
XLink is primarily for specifying and traversing hyperlinks.
However, the overlap between the two is sufficient that a
mapping from XLinks to statements in an RDF Model can be
defined. Such a mapping allows XML Linking elements in XML
documents to be harvested as a source of
RDF statements. XLinks thus provide an alternate syntax for
RDF information that may be useful in some situations.

This appendix is (non?)-normative.

Readers of this appendix as assumed to be familiar with the
XML Linking specification, the RDF Model and Syntax 
Recommendation, and the RDF Schema (???) Recommendation.


Principles of the Mapping
=========================

Simple RDF statements are comprised of a subject, a predicate,
and an object. The subject and predicate are
identified by URIs, the object may be a URI or a literal
string. To map an XLink into a statement we
need to be able to determine the URIs of the subject and
predicate. We must also be able to determine the object -
be it a URI or a literal.

The general principle behind the mapping specified in
this document is that each arc in an XLink give rise to one RDF
statement. The starting resource of the arc is mapped to
the subject of the RDF statement. The ending resource of the
arc is mapped to the object of the RDF statement. The arcrole
is mapped to the predicate of the RDF statement. However, a
number of corner cases arise, so see the details of the mappings.



[If we are going to make any of this normative, the normative
stuff starts here. Also, do we want to define a directive
to tell harvesting software not to harvest a particular XLink?
Also will need standard thing saying that when we say things like
"xlink:type" or "rdfs:Class", we mean equivalent according to
namespace spec, not literal use of those prefixes.]

Mapping Specification:
======================

Simple linking elements
-----------------------

The starting resource of the simple link is
mapped to the subject of the RDF statement. Note that
the starting resource of a simple link is the linking
element itself. Also note that the object of
an RDF statement must be a URI. Therefore, to harvest a simple
link to an RDF statement, the harvesting software must synthesize
a URI reference using an XPointer that selects the linking element.

Any number of equivalent XPointers could be synthesized.
In order to ensure identical models when different 
implementations are harvesting XLinks, the following
guidelines should (must?) be followed when synthesizing
the XPointer.
The XPointer should start from the nearest ancestor of the
linking element, including the linking element itself, that
has an attribute of type ID specified.
If there are none, then the XPointer starts from the
document root. Element name navigation is then used to
construct the path to the linking element from the starting element. For
example, ...

The ending resource
of a simple link is mapped to the subject of the RDF statement.
Note that the ending resource of a simple link is always a URI.

The value of the arcrole attribute, if one is given, is mapped
to the predicate of the RDF statement. Note that the value of
the arcrole attribute is already required, by the XLink
specification, to be a URI reference. [Do we want to add
the rest of this paragraph?] If no arcrole attribute
is specified, the property type of the linking element (x:tla
in this case) is mapped to the predicate of the RDF statement.
In this case the namespace URI and the NC Name are concatenated
using the approach documented in the RDF M&S specification in
order to synthesize the URI reference for the predicate.

If a role attribute is specified on the simple link, it is
harvested according to the following procedure. 
Two additional statements are added to the RDF model. The
first is a statement whose object is the ending resource of
the simple link, whose predicate is "rdf:type", and whose subject
is the resource identified by the role attribute. The second
is a statement whose object is the resource identified by the
role attribute, whose predicate is "rdf:type" and whose
subject is the resource "rdfs:Class".

An example of such an element is
  ... In a <x:tla xlink:type="simple"
     xlink:href="http://www.foo.com/papers/crops.txt"
     xlink:arcrole="http://links.org/namespace/cite"
     xlink:role="http://links.org/namespace/screed"
  >recent paper</x:tla>, Dr. Taylor assumes that ...

Mapping that link according to this specification results
in the RDF model shown below in figure 1:

 @recent paper@ --- cite --> http://www.foo.com/papers/crops.txt
 ...crops.txt --- rdf:type ---> ...screed
 ...screed --- rdf:type ---> rdfs:Class
Figure 1: Sample RDF Model constructed with arcrole

If the arcrole had not been specified, then the result
would have been the RDF model shown in figure 2.

 @recent paper@ --- x:tla --> http://www.foo.com/papers/crops.txt
 ...crops.txt --- rdf:type ---> ...screed
 ...screed --- rdf:type ---> rdfs:Class
Figure 2: Sample RDF Model not using arcrole attribute

Extended XML Links
------------------
Extended XML links shall be harvested into RDF statements
using the mapping rules below. We first describe the rules
for the components of an extended link (arcs, locators, and
resources). Then we describe the rules for the extended
link as a whole.

xlink:type="arc"
- - - - - - - -
XML elements with an xlink:type attribute whose value is "arc"
are known as arcs. Recall that arcs use the 'to' and
'from' attributes to specify the endpoints of 0 or more
possible traversals. Also recall that the 'from' and 'to'
attributes do not provide URIs, they provide labels which
may appear on one or more locator or resource elements.

The number of RDF statements harvested from a single arc
element is equal to the number of possible traversals
specified by the arc element. That quantity is the
multiplicative product of the number of resource an/or
locator elements identified by the 'to' and 'from' attributes.
Each RDF statement will correspond to one and only one of
the traversals.

The starting resources of the traversals will be mapped to
the subject of the RDF statement(s). The ending resources
of the traversals will be mapped to the object of the RDF
statement(s). The predicate of each RDF statement is the
value of the arcrole attribute, if one was specified.
If the arcrole attribute was not specified, the element
type of the arc is converted to a URI reference as described
in the RDF M&S specification and that is used as the
predicate for the RDF statement(s).

Note that the content of the arc element is not treated
as either a starting or ending resource. Only the 'to'
and 'from' attributes are used in the mapping.

xlink:type="locator"
- - - - - - - - - - 
Each XML element whose xlink:type attribute has a value of
"locator" is known as a locator. Each locator gives rise
to 0 or more statements in the RDF model. The subject of
all of those statements is the value of the xlink:href
attribute of the locator, except as noted below.

If the locator element provides a 'role' attribute, one
or two RDF statements are added to the model. The value
of the href attribute is mapped to the subject of the
first statement. The value of the role attribute is mapped
to the object of the statement. The predicate of the
first statement is 'rdf:type'. The value of the role
attribute is mapped to the subject of the second RDF statement.
The predicate of the second statement is 'rdf:type' and
the object of the second statement is 'rdfs:Class'. The
second statement is not added to the model if an identical
statement already exists in the model.

If the locator element provides an xlink:label attribute,
an RDF statement is added to the model. The value of
the href attribute is mapped to the subject of the statement.
The predicate of the statement is "xlink:label". The object
of the statement is the value of the label attribute.

If the locator element provides an 'xlink:title' attribute,
an RDF statement is added to the model. The value of
the href attribute is mapped to the subject of the statement.
The predicate of the statement is "xlink:title". The object
of the statement is the value of the title attribute.

XML elements with an xlink:type attribute whose value is "title"
are known as title elements. If the locator element contains
one or more title elements, one RDF statement will be added
to the model for each title element. The value of
the href attribute is mapped to the subject of each statement.
The predicate of each statement is "xlink:title". For
each RDF statement, the object
of the statement is the element content of the
corresponding title element. 
   
xlink:type="resource"
- - - - - - - - - - 
Each XML element whose xlink:type attribute has a value of
"resource" is known as a resource. This specification uses
'Resource' with the initial capital to mean anything
identified by a URI. The lowercase 'resource' has this more
restricted meaning.

Each resource gives rise
to 0 or more statements in the RDF model. Unless noted
otherwise, the subject of
all of those statements is the resource element itself,
referenced using an XPointer synthesized according to the
procedure described in [section reference here, make the
XPointer thing a stand-alone section. That procedure needs
to note the special handling of elements that contain
title elements]. 

If the resource element provides a 'role' attribute, one
or two RDF statements are added to the model. The subject
of the first statement is the synthesized URI reference for
the resource. The value of the role attribute is mapped
to the object of the statement. The predicate of the
first statement is 'rdf:type'. The value of the role
attribute is mapped to the subject of the second RDF statement.
The predicate of the second statement is 'rdf:type' and
the object of the second statement is 'rdfs:Class'. The
second statement is not added to the model if an identical
statement already exists in the model.

If the resource element provides an xlink:label attribute,
an RDF statement is added to the model. The subject
of the statement is the synthesized URI reference for
the resource. 
The predicate of the statement is "xlink:label". The object
of the statement is the value of the label attribute.

If the resource element provides an 'xlink:title' attribute,
an RDF statement is added to the model. The subject
of the statement is the synthesized URI reference for
the resource. The predicate of the statement is
"xlink:title". The object of the statement is the value
of the title attribute.

XML elements with an xlink:type attribute whose value is "title"
are known as title elements. If the resource element contains
one or more title elements, one RDF statement will be added
to the model for each title element. The subject of each
RDF statement is the synthesized URI reference for the
resource. The predicate of each statement is "xlink:title".
For each RDF statement, the object of the statement is the
element content of the corresponding title element. 


Extended link as a whole
- - - - - - - - - - - -

The specifications above define a means for harvesting RDF
statements from the traversal and metadata information in
XML Links. Extended links provide additional information
on the grouping of the traversal information. Harvesting
software MAY convert that information into RDF statements
using the following procedure.  [I'm iffy on this because
we did not really specify models of models in the RDF specs.
But if we don't say something, we are going to see people
doing their own way. If people have their own take on models
of models, that is probably fine. but...  Like I said, I'm
iffy on this part].

XML elements with an xlink:type attribute whose value is
"extended" are known as extended links. Each extended link
may result in the creation of a Bag in the RDF model. Each
RDF statement resulting from the arcs, locators, and
resources in the extended link will be referred to as
an 'underlying statement'. Each underlying statement results
in an additional statement in the model. The subject of
the additional statement is the Bag for
the extended link. The predicate for the additional statement
is one of the ordinals rdf:_1, rdf:_2, ..., selected in order
as the underlying statements are added to the model.
[Should we use a Seq instead of a Bag?].
The object of each additional statement is the Resource that
is the reification of the corresponding underlying statement.

[Ugh, I'm out of gas. Do we want to:
1) Define a Bag for the document as a whole? If we do, lets
NOT require that harvesting software make the thing when
harvesting simple links. KISS is my mantra.
2) Define a subclass of Bag to represent an extended link?
3) Do anything with the behavior attributes? I'd be happy
   not to. But if we must, I'd suggest hanging them off the
   resource that is the reification of the traversal. 
]

Later,
Ron

Received on Saturday, 13 May 2000 13:29:37 UTC