Comparing relationships XML vs. RDBs from Brian Maso on 2003-09-09 (www-ql@w3.org from July to September 2003)

From: Brian Maso <brian@blumenfeld-maso.com>
Date: Tue, 09 Sep 2003 16:29:03 -0700
To: www-ql@w3.org
Message-Id: <5.1.0.14.0.20030909154511.01e90208@blumenfeld-maso.com>
I'm having thoughts peripherally related to XQuery. Specifically what types 
of data patterns can be stored in an XDB vs. an RDB. I thought I'd try to 
spark a discussion here since this list seems so quiet...

I'm concentrating on the relationships between data entities each can 
store. Using foreign keys and primary keys in a relationship database, one 
can create a 1:1 or a 1:many relationship between an row in table A to one 
(or many) row(s) in table B. This relationship is a link only between rows 
in tables A and B. One could not, for example, create a relationship 
between a row in table A and a row in *either* table B or C. Both sides of 
the link are hard-coded as to the tables that are involved.

If you consider a table definition to be a type definition applied to row 
structures (e.g., each row in table A has a field X of type J, a field Y of 
type K, and a field Z of type L, etc.), then you could say each side of a 
relationship in an RDB schema is isomorphic as to the type of item it can 
reference.

Use XML Schemas, there is no standard way to represent a relationship 
between an element located in document A to an element in document B. The 
only intra-element relationships are within the same document. 
Specifically, the usable relationships are the XPath axes (child::, 
parent::, attribute::, etc.) So intra-entity relationships in XML Schemas 
seem a bit weaker to me than in relational schemas.

However, relationships in XML Schemas do support type polymorphism, which 
relational schemas cannot. That is, specifically for the "child" role of a 
parent-child relationship any instance of the specified type *or any 
derived type* can be used. The parent's schema indicates that that child X 
has type J -- but of course any instance of J or any type derived from J 
can be used.

So relationships in relational schemas are extremely different than 
relationships in XML Schemas.

What are the repercussions?

1) XML Schemas can be used to model "container/contained" data patterns 
very easily: purchase order/line item, product/part, etc. Data patterns 
where the child entity has no identity way from its parent container.

Relational schemas can't represent this as well -- you have to use 
constraints, convention or other semi-standard mechanisms to indicate that 
a row in the OrderItems table is "contained by" an row in the 
"PurchaseOrder" table. The "cascaded delete" I guess is the industry 
solution to representation of this type of data pattern.

2) XML Schemas can't be used to model "association" data patterns easily: 
Employee/office floor, car model/type of oil, etc. Data patterns where 
there are two top-level entities with a relationship between them can't be 
expressed in XML Schemas very nicely, at least not in a way that the 
relationship can be easily used in an XQuery expression.

I think these two fundamental differences means each model is "better" at 
solving different data representation problems. But obviously neither is 
hands-down a winner.

Brian Maso
Received on Wednesday, 10 September 2003 11:05:17 UTC