RE: Trying to create generic schema from Michael Kay on 2010-01-15 (xmlschema-dev@w3.org from January 2010)

From: Michael Kay <mike@saxonica.com>
Date: Fri, 15 Jan 2010 09:25:37 -0000
To: "'Jeff Faath'" <jfaath@gmail.com>, <xmlschema-dev@w3.org>
Message-ID: <7849339CF2A842C39A6629C4674B7857@Sealion>
There's no right answer to this. The generic coding imposes fewer
constraints on what can appear in the data, which is both good and bad.
Raising the level of abstraction in your design will tend to make the design
more versatile and more capable of dealing with requirements changing, at
the expense of being more complex and harder to work with. If you take it to
extremes you get to coding schemes like HL7 that appear to have raised the
level of abstraction to a point where the information is totally
incomprehensible to everyone except a few high priests.
 
XML is extensible and it's not wrong to take advantage of the fact that the
tagset can be extended. It is wrong, however, to capture "data" (as distinct
from metadata) in the tag names. But it's a very fine line between
 
<mobile-phone>07778887777</mobile-phone>
 
<phone role="mobile">07778887777</phone>
 
<attribute name="mobile-phone" value="07778887777"/>
 
Your decision.
 
Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay 


  _____  

From: xmlschema-dev-request@w3.org [mailto:xmlschema-dev-request@w3.org] On
Behalf Of Jeff Faath
Sent: 14 January 2010 19:34
To: xmlschema-dev@w3.org
Subject: Trying to create generic schema


Greetings,

I have a general question about creating a schema that is generic (from the
business point of view) vs non-generic.  I was wondering if I could get
advice from experts on the advantages or pitfalls of these two methods.  I'm
inexperienced with building XML documents so I don't have the foresight I
would like to have.  One method might look nicer now, but I'm worried it
might have issues with future processing and extendability needs.  The
question is probably best explained with an example.

We have multiple client applications that send data readings to a central
processor.  The context of the readings differ based on which application it
came from.  We anticipate having more applications in the future that send
more contextual data.  The schema is more complicated than this, but this
sums up the issue:

Generic

<root>
  <data type="appA">
    <reading>X</reading>
    <reading>Y</reading>
  </data>
  <data type="appB">
    <reading>1</reading>
    <reading>2</reading>
  </data>
<root>

Non-generic

<root>
  <appAdata>
    <appAreading>X</appAreading>
    <appAreading>Y</appAreading>
  </appAdata>
  <appBdata>
    <appBreading>1</appBreading>
    <appBreading>2</appBreading>
  </appBdata>
<root>

In the first example, the schema is much simpler with just three elements.
The type of data is determined by the 'type' attribute and that allows for
understanding the context of the reading values.  Having a new app installed
in the system would not require changing the schema.

In the second example, there are elements for each app and corresponding
readings.   The context is found in the explicit elements.  Adding a new app
would require adding a new element to the schema (and re-generating binding
classes).

The first example seems so much nicer, but I can't help but think there may
be some pitfalls to keeping it so clean.  The second example seems verbose,
but being explicit leaves no room for doubt when processing.  Might anyone
have some insight in how to go about this?  Is there some middle ground
between the two methods?

Regards,

JF
Received on Friday, 15 January 2010 09:26:07 UTC