Describing formats [was: Caveats for Web-friendly service description]

Interesting question. I think the answer lies in the semantics of 
providing the schema; although XSD has strict validation as its primary 
use case, many people use it as a modelling language. Indeed, in Web 
services, anecdotal evidence is that many/most deployments turn off 
Schema validation.

In other words, one of the big use cases for describing formats is on 
the receiver side -- in validation --but it's also on the sender side, 
in data binding. People want to be able to make programmatically 
creating the documents easier too.

While it's true that there are a lot of pitfalls in data binding, 
Schematron doesn't seem to allow it at all; it's very difficult to 
reverse-engineer those XPaths into an appropriate instance document. If 
you do, you might come up with an instance that, while valid according 
to the schema, isn't what the supplier expected, and will break when 
they revise the format in the future. That's because the extensibility 
points are implicit in schematron; you can add data that will break 
future revisions of the schema unless the schema author is aware of and 
agrees with your extension.

Some people have been making noises about an XML "templating" language 
that is the reverse/complement of Schematron and similar approaches 
like GRDDL; I'm not sure what that would look like, but I wonder at the 
necessity of keeping two documents in sync.

I tend to think it would be more appropriate to use something designed 
for modelling -- like RDF Schema/OWL -- rather than hijacking a 
validation-centric language like XSD, RNG or Schematron. It's not that 
I'm necessarily a SW fanatic, it's just that the data model is fairly 
simple, has the right default rules for extensibility and versioning, 
and maps to languages more easily than the Infoset.

Tying it back to Postel's law, being strict in what you send means that 
you need to know what to send. Being loose in what you receive can be 
accomplished in a number of ways; it's up to the implementation to try 
to make sense of partial data and syntax errors. Baking that looseness 
into what's advertised as being acceptable isn't necessary, and is 
probably counter-productive.


On Jun 5, 2005, at 9:11 AM, Stefan Tilkov wrote:

> On Jun 1, 2005, at 4:29 PM, Marc Hadley wrote:
>> ... if the representation is relatively simple and the description 
>> includes parameters then the schema can be ignored when processing a 
>> received representation (you just apply the XPaths and grab the 
>> information you're interested in). However in order to produce a 
>> document you need access to something that describes the format 
>> you're expected to produce in some detail.
> If the application follows Postel's law and treats incoming messages 
> (requests) differently from outgoing messages (responses), does this 
> need to be reflected in the description languages? Is something as 
> strict as XSD a good choice to describe what you send while something 
> more flexible such as Schematron is better to describe what you'll 
> accept?
> Stefan
> --
> Stefan Tilkov,,
> innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
> Phone: +49 170 471 2624  Fax: +49 2102 77160-1
> ICQ: 177869128, AIM: stefantilkov, Weblog: 

Mark Nottingham

Received on Sunday, 12 June 2005 11:53:35 UTC