RE: Versioning of XML Schema and namespaces

Michael and all,

Thanks again to you all for replying to this discussion.  I am grateful to
your patience.

I think that details of one of my emails may have been lost in the volume of
its content so here is my response to Michael's comment:

> ...
> As has been said a number of times on this thread, the recipient is
> validating the document because he doesn't trust the sender. 
> So it makes no
> sense at all to validate against a schema nominated by the sender.
> 

My clients that generate the XML document instances may not fully understand
XML or may have used some software that doesn't fully implement the W3C XML
specifications.  I have noticed that this is not unique to my situation.
There are millions of invalid SGML and XML documents available on the
internet.

My metadata indexing and search application relies on the XML document
instances being 100% valid to provide effective search results.  Eg dates in
ISO 8601 format.  Therefore, I need to validate the XML document instances
against the specified XSDs to provide the best service. 

The ISO 19139 XSDs are extensible.  I expect my clients to extend these XSDs
to suit their business needs.  The extended XSDs are XML documents and may
not be 100% W3C XML Schema compliant.  If I don't already have a copy of the
XSDs referred to in the XML document instances then I need to download those
XSDs and validate them.  

If the XSDs are not valid then I report my findings to my clients and reject
the relevant XML document instances.  If the XSDs are valid then I validate
the XML document instances against those XSDs and report my findings to my
clients.  Again only valid XML document instances are accepted.

If I do have a copy of the XSDs then I will have already validated them and I
hope to use OASIS Catalogue files to refer to local copies of those XSDs when
validating related XML document instances.  This will of course reduce
bandwidth, time and costs and is essential when validating 40,000+ metadata
records at a time.

To achieve all this I need to know the absolute locations of XSDs via
references in the XML documents instances so that I can get them if I don't
already have them.  I also need to know when XML document instances refer to
a new version of the existing XSDs as this may affect my validation results.

I greatly appreciate your discussions on how I should achieve this.  It seems
that a nameSpace to the closest relative XSD combined with a schemaLocation
to that XSD will solve my problem.  It also seems that having the version
number in the nameSpace and schemaLocation will be the best option because
not all parsers will read the 'version' attribute.  I use xmllint, xerces,
Oxygen and maybe jing.

It may seem inconceivable to you that people would create invalid XSDs and
XML document instances but I assure you it happens all the time.  Hence my
obsession to validate XML so that my application can provide the best
results.

Thanks again for all your help.  

John

> Michael Kay
> http://www.saxonica.com/
> 
> 
> 
> 

Received on Wednesday, 11 May 2005 02:29:02 UTC