W3C home > Mailing lists > Public > xmlschema-dev@w3.org > October 2000

RE: Trying to validate a sample xml file against a schema I am de velo ping.

From: Michael Burns <Michael.Burns@sas.com>
Date: Tue, 17 Oct 2000 12:50:28 -0400
Message-ID: <DB8AFFD38A30D311BA6D0090276DC8C02D5965@merc09.us.sas.com>
To: "'Martin Gudgin'" <marting@develop.com>
Cc: xmlschema-dev@w3.org
Thanks Martin so far for all your help.

I am making progress.
As per your instructions I have installed python 1.6 and LT XML and 
the sources for xmlschema.   I am new to python but have managed to get
applyschema to work, sort of.  The shell I usually use on NT is cygnus
solutions cygwin because I don't like the command.com.  I can get 
python itself to run and even run it's auto test routines.  

I can run the python IDLE and have a startup script that gets the xmlschema
and PyLTXML stuff on my sys.path:

['C:\\Python16\\Tools\\idle', 'C:\\Python16', 'C:\\Python16\\DLLs', 'C:\\Python16\\lib', 'C:\\Python16\\lib\\plat-win', 'C:\\Python16\\lib\\lib-tk', 'C:\\Python16', 'c:\\sasmkb\\xml\\xmlschema', 'c:\\Program Files\\HCRC LTG\\LT XML\\Python\\lib\\PyLTXML', 'c:\\Program Files\\HCRC LTG\\LT XML\\Python\\lib', 'c:\\Program Files\\HCRC LTG\\LT XML\\Python'] 

I can even open the applyschema.py and run it.  Unfortunately when I do so 
it does not get any arguments to know what file to validate, which is why
you suggest to run it from a shell with the python command.  

However when I try to do so python complains:

$ python applyschema.py
Traceback (most recent call last):
  File "applyschema.py", line 8 in ?
    from PyLTXML import *
ImportError: No module named PyLTXML 

If I bring up python itself with no args at all, I can 
enter the same 'from PyLTXML import *' by itself and it works fine. 
I can then 'import applyschema' and that does not complain.
I just don't know how to run it with arguments at that point. 

BTW, I tried using command.com and managed to get python to
run applyschema.py and get the same error importing PyLTXML.
Once again, I can do the import manually without any complaints.
Any clues as to why applyschema cannot do the import?
Is my sys.path goofy?

Failing that, I actually edited applyschema.py and simply set
the arguments with an assignment statement:
argl=[ '-o', 'c:\\sasmkb\\xml\\xmlschema\\triv.out', 'c:\\sasmkb\\xml\\xmlschema\\trivBad.xml', 'c:\\sasmkb\\xml\xmlschema\\triv.xsd']
around line 805.  
That works OK, but is kind of a pain to edit the program for every
run I want to make.  
Is there an easier way?
Is there a way to run this program with arguments from the Python shell
at the >>> prompt?

Next question: actually getting it to do some validation. 
I started with the triv.xml file that comes with applyschema.py using
the arguments shown above.  When I run this the -o file (triv.out) looks 
like this: 

<?xml version='1.0'?>
<xsv outcome='validation not attempted' schemaDocs='c:\sasmkb\xml\xmlschema\triv.xsd' target='c:\sasmkb\xml\xmlschema\triv.xml' version='XSV 1.167/1.77 of 2000/09/28 15:54:50' xmlns='http://www.w3.org/2000/05/xsv'/>

Which seems OK, except that the outcome= says that no validation was done. 
I thought that maybe that was a false alarm and it actually validated clean
with no errors.  So I copied triv.xml and made trivBad.xml and edited in a
blatant error:  <icky bad=1>  inside the innermost <f:root>.  When I run that
it still does not complain about anything.  
What's up?
What am I missing? 
I even tried creating ill formed xml by adding an element tag that
was never closed (<foo> without an </foo>  ) and it did not complain at all.

Help?  

Per your comments below about the personal-schema.xml file I was trying:

<MJG>
The schema at http://www.realtime.net/~mburns/personal-2000.xsd is invalid
per http://www.w3.org/2000/10/XMLSchema which I think explains why you are
getting the output you are from XSV.

You need to amend the anonymous simpleType definition inside the attribute
'contr' inside the anonymous complex type inside the element decl 'person'
to read as follows;

    <simpleType>
      <restriction base='string'>
        <enumeration value='true'/>
        <enumeration value='false' />
      </restriction>
    </simpleType>

I'm also curious as to why you didn't just use a restriction of boolean
here?
</MJG>

This example is one that ships with the xerces parser.  I did not touch
it other than to change the namespace from 1999 to 2000/10.  Granted
it may not be valid per the 2000/10 specs.  I was just trying to get
something to work.  So, I can't answer your curiosity about using a 
restriction of boolean.  I don't know why they did not do that.
Frankly, I have not gotten too deeply into the details, I am just
trying to get a development and testing framework in place.  Then
I'll dive into the details.  

Nothing like trying to be on the bleeding edge!
Thanks again for all your help.  I'll keep trying. 


-----Original Message-----
From: Martin Gudgin [mailto:marting@develop.com]
Sent: Saturday, October 14, 2000 2:58 PM
To: Michael.Burns@sas.com
Cc: xmlschema-dev@w3.org
Subject: Re: Trying to validate a sample xml file against a schema I am
develo ping.


[inline] ( Look for <MJG></MJG> )
----- Original Message -----
From: "Michael Burns" <Michael.Burns@sas.com>
To: "Martin Gudgin" <marting@develop.com>
Cc: "Michael Burns" <michael.burns@sas.com>; <xmlschema-dev@w3.org>
Sent: Wednesday, October 11, 2000 6:08 PM
Subject: Re: Trying to validate a sample xml file against a schema I am
develo ping.


I  previously got xsv to validate some sample xml files (can't remember
exactly which ones though),
but cannot get it to validate much of anything now.

I have put several attempts on my personal web site:

http://www.realtime.net/~mburns/po.xml
cut and paste directly out of the primer, and the schema for it:
http://www.realtime.net/~mburns/po.xsd
The only thing I changed in the xml file was to add
    xmlns:xsi = "http://www.w3.org/2000/10/XMLSchema-instance"
   xsi:noNamespaceSchemaLocation = "po.xsd"

That brings up one of my burning questions:   Given the initial contents
of po.xml and po.xsd as shown
in the primer, how in the world is a parser supposed to connect the two?

What are all or some or the
recommended way for a parser to determine which schema to use.

<MJG>
The Working Draft is careful not to specify this because we did not wish to
limit schema processors with respect to where they find schemas.

One company's schema processor may be initialised with a set of schema files
when it first starts, maybe these are the only schemas the company in
question cares about. Other processors may always look for
xsi:schemaLocation ( or xsi:noNamespaceSchemaLocation ). Others may
dereference namespace URIs... Some may do all of these and more. The point
is that although the link between the instance and the schema is always the
namespace URI of the instance, how a schema processor finds the schema for a
given namespace is entirely up that processor.
</MJG>

I added
these two attributes to
the top level element in po.xml and it still does not seem to find the
schema and use it.   When I run
the xerces parser on this with schema validation turned on it complains
that:

[Error] po.xml:6:4: Element type "purchaseOrder" must be declared.
[Error] po.xml:7:26: Element type "shipTo" must be declared.
[Error] po.xml:8:15: Element type "name" must be declared.
[Error] po.xml:9:17: Element type "street" must be declared.
[Error] po.xml:10:15: Element type "city" must be declared.
[Error] po.xml:11:16: Element type "state" must be declared.
...

If I rename the po.xsd file so it can't be found, then xerces does not
complain,
so it appears it is not looking for the specified file.  What am I
missing?

It appears clear to me that the xerces parser only recognizes the 1999
spec
and I am guessing that the xsv only recognizes the 2000/10 spec.

<MJG>
I've not done much with Xerces yet so I don't know how they locate schemas,
I'll try and look at this over the next week and get back to you. You are
correct that at the moment Xerces supports the
http://www.w3.org/1999/XMLSchema namespace and not
http://www.w3.org/2000/10/XMLSchema. Apache have said that they will update
Xerces but it is likely to take them a few months to do so.

XSV has support for both http://www.w3.org/1999/XMLSchema ( which you will
find at[1] ) and http://www.w3.org/2000/10/XMLSchema ( which you will find
at[2] )
</MJG>

I have created a bunch of test cases using both the sample that comes
with
xerces (personal-schema.xml + personal.xsd) and the po.xml + po.xsd that

I copied from the primer.  I have created both 1999 and 2000 versions of
each
and still have been having poor luck getting much to validate properly.

All of these are in http://www.realtime.net/~mburns/

1999 spec samples:

http://www.realtime.net/~mburns/po99.xml + po99.xsd                   <-
copied from primer and changed to 1999 format
http://www.realtime.net/~mburns/personal-schema.xml + personal.xsd
< - works with xerces as expected
http://www.realtime.net/~mburns/personal-schema-fails.xml + personal.xsd
<- xerces reports errors as expected

2000 spec samples:

http://www.realtime.net/~mburns/po.xml + po.xsd                   <-
copied from primer
http://www.realtime.net/~mburns/personal-schema-2000.xml +
personal-2000.xsd
http://www.realtime.net/~mburns/personal-schema-2000-fails.xml +
personal-2000.xsd

Those last two I can get to run through xsv but don't understand the
results,
it does not seem to be validating and I don't understand why not:

<MJG>
The schema at http://www.realtime.net/~mburns/personal-2000.xsd is invalid
per http://www.w3.org/2000/10/XMLSchema which I think explains why you are
getting the output you are from XSV.

You need to amend the anonymous simpleType definition inside the attribute
'contr' inside the anonymous complex type inside the element decl 'person'
to read as follows;

    <simpleType>
      <restriction base='string'>
        <enumeration value='true'/>
        <enumeration value='false' />
      </restriction>
    </simpleType>

I'm also curious as to why you didn't just use a restriction of boolean
here?
</MJG>


- - - - - - - - snip - - - - - - - - - -

Schema validating with XSV 1.166/1.77 of 2000/09/28 15:54:50

 Target: http://www.realtime.net/~mburns/po.xml
   (Real name: http://www.realtime.net/~mburns/po.xml
    Length: 1106 bytes
    Last Modified: Wed, 11 Oct 2000 15:28:00 GMT
    Server: Apache/1.2.5)
 docElt: {None}purchaseOrder
 Validation was strict, starting with type {None}:PurchaseOrderType
 schemaLocs: None -> po.xsd
 The schema(s) used for schema-validation had no errors
 instanceAssessed: true
 No schema-validity problems were found in the target




Low-level XML well-formedness and/or validity processing output


Warning: Document has no DTD, validating abandoned
 (detected at end of prolog of document
http://www.realtime.net/~mburns/po.xsd)




Schema resources involved

Attempt to import a schema document from
http://www.realtime.net/~mburns/po.xsd for no namespace, succeeded

/- - - - - - - - snip - - - - - - - - - -

At least it seems to be finding the po.xsd, but this tells me that did
not validate the schema.

<MJG>
Ignore the "Warning: Document has no DTD, validating abandoned " this just
means that the underlying *XML* parser did not validate the schema ( or the
instance ) against a DTD. Look at the "The schema(s) used for
schema-validation had no errors " message to figure out whether your schemas
are OK or not.
</MJG>

<SNIP/>

I would also like to have a copy of xsv locally so I can use it offline.

There was a hint that it was possible but not easy.  I got ltxml to
build locally,
what else is required for xsv?

<MJG>
You need to install Python 1.6[3] and then download the CVS source tree for
XSV[4]. You can run it with

    python <pathtoxsvsource>\applyschema.py -s <stylesheet> <instance>
<schema>
</MJG>

Thanks for all your help.

<MJG>
NO worries...

Regards

Martin


[1] http://www.w3.org/2000/06/webdata/xsv
[2] http://www.w3.org/2000/09/webdata/xsv
[3] http://www.python.org
[4] http://dev.w3.org/cvsweb/xmlschema/
</MJG>
Received on Tuesday, 17 October 2000 12:54:41 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:14:47 UTC