W3C home > Mailing lists > Public > xmlschema-dev@w3.org > December 2005

Re: SV: SV: SV: SV: Schema help

From: Jack Lindsey <tuquenukem@hotmail.com>
Date: Sun, 04 Dec 2005 16:33:20 -0500
Message-ID: <BAY102-F3243F7FD4934363D6BC6D0D74E0@phx.gbl>
To: noah_mendelsohn@us.ibm.com, brs@itst.dk
Cc: mike@saxonica.com, petexmldev@tech-know-ware.com, xmlschema-dev@w3.org

Schematron expresses validation rules in machine-readable form as XPath 
expressions.   W3C Schema already uses XPath expressions for identity 
constraints.  There is not a principle at stake here, is there?

Do you not endorse reuse and the orthogonal alignment of W3C offerings?

Tim's RDF/Java example does not seem to be about the degree to which you 
should use the features of a language but how the objectives of a language 
can fundamentally impact your flexibility when you try to repurpose data 
instances expressed in them.  I am not seeing the parallel with validation.  
can you please enlighten me?

Jack Lindsey

>From: noah_mendelsohn@us.ibm.com
>To: Bryan Rasmussen <brs@itst.dk>
>CC: "'Michael Kay'" <mike@saxonica.com>,        
>"'petexmldev@tech-know-ware.com'" <petexmldev@tech-know-ware.com>,        
>"'xmlschema-dev@w3.org'" <xmlschema-dev@w3.org>
>Subject: Re: SV: SV: SV: SV: Schema help
>Date: Fri, 2 Dec 2005 19:02:01 -0500
>Should Schematron be seriously considered?  Absolutely.  It has many
>attractive qualities and seems to be doing well for at least some users.
>Is it such an obvious choice that we should rush it?  I don't think so.
>Picking up on just one point mentioned in this thread...
>Bryan Rasmussen writes:
> > Given that there is likely to be some argument in W3C as to how
> > far such constraints should be implemented I doubt they will
> > come out as powerful as Schematron constraints
>Probably true, but that doesn't necessarily make the Schematron approach
>the best.  I think we need to also consider Tim Berners-Lee's Principle of
>Least Power [1].  Since what Tim has written on this is just a few
>paragraphs, I'll quote them all here:
>"In choosing computer languages, there are classes of program which range
>from the plainly descriptive (such as Dublin Core metadata, or the content
>of most databases, or HTML) though logical languages of limited power
>(such as access control lists, or conneg content negotiation) which
>include limited propositional logic, though declarative languages which
>verge on the Turing Complete (PDF) through those which are in fact Turing
>Complete though one is led not to use them that way (XSLT, SQL) to those
>which are unashamedly procedural (Java, C).
>The choice of language is a common design choice. The low power end of the
>scale is typically simpler to design, implement and use, but the high
>power end of the scale has all the attraction of being an open-ended hook
>into which anything can be placed: a door to uses bounded only by the
>imagination of the programmer.
>Computer Science in the 1960s to 80s spent a lot of effort making
>languages which were as powerful as possible. Nowadays we have to
>appreciate the reasons for picking not the most powerful solution but the
>least powerful. The reason for this is that the less powerful the
>language, the more you can do with the data stored in that language. If
>you write it in a simple declarative from, anyone can write a program to
>analyze it in many ways. The Semantic Web is an attempt, largely, to map
>large quantities of existing data onto a common language so that the data
>can be analyzed in ways never dreamed of by its creators. If, for example,
>a web page with weather data has RDF describing that data, a user can
>retrieve it as a table, perhaps average it, plot it, deduce things from it
>in combination with other information. At the other end of the scale is
>the weather information portrayed by the cunning Java applet. While this
>might allow a very cool user interface, it cannot be analyzed at all. The
>search engine finding the page will have no idea of what the data is or
>what it is about. This the only way to find out what a Java applet means
>is to set it running in front of a person.
>I hope that is a good enough explanation of this principle. There are
>millions of examples of the choice. I chose HTML not to be a programming
>language because I wanted different programs to do different things with
>it: present it differently, extract tables of contents, index it, and so
>I think we need to consider the choice of constraint mechanisms from this
>perspective too.  At least in principle, the ideal would be something just
>powerful enough, but no more.  I know that Schematron is sometimes
>implemented on top of XSLT, and full XSLT is much too powerful for me to
>be comfortable using it as part of validation.  I would like to study
>whether Schematron per se may be more appropriately limited, and I have
>not yet looked into it in detail.  Perhaps someone who knows Schematron
>better than I do can enlighten me?
>[1] http://www.w3.org/DesignIssues/Principles.html#PLP
>Noah Mendelsohn
>IBM Corporation
>One Rogers Street
>Cambridge, MA 02142
>Bryan Rasmussen <brs@itst.dk>
>11/18/2005 03:31 AM
>         To:     "'noah_mendelsohn@us.ibm.com'"
>         cc:     "'petexmldev@tech-know-ware.com'"
><petexmldev@tech-know-ware.com>, "'xmlschema-dev@w3.org'"
><xmlschema-dev@w3.org>, "'Michael Kay'" <mike@saxonica.com>
>         Subject:        SV: SV: SV: SV: Schema help
>Well on the subject of co-occurence constraints I would just like to
>reiterate what I said earlier, with some extension:
>Given that there is likely to be some argument in W3C as to how far such
>constraints should be implemented I doubt they will come out as powerful
>Schematron constraints, furthermore I have a hard time seeing this as
>producing a syntax as nice as Schematron, therefore I would really like to
>see something like:
>1. XML Schema adopts Schematron as an extension language of some sort.
>2. XML Schema puts some thought into how Schematron can be combined with
>Schema to the benefit of both, beyond the normal  method of drop
>in appinfo.
>I have some ideas on #2, but I'm somewhat conflicted about them - what
>makes sense, syntax etc. so I don't really want to just blurt out with it.
>I'd be more interested in hearing what kinds of things other people could
>see connecting the two languages.
>Bryan Rasmussen
>-----Oprindelig meddelelse-----
>Fra: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com]
>Sendt: 17. november 2005 18:51
>Til: Bryan Rasmussen
>Cc: 'Michael Kay'; 'petexmldev@tech-know-ware.com';
>Emne: Re: SV: SV: SV: Schema help
>Well, I think there are good reasons from time to time to revisit the
>effectiveness of the W3C process and the compromises embodied therein. I'm
>not convinced that a deep dive on that is the best use of this particular
>mailing list.   I happen to like the working groups I've been on that do
>their work in public (in my case, both the TAG and XMLP) and I'd be happy
>for schema to go the same way.  Then again, I really don't think that's a
>substitute for having people who have 30% of their time committed to
>working on a technology.  There's a lot of detail work and care required
>to revise a specification even if there's agreement on the general ideas.
>The discussions need to involve people who have the knowledge and the time
>commitment to work through interactions with existing features of the
>specification.  In the case of co-constraints, it would seem to me that
>there ought to be a careful look taken at the relationship between the
>existing key/keyref/unique constraint mechanisms and anything new that's
>proposed.  It would be nice to believe that we wouldn't just be sprouting
>new and uncoordinated ways of doing things every few years.
>So, I personally welcome broader input, but what we're really short of are
>the people who can edit the specification text, draft prose, be
>responsible for the details, etc.  Of course, there are also lots of other
>messy issues to consider when you change the working mode of a group
>including anti-trust laws in various jurisdictions, IP issues, etc.  If
>people feel that they have ideas for how the W3C can do these things
>better, I think the right place to go would be to the W3C staff and/or the
>workgroup chairs.  I personally would not be against having the schema WG
>switch to using a public mailing list for its discussions.  I suspect that
>requires a recharter, but in principle I'm fine with it.  I don't think
>that will solve much of our resource problems.  We don't lack for people
>with good ideas, in email or in person.  We're missing the people to do
>the archticture and drafting work that goes into making all the details
>fit together.   It's hard to do that well without meeting F2F from time to
>Noah Mendelsohn
>IBM Corporation
>One Rogers Street
>Cambridge, MA 02142
>Bryan Rasmussen <brs@itst.dk>
>Sent by: xmlschema-dev-request@w3.org
>11/17/2005 04:59 AM
>         To:     "'Michael Kay'" <mike@saxonica.com>
>         cc:     "'xmlschema-dev@w3.org'" <xmlschema-dev@w3.org>,
>"'petexmldev@tech-know-ware.com'" <petexmldev@tech-know-ware.com>, (bcc:
>Noah Mendelsohn/Cambridge/IBM)
>         Subject:        SV: SV: SV: Schema help
>Damn, an earlier typo in the email address of Pete Cordell added in by me
>was replicated in your email. Just on the off chance this thread goes any
>further I thought I should correct it. I've cc'ed Pete on this mail. Sorry
>for the problem.
>Bryan Rasmussen
>-----Oprindelig meddelelse-----
>Fra: Michael Kay [mailto:mike@saxonica.com]
>Sendt: 17. november 2005 10:47
>Til: noah_mendelsohn@us.ibm.com; Bryan Rasmussen
>Cc: xmlschema-dev@w3.org; ',petexmldev@tech-know-ware.com'
>Emne: RE: SV: SV: Schema help
> > 1) Although most widely used schema validators are fairly
> > slow, one can in
> > fact implement the XML schema rules at quite high speed.  My team is
> > hoping to publish some work in that area in coming months,
> > and I suspect
> > that others in the industry are working in the same
> > direction.  I think
> > it's important to the success of any technology we choose
> > that it be able
> > to meet the performance needs of our customers.
>I would resist this kind of thinking. SQL was successful because it put
>functionality first, and left implementors to devise optimisation
>strategies. Users need a constraint language that is capable of expressing
>arbitrary constraints on the content of a document, and it should be left
>the implementor to work out which of these constraints can be evaluated in
>streaming mode and which can't.
>SQL today allows the full power of the query language to be used to
>integrity contraints, and users learn when they need to restrict their
>ambitions to meet performance requirements. 90% of applications aren't
>performance critical anyway.
>There's no point telling users to go and use some other technology to do
>their validation, the other technology isn't going to be fast either.
>Michael Kay

Take advantage of powerful junk e-mail filters built on patented Microsoft® 
SmartScreen Technology. 
  Start enjoying all the benefits of MSN® Premium right now and get the 
first two months FREE*.
Received on Sunday, 4 December 2005 21:33:26 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:56:09 UTC