FW: XML Question

> -----Original Message-----
> From:	Babich, Alan [SMTP:ABabich@filenet.com]
> Sent:	May 18, 1998 10:32 AM
> To:	'Jim Davis'; Babich, Alan
> Subject:	RE: XML Question
> 
> Thanks very much for answering my question.
> Unfortunately, I don't have a parser available either.
> Since I e-mailed you, I have found out that
> that was not quite correct syntax. For one thing,
> you can't have commas on the same level with vertical
> bars. You have to enclose the alternatives in 
> parentheses to push the commas to a lower level.
> Thus, 
> <!ELEMENT foo  a, b | c , d >
> is incorrect. Correct is:
> <!ELEMENT foo (a , b) | (c, d) >
> 
> I have understood that XML DTD wasn't BNF for a
> while now. I originally tried to figure out how to express
> the DTD for the beginning and ending tags, so I could
> control what tags appeard in the actual queries. I soon found 
> out that was impossible. The beginning and ending tags
> are implicit -- you get tags every time you define a new 
> identifier. So, I immediately stopped being the least bit 
> concerned about the extra tags that appear. XML DTD
> syntax provides no reasonable way to control their
> appearance. So, in my opinion, one then makes a virtue 
> out of necessity. One uses the syntax the way it
> was apparently designed to be used, i.e., do the
> appropriate syntactic factoring, thereby generating some
> intermediate tags as a side effect.
> 
> After thinking about it some more as a result of your
> e-mail, I still think the right thing to optimize is the 
> compactness and readability of the DTD,
> not eliminating the extra intermediate tags. The
> queries will be generated programmatically, for the
> most part, anyway. It feels wrong to me instinctively
> to eliminate the intermediate tags. Objectively,
> the approach doesnt' scale when you have 
> polymorphic n-ary operators.
> 
> For example, let's take your proposal farther. 
> You can see how many cases
> there are for just the six binary relational operators.
> (You even mentioned computer generation of the DTD
> grammar, which also feels wrong to me.)
> But, we already have n-ary operators (AND and OR),
> and we want more in the future. AND and OR aren't
> a big problem, because they aren't polymorhpic.
> (You only have 36 alternatives.)
> However, take addition of numbers, for
> example. Suppose you want to have a polymorphic
> n-ary addition operator. With integer expressions
> already defined in my current proposal as integer_expr, 
> it's, simple, clear, easy, and compact:
> 
> <!ELEMENT add ( integer_expr , integer_expr+ ) |
>                          ( integer_expr , real_expr+ ) |
>                          ( real_expr , real_expr+ ) |
>                          ( real_expr , integer_expr+ ) >
> 
> That's the whole increment to the DTD, given my proposal.
> 
> Now consider your proposal. The
> cases explode exponentially, and, no matter what,
> you have to stop at some arbitray n, because
> if you don't, the DTD is inifinitely long. Assuming
> you have integer_op, integer_prop, integer_const,
> real_op, real_prop, real_const. Then you have 6 base
> elements. For 2-ary addition, you have 6 squared
> cases. Then you add the alternatives for
>  3-ary addition, which has 6 cubed
> cases. Then you add 4-ary addition, which has
> 6 to the fourth power cases, etc. You can see
> that the DTD for addition would make
> the spec so large for even for modest
> value of N, that it would  be totally unacceptable
> to me, at least.
> 
> Thus, if you try top optimize the wrong thing,
> (i.e., try to eliminate a few intermediate tags instead of 
> making the DTD grammer general, extensible,
> simple, easy, and compact),
> you discover that approach doesn't scale when you have
> n-ary polymorphic operators.
> 
> Alan Babich
> 
> 
> > -----Original Message-----
> > From:	Jim Davis [SMTP:jdavis@parc.xerox.com]
> > Sent:	May 17, 1998 2:00 PM
> > To:	Babich, Alan
> > Subject:	Re: XML Question
> > 
> > At 04:17 PM 5/7/98 PDT, you wrote:
> > >	Jim: 
> > >
> > >	Is the following legal XML DTD syntax?
> > 
> > Sorry to take so long to be able to reply.
> > 
> > It looks legal to me.  I don't have a DTD-processor to check.
> > 
> > I have some reactions to it though
> > 
> > 1) This syntaxc makes for quite cumbersome queries.  An example
> would
> > be
> > 
> > <where>
> >  <boolean_op>
> >   <eq>
> >     <float_expr>
> >       <float_prop>
> >        <prop>SOME PROP</prop>
> >       </float_prop>
> >     </float_expr>
> >     <int_expr>
> >      <int_const>74</int_const>
> >     </int_expr>
> >    </eq>
> >  </boolean_op>
> > </where>
> > 
> > I think we could get a grammar where the equivalent XML would be
> > 
> > <where>
> >  <eq>
> >    <float_prop>SOME PROP</float_prop>
> >    <int_const>74</int_const>
> >  </eq>
> > </where>
> > 
> > This makes the grammer more complicated, but the XML expressions
> > simpler.
> > 
> > For example, where eq is now defined to take just six possible
> > combinations
> > of tags (e.g. int_expr, int_expr) it would take far more.  I can't
> > calculate easily the total number, but it would be at least
> > 
> > <!ELEMENT eq (float_prop,   float_prop |
> >               float_prop,   float_const |
> >               float_prop,   int_prop     |
> >               float_prop,   int_const     |
> >               float_const,  float_prop |
> >               float_const,  float_const |
> >               float_const,  int_prop     |
> >               float_const,  int_const     |
> >                 ... likewise for int_prop and int_const
> > 
> > Luckily this grammar can be produced by mechanical processes.
> > 
> > Likewise where would be defined to take explitly all the know
> boolean
> > valued expressions, e.g.  and, or, not, gt, ge, eq, lt, le plus
> > bool_const,
> > bool_prop)
> > 
> > By the way, I don't know whether this applies to you or not, but
> when
> > I
> > first started writing XML grammars I was always getting tripped up
> by
> > thinking of them as BNF.  But unlike BNF, these DTD define
> > compositions of
> > tags not concatenations of strings, so ever level of nesting results
> > in an
> > extra level of syntax in the XML.  Until you get used to this, it's
> > easy to
> > write a DTD that would be clean and elegant as context free grammar
> > but
> > results in a tedious XML syntax on the wire.  Does this make sense?
> > 
> > best regards
> > 
> > Jim

Received on Monday, 18 May 1998 15:13:04 UTC