Re: SOAP 1.1 Validator - predefined-entity counter questions from Dave Winer on 2001-02-09 (xml-dist-app@w3.org from February 2001)

From: Dave Winer <dave@userland.com>
Date: Fri, 9 Feb 2001 13:34:52 -0800
To: "Daniel Barclay" <Daniel.Barclay@digitalfocus.com>, <xml-dist-app@w3.org>
Cc: <SOAP@DISCUSS.DEVELOP.COM>
Message-ID: <20de01c092e0$23e53f50$33a1dc40@murphy2>

Daniel:

This is really embarassing for two reasons:

1. That test is not actually performed by the validator. In other words it's
documented but not implemented.

2. The explanation is unclear as you indicate. The purpose of the test is to
find out of your XML parser is properly encoding and decoding potentially
troublesome characters. The wording is not accurate.

Dave


----- Original Message -----
From: "Daniel Barclay" <Daniel.Barclay@digitalfocus.com>
To: <xml-dist-app@w3.org>
Cc: <dave@userland.com>; <SOAP@DISCUSS.DEVELOP.COM>
Sent: Friday, February 09, 2001 1:10 PM
Subject: SOAP 1.1 Validator - predefined-entity counter questions


> Dave,
>
> I'm a little confused about one of the tests performed by your
> SOAP validator at http://soap.weblogs.com/validator1.
>
> The description of the first test (countTheEntities)
> (at http://soap.weblogs.com/validator1#countTheEntities) says:
>
>   validator1.countTheEntities (string) returns struct
>
>              This handler takes a single parameter, a string, that
contains any number of
>              predefined entities, namely <, >, &, ' and ".
>
>              Your handler must return a struct that contains five fields,
all numbers:
>              ctLeftAngleBrackets, ctRightAngleBrackets, ctAmpersands,
ctApostrophes,
>              ctQuotes.
>
>              To validate, the numbers must be correct.
>
>
> 1.  When you say that the string contains some "predefined entities,
namely <, >, &,
>     ... ", what exactly do you mean?  (The character "<" is not a
predefined entity,
>     nor even a reference to a predefined entity.)
>
>     Do you mean that the XML source string contains predefined entity
_references_
>     that specify those characters (e.g, "&lt;", "&gt;", etc.)?
>
>     Or do you mean that the XML source string contains some representation
of the
>     characters "<", ">", etc. (the characters for which there are
predefined
>     entities), regardless of which representation is used (predefined
entity
>     references, any other entity references, character references, or
CDATA
>     sections)?
>
>
> 2.  If you do mean entity references to predefined entities, it that
actually
>     information that a SOAP application can count on getting from XML
processors?
>
>     Entity references are not part of the XML Information Set, right?
>
>     The XML Information Set draft (at http://www.w3.org/TR/xml-infoset/)
says,
>     in section 1 [emphasis mine]:
>
>         "An information set describes its XML document *with entity
references
>         already expanded*, that is, represented by the information items
>         corresponding to their replacement text."
>
>     Additionally, Section 2.5 says that any validating parser will not
report
>     references to entities [emphasis mine]):
>
>         2.5. Unexpanded Entity Reference Information Items
>
>               XML Definition: Section 4.4.3, Included If Validating
>
>             A unexpanded entity reference information item serves as a
>             place-holder by which an XML processor can indicate that it
has
>             not expanded an external parsed entity.  ...  *A validating
XML
>             processor, or a non-validating processor that reads all
external
>             general entities, *will never generate* unexpanded entity
reference
>             information items for a valid document.*
>
>    So, if you did mean to count entity references, this all seems to
indicate
>    that a SOAP processor using a validating XML parsers can *never* pass
your
>    first test.
>
>    Am I missing anything?
>
> Daniel
> --
> Daniel Barclay
> Digital Focus
> Daniel.Barclay@digitalfocus.com

Received on Friday, 9 February 2001 16:36:02 UTC