Comments on 12 April 2011 draft of
XML processor profiles
C. M. Sperberg-McQueen, Black Mesa Technologies LLC
15 April 2011
This document comments on the Last Call draft of 12 April 2011 of
the
XML processor profiles
specification (hereinafter XPP). The comments are the sole
responsibility of the author, who is not here speaking for any other
persons or organizations.
I have tried to organize my comments into distinct observations
that can be dealt with separately, but I have been unable to
make them wholly discrete and orthogonal.
For each topic I have tried to indicate whether I believe it to
be a substantive or an editorial point, but the distinction is
not always helpful and I have not tried to apply it in all cases.
Trivial points of style and spelling are gathered together at the
end.
1. Choice of facets for characterizing processors
Any formulation of profiles for a specification chooses to clump
some things together and to separate others, by placing those things
(here, XML processors) either into the same class (processors
satisfying a given profile) or into different classes. By defining
profiles in terms of certain chosen criteria and not in terms of other
criteria, any such spec makes an implicit choice among the infinite
number of facets that could be used for characterizing the objects
being classified. It is almost always helpful if that choice is made
explicit rather than implicit. The current draft leaves the choice
unexplained, its rationale unarticulated.
The introduction should describe at least briefly the rationale for
the choice made and characterize both the dimensions along which the
profiles defined here distinguish among processors and also some of
the more obvious dimensions along which the profiles do
not distinguish processors. At the very least, the spec
needs to acknowledge explicitly that some properties have been left
out of account instead of being made the basis for defining different
profiles, for example by saying explicitly that propertes P, Q, and R
are not taken into consideration in defining the profiles.
For example: in many kinds of practical work the most important
characteristic among processors for any given programming language is
probably the distinction between processors with event-based and those
with tree-based interfaces, as exemplified by the difference between
the SAX and DOM interfaces. A reader of XPP might plausibly expect,
therefore, that if XPP is intended for practical use, that distinction
will show up in the definition of profiles, in order to allow the
profiles to provide a useful characterization of processors. What
should such a reader infer from the failure of XPP even to mention
this distinction? That you did not think the distinction important?
That you thought it was not practically relevant? That it is too
difficult to specify cleanly? Or that it didn't occur to you? Right
now the text of the spec is compatible with each of these inferences;
I think that if you explain your choice you may be able to elimimate
at least the last one.
N.B. this issue is not about the choice of facets (concerning which
there are other comments elsewhere) but about the need to say clearly
that a choice has been made and to indicate why some facets were
chosen and others not. Improving the choice of facets, as I hope you
will, will not excuse you from the need to justify the choice, or at
least identify and explain it.
Suggested fix: explicitly acknowledge that XPP
involves a choice among possible ways of characterizing processors;
identify the processor properties used as the basis for the
classification proposed and identify at least some potential
properties which are not used in the classification. Explain the
basis for the choice.
2. Respect for the stand-alone declaration
It would be helpful, I think, for the processor profiles
to distinguish more carefully the different behaviors possible
with regard to the stand-alone declaration in the input
XML document.
- All declarations are read and handled appropriately,
so documents with standalone='no' are processed without
information loss.
- No external declarations are read if standalone='yes';
if standalone='no' then external declarations are read, so all
documents are processed without information loss.
- No external declarations are read; if standalone='yes',
the document is processed without information loss, and
if standalone='no', the processor signals an inability to
process the document without the possibility of information
loss.
- No external declarations are read, so documents with
standalone='yes' are processed without information loss, and
information will typically be lost in the processing of documents
with standalone='no'. (Since documents may have standalone='no'
even if standalone='yes' would be permitted, there can be cases
where no information is lost in practice.)
In particular, it would be helpful for users of XML and
for writers of specifications for XML-based processing to
distinguish the last case from the others, in order to exclude
it.
Suggested fix: augment the basic profile to require
either that external declarations be read when necessary or that the
processor signal an inability to handle non-standalone documents
properly. Optionally also keep the profile now called basic, giving
it a new name (personally, I could go for “sub-optimal”, but some
people might think that that name was ungenerous).
3. Validating processors
Why do none of the defined profiles include validation? Are there no
validating XML parsers? Or is it the view of the authors that in
characterizing an XML processor it is unimportant whether the
processor performs validation or not?
This reader particularly objects to the use of the word
recommended to name a processor profile which
does not involve validation of the input. It might be better to avoid
the value judgement intrinsic to the word. But if you are going to
make the value judgement, then I think you should make the correct
value judgement, which is that validation is a more valuable service
than non-validating well-formedness checking.
Suggested fix:
Define at least one profile for validating processors.
Either eliminate the name recommended
or reserve it for validating processors.
4. Definitions of terms
The specification has a section on terminology; I think this is
helpful. It could be made more helpful if the terminology section
were more systematic.
A section on terminology can help the reader of a spec by
identifying the key concepts of the specification and defining the
terms used to denote them. It can help the authors of the spec by
forcing them to attempt a coherent statement of what they mean by a
given term; the effort of providing a formal definition is often
repaid by the discovery of inconsistencies or infelicities in the
authors' understandings of key concepts. In the current draft,
however, the terminology section does not have much success with
either of these tasks. Some important concepts appealed to in the
current draft are not defined at all; others are hyperlinked to other
specs but the definitions of the terms are not repeated in the current
document, which makes the document unnecessarily hard to read.
Among the terms that ought to be defined, since they are
crucial to the intellectual work of the spec, are these.
- XML processor (the definition
given in the XML spec is quoted in section 1; it might usefully
be given again in the terminology section)
- rigid (used to characterize
mappings from XML documents to data model)
- profile (what is
a profile? How is it distinguished from a thing which
is not a profile? Is my cup of coffee an XML processor
profile? Why not?)
- processor profile (ditto)
- data model (in addition to
specifying what is meant by this term in XPP, the spec
should probably also take a little more care to distinguish
data models and data model instances)
- faithful provision
- expose (is faithful provision the
act of exposing? or is it possible to expose information
in a way that does not constitute faithful provision?
faithless provision?)
- construction (esp. of data
models)
- identification as IDs (esp. of
xml:id attributes)
- reading (esp. of external markup
declarations)
- processing (esp. of XML documents and
of external markup declarations)
- packaging (of information; is packaging
the same as faithful provision and/or exposure?)
- provide (of information items and
properties; identical to or different from packaging? faithful
provision? exposure?)
- implementation-defined
It is not a coincidence that defining these terms well will require
clarity in the central concepts they denote. If the concepts are
already sufficiently clear to the authors of the spec, you owe it to
your readers to share that clarity with them. If they are not
currently clear enough, then you owe it to the potential users of your
spec to sharpen them; muddy concepts lead to poor designs.
A careful, explicit definition of
profile will
make it easier for readers to see whether the profiles described
later are well defined or note, and easier to judge their
utility. It might also help the authors of the spec improve the
specification of the profiles.
Among the terms that are appealed to but not defined locally are
- well-formed
- namespace well-formed
Suggested fix:
Use the terminology section to provide explicit definitions for
your key terms. Use the exercise of defining the terms to
clarify your key concepts. Revise the rest of the spec to
reflect the increased clarity of the analysis.
5. Are the profiles disjoint?
Is it intended that every processor conform to at most one profile
from among those defined here? Or is the set of processors conforming
to the minimum profile intended to be a superset of those conforming
to the basic profile?
It would be helpful to say, one way or the other.
Suggested fix: Say, one way or the other.
6. Identification of xml:id attributes as IDs
In 2.1, 2.2, and 2.3, what does it mean to “[identify] all
xml:id attributes as IDs as required by [xml:id Version
1.0]”? How does one distinguish between a processor which
satisfies this requirement and one which fails to satisfy it?
Is identification of xml:id attributes as
IDs distinct from “processing” them as IDs? As far as I
can tell, it's much easier to find rules in the xml:id spec that
define processing than to find rules that define identification
of attributes as IDs.
Suggested fix: define identification as
an ID as it is used here. (Since [attribute type] is a
class-A property, I think that identifying an attribute as an ID
simply means that attributes named xml:id are given
[attribute type] = ID. If that is what you mean, say so. Do not
assume that the reader of XPP has perfect recall of the full text of
the Infoset and xml:id specs.)
7. Processing of external declarations
Sections 2.3 and 2.4 specify “reading and processing all external
markup declarations”; what kind of processing is involved? Is
there a difference between reading and processing? (The fact that
both are mentioned may suggest that there is; the spec's habit of
using three words where one would do may suggest that there is not.)
Suggested fix: Define reading
and processing, as they apply to external
markup declarations. If they are synonymous, drop one of them.
8. Providing information items
Section 3 describes class A as “Items and properties which must
be provided by all profiles.”
This reader finds the relative clause confusing. In ordinary
English, I would have thought that XML processors provide access to,
or expose, the information in the XML document. The words quoted
suggest that it is not processors but
profiles that provide (access to) the information, which
in turn persuades me that you are either using one or more of the
terms item, property,
provide, and profile in
some special sense distinct from ordinary-language usage, or that you
are using words too carelessly. If nothing else, this sentence seems
to underscore the desirability of defining terms explicitly.
Suggested fix: Decide what it means to provide
information, and whether the act is performed by processors or by
profiles, or (by a kind of type overloading) both. Define the
relevant terms. Recast the spec to use the terms as defined.
9. Data models and information sets
At several points, the words of the spec suggest that XPP believes
itself to be in the business of talking about how processors map from
sequences of characters matching the document production
in the XML spec to data models (or more probably instances of data
models). But nothing in the spec actually talks about anything that
seems recognizable as a data model (in my understanding of any of the
many ways term is normally used); the most visible differences among
the profiles have to do with how much information they provide, and in
some cases with whether particular items in the input are
characterized in accordance with their declarations or not.
Suggested fix: Either define what you mean by
construction of a data model and recast the
definitions of the profiles so that they do in fact say how it is to
be done, or stop talking about data models and admit that XPP is about
which subset of the defined infoset is provided by processors.
10. Rigidity
Section 1 reads in part
Such definitions [of XML
applications] have suffered to some extent from an uncertainty
inherent in using that kind of foundation, in that the mapping XML
processors perform from XML documents to data model is not rigid.
This way of putting things seems to convey an implied rebuke to the
authors of the XML specification and to reflect an assumption that
they were trying to define a rigid mapping, but that they failed. The
rebuke may be warranted (although I think the failure to specify the
information a parser must provide is more clearly an error than the
decision not to specify a data model), but the historical assumption
is false. I think it would be historically more accurate to say that
the authors of the XML specification intended the XML specification to
be compatible with a wide variety of data models and processor
interfaces and that the XML specification, as a result, provides a
flexibility and generality that the authors of other specifications
have not always found helpful. In search of flexibility, the XML spec
leaves to other specifications some responsibilities which,
empirically, other specifications have not always bothered to
discharge.
This is primarily a rhetorical issue, but it seems important
because it touches on the purpose of this specification and the task
it sets itself to solve.
Suggested fix: If you think the XML spec screwed up,
say so cleanly and say how XPP proposes to mitigate the damage. If
you think the XML spec got it right but later users of XML have
screwed up by not meeting their responsibilities, then say so cleanly
and say how XPP proposes to mitigate the damage. If you don't think
anyone screwed up, strictly speaking, but the situation can still be
improvied, then find a non-pejorative characterization of how the
current situation arose and explain in neutral terms how it can be
improved.
11. Relation of profiles to current practice
When profiles are defined for a new specification, they involve
predictions about which kinds of variation in processor behavior are
likely to be interesting and useful to developers and users. In the
case of new specifications, there is no existing practice that could
be appealed to as a justification of the classification or profiles,
or to provide examples of software fitting one profile or another.
That is not the case here, and I think the specification should not
progress until an empirical survey of existing processor
characteristics is performed, as a simple way of field-testing the
profiles defined here for applicability in the real world and of
clarifying the intent of the profiles by providing examples, where
applicable, of existing interfaces or processors that satisfy the
profile.
In particular, I could have sworn (but I am too lazy to look it up
now) that I had used some parser interfaces which did not provide
access to namespace prefixes, and other interfaces which provided only
inconvenient access to namespace names. Is a set of profiles which
assumes that namespace name, local name, and prefix are always all
three provided a good match for a world in which some parser
implementors give their users a choice (prefix plus local name or
namespace name plus local name)?
Note that actually classifying real parsers will require a crisp
definition of what it means to make a particular information item
available; that will be a good thing, although it is likely to involve
some work.
Suggested fix: Identify ten or twenty existing XML
processors with different behaviors (for purposes of this exercise,
all conforming SAX processors may well turn out to be alike; ditto for
conforming DOM parsers). Using the definitions given in XPP, identify
which profile(s) each parser matches, if any. If there are
significant numbers of parsers which match no profile, consider
whether the profiles need to be revised to provide a better connection
with existing practice. Use a non-normative document to provide
examples of processors matching the different profiles.
12. Implementability of the spec
The status section says
[T]his specification is not implementable
as such ....
This makes no sense.
The specification defines processor profiles for XML processors.
On the face of it, it seems entirely possible for XML processors to
make meaningful claims to having implemented one or more of the
profiles defined here — not only meaningful, but desirable as a
way of simplifying communication between software provider and
software user. If a vendor claims that an XML parser provides an
interface which matches the modest processor profile
defined here, it would (or so it seems) be quite possible to put that
claim to the test and decide rationally whether the claim is true or
not. In what sense, then, is this spec not implementable?
The status section says, further down, that XPP “is intended for
use by other specifications which themselves define one or more XML
languages”. I take that to mean that the idea is that just as XSD,
for example, now specifies that its input documents must be exposed in
a way that makes certain infoset properties available, specs might in
future say that their upstream processor must conform to this or that
processor profile. But if XPP denies that an XML processor can
implement XPP, it must follow that the processor cannot conform to
XPP, or to any profile defined by XPP. So how can your intended
clients use XPP to characterize the class of processors they
require?
As things stand, they cannot; XPP provides no conformance rules for
XML processors, so it is not in fact very useful as a tool for
identifying classes of processors.
Suggested fix:
Remove the claim that XPP is not implementable.
13. Conformance clause
The conformance clause suggests that other specifications can make
use of this one by using words such as “Conforming implementations
must construct input data models from XML documents as required by the
recommended XML processor profile.” There are at least two problems
with this formulation.
First, as noted elsewhere, XPP does not in fact define what a data
model is or how to construct one. So it's hard to see just how to
construe the phrase “construct input data models from XML documents
as required by the ... profile”.
Second, requirement is merely a name used in
specification writing to denote criteria which must be satisfied by an
objects making true claims of conformance to a specification. (To
quote the ISO/IEC Directives for the structure and drafting of
international standards, a requirement is an “expression in the
content of a document conveying criteria to be fulfilled if compliance
with the document is to be claimed and from which no deviation is
permitted”.) If a spec does not define conformance for a given
class of objects, it follows logically that the spec does not (and
logically cannot) define requirements in this sense for objects of
that class.
Perhaps you are using
requirement in some
other sense? But no, section 1.1 specifies that you use the word
“as described in [RFC 2119]”. RFC 2119 unfortunately provides
only an implicit definition for the term
requirement, which has the unfortunate
additional property of being circular:
MUST This word, or the terms
"REQUIRED" or "SHALL", mean that the definition is an absolute
requirement of the specification
From the equation of
required with
must and
shall, however, it seems likely that the intent
of RFC 2119 appears to be similar to that of the ISO/IEC Directives.
That is, as far as I can tell, the usual usage of the terms in W3C and
IETF specifications. So I infer that the word
requirement really does mean here a property
whose absence will invalidate any claim to conformance.
Now, XPP spec explicitly denies that processors can conform to XPP.
It follows logically that neither XPP nor the profiles defined in XPP
define requirements for XML processors, and that any spec that
requires XML processors to conform to a given processor profile
defined in XPP is asking them to perform the impossible, to conform to
a specification which denies that they can possibly conform to it, and
also making a vacuous requirement, that they satisfy the requirements
of a specification which formally defines no requirements.
Suggested fix:
- Replace the phrase “construct input data models ...” with
some words which are given a meaningful definition by the spec. Or
alternatively define what is meant by data
model, construct, etc.
- Define criteria for conformance of XML processors to
the profiles defined.
14. Documentation of implementation-defined features
In section 3, the definitions of classes V and X say
that processors “should document whether
they provide this information to applications or not.”
The term implementation-defined is used by
other specifications, both within and outside the W3C, to characterize
features or behaviors which conforming processors are required to
document as part of a claim to conform to the specification. If the
term is taken to have that meaning here (XPP does not define the term,
so it's hard to be sure), then the statement that processors
“should” document their behavior is logically
inconsistent: if the behavior is implementation-defined, then the
correct verb is must, not
should.
If the intent is to specify that behavior is allowed to vary from
processor to processor and that processors are not required to
document their behavior as a condition of conformance, then the term
implementation-dependent is less flagrantly
inconsistent with usage in other W3C specifications. It, too,
however, is incompatible with the following sentence, since by default
implementation-dependent is used to
characterize features and behaviors which should not be
documented, since users of the technology in question are to be
discouraged from relying on particular processor-dependent behaviors.
In any case, I do not believe there can be a good reason for a
processor not to provide the documentation in question, so I think
should is out of place. It should be a
requirement, and the verb to use is must.
I note in passing that the use of should
here is logically incompatible with XPP's failure to define
conformance criteria for XML processors.
Suggested fix: Add a definition of
implementation-defined compatible with that
used in XPath 2.0 and related spec, and delete the sentence saying
that processors “should” document the behavior in
question. (Optionally add a redundant statement saying that they
“must” document the beahvior, or better a note
observing that it is a consequence of the behavior's being
implementation-defined that the implementation must define it.)
15. The information expressed in XML documents
Section 3 begins
For the profile definitions above
and the invariants below, we categorize the information expressed in
XML documents into a number of (overlapping) classes.
This is incorrect. What is characterized in section 3
is not the information expressed in an XML document, but
the particular subset of that information for which the
Infoset spec defines names. The two are the same neither in
theory nor in practice.
Suggested fix: Replace the sentence quoted with one
that's not false. Perhaps “For the profile definitions above and the
invariants below, we categorize the information identified and named
in [XML Information Set] into a number of (overlapping)
classes.”
16. The information classes
A reader might plausibly wonder about the principles which guided
the classification of information items and properites in section
3. At least, this reader wonders. After reading the spec, I'm
still wondering. An explicit statement of the principles which
guided the classification should be provided.
Some description of why the letters, A, B, P, V, and X were chosen
as the names for their classes would also help.
It would be preferable, I think, for the classes to be
characterized in terms of their content rather than solely in terms of
which processor profiles are required to expose them. As it is, the
statements that class A consists of “Items and properties which must
be provided by all profiles” and class B of “Items and properties
which must be provided by 2.3 The modest XML processor profile and 2.4
The recommended XML processor profile” look as if they are intended
to serve as definitions, but as definitions they are wholly unsuitable
and as normative statements they are wholly redundant with 2.1 through
2.4.
Suggested fix: Characterize classes A, B, etc. not
in terms of which profiles they are associated with but in terms of
what information they contain. Either explain the choice of letters,
or label the classes A through F. Optionally make the classes
disjoint to reduce confusion.
17. Recursive XInclude processing
It might be helpful to readers to remind them in a note that
XInclude requires recursive processing of include
elements, so that the output of a processor matching the
‘recommended’ profile will be guaranteed never to
contain xi:include elements.
Suggested fix:
Say it explicitly. Do not assume your readers have memorized
the XInclude spec.
18. Minor editorial points, typos, etc.
Some typographic and editorial problems caught my eye.
-
The term profile is not used in this
specification to denote any thing other than the processor
profiles defined here. The terms profile
and processor profile denote the
same thing. So in most of the twenty-five or so occurrences
of the phrase “processor profile”, the first word
supplies no information or meaning not already supplied
by the second word. The spec would be shorter and easier
to read if the first word were struck from, say, twenty or
so of those occurrences.
-
The status section says
[T]his specification is
not implementable as such ....
What does “as such” mean
here? Normally, one would take “such” to have an anaphoric
reference, so the sentence would be equivalent to saying “this
specification is not implementable as a specification”, but the
meaning of this rephrasing is also opaque to me. How would that
be different from not being implementable? Perhaps the anaphoric
reference is to the concept of implementation, so the phrase ought
to be expanded to “not implementable by means of an
implementation in the strict sense”. This does not seem very
promising, either.
Perhaps the simplest repair of the stylistic problem would be
to delete “as such” without replacement. (But the stylistic
problem is not the only problem here; see
12, “Implementability of the spec”.)
-
In
section 1, horizontal
ellipses are used with whiespace between the full stops without
whitespace before or after the ellipsis.
For “a software module. . .used”, read “a software
module ... used” or optionally “a software module …
used” (the latter using the standard hellip
entity for character U+2026).
-
Section 1 reads in part
Another kind of
uncertainty stems from the growth of the XML family of
specifications: if the input document includes uses of XInclude,
for instance.
This is not a well-formed English sentence. Perhaps a
continuation of the sentence has been lost? Something
like “, then the results
provided by the XML processor may vary among processors, so that
the application does not know what to expect”?
-
All the manuals of style I know frown on subdividing sections
of a document into fewer than two subsections. Section 1.1 on
terminology should either be given a sibling, or folded into its
parent section, or promoted to be a sibling of its current
section.
Since terminology is not really part of the background
of the specification, the last possibility seems best.
-
In 1.1, the paragraph about base URI
says the term is used “as it is defined in [RFC 3986]”. But
RFC 3986 does not provide any definition properly so called for
the term base URI. It specifies rules for
establishing and using a base URI, but it does not “define”
it.
I think what is meant is that XPP assumes that the base
URI is established and used as specified in RFC 3986. So perhaps
read
A base URI is an absolute URI against which relative
URIs are applied; this specification assumes that base URIs are
established and used as specified in [RFC 3986].
But you should probably also decide whether XPP assumes it
or requires it.
-
In the first paragraph of 2, the phrase the steps
necessary to construct a data model from a well-formed and
namespace well-formed XML document seems ill chosen.
The descriptions that follow are not, in fact, procedural in
nature, so steps doesn't seem right.
Nor do they in fact redeem the promise of information on how
to construct a data model (or even a data model instance).
In principle, I'd like to propose better wording, but I can't
because I don't know what you are trying to say here. I think you
are mostly just trying to characterize the sections which follow
by talking about what the profiles do or are. Unfortunately, I
also don't understand precisely what you mean by the word
profile. Judging by this phrase's
flamboyant failure to connect with anything that actually happens
in sections 2.1 through 2.4, you may be experiencing some
trouble in that area, too.
-
In 2.1, the mention of information set classes A, B′, P,
and X comes out of nowhere; this reader felt completely
blind-sided.
It would be better if somewhere closer to the top of the
document there were some words to say something like
Profiles are defined in terms of a processor's
behavior with regard to external markup declarations, its support
or lack of support for xml:id and XInclude, and the
information it provides to its downstream applications. For this
purpose, section 3 of this specification partitions the
information items and properties defined by [XML Information Set]
into classes; each profile specifies which classes of information
are exposed by processors in that class.
-
In 2, the clauses about faithful provision of the
information in the document all take the form “Faithful provision
of the information ... corresponding to information items
and properties ...”.
Perhaps it would suffice to provide, or expose, the information
items and properties specified.
If it is absolutely necessary to provide not the information
items and properties themselves but instead information
corresponding to (but, implicitly, not identical to?)
the specified items and properties, then I think the spec has an
obligation to explain clearly what the difference is, and why
exposing the items and properties does not satisfy the
requirements of the spec. In particular, you need to provide an
answer to the reader who is asking “How can a piece of
information correspond to an information item without
being indistinguishable from it (qua
information) and thus without being that information
item?”
The editors might do well to review their dusty copies of
Strunk and White's Elements of style, especially
the maxim “Omit needless words”, and to revise accordingly.
If they do, the individuals corresponding to their readers will
feel an emotion corresponding to gratitude. (Or, at least, a
diminished desire to seek out sharp objects and perform dangerous
acts with them.)
-
2.4 reads (rule 4):
Replacement of all include
elements in the XInclude namespace, and namespace, xml:base and
xml:lang fixup of the result, as required for conformance to [XML
Inclusions (XInclude) Version 1.0 (Second Edition)];
This sentence seems unnecessarily awkward; this reader, at
least, found it hard to read and follow.
Perhaps “and fixup of the namespace, xml:base, and
xml:lang properties of the result ... ”?
-
In
section 3, the
labels of the classes are reduplicated. For “Class AClass A”
read “Class A”, and similarly for the other classes, and
for the lists of information items in section 4.2.
-
In 4.2.2 and 4.2.3, a number of items are described in terms
like “Entirely, per the Element case above.” The spec would
be clearer if full sentences were used; this reader is not certain
whether the intended verb is “may differ” or “may be
absent”, and I do not know what the subject of the sentence is
intended to be.
Also, my Oxford American dictionary defines
per as meaning ‘in accordance
with’, but what is meant here seems to be something more
like ‘as described in’. The Collins COBUILD
dictionary agrees with Oxford in saying that the use of
per in this way normally involves things
happening or being done “in the way that the plan, system, or
set of instructions says it should be done”. But I don't think
the description of the element case includes any instructions or
provides any sense of what should or should not be done; I think
per is out of place here.
-
In 4.2.2 and 4.2.3, the list of differences between information
sets (I think that's what is being listed) is made unnecessarily
long and opaque by being organized around classes of information
items instead of around cases of difference.
In 4.2.3, there is a list of seven items, six of which turn out
on inspection carry as explanation the words “Entirely, for
exactly the same reason”. It would be a lot easier for the
reader to see what is going on if the list were replaced with
a pargraph:
Where processors conforming to the modest profile report
an xinclude element, processors conforming to
the recommended profile will report the result of
XInclude processing, which will consist of zero or more
element, processing instruction, unexpanded-entity,
character, or comment information items. In consequence,
the results reported by processors matching these two profiles
may differ in the presence or absence of those information
items, as well as in the presence or absence of
attribute and namespace information items on the elements
in question.
Though more detailed and clearer than the current description,
this takes less space than the current formulation.
-
In 4.2.2 and 4.2.4 various occurrences of the word
element are capitalized for no discernible
reason. Samuel Johnson is dead; it is too late to bring back his
capitalization habit.