A thought about profiles

[yes, I know, I have real editors' work to do on this spec., but this
came up in another context]

What does anyone think of the following:

   An infoset is a single-rooted, directed graph with node labels (item
   types) and edge labels (properties), where we treat literals (strings,
   integers, URIs, [anything else?]) (which are always leaves) as nodes
   of type Literal.  The graph may contain cycles (courtesy of the
   [references] property on attributes and the [attributes] property on
   elements.

   We can define a *profile* (sc. for infosets) as a set of node labels N
   (which always contains at least Literal) and a set of edge labels E,
   and a *profiled infoset* P of an infoset I wrt such a profile as
   follows:

    If Document is not in N, then the empty graph

    Otherwise, the Document node from I (call it d), plus all nodes in I
    whose labels are in N which are reachable from d by at least one path
    all of whose edge labels are in E and all of whose node labels are in
    N

   Or alternatively, by construction:

    If 'Document' is in N, then the Document node of I is in P;

    If a node n is in P, then
      each node
       a) which is connected from n by an edge whose label is in E, and
       b) whose own label is in N
      is in P.

   If two XML documents D1 and D2 have identical profiled infosets for
   some profile P, then any application which specifies P as its input
   profile SHOULD treat D1 and D2 indistinguishably.

I feel like it might be helpful to include this, as it sort of answers
the question "what are profiles _for_?" in a different and possibly
useful way. . .

ht
-- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]

Received on Tuesday, 3 July 2012 23:47:29 UTC