RE: Change tracking processing instructions and deletion

I mis-read the syntax rule for Processing Instructions.  XML does not forbid ">" in PIs, it forbids "?>" in PIs and, while that could arise in change-tracking cases, escaping PI-embedded PIs should be easier.

 - Dennis

-----Original Message-----
From: Dennis E. Hamilton [mailto:dennis.hamilton@acm.org] 
Sent: Wednesday, August 27, 2014 09:21
To: public-change@w3.org
Cc: nigel.whitaker@deltaxml.com
Subject: RE: Change tracking processing instructions and deletion

Hi Nigel,

I think one reason for the escaped attribute content (although it could just as easily be escaped element content) is that the deletion need not be well-formed, something that matters for cross-cutting deletions in schemes such as whatever it is that ODF implementations actually do.  

Furthermore, XML forbids “<” in attributes and forbids “>” in PIs.

So you can’t avoid some escaping scheme in PIs and if attribute-like components are to follow the XML rules for attributes, I think you end up having to do &amp;lt; and &amp;gt; to keep things straight. 

Although ODF does not use Processing Instructions, its <text:deletion> element is along the lines of your <delete> example (although the provenance information is separated from the deleted material differently and well-formedness of cross-cutting extractions is achieved by adding start and end tags as necessary).

- Dennis

PS: Some list-management systems turn plaintext into HTML without properly escaping literal appearances of angle brackets and ampersands.  I am hopeful this list does better than that, considering where we are.

 - - - - Original Message - - - -
From: Nigel Whitaker [mailto:nigel.whitaker@deltaxml.com] 
Sent: Wednesday, August 27, 2014 07:59
To: public-change@w3.org
Subject: Change tracking processing instructions and deletion

Hello everyone,

There's an aspect of the existing change tracking PIs that are used in a number of systems that I've often wondered about:

The PIs that are used follow the convention of using an attribute-like syntax.  Its a convention that's been adopted for standard PIs such as xml-stylesheet and xml-model.
While its a convention, the XML spec itself doesn't say a lot about what you can/can't do in a PI

When content including elements and attributes is deleted in change tracking systems the content is typically escaped so that its a legal attribute.

Suppose I was to delete this paragraph:  <p xml:lang="en">Hello World</p>

We may see something like this (I'm generalising from what I've seen in a number of systems):

<?change user="nigel" time="2014-08-27 15:12:00" delete="&lt;p xml:lang=&quot;en&quot;&gt;Hello World&lt;/p&gt;" ?>


The angle brackets and quotes have been 'escaped' to make it a legal attribute.  I've got code to deal with this process, but I do wonder if its necessary and if things could be simplified?

If we don't use attributes we could perhaps do this:

<?change
  <delete>
    <dc:creator>nigel</dc:creator>
    <dc:time>2014-08-27 15:12:00</dc:time>
    <deletedContent><p xml:lang="en">Hello World</p></deletedContent>
  </delete>
?>

[ ... ]

Received on Thursday, 4 September 2014 18:36:24 UTC