Re: Conditional Levels of a Schema from Arshad Noor on 2009-04-08 (xmlschema-dev@w3.org from April 2009)

From: Arshad Noor <arshad.noor@strongauth.com>
Date: Tue, 07 Apr 2009 18:46:50 -0700
To: Dieter Menne <dieter.menne@menne-biomed.de>
CC: xmlschema-dev@w3.org
Message-ID: <49DC020A.3010709@strongauth.com>
Dieter,

I am going to attempt to answer your question by providing a
solution from a different perspective - the Security one -
only because the issue you've raised stems from a security
requirement: preserving patient confidentiality based on
where the data exists/is used.

I am not an XML Schema expert and come to this forum occasionally
to get my own questions answered.  But, my career is currently
focused on addressing complex data-security issues and believe
that the solution to your question deserves another approach.

If the security-requirement I've stated above is correct, then
your approach to the solution is flawed.  You are making many
assumptions about the software and environment to preserve the
confidentiality of the patient data.  However, it is those very
assumptions by data-model designers and software programmers
that have, unfortunately, resulted in many vulnerable systems
today.  But you can solve your current design problem *and* the
security problem with what I've outlined below.

The security landscape is a vastly different environment today
than it was even five years ago, with professional attackers
being far superior to many standard software developers, IMHO,
in their knowledge of systems, software and vulnerabilities.
Evidence of this superiority is visible in the *known* breaches
at datalossdb.org; what is most problematic is what we don't
know that has been breached already, but will be discovered
many months/years later (Heartland Payment Systems).

That said, I believe the solution to the problem is simple:
assume that the network is compromised and that the host is
compromised.  You may rely on the fact that software has not
been replaced on the computer/device which uses your data, but
you cannot rely on the fact that there isn't something else
running on the system that's reading files and watching
traffic go by on the network adapter.

If the network/host are assumed to be compromised, how do you
address this problem?  By securing the data itself, through
message-level encryption within the application!

By encrypting the data and placing just a reference to the
key-identifier (using the XML Encryption XSD), you can now
use a single XSD for your own data and leave the patient
data in there all the time (minOccurs="1" all the time).

The difference is, those who need to see the actual data -
the hospital, for instance - would have the authorization to
retrieve the decryption-key from their key-management system
and read the data, while all others would not be able to see
it, despite having the data, knowing the key-identifier and
even knowing where to retrieve it from (we create open-source
software that provides this level of security).

This is a radically new paradigm for data-protection.

It allows you to stop worrying about whether the data belongs
in a specific place/application/device/etc. and lets you focus
on just managing access control to your keys.

It also solves your data-design problem: the patient data is
always present in the XSD and application rules are also simple
- unless they are authorized to retrieve the key, the extra data
is just noise.  (That is the only downside: the data is always
present.  But, in these days of megabit speeds to mobile devices,
and gigabit to desktop/laptops, I'm not so sure its an issue for
new applications).

I'm not trying to detract from the interesting discussion on
conditional processing of XSD elements - I'm sure there are many
other examples where such rules must be addressed.  I've only
offered this alternative, because of the underlying security
requirement in the problem statement.

Regards,

Arshad Noor
StrongAuth, Inc.


Dieter Menne wrote:
> Hi,
> 
> we are currently defining a format for medical data storage
> (hrmconsensus.org). The full version is available 
> http://hrmconsensus.org/media/hrm/xhrm/xhrm02/xhrm0_2.xsd here .
> 
> In the simplified example below, we have the always mandatory deviceTyp. For
> patientsType, we would like to have a global conditional switch so that
> three flavors are possible
> 
> -- minOccurs = "0" for internal clinical use
> -- minOccurs = "1" for archiving, must contain patient info
> -- minOccurs = "never" anonymized, must not contain patient info
> 
> I know that the latter is not possible, that conditionals are not supported
> in XSL, and that Schematron would be an alternative.  Note that the
> conditionals occur in several nesting levels, so that we cannot easily
> combine versions of a master element with details, but they are always of
> the type "may", "must", "must not".
> 
> We would like to avoid having several xsd files and prefer a common file
> with branching. Any ideas or references to ideas are appreciated.
> 
> Dieter Menne 
> on behalf of the hrmconsensus group.
> 
> 
> <?xml version="1.0" encoding="utf-8"?>
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" version="0.2">
> 	<xs:element name="xhrm">
> 		<xs:complexType>
> 			<xs:sequence>
> 				<xs:element name="device" type="deviceType"/>
> 				<xs:element name="patients" type="patientsType" minOccurs="0"/>
> 			</xs:sequence>
> 		</xs:complexType>
> 		</xs:element>
> </xs:schema>
>
Received on Wednesday, 8 April 2009 01:47:46 UTC