- From: Butler, Mark <Mark_Butler@hplb.hpl.hp.com>
- Date: Fri, 21 Mar 2003 15:38:17 -0000
- To: "'www-rdf-comments@w3.org'" <www-rdf-comments@w3.org>
- Cc: "'Java Community Process JSR #188 Expert List'" <JSR-188-EG@JCP.ORG>
Dear Colleagues, > Mark and Colleagues, > > Thank you for taking the time to review RDFCore's documents > and provide > feedback. > > Your comment has been recorded as > > http://www.w3.org/2001/sw/RDFCore/20030123-issues/#jsr188-01 > > The WG will consider your comment and you will hear further > from us in due > course. > > I note your concerns with local datatyping as: > > a) it uses up more network bandwidth. > b) you are concerned about inconsistency > > Concerning a: > > - is there a quantative assessment of the impact on bandwidth > - has the use of entity declarations to provide a more compact > representation been considered > - has the use of DTD default attributes to provide a more compact > representation been considered > > Concerning b: > > - could you provide an example of the sort of > inconsistency you are > concerned about. I'm sorry about the delay in replying with these additional comments. I have circulated an earlier version to these comments to the JSR-188 EG, and then revised them based on their comments. However I have not had time to circulate the comments to the JSR-188 EG for a second review. Therefore please regard these comments as being submitted by me rather than JSR-188 for the moment. ================== Firstly, to help you resolve this issue, I would like to explain how I would like to see this issue resolved: 1. Most preferred solution: Allow both global and local datatyping. 2. Secondary solution: The use of DTD default attributes (DTD-DA) is a good one and I had not considered it. The primary barrier to adopting it is the difficulty of using DTDs with the current RDF/XML serialisation. However if the RDF/XML serialisation was simplified so it is more compatible with DTDs as proposed in http://www.w3.org/2001/sw/RDFCore/20030123-issues/#xmlsch-12 then CC/PP and UAProf could adopt the simplified syntax. This in itself would resolve many of the problems I outline below. In addition, the use of DTD default attributes would avoid the need for authors to include datatyping information "by hand" in CC/PP and UAProf documents. So here my solution would be to resolve xmlsch-12 by setting up a work item to produce a document that proposes a simplified RDF/XML syntax that allows the DTD-DA solution to be used. This could then be used by CC/PP and UAProf. Note this work item should not delay progress on the existing RDF documents. ================== Secondly here are some more detailed answers to your questions: Of these two issues, I feel the inconsistency / increasing the difficulty for RDF/XML authors is the more important. RDF/XML is often described as a format for machines, not people. However currently many people using RDF, including authors of UAProf and CC/PP profiles, have to author RDF/XML directly. From my work on CC/PP and UAProf I have observed that authors have difficulty authoring RDF/XML and I attribute this to a number of reasons: - unfamiliarity with the RDF/XML format - confusion over the multiple serialisation forms - difficulty reading the striped syntax - confusion arising due to revisions in the format (for example most profiles do not place a namespace prefix in front attributes - see http://w3development.de/rdf/uaprof_repository/) - the lack of automatic validation tools. Here I don't mean tools like the W3C RDF validator that just validate documents as being valid RDF. I mean tools that validate a document to check it conforms to a specific schema structure and a controlled vocabulary. For example in UAProf it is common to encounter misspelt properties e.g. <prf:AudioEncorder> instead of <prf:AudioEncoder> <prf:BitsPerPixels> instead of <prf:BitsPerPixel> In XML it is common practice to use DTDs or Schemas to validate documents. Although it is possible to use RDFS to perform some validation of RDF/XML, this necessitates the creation of custom validation tools which is unnecessary with XML - for a fuller description of the problems see http://www.hpl.hp.com/techreports/2002/HPL-2002-268.html As issue http://www.w3.org/2001/sw/RDFCore/20030123-issues/#xmlsch-12xmlsch-12 points out it is difficult to using the existing XML approaches with the current RDF/XML format. As a result of this, I am wary of any change in RDF/XML which will place additional burdens on authors. Specifically in the case of CC/PP, all CC/PP and UAProf profiles which conform to a specific vocabulary will give the same data type to specific attributes, so why include this information in every document (i.e. local datatyping) increasing the chance of errors on behalf of the author? Just as in relational databases, it is common practice to normalise schema design in order to avoid field replication and data integrity issues, it is desirable to do the same thing here with data type information. I anticipate the current data type decision will cause authors to make errors such as - omitting the data type definition altogether - giving the attribute the wrong data type - without suitable validation tools, they may introduce typing errors in the URL used to indicate the data type As for the network bandwidth problem, I anticipate that adopting this approach to datatyping will make profiles 20-30% larger than existing profiles, depending on whether profiles use DTD entities or not as you suggest. I provide details of my calculation below. INCREASES RESULTING FROM DATATYPING In UAProf version 20010330, 20 out of the 62 attributes will require datatyping. Here we will assume an estimated length of an attribute definition is 41 characters e.g. <prf:ColorCapable>Yes</prf:ColorCapable> With datatyping, this increases to 97 characters. <prf:ColorCapable rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">Yes</prf:ColorCapabl e> Profiles have a static amount structure which is approximately 1270 characters e.g. Namespace information <?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:prf="http://www.wapforum.org/profiles/UAPROF/ccppschema-20010430#"> <rdf:Description rdf:ID="MyDeviceProfile"> and six component declarations <prf:component> <rdf:Description rdf:ID="HardwarePlatform"> <rdf:type rdf:resource="http://www.wapforum.org/profiles/UAPROF/ccppschema-20010430#Ha rdwarePlatform"/> </prf:component> Therefore current profile length = 1270 + (62 * 41) = 1270 + 2542 = 3812 After datatyping, profile length = 1270 + (20 * 97) + (42 * 41) = 1270 + 1940 + 1722 = 4932 This increases profile length by approximately 30 %. USING ENTITIES Declaring entities at the top requires a fixed section 247 characters long <!DOCTYPE rdf:RDF [ <!ENTITY type-boolean 'http://www.w3.org/2001/XMLSchema#boolean'> <!ENTITY type-number 'http://www.w3.org/2001/XMLSchema#number'> <!ENTITY type-dimension 'http://www.wapforum.org/profiles/UAPROF/ccppschema-20010430#dimension'> ]> using entities gives an estimated length of 70 <prf:ColorCapable rdf:datatype="&type-boolean">Yes</prf:ColorCapable> Therefore using entities, profile length = 1270 + 247 + (20 * 70) + (42 * 41) = 1270 + 247 + 1400 + 1722 = 4639 Here profile length is increased by approximately 20%. Mark H. Butler, PhD Research Scientist HP Labs Bristol mark-h_butler@hp.com Internet: http://www-uk.hpl.hp.com/people/marbut/
Received on Friday, 21 March 2003 10:38:48 UTC