- From: Sebastian Hammer <quinn@indexdata.dk>
- Date: Wed, 11 Sep 2002 16:06:14 +0200
- To: Alan Kent <ajk@mds.rmit.edu.au>, ZIG <www-zig@w3.org>
Hiya, We took a long look at nested attributes as a way to add more expressive, direct way to search structured data such as data models that are abstractly viewed as being "XML-like" rather than "GRS-1" like. Our conclusion so far has been that nested attributes are both too clunky and not powerful enough. If you consider them as a mechanism, say, to increase the appeal of Z39.50 (or more likely, SRW) to XML-oriented communities, the obvious question may be, "what's wrong with Xpath?". By introducing XML as a record syntax (rather than telling everybody to munge their XML into GRS-1), we began what I think is a healthy path towards adopting popular mechanisms from other communities rather than telling everybody that it's our way or the highway. Put that another way, the adoption of XML only makes sense (to me) if we see it as parrt of a greater move where we look to increase the power (and market appeal) of Z39.50 by bringing in the best of the XML family of languages where they have something to offer. We'd like to propose an alternative model to nested attributes in which profiled subsets of the Xpath expression language is used to identify "parts" of abstract records for searching. The benefit of this is partly that it allows people to deploy Z39.50/SRW without squeezing their data models into flat lists of USE attributes; partly that it gives even us old-timers a more powerful language. For instance, a search for Library of congress subject headings might be represented as: Access point = /bibliographic/subject[@scheme='LCSH'], term = "computer science" and so forth. Open issues would be how you address this in the search with an atribute set identifier. You could imagine per-schema OIDs allocated by communities who need this type of communication, or a single core OID identifying the practice. The ideal would be an extension of the Type-1 query which allowed us to identify an XML schema (or namespace?). I would much rather see a scheme like this introduced than a primitive mapping of an Xpath subset into nested attributes. The richness/complexity of Xpath doesn't have to be a factor in any given application because profiles will be free to, well, profile a specific, fixed subset of expressions (they then become equivalent to flat lists of numerical attrbutes, only in a somewhat more intuitive space than flat lists of integers). We have the XML records. Let's not butcher the XML model by trying to squeeze it through a hole which isn't really big enough and which we haven't really formed a tradition for using anyway. We have server code which supports the practice described here, so if anyone would like to try interoperability, give me a holler or download Zebra. --Sebastian At 12:45 11-09-2002 +1000, Alan Kent wrote: >I had a question regarding nested attributes in the new attribute >architecture. I was trying to work out the maximal power they can >deliver. > >Rather than use numeric values, I will use XPath like syntax with >element names. (Values can be strings after all!) > >My understanding is nested attributes will allow me to do queries >such as > > Access Point: /head/title > Term: Lessons in Life > >The attribute list for the access point would list two values ('head' >followed by 'title') for the one attribute type ("1" for Access Point). >I can also do wild paths allowing > > Access Point: //title > Term: Lessons in Life > >That is, search in *any* access point where the last attribute value is >title. > >What does the following mean? > > Access Point: //title > Format/Structure: All these words > Term: Lessons Life > >Do the two search terms have to appear under the same 'title' or can they >appear in different 'title' attributes? (If 'Lessons' appears under >/head/title and 'Life' appears under /body/title in the same record, >should the record match?) > >Then, pushing things a bit further, can I say under the same 'author' >access point the 'firstname' access point must equal "John" and the >'lastname' access point must equal "Smith"? > >The following query does not mandiate first name and last name be for >the same author in the record (if there are multiple authors) > > /author/firstname = John > AND > /author/lastname = Smith > >You need something like a PROX operator with an attribute list: > > /author/firstname=John > Within-the-same /author > /author/lastname=Smith > >Maybe hijack the 'private' choice of 'proximityUnitCode' of the proximity >operator to specify the leading path length that has to be the same... >Ok, pretty yucky. The current KnownProximityUnits are actually not >very useful as I really want to specify proximity with respect to >the same attribute lists being specified in the query. How about >a third CHOICE under proximityUnitCode being an AttributeList? > >Just wondering since nested attributes were put in how far someone had >thought them through. Getting a simple path indexing scheme into Z39.50 >would certainly be a nice extension! And much more feasible to implement >efficently than full XPath or XML Query etc. > >Alan >-- >Alan Kent (mailto:Alan.Kent@teratext.com.au, http://www.mds.rmit.edu.au/~ajk/) >Project: TeraText Technical Director (http://teratext.com.au) InQuirion >Pty Ltd >Postal: Multimedia Database Systems, RMIT, GPO Box 2476V, Melbourne 3001. >Where: RMIT MDS, Bld 91, Level 3, 110 Victoria St, Carlton 3053, VIC >Australia. >Phone: +61 3 9925 4114 Reception: +61 3 9925 4099 Fax: +61 3 9925 4098 -- Sebastian Hammer, Index Data <http://www.indexdata.dk/> Ph: +45 3341 0100, Fax: +45 3341 0101
Received on Wednesday, 11 September 2002 10:05:00 UTC