- From: Sebastian Hammer <quinn@indexdata.dk>
- Date: Fri, 13 Sep 2002 09:08:26 +0200
- To: Alan Kent <ajk@mds.rmit.edu.au>, ZIG <www-zig@w3.org>
At 12:07 13-09-2002 +1000, Alan Kent wrote: >Interesting. A few questions if I may to tease things out a bit. >You say above there were open issues on how to put it into Z39.50, >then you have something implemented - how did you put it into the >protocol? (I was curious to the level of functionality you thought >you would need.) As you suggested yourself, we use a string-valued (complex) attribute. The big difference to the model hinted at in the attribute architecture is that nested attributes are not used -- all of the path information is held in a single string-valued attribute, and the syntax used to pick apart elements is an Xpath subset. >For example, I could imagine using a 'string' attribute value which >was an XPath expression (you can use string values using 'complex' >attribute values). So you define a special set with a single type >(1 = Access Point?) where the value for the type is the XPath expression. Precisely. In our server we take the liberty of cheating a little to make testing more comfortable.. we have assigned an OID for the generic practice, in our own OID space, but if our server receives a string-valued attribute of type 1 it currently assumes that it's probably an XPath-expression (that's the kind of thing that will come back to bite you if the concept takes off, but that seems like a luxury-problem at the moment). >But you mention another point above which is an XPath expression >with namespaces requires scope information - namespace prefixes >for use in the XPath expression itself. (I did not fully understand >what you meant when talked about XML schemas - you don't need a >schema to evaluate an XPath expression - or is that the logical >schema you want to search - separating logical schemas from whatever >the physical representation of the data is). Your paranthesized guess is correct. It's true you don't need the scope information, and we currently don't use it.. I described it because it seems to me that if we want to retain the possibility of a split between the search and retrieval data models, then it would be useful to identify a (possibly abstract) set of elements at search time. Our current implementation is based strictly on the contents of the internal record, though. >Or is the idea that the 'XML' representation is just a semantic model >mapped on to the real physical model (whatever it is, including MARC). >That way, we can just say semantic models should not use namespaces. I think so. >I am not (yet!) completely convinced introducing XPath as a means >of specifying access points into Z39.50 is a good thing. Most systems >predefine indexes on particular access points etc. Having a completely >dynamic scheme such as XPath for identifying access points may require >a completely different indexing and query engine to existing systems. It certainly makes *possible* a completely different indexing engine... our conclusion has ben that if we want our engine to be relevant in IR applications outside of libraries, then we'd better think hard about how to allow people to express access points in language they find natural. But it doesn't *require* a different indexing engine, because just as profiles mandate exact attribute combinations today, so they may well mandate specific element patterns in liu of USE attributes, and simple servers can be implemented using string-matching against predefined patterns. The string patterns are more verbose than USE attributes, for sure, but you can make that argument against anything XML as a whole, and that's clearly not what people choose their information structuring framework by. >Hence its unlikely any existing systems would move forward to use it. >The alternative is to enumerate all the paths a system will support >so you can use XPath expressions to identify a path, but its really >just a fancier naming scheme (attributes instead of a number have >a really long string - but the Z39.50 server just treats the string >effectively as a name - if the XPath expression is in the list of >supported expresions, great its supported.) It's a fancier naming scheme that's directly intuitive and accesible to anyone with a background in XML. In that respect, it's an attempt to open up Z39.50 for use by these groups without altering the fundamental model of the protocol. I'd say the cost-of-migration to this practice could easily be less than that required for SRW adoption... but then, migration is not required -- the purpose of this is chiefly to make the protocol more appealing to new communities, and to make Z39.50-based IR systems more versatile. >Another option (not debating merit yet) is to *encode* requests using >existing numeric attribute schemes. Client applications may allow users >to enter an XPath like syntax. A new Explain category could always be >introduced containng the information for how to convert XPath expressions >into attribute lists etc. The main benefit is that its not that radical a >change to Z39.50 as is. Another thing is any namespace stuff can be >done by the client when mapping to attribute values. (The namespaces >are almost more like OIDs identifiying the set they come out of.) The problem is that you're superimposing a fairly complex mapping scheme on top of something which is very simple and intuitive to many people (XPath). I just don't think that will fly. >So I guess the bottom line is how radical you are willing to get >in terms of a model shift. Internally here, major changes are much >less likely to get up. I think the key may be to think about this as an incremental step towards adopting current industry standards in a way that doesn't have to be very intrusive to present systems, but which can potentially increase the value of Z39.50-based systems manifold by freeing them from certain conventions in Z39.50 that are (perhaps) even more obscure and unfriendly to outsiders than our use of BER. >Interesting area though. For sure. Btw., the current version of the PQN-decoder that comes with YAZ has experimental support for this. It'd be interesting to look at how it could fit into CCL-derived query languages as well... although my guess is that it's not complicated. --Sebastian -- Sebastian Hammer, Index Data <http://www.indexdata.dk/> Ph: +45 3341 0100, Fax: +45 3341 0101
Received on Friday, 13 September 2002 03:07:11 UTC