- From: Butler, Mark <Mark_Butler@hplb.hpl.hp.com>
- Date: Mon, 10 Jun 2002 18:23:08 +0100
- To: "'Sam Lerouge'" <sam.lerouge@rug.ac.be>, "'www-mobile@w3.org'" <www-mobile@w3.org>
Hi Sam The problem here is there is a huge difference between what is available now and what people have speculatively said will be available in the future. In any kind of engineering, it is always desirable to try to minimize the number of dependencies on untried and untested technology. Unfortunately the W3C does not seem to be following that rule - it wants to use RDF for everything even though it not finished! This is frustrating for companies who would like to deploy technology today. This is not just limited to CC/PP: for example all W3C working groups manage issue lists, but they all do it in different ways, sometimes with varying success, often using different pieces of software hand-written by group chairs. Recently it was suggested that the W3C should investigate the possibility of supplying standardised tools to all groups. The W3C decided that RDF would be an ideal basis for this! I would thought taking an existing open source piece of software - say Bugzilla - would be a much quicker, more efficient way of solving the problem. The fact that RDF still has no agreed mechanism for datatypes is surely indicative that is not a finished technology ready for industrial scale deployment? The other problem is RDF, like a number of computer technologies before it, has recently had a great deal of hype. I think this has frustrated people (it frustrates me on a daily basis) because instead of honest discussions on what RDF can do, what it is good at, what it is not good at, what bits are missing, what needs further work etc etc many discussions over emphasise the future possibilities and even have a kind of "religous" fervour (e.g. "you can't do that it's not in the spirit of RDF"). Furthermore as RDF is complicated, and keeps changing, arguments often end up centering on minor issues to do with the serialisation rather than major ones ("you need to qualify your attributes with namespaces", "why don't you use typedNodes", "you haven't typed your component", "you should use IDs not about's" etc). Now I may be coming across as being very negative about RDF, so I just want to say I am not dismissing RDF or the Semantic Web: I think they are promising technologies. As I see it, about the same time that relational databases were developed, there were a whole set of important technologies developed by the AI community. However, whereas we see relational databases widely used today, we don't see these AI techniques everywhere. So if the Semantic Web just means these technologies are more widely deployed and easily available to developers that will be a good thing. However as anyone familiar with AI knows, at the time people were tremendously disappointed because the technologies did not deliver on the hype that surrounded them, even though they did have uses. It would be a shame if the same thing happened to the Semantic Web. The other thing GOFAI (good old fashioned AI) taught us is that the knowledge encoding problem is hard and there are no real shortcuts, a fact which is equally relevant to RDF today. So really what I am advocating is that more caution is exercised before building dependencies on unfinished technology. People seem to use the argument that we should be using RDF, because in the future everything will be using RDF, so it's easier to migrate now. Contrary to what other people might think, my experience of software engineering is the key point is to get the design right, because even if you are not using the most up-to-date technology, if the design is right it's much easier to migrate. This is because a key indicator of good design is simplicity, and if you have simplicity then migration is always easier. If the design is poor, it doesn't matter if you are using the most up-to-date technology, with computers there is a very likelihood that at some point you have to migrate technology and this is much harder with a poor design. Recent interest in techniques like "Extreme Programming" have highlighted how achieving simplicity in design helps. > The > advantages come when you start using RDF Schema. Or better, > when different > vocabularies are used, that refer to other vocabularies. These > "inter-vocabulary relationships" are not known in XML Schema, > I believe. The > previous version of the RDF Schema specification (see > http://www.w3.org/TR/2000/CR-rdf-schema-20000327/) gives some > basic hints on > how one could use this. The problem is they were only hints, and as you note they are only in the previous version of this spec, not the current one. So really RDFS is at exactly the same stage as XSD: neither have a standard way of declaring inter-vocabulary relationships, and any application that requires this at the moment has to develop it's own idiosyncratic way of doing this. > I think RDF is all about machine-readability, rather than > human-readability. That's true, but that's a good reason why CC/PP should not use RDF. Experience has shown that CC/PP profiles tend to generated by hand, so the fact they are written in the XML serialisation of RDF (which is more complicated than vanilla XML) creates additional difficulty, along with the fact there is no support in RDF for validating data entered by hand as there is in XML. > The interesting part of RDF is that a software agent that has a basic > knowledge on some constructs (e.g. the CC/PP model and core > vocabulary) can > learn to use other vocabularies when you feed him a new RDF > Schema that > refers to the vocabularies he already knows. People have said this (in fact I'm sure I've said it to Carl - sorry Carl!) but I don't believe it any more. This stems from my experience with UAProf. With UAProf they followed the RDF Schema guidelines, adopting a brand new namespace for each vocabulary every time they wanted to add new attributes. When they did this, they copied all the existing attributes to this namespace, but often changed the resolution rules, the components or the data types of these attributes for good measure. This has created a nightmare scenario where you have nearly as many different vocabularies as devices so processors have to process each device in a different way. It also means that there are lots of potential problems where you link to UAProf devices that use slightly different versions of the same vocabulary in a chain as there is no agreed way of merging different vocabulary versions, particularly when attributes have the same meaning but some of their properties e.g. data type, resolution rule etc have changed. This problem has never been resolved. So I think there is no getting away from the fact that i) we need to agree on vocabularies ii) once we've agreed on vocabuaries, we need to think very carefully before we change them because changing them creates problems and breaks things iii) we definitely should not change vocabularies just to add new attributes; we should create new vocabularies and use the new and old vocabularies concurrently These rules are just sensible rules for using namespaces and they are equally applicable in RDF or XML. In fact I think there is a general need to rethink how namespaces are used in association with schemas. To give an example of this, the CC/PP WG has recently been working on a new version of the CC/PP Structure and Vocabulary document in order to move to candidate recommendation. In order to do this, the group devices it was necessary to change the namespace of the CC/PP schema. I was very reluctant to do this as I was conscious that doing this would break existing CC/PP processors. However eventually the rest of the group prevailed as it was seen as being good W3C practice to do this. Personally I think if namespaces are to be used in this way they need two "axes"(separate data fields): a namespace axis (that identifies what it is) and a version axis (which identifies which version it is). However such changes are clearly beyond the remit of the CC/PP working group. As it is, I imagine there are lots of processors that try to solve this with regular expressions (i.e. if this namespace contains the string "CCPP" in it process it as CC/PP) but I don't think this is a satisfactory long term solution. > One strong point of RDF Schema is the ability to express of > relationships > between different vocabularies. I am thinking of a useful > application: when > a content provider knows that > a) "requested_file --mime-type--> image/jpeg", and > b) "user_agent --accepts--> [text/html, text/plain, > image/jpeg, image/gif]", > then he should be able to deduce the client will be able to > process the > data. In order to do so, he must know the relationship between the > "mime-type" property, that belongs to a multimedia metadata > vocabulary, and > the "accepts" property, that belongs to some CC/PP > vocabulary. Yes, but why not use the same term for both the content and the user agent as HTTP/1.1 content negotiation does? Instead of trying to find a solution to the problem, why not avoid the problem altogether? > Using RDF > Schema (or one of the related technologies, such as DAML+OIL) > to express > both vocabularies, and their relationships, would enable the content > provider to learn new vocabularies and their use. The thing is I don't think content providers want to learn new vocabularies. Content providers are busy. They want very simple vocabularies which tell them exactly what they need to know. Also they don't want different devices to use different vocabularies. They want one common vocabulary so they can use it to support device independence. So whereas we *could* do this, I think we *should* be trying hard not to do this. However again this is verging on a "religous" style arguement for some people. Of course if we are going to use RDF, then I think that using RDF Schema for vocabularies is a very good idea. However currently CC/PP does not require this, and without validation (remember this!) there is no guarantee that even if schemas do exist they will be correct. This means there are plenty of barriers to interoperability here if people don't use the technology correctly, regardless of whether it is RDF or XML. If > Hope this wasn't too much information in one time. Well I wrote for much longer than you, so not at all - I hope this wasn't too much. I guess we will see by the number of people who unsubscribe themselves this time :-) Just one thing though - please don't take any of these comments personally. Others have definitely expressed the views you express before, it's just the implementation experience has been rather different. best regards Mark H. Butler, PhD Research Scientist HP Labs Bristol mark-h_butler@hp.com Internet: http://www-uk.hpl.hp.com/people/marbut/
Received on Monday, 10 June 2002 13:24:43 UTC