- From: <juanrgonzaleza@canonicalscience.com>
- Date: Thu, 4 May 2006 04:40:04 -0700 (PDT)
- To: <www-math@w3.org>
Stan Devitt wrote: > juanrgonzaleza@canonicalscience.com> wrote: > ... > >> ... >> > That in TeX is encoded as \dot{q} in MathML was encoded in four different >> ways... > > > So lets look at the TeX source. If I give you \dot{q}'s were written by 4 > different authors in 4 different countries working in 4 different branches > of mathematics and cultures, then finding all uses of the \dot{q} 's in > (even) the TeX source document still does NOT match the occurrences of your > mathematical concept. > > This kind of assumption is a heuristic. ... "I think the author used this > special character ... and it is pretty unique, so lets hope there is not too > many of them and maybe one of them will be the one I want ..." > > Heuristics have their value and role, but should not be confused with > accurate and reliable search based on semantic markup. Understand them for > what they are. > > ... >> > Moreover Unicode is also designed for search and this would help to search >> engines to match. > > > ... Sure you can find all uses of certain normalized characters, but since > the information (authorship, subject area, concept association) is not known > , you could have made a serious mistake by assuming they all the authors > intended the same concept. > > Unicode has not dictated that all mathematicians in the world avoid use the > given character unless it has a very precise mathematical meaning. Authors > are free to (and must be free to) re-use notation in new contexts. > > The challenge here is in understanding when they do so and in communicating > that information to the reader long after the author is no longer there to > explain away any misconceptions. > > Note the requirements and points illustrated here. Whereas I almost agree >> with last Stan Devitt post >> >> [http://lists.w3.org/Archives/Public/www-math/2006May/0010.html] >> >> I think that he has missed a bit the point when says >> ... >> > In above examples, one is not comparing different notations; one is >> comparing THE SAME notation but expressed in different ways in >> presentational MathML markup. > > > As soon as you take such a notation out of context by, for example, > searching accross documents or through an archive of mathematical documents, > you are infering a meaning for all uses of for that "same notation" and that > may not have been intended by the authors. The crucial distinguising > information is not available in such documents. > > Even just searching the test document I proposed in which we discuss > multiple uses of a single notation is a problem. If we mark it up with just > a simple character representation then we already get incorrect matches. > > <blockquote> >> 2. It is unreasonable to expect that a single concept to be "presented" >> uniformly by all authors or applications (even as a multi-character >> ... >> > </blockquote> >> >> Maybe a full complete unification (an only way) was impossible, but note >> all outputs were generated from the same input: \dot{q}. > > > ... the intended meaning of which may have been entirely different by the > different authors using \dot{q} and you have no way to tell the difference. > > Two or three representation are preferable to a dozen. Note also that >> Unicode define ways to compare different codes. > > > For accurate semantic searches, the ability to explicitly associate > expressions to concepts is perhaps the only absolute requirement, but is > just that -- an absolute requirement. Once you have that, the specific > presentation chosen is almost a moot point. > > Heuristics are very important, but they must be understood as such and their > proper role must be understood. > > Stan Since this whole message apparently suggests my previous messages did not understand advantages of using content-oriented markup and since that was not the objective of this thread -focused on presentational issues mover-munder vs Unicode-. I just can recommend read the whole thread with care and also can add a link to an old document [http://canonicalscience.blogspot.com/2006/02/choosing-notationsyntax-for-canonmath.html] with some thougths on the advantages of using content-oriented markup. Juan R. Center for CANONICAL |SCIENCE)
Received on Thursday, 4 May 2006 11:40:13 UTC