W3C home > Mailing lists > Public > www-html@w3.org > August 2005

tag for notion and compound indication

From: <acc10-2005-67@gmx.de>
Date: Thu, 4 Aug 2005 02:13:13 +0200 (MEST)
To: www-html@w3.org
Message-ID: <20480.1123114393@www51.gmx.net>

Dear Ladies and Gentlemen, 

may I suggest the addition of one or two further kinds of tags that indicate

 a) compounds (assembled words) and
 b) words/word sets for indexing (notions)?

This issue is strongly linked with semantics and I think it should be of
relevant interest for the current XHTML design purpose.

- Two examples, one in German and one in English - 

In the German language more complexe semantics results from nouns, bound
together: 

  So for example the compound
  "Bundesregierung" stands for 
  "federal government" and results 
  from "Bund" (federal) and 
  "Regierung" (government). 

In an index it might be of interest to find a hyperlink (e.g. automatically
generated from the document) to this subject*  in the alphabetical order
among 

  B (like) Bundesregierung 

as well as 

  R (like) Regierung, Bundes- 

and for the English complement among 

  F (like) federal government 

as well as 

  G (like) government, federal 

( *naemly the containig paragraph or something else hyper-referable ) 

For this purpose it is necessary to show bindings and break points. 

You can imaginge similar needs for persons names when it is of interest to
be able to allocate them in a reference by the first as well as the second
name. 

E.g.: a street index for my hometown Dresden should include the

   Gret-Palucca-Str. 

   ("Gret Palucca" was a dancer,
    "Str." is the German abbreviation for Street)

under G as well as P (but not under S).

- Suggested Solution - 

To keep the number of tags limited I could imagine a single tag to fullfill
both mentioned needs. As I strongly prefer human readable and editable XML,
a one-character-long tag would be preferred for keeping readability in the
source code.

E.g. "<n>" for "notion": 

  Bundes<n/>regierung

  <n>federal government</n>

  <n>Gret-<n>Palucca-Str.</n></n>

I am open to the discussion whether a nested <n/> surrounded by text like in
"Bundes<n/>regierung" should already be allowed to do the job, as it can be
interpreted correctly only by determing the not-existance of a  surrounding
<n> and meaning that the notion is limited forward and backwards by the next
whitespaces or tag (I would prefer such light solution) or if more explicity
would be needed, such as: 

  <n>Bundes<n/>regierung</n>

You see, I adress some semantic meaning to whitespace and text (it already
has, by definiton), or by other words: "non-letter-characters". Maybe this
wont harmonize with DOM in case you consieder this topic an issue for DOM.

So in case such sophistication would be needed I suppose it would have to
look like

  <n>federal <n/>government</n>

( <n/> as seperator should be enough for precise adressing of both parts, as
far as I know the DOM ) 

or even

  <n><n>Bundes</n><n>regierung</n></n>
  <n><n>federal</n> <n>government</n></n>

then. 

( I would not like this last effort costing syntax at all, but at least I
would consider its existence better then no implementation at all. ) 

Even more complex structures can be imagined: 

  <n><n>Financial Services</n> Authority</n>

  to be noted among
  F (Financial Services Authority) 
  A (Authority, Financial Services)
  but not S (Services Authority, Financial) 

  but

  <n>River Thames Bridge</n> 
    (or sophisticated: 
    <n><n>River</n> <n>Thames</n> <n>Bridge</n></n>)

  to be noted among R, T and B 

Meaning that the first level <n> indicates the notion as whole, deeper level
do the structuring (as well as whitespace does) or compound diversion.

Of course I know that I could already achieve the handling derivable from
such structuring by existing tags as well, but then its only my private
"convention", not shareable with others (software products etc.).

  Bundes<span class="notion"/>regierung
  <span class="notion">federal government</span>

( and further more it looks horrible and costs too much characters to type
when writing source code manually ) 

Of course, notions should work for single word non-compounds as will, then
fullfilling the qualifying of its content for the indexing process as single
entry only:

  <n>wordofevidance</n>

I am looking forward to aspects, considering this topic. 

As it is of semantic relevance (e.g. a great help for an intelligent web
search as well...) I would like to see it implemented with XHTML 2.0.

With kind regards,
Benjamin Hartung
Dresden, Germany 
Received on Thursday, 4 August 2005 02:30:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:16:04 GMT