RE: QandA : A proposed enhancement to HTML from Ernest Cline on 2004-02-03 (www-html@w3.org from February 2004)

From: Ernest Cline <ernestcline@mindspring.com>
Date: Tue, 3 Feb 2004 18:59:47 -0500
To: "rogergarrett@thunderbirdtechnology.com" <rogergarrett@thunderbirdtechnology.com>, www-html@w3.org
Message-ID: <410-2200422323594746@mindspring.com>
----- Original Message ----- 
From: Roger Garrett

> ASSERTION :
>
> One can argue that there is a need for a distinct element for handling
> Q&A instead of using the <dl> element for that type of content.  I don't
> agree with that argument, but I can see its validity.  However, there
> are severe problems with your proposal.
>
> RESPONSE:
> The <dl> tag is a very generic tag used for a wide range of purposes.
> ...
>  A search engine looking through web pages would have no way of knowing
> which <dl> elements were intended as providing question-and-answer
> information and which were intended for their more conventional use.
> In addition, the <dl> tag does not provide the extra features that I
believe
> are important to the QANDA concept.

I'll grant that a search engine has no way of distinguishing between a <dl>
used for a definition list and a <dl> used for a question and answer format,
but does it need to?

It is impossible for an  author to determine all possible (or even probable)
ways that a question could be asked in a natural language format.  So
a search engine that attempts to make use of the <qanda> element that
you propose would face the exact same problem that it would if the <dl>
element were used instead.

Example:

<qanda>
 <question>What is direct current?</question>
 <question>What is DC?</question>
 <answer>Direct current is ...</answer>
</qanda>

<dl class="qanda">
 <dt class="question">What is direct current?</dt>
 <dt class="question">What is DC?</dt>
 <dd class="answer">Direct current is ...</dd>
</dl>

<dl>
 <dt>direct current</dt>
 <dt>DC</dt>
 <dd>Direct current is ...</dd>
</dl>

Is there any reason a search engine should favor one over the other?
Will a natural language search engine have more ability with <qanda>
to use it to answer questions such as, "What makes current DC?"
that the author of the text did not anticipate than with either <dl> version
above?  As far as I can see, the answer to both questions is no.  As far
as a search engine is concerned, all that a simple <qanda> element
would be is an alternate version of a <dl>.  The extra metainfo attributes
that you suggest for <qanda> are the only possible benefit for a search
engine compared to <dl>, but they could just as easily be added to
existing elements if needed instead of creating a new element.
in XHTML to support them.

Now it is possible that an author might desire to differentiate out
question and answer <dl>'s from generic <dl>'s, but the class
attribute provides a way for him to do so already, without requiring
that a new element be added to XHTML.

> ASSERTION:
> As you have proposed it, there is no way of applying other relevant
> tags such as <em> to portions of either the question or the answer.
>
> RESPONSE:
> First of all, the question part of the QANDA tag is not intended for
> rendering on a web page by a browser. Thus, there is no need for
> any additional tags within the question part. The question part is
> only intended to be read by a search engine as it spiders web pages
> looking for QANDA tags, so that it (the spidering search engine) can
> place the information into its database and later make matches
> between the question parts of the QANDA tags and the actual
> questions that are entered at the search engine web page by users
>  of the search engine.

To repeat myself from above:
"It is impossible for an author to determine all possible (or even probable)
ways that a question could be asked in a natural language format."
So if it isn't intended to be seen by the user, wny bother to put the
"question" into natural language in the first place?

> Secondly, there is no reason why additional tags cannot be applied
> to the quoted text of the answer part...

If its going to use markup it should be an element, not an attribute.
Also, you can't have multiple attributes with the same attribute name
on the same element instance, so using elements instead of attributes
will be necessary if you want more than k versions of the question,
where k is the number of different question attributes available.
(In your proposal, k=2, question and alternatequestion.)

> ...
>
> ASSERTION:
> 
> As search engines long ago discovered, relying on authors to provide
> accurate and relevant keywords depends upon them not playing tricks
> to get their site referred to.  That is why most search engines pay scant 
>attention to keywords.  As such, any argument that relies upon search
> engines processing a keyword argument is largely bogus.
>
> ...
>
> RESPONSE:
 [A fairly long response which I shall not repeat,  documenting several
  proposed ways that a search engine might use the <qanda> attributes
  once it had decided by other means to trust such arguments.]

The problem is, there is no reason why such schemes could not be done
now, if the search engine runners desired.  Furthermore, the arguments
given by you could be used in support of adding ways of metainfo or
alternate content to XHTML elements in general and not just your
proposed <qanda> element.

SUMMARY:
Roger has proposed three seperate ideas all combined
into a single proposal.
1) That there should be a <qanda> element distinct from
    the <dl> element.
2) That there should be a means of attaching metainfo to
    individual elements and not just the whole document.
3) That there should be a means of indicating alternate
   formats of the same content.

Despite Roger's proposal containg all three, I feel that there is no reason
that the three parts should not be considered independently instead,
as no reason has been shown why metainfo or alternate content should
be restricted to just "question and answer" data.

1> I disagree with Roger on the first point because in my opinion,
no advantage (in the context of XHTML) has been shown for making
that difference.

2> As for attaching metainfo to specific elements, the idea has merit,
but I am not happy about the proposed implementation.  Except for
specific pieces of metainfo such as language or MIME type that
directly affects how a document accesses or displays the data,
(X)HTML ha traditionally done little with metainfo.  If generic metainfo
is to be added for any element and not just the document, it probably
should be done by allowing the <meta> and <link> elements
already in XHTML to be children of elements other than the <head>.

3> As for the alternate content, the Embedding Attribute Collection and
the Object Module of the current XHTML2 working draft [1] provides a
method of providing alternate content that allows for the author
to indicate a prefrence for which format is to be used.  Also, when
content is accessed via HTTP, a choice of formats can be provided
via a single URL, with the user in control of which is preferred.
Altho this facility is currently used mainly for providing alternate
image formats for a picture or audio formats for a sound, there
is nothing that prevents for example a video file, an audio file, and
a text file of a speech from sharing the same URI.

Still, there is room for improvement in the handling of alternate content
in XHTML, since user determined preference is currently only possible
in the context of HTTP, while ideally it should not matter what protocol
is being used.  However, given the total lack of any information on how
either the author or the user is supposed to express a preference for
one format over another, I can't see any merit in the proposal as it
currently stands on this point either.

[1] http://www.w3.org/TR/2003/WD-xhtml2-20030506/
Received on Tuesday, 3 February 2004 18:59:45 UTC