Re: updated cite definition - please review from Jukka K. Korpela on 2013-08-28 (public-html@w3.org from August 2013)

From: Jukka K. Korpela <jukka.k.korpela@kolumbus.fi>
Date: Wed, 28 Aug 2013 23:43:13 +0300
To: HTMLWG WG <public-html@w3.org>
Message-ID: <521E60E1.2010201@kolumbus.fi>

2013-08-28 22:58, Charles McCathie Nevile wrote:
> On Wed, 28 Aug 2013 10:32:41 +0200, Jukka K. Korpela 
> <jukka.k.korpela@kolumbus.fi> wrote:

[...]
>> All that matters in the common existing practice is that <cite> is by 
>> default rendering in italic (when possible). Everything else is just 
>> idle and confusing “semantics” in the worst meaning of the word – 
>> unless someone can come up with an example (even a very theoretical 
>> thought experiment) what could possibly be done with <cite> on the 
>> basis of the proposed semantic definition.
>
> There's quite a lot of software out there used to detect plagiarism. 

Yes, such software has been available for decades. How does it relate to 
<cite>?

> There's also a lot of translation and automated translation.

Yes, for a long time. How does it relate to <cite>? No matter which of 
the proposed definitions you expect to be applied, you cannot use the 
definition to determine how <cite> content should be translated.

> Knowing when something is attributed and being able to compare it 
> based on a search, even across languages, provides a pretty powerful 
> plagiarism detection tool with the ability to save many people a lot 
> of very boring mechanical work and focus on the real academic merits 
> of something - or to go home earlier, or whatever...

You cannot know that "something is attributed" by looking at <cite> 
elements. If <cite> is allowed to mean just a title of a work mentioned 
in text, then you cannot know whether attribution of any kind is 
involved. Even if you expect it to relate to attribution, you cannot 
know what that something might be. So far from offering a pretty 
powerful plagiarism detection tool, <cite> at most suggests that 
something might be being attributed, or maybe not. How useful is that?

So I don't think this counts even as a hypothetical idea of how <cite> 
might be used. If we were talking about genuinely structured markup for 
citations, then we would have elements that relate part of a document 
with a credit statement, probably with at least minimal internal 
structure. Such markup might have uses in various analyses, if it were 
widely used. But <cite> is a lost cause, both because it lacks any 
structural features and because it has been used for a variety of 
meanings - it really suffices to say that since it can mean either a 
reference to a source or just a title of a work mentioned, you cannot 
know whether it indicates the source of some content in the document.

-- 
Yucca, http://www.cs.tut.fi/~jkorpela/

Received on Wednesday, 28 August 2013 20:43:35 UTC