Re: Citation semantics in HTML

David,,


HTML is now maintained by the WHATWG [1]. However, ideas and proposals 
are incubated at W3C in the Web Platform Incubator Community Group [2].


Hope this helps

Léonie


[1] https://github.com/whatwg/html

[2] https://github.com/WICG/proposals/


On 03/07/2024 17:27, David Greenwood [od21dg] wrote:
> Dear Sir or Madam,
>
> Re: Citations in HTML
>
> I'm working on an HTML editor frontend for a distributed authoring and 
> versioning application to implement a read-write web.
>
> Forgive me for addressing this email so generally. I was unable to 
> find a chapter for the United Kingdom. I'm writing because I have an 
> idea for a future version of HTML that may help uphold democratic 
> values as well as improve academic and research publishing.
>
> Here's my idea:
>
> Supporting semantic citation data for the <cite> tag would have a 
> number of advantages. It would:
>
> 1) improve reliability of information on the world wide web
> 2) improve the accessibility of reliable information for academic and 
> research purposes
> 3) enable GPT-type AI implementations to properly support their 
> responses with citations (this is a real problem for people using GPT)
> 4) encourage a more thorough approach to publishing verifiable 
> information in a media culture so often concerned with impact over truth.
>
> Given that referencing standards already exist and are relied on 
> heavily in academia, (e.g. Harvard, Chicago), would not the inclusion 
> of citation data be feasible in a future specification?
>
> An example implementation might be as follows:
>
> <p id="introduction"> In
>   <cite
>       authors="Berners-Lee, T. and Fischetti, M."
>       date="1999"
>       title="Weaving the Web : the past, present and future of the 
> World Wide Web by its inventor"
>       publisher="London: Orion Business."
>     >
>       Weaving the Web : the past, present and future of the World Wide 
> Web by its inventor (Berners-Lee, T. and Fischetti, M., 1999)
>   </cite>,
> the world wide web was originally envisaged as a creative space for 
> collaborative writing, editing and dissemination of information, 
> marked up semantically and structurally to both format and categorize 
> the parts of a document/text according to meaning and the context of 
> the content.</p>
>
> Or, perhaps:
>
> <p id="abstract"> Our analysis of the dataset &quot;
>   <cite
>       authors="Tatman, R."
>       date="2017"
>       title="Every Pub in England"
>       location="online"
>       
> address="https://www.kaggle.com/datasets/rtatman/every-pub-in-england 
> <https://www.kaggle.com/datasets/rtatman/every-pub-in-england>"
>       accessed="July 03, 2024"
>     >
>       Every Pub in England
>   </cite>
>   discovered at least 2 ancient establishments that may date to the 
> Dark Ages, but all important
>   historical information about has been lost due to historians not 
> making use of semantic web
>   technologies to allow machines to process, store and disseminate 
> information to humanity's benefit.
> </p>
>
> With the above citation attributes, it would be possible to do this:
>
> <footer>
>
> <p>See also:
> <ol id="bibliography">
> </ol>
>
> <script>
>
>   const bibliography = document.querySelector( 'ol#bibliography');
>   const sources = document.querySelectorAll( 'cite');
>
>   sources.forEach( el => {
>     const li = document.createElement('li');
>     li.innerHTML = `${el.authors} (${el.date}) <em>${el.title}</em>. 
> ${el.publisher}.`;
>     bibliography.appendChild(li);
>   });
>
> </script>
>
> </footer>
>
> Hence, the information in the citations is readily usable for further 
> computational purposes and for user consumption.
>
> A corollary to this approach is then possibility of a new tag 
> <bibliography> that would be automatically populated with the citation 
> data in the page content.
>
> <bibliography for="introduction abstract" />
>
> A single "for" attribute would allow a space-separated list of content 
> elements by id that should be parsed for the population of the 
> bibliography, producing:
>
> - Berners-Lee, T. and Fischetti, M. (1999) Weaving the Web : the past, 
> present and future of the World Wide Web by its inventor. London: 
> Orion Business.
> - Tatman, R. (2017) Every Pub in England. Kaggle.com. [Online] 
> [Accessed on 3rd July 2024] 
> https://www.kaggle.com/datasets/rtatman/every-pub-in-england 
> <https://www.kaggle.com/datasets/rtatman/every-pub-in-england>.
>
> There are a great many possibilities for making better use of citation 
> metadata. It would be possible to have numbered, superscript-style 
> linking such that inline references link to footnote bibliography 
> entries much as in Wikipedia.
>
> A great amount of the user's markup work can be automated. The parsing 
> of URL schemes and namespaces in the user's citation data would make 
> citation markup easier,. For example, URN schemes such as urn:isbn 
> could be used for book citations, or DOI for papers. A Javascript 
> Citations API would be able to recognise when the user is referencing 
> a particular type of source, making use of namespaces, and formatting 
> citations properly by way of a Citation Builder.
>
> The Citations API would support citation indexing, summarisation and 
> management of citations, as custom Javascript classes with properties 
> such as title, date, last-accessed, authors, etc..
>
> There's a world of opportunity here, and like the lost historic pubs 
> of yesterday, how much knowledge is being lost by our not leveraging 
> our very best technology? Moreover, how much is democracy being abused 
> and undermined by fake news and bogus, unverifiable information 
> because we can't build a web of trust with insufficient tracing of 
> sources?
>
> Kind Regards,
>
> David Philip Greenwood

-- 
Léonie Watson (She/Her)
Director
https://tetralogical.com

Received on Thursday, 4 July 2024 07:43:27 UTC