- From: Chris Lilley <Chris.Lilley@sophia.inria.fr>
- Date: Fri, 10 Jan 1997 19:58:48 +0100 (MET)
- To: bosak@atlantic-83.Eng.Sun.COM (Jon Bosak), www-style@www10.w3.org
On Jan 10, 9:21am, Jon Bosak wrote: > a group of researchers at Sun has > developed an entire speech synthesis language based on HTML, This of course differs from the approach taken here, where the speech synthesis part resides in the stylesheet and the document is in HTML. It sounds interesting work, though - is it publically available? Can you provide a reference to it? > This team was also forced to add > nonstandard attributes to HTML in order to convey the complete set of > information needed properly to drive speech synthesis. Yes, we should be able to avoid this. > the process they were forced to add CLASS attribute values to SPAN > such as sentence, phrase, compound, and a number of others in order to > distinguish language features in the text itself. Interesting. The examples of text-to-speech synthesis that i have seen seem able to cope with normal sentence and phrase structure as described by the punctuation. Indeed I have heard two systems which can even correctly say something like Dr Smith lives at 42 Oak Dr Southport, and likes fishing. The problem seems to be when punctuation is used for something other than punctuating. > [Dave Raggett:] > > | I think there is a good case for markup to indicate how to speak > | certain words or phrases, when this also serves to amplify the > | semantics. > This is consistent with the experience of the team referred to above. Right. It should be a last fallback, however - it is easier and more generally useful to add semantic elements (person, date, time) and leave the phonetics for specific individual cases. I can see people adding IPA in an attribute when describing how to pronounce their name on their personal home page, for example, or in scholarly publications, online phrasebooks, and the like, but I don't see IPA in attributes being something folk will want to use a lot of in general. > | It would be great to collect suggestions for enlarging the set > | of phrase tags for future versions of HTML. Is there a core > | set that will meet say 95% of people's needs? Yes. Now we need to find out, as Dave says, which items of semantic markup will fit the 80:20 rule. (20% of the possible markup covers 80% of common cases). 95% might be stetching the normal distribution a little too far ;-) > If you continue to add more standardized tags every time you encounter > a new problem domain you will destroy HTML as a simple markup > language. I don't think that person, place, date and time are going to make HTML significantly more complicated to hand-author given the weight of tags that already exists for tables, font, etc. > SGML went down this road a long time ago in the attempt to > define universal tag sets Yes, but typical documents using other applications of SGMLare very rich semantically and are used within a much more restricted context than typical HTML documents. Even so, some DTDs have found widespread usage, for example DocBook. > XML provides > such a standardized extensible language definition mechanism for the > Internet, which is why I suggested that it be looked into for this > purpose. Yes, the work on XML is exciting and hopefully (my personal view here) future versions of HTML will be defined in terms of XML and will thus be extensible. And hopefully by the time that happens some of the stylistic parts of the existing HTML specification will no longer be required. For people who haven't heard of XML yet, see: http://www.w3.org/pub/WWW/TR/WD-xml-961114.html http://www.w3.org/pub/WWW/MarkUp/SGML/Activity -- Chris Lilley, W3C [ http://www.w3.org/ ] Graphics and Fonts Guy The World Wide Web Consortium http://www.w3.org/people/chris/ INRIA, Projet W3C chris@w3.org 2004 Rt des Lucioles / BP 93 +33 (0)4 93 65 79 87 06902 Sophia Antipolis Cedex, France
Received on Friday, 10 January 1997 13:59:28 UTC