- From: Joseph J Panjikaran <josephj@wipinfo.soft.net>
- Date: Tue, 16 Feb 1999 18:11:17 +0530
- To: <quint@w3.org>
- Cc: "Amaya mailing list" <www-amaya@w3.org>
- Message-ID: <01be59a9$a5fcb620$1d14a8c0@Anthurium.wipinfo.soft.net>
Hello. Thank u for providing valuable insights to design decisions. I have been trying to modify the HTML.S file to get a cut down version of tags and attributes to be recognised by Amaya. The new HTML.S is now compiled to HTML.STR and HTML.H using the compiler.exe. I have also cut down the respective tags and attributes from HTML.A also. I then use the new HTML.STR and HTML.h when building the Amaya. There seems to be some problems. When parsing and this part of code is reached in tree.c /*************************/ case CsIdentity: /* structure is the same as that defined by another rule of the */ /* same scheme */ create = FALSE; pSRule2 = &pSS->SsRule[pSRule->SrIdentRule - 1]; if (pSRule2->SrParamElem || pSRule2->SrAssocElem || pSRule2->SrConstruct == CsBasicElement || pSRule2->SrNInclusions > 0 || pSRule2->SrNExclusions > 0 || pSRule2->SrConstruct == CsConstant || pSRule2->SrConstruct == CsChoice || pSRule2->SrConstruct == CsPairedElement || pSRule2->SrConstruct == CsReference || pSRule2->SrConstruct == CsNatureSchema) create = TRUE; t1 = NewSubtree (pSRule->SrIdentRule, pSS, pDoc, assocNum, Desc, create, withAttr, withLabel); if (pEl == NULL) pEl = t1; else InsertFirstChild (pEl, t1); break; /***************************************/ it goes into infinite recursion by calling NewSubtree () function with the typenum parameter=0. Any insights will be appreciated. -----Original Message----- From: Vincent Quint <quint@w3.org> To: Joseph J Panjikaran <josephj@wipinfo.soft.net> Cc: Amaya mailing list <www-amaya@w3.org> Date: Monday, February 15, 1999 1:41 PM Subject: Re: Grammar enforcement while parsing in Amaya >Joseph J Panjikaran wrote: > >> I was viewing a FRAMESET based HTML page in Amaya1.4. >> Inside the FRAMESET tag an H1 tag had crept in. >> >> The file was not written using Amaya and I am just using it to view it. >> Unfortunately, the H1 tag is recognised inside FRAMESET. >> >> I checked the grammar specified in HTML.S file. H1 is not defined >> inside FRAMSET. > >You could also have checked the SGML DTD for HTML 4.0: > > http://www.w3.org/TR/REC-html40/sgml/loosedtd.html > >as both the DTD and the HTML.S file define the same structure. >You are right: a FRAMESET element cannot contain an H1 element as a child. > >> I dont know about IE, but Netscape4.04 does not allow this. > >The only safe reference I know is the HTML 4.0 specification: > > http://www.w3.org/TR/REC-html40/ > >> The parser should enforce the grammar to some extent and should not >> allow such blatant abberrations and reject the tag. > >In principle, you are right. But Amaya has to cope with existing Web >pages, and, as you know, very few of them validate against the HTML DTD. >When designing Amaya, we were faced with a difficult choice: >(1) adopt a strict position and reject all invalid pages. Most users > would be very disapointed not to be able to see mainy pages that > other Web clients can display; >(2) accept invalid pages and let Amaya fix the most common bugs. > >We took position (2) and decided that Amaya should try to fix bugs, but >without losing information. If an element is not valid in a given context, >Amaya tries to change the structure locally to make that element valid, >but it doesn't delete the element or move it to a different place, which >could change its meaning. That's why the H1 element is kept in the FRAMESET. > >Another important design decision that has been made for Amaya is that, even >if it accepts invalid documents, the structure and markup that it produces >is always valid. Obviously, only elements created or changed by Amaya itself >are concerned here. Some invalid parts of the original document may remain >when the document is saved. > >W3C has also developed a validator that allows you to check documents. >Have a look at: > > http://validator.w3.org/ > >You could also use HTML tidy to fix erroneous documents: > > http://www.w3.org/People/Raggett/tidy/ > >> Another classical case is allowing INPUT tag without an enclosing FORM tag. >> I checked for TABLE related tags, some degree of checking has been hard >> coded to ContextOK() of html2thot.c file > >This is an example of these errors that Amaya tries to recover. If the >structure of a table is wrong, Amaya can not edit it. For that reason, it >applies some tranformations that make the structure correct. > >> I dont think this is the right way of grammar enforcement, since the whole >> purpose(as i see it) >> of having a '.S' and '.STR' file is to make the code independent of the >> grammar to a large exent. > >The issue is that a DTD or a .S file only specifies the structure of a >document class, not its semantics. When you consider an invalid document, >there are often several ways to transform its structure to make it valid, >but each transformation may have a different impact on the document semantics. >The DTD of .S file does not allow you to choose the right transformation. >A specific piece of code is then needed. > >> What is the reason for a very liberal grammar enforcement? > >See above. > >> BTW I am an ardent fan of Amaya. I really enjoyed using Compiler application >> given in amaya1.4 >> Good error messages are given when compiling "HTML.S". It helped a lot!! >> Eagerly waiting for the goodies in Amaya1.5 > >Thanks for your support. > >> regards >> Joseph > >Vincent. > >------------------------------------------------------- >Vincent Quint INRIA Rhone-Alpes >W3C/INRIA ZIRST >e-mail: Vincent.Quint@w3.org 655 avenue de l'Europe >Tel.: +33 4 76 61 53 62 38330 Montbonnot St Martin >Fax: +33 4 76 61 52 07 France > > >
Attachments
- application/octet-stream attachment: HTML.s
Received on Tuesday, 16 February 1999 07:36:03 UTC