- From: Gerald Oskoboiny <gerald@w3.org>
- Date: Fri, 24 Oct 1997 17:01:11 -0400 (EDT)
- To: www-html-editor@w3.org
- cc: Dan Connolly <connolly@w3.org>
Hi, I've run into some problems trying to get the HTML 4.0 SGML stuff (DTDs and HTML4.decl) to work on Unix systems. I'm using the most recent version of everything (SP v1.2.1, and the HTML 4.0 materials as taken from the CVS repository in html4-src as of about an hour ago), and it seems that the UCS-4 decl uses characters that are too big for modern Unix systems to understand. I've tested it using SP 1.2.1 on Solaris 5.5.1 and on Redhat Linux with kernel 2.0.30, and SP is compiled with -DSP_MULTI_BYTE . When I run "nsgmls -s -c sgml/HTML4.cat sgml/HTML4.decl ~/file.html" using the files produced in the "sgml" directory after a "make" in html4-src, I get: nsgmls:HTML4.decl:21:29:W: characters in the document character set with numbers exceeding 65535 not supported I discussed this with Ian on IRC, and he noted that a "make check" checks the spec against its own DTDs, and that works fine. However, I see that "make check" does this: check: all @for i in $(MAINOBJS) $(APPENDIXES) $(REFS) $(INDEXES) ; \ do echo checking $$i...; $(NSGMLS) -s -c sgml/HTML4.cat $$i; done; \ echo checking done. so it isn't using HTML4.decl; it must be using the SGML declaration that's compiled into nsgmls? To use the HTML4.decl, I believe the command is: nsgmls -s -c sgml/HTML4.cat sgml/HTML4.decl ~/file.html and that produces the error I quoted above. The line it's complaining about in HTML4.decl is: 160 1113952 160 (it's complaining about the 1113952 being too large.) I understand that you recently changed this number from what it was previously (2147483486), to get around the NAMELEN problem (which requires these number to be 8 characters or less). But it seems that this new number, 1113952, is still too large on all the Unix systems I've tried it on. My somewhat uninformed diagnosis of this is: modern Unix systems are not capable of handling UCS-4; they can only do UCS-2. So: any ideas? I need to get this to work for the HTML validation service; currently I'm using the HTML4.decl that was shipped with the 970708 snapshot of the HTML 4.0 materials: http://www.w3.org/TR/WD-html40-970708/sgml/HTML4.decl but apparently that's UCS-2, not UCS-4. Maybe you need to ship two decls with the HTML 4.0 materials, one which is UCS-4 and one which is UCS-2 for systems that aren't capable of UCS-4? Or maybe it's not necessary to use HTML4.decl at all? Is it only for use on systems which can support it, and optional on others? (How important is it that this HTML4.decl be used and not some other one?) If it *is* important, I believe something needs to change in the HTML4.decl for it to be useful to people using nsgmls on Unix systems. I'm in way over my head here, but I've tested this pretty thoroughly. If you like, I can bring this up on comp.text.sgml to see if any of the SGML gurus there can help. Thanks, Gerald -- Gerald Oskoboiny <gerald@w3.org> +1 617 253 2920 System Administrator, W3C http://www.w3.org/People/Gerald/ World Wide Web Consortium, MIT Labatory for Computer Science 545 Technology Square, Room NE43-353 Cambridge MA 02139 USA
Received on Friday, 24 October 1997 17:01:59 UTC