W3C home > Mailing lists > Public > www-validator@w3.org > March 2001

Re: [OpenJade-devel] Re: Any C++ programmers around? (was: Unix --> NT (source code stuff))

From: Adam Di Carlo <adam@onshore.com>
Date: Sat, 17 Mar 2001 05:28:40 -0500 (EST)
To: openjade-devel@lists.sourceforge.net
Cc: Nick Kew <nick@webthing.com>, "Michael D. Crawford" <crawford@goingware.com>, W3C Validator <www-validator@w3.org>
Message-ID: <oa66h8ofjn.fsf@arroz.fake>
Terje Bless <link@tss.no> writes:

> [ Any objections? Adam? Brandon? Anyone?                            ]

Nope, not at all!

> >But again, I've had to do that to run it from the Web, so a commandline
> >variant should be equally straightforward (though a hack).
> 
> Well, the point is that (Open)SP expects some things to be specified in
> environment variables instead of as switches on the command line. This is a
> nice feature addition for the command line -- as you won't have to specify
> the switches every time -- and kinda works in CGI land -- because the
> environment goes away with each invocation -- but it's a sordid mess when
> you move to mod_perl or other persistent interpreters where the lifetime of
> the environment (parent process) spans several invocations of SP.
> 
> We're currently futzing around with SGML_CATALOG_FILES, SGML_SEARCH_PATH,

Well, if you build with the right settings, you don't need to do this.

> SP_CHARSET_FIXED, and SP_ENCODING. In particular, SP_CHARSET_FIXED and
> SP_ENCODING are "magical" in that they are necessary to enable XML mode.

Hmm.  I find that XML mode works pretty well without those.  But maybe
I haven't looked deeply enough.  Can someone enlighten me.

Anyhow, in principle (aside from SGML_CATALOG_FILES and
SGML_SEARCH_PATH), I agree there should be cmd-line equivalents.

> >>* Ability to say: "use this SGML Declaration and this DTD".
> >>  -- SGML Open Catalogs are fine and dandy an all, but for some things it
> >>     would be much less painfull to say "use this" on the command line.
> >
> >I'd like to, but that'll be a longer-term thing.  I'd like it still better
> >if someone with a much deeper knowledge of SGML than mine looked at it.
> 
> Et tu, Brute? Aren't there any real SGML gurus around that could help my
> poor tortured brain -- and Nick's, apparently :-) -- tackle SGML issues? I
> barely understand half of what the SP man pages are trying to tell me
> because they speak in SGML-alese (i.e. in tounges for the good it does me)
> and XML is "double the fun" (that was your cue Sean! ;D).

Oh, I pointed out in my last email that DTDDECL in your SGML open
catalogs will fix this. We ship a lot of DTDs in Debian this way (but
I had to hack Jade for Debian so Jade wouldn't bitch about the Jade
non-supported directive).

Perhaps I should give an example:

PUBLIC  "-//W3C//DTD XHTML 1.0 Strict//EN"        "xhtml1-strict.dtd"
DTDDECL "-//W3C//DTD XHTML 1.0 Strict//EN"        "xhtml1.dcl"


> >> * Blue Sky: A Perl (XS) Module Interface

> >I don't see myself getting involved with that.  If I'm hacking SP,
> >that's because I don't want to wrap it in something else - like Perl.
[...]
> In consolation, it should be a pretty easy task if you A) know C++ and B) a
> little about Perl XS modules. Perl comes with tools to automate parts of
> the job and man pages that describe the process fairly well (at least, it
> looked pretty good to my untrained eyes even if I can't do it myself ;D).
> To all appearances it's mainly a mechanical job of doing data type
> conversions from/to Perl/C++ and similar things.

Yes, I agree it would be pretty easy and quite nice.  Even just
getting the entity mgmt function from libsp (or libosp) would be
pretty sweet.

> >> * Blue Sky: Configurable Error Format
> >>   -- The error messages are an exception for most SGML Processors, but
> >>      for the Validator they are the norm. Being able to play tricks with
> >>      the format and fields of the error output would be usefull.
> >>      Reporting context a bonus!
> >
> >I've looked at that a little, and I've implemented a compile-time option
> >to switch between JJC's format and HTML-ised format for Code Valet.
> >I'll be doing some more work in this field in due course.
> 
> Well, the reason I'm so gung ho on switching to OpenSP is that it has a
> switch "-n" that outputs message numbers ("relevant clauses") with error
> messages: "onsgmls:OSPF<0>:1:1:1:E: DOCTYPE Missing" meaning it detected
> that there was no DOCTYPE on line one, character one, and this violates
> clause number one (of some ISO standard presumably). Since we're wrapping
> SP in a Perl CGI app, it's much easier to parse out the error _number_ (or
> some other semi-unique identifier) then the free-form text message.

Doesn't that solve this wishlist item, then?

> Other usefull things to have in an easily parseable format is stuff like
> containing element (last opened element), asking for warnings about
> "expected foo, but got bar, assuming baz" so we have a way to report when
> someone forgot to close their TD or puts weird stuff in the HEAD section
> that will implicitly close HEAD and open BODY.

I use 'onsgmls -gues file' and that tells me containing element.

> BTW, since I'm yelling about SGML gurus and C++... Did anyone ever have any
> ideas about why some errors get reported only once with a HTML 4.01 Strict
> DTD, but multiple times with the HTML 4.01 Transitional DTD? Either this is
> an intentional difference in the two DTDs -- one that I can't find or
> understand the point of (I didn't even know this was possible to express in
> a DTD!) -- or a bug in all SP-based parsers. In particular, a bogus
> attribute on the IMG element gets reported only once with strict.dtd, but
> at every occurence in loose.dtd, using lq-nsgmls, JJC/SP nsgmls, and
> OpenJade's OpenSP.

I think I saw that email come in but I haven't had time to look yet.

FYI, did anyone but me notice that XHTML 1.0 is not clean according to
onsgmls -Wall?  'xhtml1-strict.dtd' contains bogus stuff like an
entity 'FrameTarget' which is defined but never used.

I emailed the address given for comments, but have gotten no feedback.
I wish the spec committes would at least try to validate their DTDs
fully before publishing the spec...


-- 
.....Adam Di Carlo....adam@onShore.com.....<URL:http://www.onShore.com/>
Received on Monday, 19 March 2001 01:46:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:13:55 GMT