- From: Sam Ruby <rubys@intertwingly.net>
- Date: Thu, 25 May 2006 07:39:00 -0400
- To: www-validator@w3.org
- CC: Neil Smith <Neil_Smith@hargreaveslansdown.co.uk>
David Dorward wrote: > On Thu, May 25, 2006 at 11:15:27AM +0100, Neil Smith wrote: > >>When submitting a document in Atom format to the feed validator service >>http://validator.w3.org/feed/check.cgi >> >>Inclusion of an & entity followed by a single character in the >>range a-zA-Z only, before the closing <title /> element tag causes >>the feed validator to report " EOF in middle of entity" : > > I'm not an expert on ATOM, but I believe this is what is happening: > > Your title element has a type attribute that specifies it contains > HTML and so the text must have special characters represented by > character references. > > This HTML is being represented in XML, so any special characters in > the HTML source must also be represented as character entities. > > Thus: foo&bar in text becomes > foo&bar in HTML and > foo&amp; in XML encoded HTML > > You've only encoded the ampersand once, so are getting a warning. Exactly. >>Use of more than one alpha character after the & entity does not >>cause this error in the validator. It should of course be >>reasonable to end a title element in for example E&O, or in our >>case the abbreviation for a company, i.e A&L > > I'm now entering the realm of guesswork, but I suspect that you can't > have named entities with only one letter, so the parser knows that &O; > isn't a real entity, but that &Ox; could be. it seems that the parser doesn't like unclosed entites at the end of the string. If you have access to Python, you can experiment with the following code: --- text="Viridian results higher on Irish businessE&O" from HTMLParser import HTMLParser, HTMLParseError from xml.sax.saxutils import unescape try: parser=HTMLParser() parser.feed(unescape(text)) parser.close() print 'ok' except HTMLParseError, error: print error --- > (I read the mailing list, please address responses there and do not CC > me.) OK ;-) - Sam Ruby
Received on Thursday, 25 May 2006 11:39:43 UTC