RE: Charsets, encodings, http-request, unescape-markup, and convenience, oh my!

>   In my humble opinion, I think those problems wouldn't happen if HTML
> content was parsed as a document node directly by the http-request
> step.  The step can access the HTTP response context (including the
> charset if any) and parse the HTML content directly into a document
> node, e.g. following the same rules as in escape-markup.  Or did I
> miss something?

I guess we could give HTML extra significance in p:http-request (similar to application/xml) and make the step behave as p:unescape-markup for HTML responses... But my personal feeling is that the less magic happens in p:http-request the better. I think that p:http-request should really only give you the 'raw' data that came with the response. If you want to treat the response data as HTML, you can apply p:unescape-markup to it. But if you want to treat the (HTML) response data as a sequence of bytes, you should still be able to do that.

Regards,
Vojtech

--
Vojtech Toman
Consultant Software Engineer
EMC | Information Intelligence Group
vojtech.toman@emc.com
http://developer.emc.com/xmltech

Received on Monday, 10 October 2011 11:36:28 UTC