Re: How should I query ActivityStreams objects containing both JSON and HTML?

On 1 Apr 2023, at 1:24, Aaron Gray wrote:

> On Sat, 1 Apr 2023 at 00:12, Ryan B <w3c@ryanb.org> wrote:
>
>> Individual situations vary, but in general, real world HTML markup tends
>> to be much more brittle and change more frequently than JSON data. I've
>> spent a lot of time on tools that both scrape HTML in the wild and handle
>> medium to large JSON objects, and I've had better luck keeping the two
>> fairly separate. I'd suggest a pre-processing pass to scrape the HTML into
>> a consistent JSON schema, and only then handle that and your native JSON
>> together, however you like.
>>
>
> Yes, extracting the HTML and processing separately is probably the
> easiest option :)
>

What use case do you have in mind?

Marcus

Received on Friday, 31 March 2023 23:48:58 UTC