Re: Whither the JSON-LD context?

Wes Turner
On Jan 7, 2014 6:53 AM, "Dan Brickley" <> wrote:
> On 7 January 2014 10:16, Markus Lanthaler <>
> >> W3C's experience with XML parsers that auto-fetch
> >> and
> >> when parsing XML is relevant here:
> > [...]
> >>
> >> If JSON is the new XML and JSON-LD is the emerging best practice for
> >> interoperable JSON, it isn't unreasonable to expect XML-levels of
> >> usage. So let's try to learn from the W3C XML DTD experience.
> >
> > I think there's a very important difference to that experience. XML
namespaces are not links and are thus not *expected* to be dereferenced.
Thus, AFAICT, for a long time those URLs returned non-cacheable HTTP error
responses. If you know that a document is going to be requested often, you
can plan for it (CDN, long cache validity etc.). I know it's important to
keep these things in mind but I'm still not convinced that serving a small
static file (even if it is requested millions of times) causes much costs.
Otherwise, all the free JavaScript library CDNs etc. would have been shut
down already a long time ago..

Last time I tried to run a free CDN, it wasn't inexpensive.

Can we create a validated whitelist of schema URIs, or should we rely upon
sensible server-side caching (etags, cache-control)?

The Norvig XKCD solution may be helpful here.

> The main lesson from
> is DTD-related rather than schema-related. Schema-fetching is
> generally seen as more optional.

So, <link> elements with full URL/URIs or (@vocab and/or prefixes which a
'normal' client won't prefetch or unnecessarily dereference)?

> The important difference is (I haven't found the exact spec reference
> but...) the XML 1.0 spec says that when parsing XML with validation
> enabled, external references to DTDs must be de-referenced.

A separate thread is probably more appropriate for a question like "which
libraries / frameworks / toolkits / user-agents are needlessly requesting
unnecessary levels of server resources?"

> This is something that anyone learning-through-doing XML handling
> might not even think about, if they're using a spec-compliant XML
> parser. Many users of such libraries have no idea that their
> application code is hammering with repeated HTTP requests.
> Let's think about how we can help novice JSON-LD toolkit users find
> themselves in the same position.

> Perhaps the default behaviour of a
> JSON-LD toolkit / parser could keep a global fetches-per-minute count,
> and complain to STDERR if the application is over-fetching? (alongside
> sensible caching defaults etc)

IIRC the reddit PRAW API enforces client-side rate-limiting with the
requests library, but requests_cache seems not to work.

Received on Tuesday, 7 January 2014 16:17:52 UTC