- From: Simon Spero <sesuncedu@gmail.com>
- Date: Tue, 7 Oct 2014 14:16:53 -0400
- To: Phillip Lord <phillip.lord@newcastle.ac.uk>
- Cc: semantic-web@w3.org, Linked Data community <public-lod@w3.org>, Luca Matteis <lmatteis@gmail.com>, Alexander Garcia Castro <alexgarciac@gmail.com>, Norman Gray <norman@astro.gla.ac.uk>
- Message-ID: <CADE8KM78xzogmdsqX4WZ8AWcjEjhYzBOBnaT-0=FHS=tejAp7Q@mail.gmail.com>
BLUF: This is where information science comes in. Technology must meet the needs of real users. It may be better to generate better Tagged PDFs, and to experiment, using some existing methodology annotation ontologies, with generating auxiliary files of triples. This might require new/changed latex packages, new div/span classes, etc. \huge But what is really needed is actually working with SMEs to discover the cultural practices within the field and subfield, and developing systems that support their work styles. This is why Information Science is important. If there are changes in practices that would be beneficial, and these benefits can be demonstrated to the appropriate audiences, then these can be suggested. If existing programs, libraries, and operating systems can be modified to provide these wins transparently, then it is easier to get the changes adopted. If the benefits require additional work, then the additional work must give proportionate benefits to those doing the work, or be both of great benefit to funding agencies or other gatekeepers, *and* be easily verifiable. An example might be a proof (or justified belief) that a paper and it's supplemental materials do, or do not contain everything required to attempt to replicate the results. This might be feasible in many fields through combination of annotation, with sufficiently powerful KR language and reasoning system. Similarly, relatively simple meta-statistical analysis can note common errors (like multiple comparisons that do not correct for False Discovery Rate). This can be easy if the analysis code is embedded in the paper (eg SWeave), or if the adjustment method is part of the annotation, and the decision process need not be total. This kind of validation can be useful to researchers (less embarrassment), and useful to gatekeepers (less to manually review). Convincing communities working with large datasets to use RDF as a native data format is unlikely to work. The primary problem is that it isn't a very good one. It's great for combining data from multiple sources- as long as ever datum is true. If you want to be less credulous , KMAC YOYO. Convincing people to add metadata describing values in structures as owl/rdfs datatypes or classes is much easier- for example, as HDF5 attributes. If the benefits require major changes to the cultural practices within a given knowledge community, then they must be extremely important *to that community*, and will still be resisted, especially by those most accultutrated into that knowledge community. An example of this kind of change might be inclusion in supplemental materials of analyses and data that did not give positive results. This reduces the file drawer effect, and may improve the justified level of belief in the significance of published results (p < 1.0). This level of change may require a "blood upgrade" ( < https://www.goodreads.com/quotes/4079-a-new-scientific-truth-does-not-triumph-by-convincing-its>). It might also be imposable from above by extreme measures (if more than 10% of your claimed significant results can't be replicated, and you can't provide a reasonable explanation in a court of law, you may be held liable for consequential damages incurred by others reasonably relying on your work, and reasonable costs & possible punitive damages for costs incurred attempting to replicate. Repeat offenders will be fed to a ravenous mob of psychology undergraduates, or forced to teach introductory creative writing ). Simon P. S. [dvips was much easier if you had access to Distiller] It is possible to add mathematical content to html pages, but it is not easy. MathML is not something that browser developers want, which means that the only viable approach is MathJax (<http://mathjax.org>). Mathjax is impressive, and supports a nice subset of LaTeX (including some AMS). However, it adds a noticeable delay to page rendering, as it is heavy duty eczema script, and is computing layout on the fly. It does not require server side support, so is usable from static sites like github pages (see e g. the tests at the bottom of < http://who-wg.github.io>). However the common deployment pattern, using their CDN, adds archival dependencies. >From a processing perspective, this does not make semantic processing of the text much easier, as it may require eczema script code to be executed. On Oct 7, 2014 8:14 AM, "Phillip Lord" <phillip.lord@newcastle.ac.uk> wrote: > > > On 10/07/2014 05:20 AM, Phillip Lord wrote: > >> "Peter F. Patel-Schneider" <pfpschneider@gmail.com> writes: >> >> tex4ht takes the slight strange approach of having an strange and >>>> incomprehensible command line, and then lots of scripts which do default >>>> options, of which xhmlatex is one. In my installation, they've only put >>>> the basic ones into the path, so I ran this with >>>> /usr/share/tex4ht/xhmlatex. >>>> >>>> >>>> Phil >>>> >>>> >>> So someone has to package this up so that it can be easily used. Before >>> then, >>> how can it be required for conferences? >>> >> >> http://svn.gnu.org.ua/sources/tex4ht/trunk/bin/ht/unix/xhmlatex >> > > Somehow this is not in my tex4ht package. > > In any case, the HTML output it produces is dreadful. Text characters, > even outside math, are replaced by numeric XML character entity references. > > peter > >
Received on Tuesday, 7 October 2014 18:17:20 UTC