- From: Larry Masinter <masinter@adobe.com>
- Date: Thu, 26 Oct 2017 18:31:57 +0000
- To: "public-pdf-open-data@w3.org" <public-pdf-open-data@w3.org>
- Message-ID: <3E996DFA-3AF6-4C64-9C7A-3592613E097F@adobe.com>
I put together some notes about documents and data that I’d like to discuss.
This is an outline of discussion points
We should look at the use cases to see if this analysis gives a better way of evaluation
The 5-star ratings confuse modality with format
Data, Documents, Access methods all have a place
Different modalities have different requirements
I’d want to suggest another look at how to evaluate “Open Publication”
that separates the modes
* Documents should be accessible, transportable,
Searchable, translatable, portable, open format.
Open format is about current and future tools.
.docx .pdf .xps are all document formats.
Documents in source format (LaTex, etc.) often use local context
PDF can too (e.g., non-embedded fonts)
Technology is moving rapidly
Special place for image scan of paper
Good and bad scans
Printouts of spreadsheets — acknowledge
paper dominates work practice for last few centuries
And most of current law
The world is moving slowly to data
OCR is improving too, but doesn’t currently
do very well with bad scans of paper tables
* Data should be in data format which is reusable
Extra points for
- documenting the schema
- using a standard schema
* Data needs explanation — hypertext, web applications are great
- accessible, Multi-lingual are important
- let people download as data, also as document
* Hybrid forms of document + data are interesting
PDF with data attachments
If the document explains the schema
HTML with RDFa or microdata
Use Schema.org?
Forms and form-data (e.g. Publishing tax returns in US, 1040 is the schema)
* None of the data portals I’ve seen care about 4th and 5th star
They’re about hybrid forms and a dream, but not so practical
Except HTML with microdata
Documents can be doctored, edited, even PDF
Best practice should be to give people a way to validate
QR-code with URL to official site?
Use digital signatures
Larry
--
http://LarryMasinter.net
Received on Thursday, 26 October 2017 18:32:26 UTC