Extensible Validation Report Language (XVRL)
Editor's Draft
- This Version:
- Local build
- Latest Version:
- http://spec.xproc.org/master/head/xvrl/
- Editors:
- Gerrit Imsieke
- Matthieu Ricaud-Dussarget
- Norman Walsh
- Repository:
- This specification on GitHub
- Report an issue
This document is also available in these non-normative formats: XML.
Copyright © 2019 @@FIXME:
Abstract
This specification describes a unified vocabulary for validation reports. Its main focus is to express the findings of the most common XML validation languages, Schematron, XML Schema, DTD, and Relax NG. It is meant to be extensible in multiple ways. It should both be able to express the results of other XML validation methods and of validation methods that apply to non-XML formats such as JSON or RDF graphs (irrespective of their serialization format). Another extension axis is that it allows addition of custom attributes or elements. While XVRL at its core is specified in terms of an XML vocabulary with a Relax NG schema, there may also be non-normative serialization formats and schemas, namely a JSON serialization and schema.
Status of this Document
This document is an editor's draft that has no official standing.
This section describes the status of this document at the time of its publication. Other documents may supersede this document.
Table of Contents
1 Introduction
XVRL provides a unified format for expressing possibly multiple validation methods, applied to possibly multiple documents. The need arises because not every validation language has a standardized report format, making it difficult to render the results of multiple validations in a single report.
2 XVRL Vocabulary
XVRL elements are in the namespace http://www.xproc.org/ns/xvrl
. XVRL documents may contain
elements in other namespaces at certain locations. The XVRL elements and attributes and their semantics are given
in the following lists. More details about the XVRL grammar are encoded in the Relax NG Compact Syntax version of
the XVRL schema, which is also normative.
2.1 Document Structure
detection
A single finding, typically with an associated error code and/or message(s). A
report
element primarily containsdetection
elements. See Section 2.2, “Detection” for details.digest
A
report
may contain adigest
element in order to provide a summary of thedetection
elements. For the distinct severity levels, counts of thedetection
elements for a given level may be specified ondigest
, for example in an@error-count
attribute. In addition, the@worst
attribute may give the highest severity level that occurs in thedetection
elements that are contained by thedigest
’s parent element.A
digest
element may occur in addition to or instead ofdetection
elements. If nodetection
element is included, adigest
element must be included.All information in digest is understood to be aggregated at some point from the actual detection elements. It is the responsibility of an XVRL creating/processing application to keep them up to date or to remove them when the underlying detection information is changed. A digest may be inserted either before or after the detection elements.
metadata
Information about the time of validation, the validator used, the document(s) under test, etc. See Section 2.3, “Metadata” for details.
A single
metadata
element need not contain all relevant metadata. Metadata infomation will be inherited from surroundingreports/metadata
elements, that is, if a givenmetadata
does not providevalidator
information but the parentreports/metadata
does, the parent’smetadata/validator
will also pertain to the currentmetadata
element’s siblings and their descendants, unless overridden further down.report
The result of a single validation method, typically using a single schema, typically applied to a single document (also referred to as the source document). The individual errors (or other findings) are included as
detection
elements.Naming things…
Previously, what is called “detection” here was called “report”, while a collection of detections was called “validation-report”. Now this collection ist called “report”, while a collection of (new-terminology) reports is now called “reports” (previously: “validation-reports”). I changed the names because I didn’t think that both an individual finding and a collection of individual findings should both be called a “report”, with the prefix “validation-” discerning between both. Since the individual finding is also the result of a validation, there is no reason it couldn’t have been called “validation-report” in the first place. It took quite some time to come up with a term for the individual findings. Candidates were: “finding”, “observation”, “detection”, “incidence”, and “incident”. I’m willing to rename it to something that seems more fitting.
reports
A collection of
report
elements. It may contain the samemetadata
information as a singlereport
in order to denote common information, for example if all validations have been applied to the same document or if all validations use the same schema or validation engine.reports
elements may nest in order to groupreport
elements with common sets of metadata.
2.2 Detection
As described in Section 2.1, “Document Structure”, detection
is the main container for individual
validation findings. It contains optional severity
and
code
attributes, and the following elements in arbitrary order:
category
In order to filter or group messages for a formatted report, individual
detection
s may be categorized according to arbitrary category systems, using the repeatablecategory
element. Its optional attributevocabulary
can hold a string that designates the category system. There are no pre-defined values to choose from.code
attributeAn error code. The term “error code” is used in a colloquial sense here. It need not relate to an error, but to any kind of message that has a distinctive identifying string.
context
elementThe purpose of this element is to present a piece of content that surrounds the element that the detection pertains to. It contains an optional
location
element, followed by (optional) arbitrary text or non-XVRL element content.location
Within a single
detection
element, the location in the source document that the validation error, warning, etc. pertains to is given by thelocation
element’s attributes.If not present,
href
is taken from the closest ancestor’smetadata/document/@href
attribute. If there are multipledocument/@href
attributes in the closest ancestor’s metadata, thehref
attribute should not be omitted onlocation
, or at least a disambiguating relative URI should be given in thelocation/@href
attribute.The attribute
xpath
contains an XPath expression that gives the location within the document. The in-scope value of the attributexpath
that is permitted on any element may give a namespace for the element names in this XPath expression. Apart from that, theQ{namespace-uri}local-name
syntax should be used, but in-scope namespace prefixes or XPath predicates like[namespace-uri() = 'uri']
may also be used.The attributes
line
andcolumn
may also be used to point at lines and columns in a textual representation of the source document.The attribute
octet-position
may be used to give the byte position (1-offset) of the error. This may be useful for binary documents.In order to support JSON document validations, the attributes
jsonpath
andjsonpointer
may be used.Giving multiple alternative pointers is not forbidden. However, it is beyond the scope of this specification to define mechanisms to enforce or check consistency between the attribute values. It is evident that
jsonpointer
orjsonpath
are meaningless in the context of XML documents.Other attributes are permitted if they are in a non-XVRL namespace.
message
An error message that pertains to a
detection
. There may be multiplemessage
elements in a singledetection
element, typically to convey localized versions of essentially the same message. A message may contain arbitrary markup in non-XVRL namespaces. Messages are typically generated for consumption by humans.Note
Whenever the term “error message” is used in a colloquial sense (that is, not highlighted as the severity level “
error
” or as the XVRL element “message
”) throughout this specification, adetection
element with any@severity
level, not necessarily “error
”, and any number of localizedmessage
s is implied. Likewise, the term “error code” does not imply the severity level “error
”.provenance
In multi-step conversion pipelines it is sometimes required to save a common origin location that a portion of the validated document is derived from. This may be necessary in order to patch back error messages of later conversion stages into the source document.
The optional
provenance
element within adetection
conveys exactly this information, in a containedlocation
element that points to the provenance location in the original source document. Aprovenance
element may contain multiplelocation
elements; it is up to processing applications to discern between different roles that they may have.Although it is possible to omit the
@href
attribute in the containedlocation
elements, this URI is not inherited from a containing element’smetadata/document/@href
attribute.severity
attributeThe
severity
attribute is permitted on adetection
element. XVRL establishes a finite set of error levels that correspond to the impact of a detected issue. Eachdetection
element may have a severity level, from highest (worst impact) to lowest, of “fatal-error
”, “error
”, “warning
”, or “info
”. In addition, theseverity
attribute may have the value “unspecified
” which is equivalent to omitting the attribute.Note
Which severity level is attached to a given error code depends on, among other things, the audience that the validation report is prepared for. For Schematron’s SVRL output, the values of
@role
will typically translate to XVRL@severity
attributes, but this mapping may be configured, see below.summary
element(s)An abstract of of a
report
, areports
collection, or an individualdetection
. This element is repeatable, for example, in order to support multiple natural languages. In the context ofdetection
, it can serve as an abridged version of a full message that contains lengthy lists and the like.supplemental
element(s)This repeatable element may contain arbitrary textual or non-XVRL element content. It may appear in
metadata
and indetection
. Itsrole
attribute may be used to further classify the purpose of its content. Like any other XVRL element, it may be localized using thexml:lang
attribute.Purposes can be, but are not limited to, conveying the SVRL source that the XVRL report was created from, or a disclaimer, a confidentiality statement, or introductory content that should be included in a rendered report.
2.3 Metadata
All elements in this section are optional within metadata
.
The order in which they appear is arbitrary. Some are repeatable.
creator
elementInformation about the tool that created the source document. There is no content, only the optional attributes
@name
,@version
, and@invocation
. The@invocation
attribute is meant to hold a command line that contains the invocation of the program that was responsible for generating the source document. This information can be useful for later diagnosing dependencies between errors and command line options.category
element(s)See
category
document
element(s)The URI of a source document may be specified in the
href
attribute. In addition or instead, the document may be given as the element content. See alsolocation
about inheritance ofhref
.schema
element(s)The URI of a schema document may be specified in the
href
attribute. In addition or instead, the document may be given as the element content. The namespace of the schema may be given in the attributeschematypens
. The version of the schema language may be given in the attributeversion
.summary
element(s)See
summary
element(s).supplemental
element(s)timestamp
element>The content needs to be an
xsd:dateTime
value, for example “2017-12-04T12:21:37.381+01:00
”.title
element(s)The title of a
report
, areports
collection, or an individualdetection
. This element is repeatable, for example, in order to support multiple natural languages.validator
elementInformation about the validation program that generated the report(s) or the underlying messages (if XVRL was not generated natively by the program). There are optional attributes
@name
and@version
, both are arbitrary strings. Arbitrary text or element (in a non-XVRL namespace) content may be contained, for example to describe a configuration or to include an actual configuration file.
A The XVRL Schema
# Schema for XML Validation Report Language; adapted from a proposal
# by Matthieu Ricaud incorporating some suggestions by Gerrit Imsieke
# See https://github.com/xproc/3.0-steps/issues/15
datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes"
namespace xvrl = "http://www.xproc.org/ns/xvrl"
default namespace = "http://www.xproc.org/ns/xvrl"
namespace local = ""
start = validation-reports
xmllang.attr = attribute xml:lang { xsd:language }
xmlid.attr = attribute xml:id { xsd:ID }
xmlbase.attr = attribute xml:base { xsd:anyURI }
# Default namespace URI for location XPaths:
xpdns.attr = attribute xpath-default-namespace { xsd:anyURI }
anyother.attr = attribute (* - (local:* | xvrl:* | xml:*)) { text }
any.attr = attribute (* - xvrl:*) { text }
message.attr = attribute (* - (xvrl:* | xml:*)) { text }
common.attr = xmllang.attr? & xmlid.attr? & xmlbase.attr? & xpdns.attr? & anyother.attr*
any.element =
element (* - xvrl:*) {
(any.attr | text | any.element)*
}
message.element =
element (* - (xvrl:* - xvrl:value)) {
(message.attr | text | message.element | value)*
}
validation-reports =
element validation-reports {
common.attr,
attribute href { xsd:anyURI }?,
validation-reports.metadata,
validation-report+
}
validation-reports.metadata =
element metadata {
common.attr,
timestamp?,
title*,
summary*,
category*,
any.element*
}
validation-report =
element validation-report {
common.attr,
attribute href { xsd:anyURI }?,
validation-report.metadata,
((digest?, report+) | (report+, digest) | digest)
}
## All information in digest is understood to be aggregated at some point from the actual report elements.
## It is the responsibility of an XVRL creating/processing application to keep them up to date or to remove them
## when the underlying report information is changed. If the individual reports are omitted, a digest must be present.
## A digest may be inserted either before or after the report elements.
digest =
element digest {
common.attr,
attribute valid { "true" | "false" | "partial" }?,
attribute fatal-errors { xsd:integer }?,
attribute errors { xsd:integer }?,
attribute warnings { xsd:integer }?
}
validation-report.metadata =
element metadata {
common.attr,
timestamp?,
(validator
& creator?
& title*
& summary*
& category*
& schema*
& any.element*
)
}
report =
element report {
common.attr,
attribute severity { "info" | "warning" | "error" | "fatal-error" },
attribute code { text }?,
location?,
provenance?,
let*,
title*,
summary*,
category*,
message+,
supplemental*
}
location =
element location {
location.model
}
location.model =
xpdns.attr?,
# XPaths may use the Q{namespace-uri}local-name notation.
attribute xpath { text }?,
# These are different syntaxes to address JSON documents.
# JSON docs may be represented as XPath maps and arrays
# and then addressed via, e.g., xpath=".(3)('foo')"
# for the 3rd array item, which is a map, and then the map’s
# value for the 'foo' key.
attribute jsonpointer { text }?,
attribute jsonpath { text }?,
attribute href { xsd:anyURI}?,
attribute line { xsd:positiveInteger }?,
attribute column { xsd:positiveInteger }?,
# For binary data:
attribute octet-position { xsd:positiveInteger }?,
anyother.attr*
provenance =
element provenance {
location.model
}
message =
element message {
common.attr,
attribute template { xsd:boolean }?,
(text | message.element)*
}
let =
element let {
common.attr,
attribute name { xsd:QName },
(attribute value {xsd:string} | (text | any.element))*
}
value =
element value {
common.attr,
attribute name { xsd:QName }
}
supplemental =
element supplemental {
common.attr,
(text | any.element)*
}
validator =
element validator {
common.attr,
attribute name { text },
attribute version { text }?
}
creator =
element creator {
common.attr,
attribute name { text },
attribute version { text }?,
element invocation { text }?
}
schema =
element schema {
common.attr,
attribute href { xsd:anyURI }?,
attribute schematypens { xsd:anyURI },
attribute version { text }?
}
title =
element title {
common.attr,
(text | any.element)*
}
summary =
element summary {
common.attr,
(text | any.element)*
}
category =
element category {
common.attr,
attribute vocabulary { xsd:token }?,
(text | any.element)*
}
timestamp =
element timestamp {
common.attr,
xsd:dateTime
}
B Parameters for Controlling XVRL Generation
The following parameters should be understood by XVRL report generators when converting underlying validation
reports, for example, from SVRL or from the XProc error vocabulary, c:errors
etc.
xvrl:default-severity
-
When no severity is associated with a source vocabulary element that is mapped to
detection
, this property can be specified in order to assign a default severity to any of these source vocabulary constructs. It can be argued that the XProc error vocabulary,c:error
, already conveys the severity levelerror
. The view that this specification takes is to regard these messages as generic findings of severity “error
”, but that thexvrl:default-severity
may be given to override this.Implementations are free to provide other parameters, in a different namespace, that permit a more detailed mapping, for example from error code to severity.
xvrl:format
-
Anticipates future alternative serialization If no value is given,
xml
is assumed. Other possible values may be, but are not limited to,json
,rdf/xml
,turtle
. xvrl:language
-
A space separated list of language abbreviations, typically according to ISO 639-1. The preferred language is given first, followed by fallback languages. The result is that localized elements within a
detection
will be reduced to messages, categories, or summaries in the preferred language. Example:de en
instructs the XVRL generator to include German messages only and to use an English message when no German message is present. If no language matches for a given localizable element in adetection
context, a corresponding element with the same attributes, but with noxml:lang
attribute, should be included. Localizable elements with anxml:lang
attribute that is not listed in this property should be ignored. xvrl:map-to-severity
-
This parameter contains space-separated QNames that correspond to elements or attributes of an underlying reporting language, in particular SVRL attributes. A value of
flag role
instructs an SVRL to XVRL converter to preferentially map the SVRLflag
attribute to the XVRLseverity
attribute. If it is not present or its value cannot be mapped, it should try to map the SVRLrole
value to XVRL’sseverity
.The following attribute values are considered mappable, after folding the source value to lower case: “information”, “informational” map to “info”; “warn” maps to “warning”; “fatal” maps to ”fatal-error”. A conversion tool may consider other variants, including translations that correspond to the natural language of the corresponding error message, for mapping.
If the content of an (for example) SVRL attribute cannot be mapped, it should be attached to the corresponding XVRL
detection
either as a category or as a namespaced attribute (that is,role="foo"
in SVRL may becomesvrl:role="foo"
in XVRL, withxmlns:svrl="http://purl.oclc.org/dsdl/svrl"
). xvrl:xpath-notation
-
This parameter controls how XPath attributes given in
location
elements should be structured. Possible values are “Q
”, “namespace-uri
“, and “name
”.Example: The path
/TEI/text[1]
in the namespacehttp://www.tei-c.org/ns/1.0
will be represented in these notations as follows:Q
-
/Q{http://www.tei-c.org/ns/1.0}TEI/Q{http://www.tei-c.org/ns/1.0}text[1]
namespace-uri
-
/TEI[namespace-uri()='http://www.tei-c.org/ns/1.0']/text[namespace-uri()='http://www.tei-c.org/ns/1.0'][1]
This corresponds to the parameter setting
full-path-notation=1
in the SVRL output of the Schematron skeleton implementation. name
-
/tei:TEI/tei:text[1]
This corresponds to the parameter setting
full-path-notation=2
in the SVRL output of the Schematron skeleton implementation. It takes namespace prefix declarations from the source document and it needs to copy these declarations to an appropriate location in the resulting XVRL document.
If the XVRL attribute
xpath-default-namespace
is present on an ancestor element, the namespace URI given in this attribute on the closest ancestor should be used to omit this namespace from the resulting XPath. Ifxpath-default-uri="http://www.tei-c.org/ns/1.0"
is in force in a given context, the paths in any of the three notations should reduce to/TEI/text[1]
.If an XVRL-generating application is unable to generate the preferred notation, any XPath notation that it can produce is acceptable.
C XSLT Stylesheets (Non-Normative)
As a convenience, XSLT stylesheets will be made available for the following purposes:
- SVRL → XVRL conversion
-
It is recommended that XProc processor vendors make this XSLT available under the import URI
http://xproc.org/xvrl/xsl/svrl2xvrl.xsl
. c:errors
→ XVRL conversion-
It is recommended that XProc processor vendors make this XSLT available under the import URI
http://xproc.org/xvrl/xsl/c-errors2xvrl.xsl
. - Filter/transform XVRL
-
It is recommended that XProc processor vendors make this XSLT available under the import URI
http://xproc.org/xvrl/xsl/xvrl2xvrl.xsl
.
All stylesheets accept the parameters given in Appendix B, Parameters for Controlling XVRL Generation.