Copyright © 2022 the document editors/authors. All rights reserved.
LegalHTML is an extension of the [HTML] language designed for representing legal documents.
This section is non-normative.
LegalHTML is an extension of the [HTML] language designed for representing legal documents. It provides an explicit domain language addressing all structural aspects of an act such as articles, paragraphs, items and references. It is enhanced with rich metadata for describing the editorial and jurisdictional history of an act and includes references to domain entities. Furthermore, LegalHTML addresses the consolidation of an act and its subsequent modifications by representing it in a single document using a tree-like model of the original content and of its modified versions. The associated implementation of LegalHTML introduces a default presentation and default behavior for compliant documents, which become "active" as they suppport table of content generation, footnotes crossreferences and, moreover, point-in-time-visualization of consolidated documents.
LegalHTML is unifying standard for the semantics of legal documents on one hand and their visual representation on the other hand. This is in contrast with other approaches where such aspects are managed through separate models and are consequently expressed through different document instances. As a result, at the publication level, there is a single document that incorporates:
Adopting a unifying format covering structure, presentation and semantics is not only beneficial in terms of publication. It indeed supports agile editorial workflows in which a single document can be initially drafted at the mere text level and progressively refined by specialized figures concerned with the different aspects of the document (proper legal formulation of the content, document structure, cosmetics, semantics, entity references etc..)
This document describes how to use LegalHTML-specific elements for describing legal documents. However, a legal document is also composed of figures, tables, legends, ordinary text (e.g. in memoranda) that do not require these elements for the legal domain and can instead be represented through HTML Standard. Such representations are not described in the present document. The reader may refer to the HTML Standard for a complete compendium on document editing for the Web.
Sections of ordinary text can still benefit, though, of some elements from the LegaHTML vocabulary, when these are beneficial to the proper representation of the content. These exceptions will be discussed along this document.
LegalHTML distinguishes three semantic layers with different concerns:
LegalHTML addresses these different concerns utilizing the various Extensibility mechanisms in the HTML Living Standard, making an informed decision to use of them in different contexts to achive a concise and clear representation.
Furthermore, LegalHTML is not bound to a specific legal tradition and is thought to be extensible through the use of different ontologies and authority resources; these can be interchangeably adopted, providing the vocabulary and semantics required by any given tradition.
A LegalHTML document is an [HTML] document that conforms to the constraints specified herein. These constraints include the requirement to link the [JavaScript] file and the [CSS-2021] style sheet that together implement LegalHTML. The following example illustrates the structure of legal documents in LegalHTML.
The script
element with type
text/turtle
contains RDF metadata conveniently
serialized in [turtle] syntax. The body
of the
document consists of a number of custom sections
matching the structure of the specific legal document to
represent.
For what concerns the body
, LegalHTML offers a broad set of elements covering different requirements. The actual content of a LegalHTML document then depends on the nature of the legal document being encoded.
Metadata provides global information about the document, possibly including references to external domain entities.
The LegalHTML specification includes a companion ontology, LegalHTML Ontology associated with the namespace https://w3id.org/legalhtml/ov#
, the preferred prefix of which is lh
.
As already mentioned, this ontology can then be complemented with other ontologies and authority resources providing the vocabulary, semantics and entity references required by any given tradition.
In all the examples throughout the specifications, the [ELI] ontology and the metadata profile found in [EUR-LEX] and authority tables of the Publications Office of the European Union have been adopted for additional semantics and entity reference.
A detailed description of the ontology can be found in the associated documentation. The example below illustrates the metadata describing an act and three subsequent amendments.
LegalHTML uses the script
element with type
text/turtle
as
a means to store metadata inside the [HTML] document using the [turtle] syntax.
The resource <http://data.europa.eu/eli/dec/2008/589/2012-08-10>
, which coincides with the
base URI associated with the example document, is firstly said to be an lh:ConsolidatedResource
, and
for the sake of interoperability it is also said that its type_document
according to the [ELI]
ontology is <http://publications.europa.eu/resource/authority/resource-type/CONS_TEXT>
. As
such, the document consolidated (lh:consolidates
) a number of documents, one being the original act
and the others being amendments (indeed, the latter is an indirect amendment). The document is also associated
with the corresponding change sets (via the property lh:changeSet
), each one being associated with an
act (lh:changingAct
), some dates (e.g. the property lh:entryIntoForce
holds the date of
first entry into force), and, finally, the actual changes. Indeed, the ontology defines a number of properties to
hold different types of change, such as lh:forceChange
and lh:textualChange
for force
changes and textual changes, respectively. The latter is, indeed, specialized into distinct subclasses
representing different types of textual changes. In the example, an lh:Substitution
represents the
fact that some subdivision (lh:amendingText
) of the changing act has specified the replacement of a
subdivision (lh:amendedText
) of the act being consolidated. Actually, the last change set in the
example illustates an indirect amendment, in which the change affects a subdivision of an early amendment to the
current act, entailing the presence of two values for the property lh:amendedText
. The aforementioned
properties are complemented by similar ones, which instead hold technical references to the actual [HTML]
elements in the mutiple version document that hold the affected portions of the act being consolidated.
In the example, the <../change1>
resource substitutes the content of the second
paragraph: the property lh:replacedContent
is a relative URL pointing by means of a fragment
identifier to the lh-version
that represents the version of the paragraph being substituted, while
the property lh:replacement
is a similar reference to the lh-version
that represents the
new version of that paragraph (both elments have to be children of the same lh-cons
custom element).
Additionally, [rdfa-core] makes it possible to interleave machine-readable metadata and textual content, when the information of interest appears in the document such as in the case of the concluding formulas.
A number of properties from the LegalHTML Ontology can indeed be represented by text present in the act. In such cases, [rdfa-core] should be used to semantically annotate relevant passages, providing metadata inline, rather than placing them separatedly, as described in the previous section.
The LegalHTML Ontology is usually combined with other ontologies, vocabularies, and authority tables, to dealwith semantic concepts that are specific to a particular legal tradition and thus could not be included in the general ontology.
The following examples use some authority tables from the Publications Office of the EU (e.g., for the property values), and the [ELI] ontology (to a lesser extent), all dealing with semantics related to EU legislation.
Prefix | Namespace | Reference |
---|---|---|
corpbody | http://publications.europa.eu/resource/authority/corporate-body/ |
[OP-CORPORATE-BODY] |
file-status | http://publications.europa.eu/resource/authority/file-status/ |
[OP-FILE-STATUS] |
language | http://publications.europa.eu/resource/authority/language/ |
[OP-LANGUAGE] |
procphase | http://publications.europa.eu/resource/authority/procedure-phase |
[OP-PROCEDURE-PHASE] |
role | http://publications.europa.eu/resource/authority/role/ |
[OP-ROLE] |
restype | http://publications.europa.eu/resource/authority/resource-type/ |
[OP-RESOURCE-TYPE] |
Each of the acting entities of a document can be represented by means of the lh:actingEntity
property
A document (e.g. a decision) can be addressed to specific entities, which can be represented by means of the lh:addressee
property
The typology of the document, specified with respect to a jurisdiction-specific repertoire, is best represented in an a meta
element inside the head
of the document, making it readily available to HTML processors without the need for extracting the information as RDF.
The entities that originated a document can be annotated individually with the lh:issuer
property.
The lh:legislature
property indicates, by means of an explicit reference to an authority resource, the legislative term of a given body under which a certain act has been issued
The language in which a document has been originally drafted can be represented by means of the lh:originalLanguage
property. The value has to be taken from an adopted references for languages IRIs.
The unique reference for a procedure can be represented by means of the lh:procedureID
property and value taken from the content text.
The procedure stage can be reported, in a document, by using full wording or a dedicated code. In ontology data, the property lh:procedureStage
should be used to represent such information, possibly normalizing the value, especially in case of full wording, to a formal code
Each of the entities that have proposed a document can be represented by means of the lh:proposingEntity
property.
A Reference is an identifier assigned to a document by an entity; it can be expressed by means of the lh:reference
property.
Since the reference is usually expressed as a string, there is no need of normalization (though it is permitted to provide one) and its value can be implicitly assumed to be the annotated portion of the content text, as in the following example:
The lh:reference
property can be used to indicate that the subject matter of an act is of (any sort of) relevance for a certain body,
In the following example, the RDFa annotation expresses the fact that the subject-matter of the act is governed by Agreement on the European Economic Area (EEA)
The status of a document can be represented by means of the lh:status
property
The title of the document can be expressed through an RDFa annotation. The advantage of such element is that it can be added in different structural elements of the document, as in the following case of council internal document 7940/19, where the title of the document is not present in a proper title structure (which will be later described in title)
The title section of an act, which is represented by the customized built-in element lh-title
,
includes according to the [OJ-STYLE-GUIDE]:
The aforementioned components of a title are represented as different h1
elements. Text-level
semantic elements can be used to further annotate the content: in the example, a time
element is used to contain the canonical value of a date in [ISO8601] format.
please notice the custom element lh-effective-title
denoting a portion of the title that is specifically addressing the subject of the document and that is never rephrased when citing the document. Indeed, in amendments and references to an act, the full title is often not cited “verbatim”.
A legal act can be broken down into a number of subivisions that account for distinctions above the basic unit (e.g., the article), which could be, depending on the legal tradition, parts, titles, chapters or sections.
These subdivisions are represented as custom built-in elements section
/lh-section
with the attribute data-type
indicating the subvision type, as one of: part
, title
, chapter
or section
. As already mentioned, it is out of the scope of LegalHTML to define a controlled list of such levels, let alone constraing how they can be nested.
The custom built-in element
must contain a nested section
/lh-sectionheader
, which in turns contains the custom build-in elements h2
/lh-section-number
(with attribute data-value
for the normalized number of the section) and h3
/lh-section-title
The preamble section of an act, which is represented by the section
/lh-preamble
customized built-in element, covers according to the [OJ-STYLE-GUIDE] the content between the title and the enacting terms,
consisting of an introductory element (represented by a div
/lh-preamble-init
customized built-in element, usually conveying the acting entities), citations, recitals and a final
clause (represented by a div
/lh-preamble-final
customized built-in element, usually containing the solemn form that introduces the enacting terms).
The citations section may refer according to the [OJ-STYLE-GUIDE] the legal basis of the act, preparatory acts and,
in legislative acts, the transmission of the document
and the procedure followed. These are represented by customized built-in elements div
/lh-citation
enclosed within a customized built-in element div
/lh-citations
.
In the following example, there is a semantic annotation for the mention of an article in the TFEU that sets the legal basis for the adoption an the act. The nature of the refence is represented with the property lh:based_on
from the [ELI] ontology, while the target is identified using the URI provided by the OP implementation of ELI.
The recitals briefly discuss the reasons for the content of the enacting terms. These are grouped into a customized built-in element div
/lh-recitals
, which in turn contains:
The customized built-in element div
/lh-recital-init
contains the introduction to the recitals section, which is
usually a phrase like "whereas". Subsequently, the customized built-in element ol
/ lh-recital-list
contains the actual list of recitals. Each recital gets itw own li
element, which in turn contains:
The enacting terms section of an act, which is represented by the customized built-in element section
/lh-enacting-terms
, contains according to the [OJ-STYLE-GUIDE] the normative content of the act, articulated into articles, (numbered) paragraphs, points and other
minor subdivisions.
An article is represented with a standard article
element, which is used solely for that purpose in LegalHTML.
An article must contain a header
that contains:
Paragraphs are represented through the built-in element div
customized as lh-paragraph
LegalHTML allows for a smart representation of the textual content of the paragraph. If the paragraph is not structured, its content can be represented as text content within its same tag. If there is a structure, then <p> elements should be used for the simple text content
Numbered paragraphs are enclosed into a customized built-in element
.
ol
/lh-paragraphs
Each paragraph is then associated with a distinct li
element containing:
Points express an enumeration, which can be numbered or not.
p
element to represent it.
Numbered points are represented using a customized built-in element ol
/lh-numpoints
. Avoiding generated text, each li
child must contain a customized built-in element span
/lh-number
annotating the point number (whose normalized value is in the attribute data-value
) and a customized built-in element span
/lh-point
annotating the actual content of the point.
Bullet points are represented using a customized built-in element ol
/lh-bullpoints
. Avoiding generated text, each li
child must contain a customized built-in element span
/lh-bullet
annotating the marker and a customized built-in element span
/lh-point
annotating the actual content of the point.
LegalHTML provides the customized built-in element div
/lh-formula
to represent (mathematical) formulas. For the content, MathML is supported, which is the official language for representing mathematical formulas in HTML, but other languages can be supported through dedicated scripting in future versions of LegaHTML (e.g. Latex).
The code above using MathML to represent a mathematical formula results in the following rendering in the browser:
According to the [[OJ-STYLE-GUIDE], the concluding formulas, which are represented by an
lh-concluding-formulas
custom element, report the date and place an act has been signed as well as
the signatory
Each signature is represented in LegalHTML through the
customized built-in element.div
/lh-signature
Each signatory can be semantically annotated for the organization they represent (property from the LegalHTML ontology: lh:signatoryOrganization
), their role in the organization (lh:signatoryRole
) and the signatory themself (lh:signatory
).
An active modification is a portion of text describing an amendment performed on another act
the whole text of the amendment is wrapped by a
customized built-in elementdiv
/lh-placedate
the portions of text that are to be inserted, replaced, etc.. can be cited explicitly through the blockquote
or q
elements. It is worth mentioning that:
lh-amendment
covers the whole amending paragraphblockquote
is used for quotes of structures that may go from a simple element to structures of several nested tags. As an alternative, q
can be used for pure replacing/replaced text, e.g. to say that “dog must be replaced by fog” both dog and fog would be covered by distinct q
blockquote
and q
must possess the attributes data-start-quote
and data-end-quote
specifying the character used in the article to surround the active modification
multiple amendments may appear as introduced by sentences qualifying part of the information (e.g. part of the reference) that is relevant for all of the amendments. In this case, the element
is adopted as an umbrella over the whole set of amendments. Furthermore, this element is a non-terminal one: a nesting of more div
/lh-amendment-grouplh-amendment-group
tags is possible, to introduce progressively more specific information about progressively smaller numbers of amendments, as in the following example:
The lh-doppelganger element
Please notice in this last example the use of the autonomous custom lh-doppelganger
element. This element is meant to replace a specific [HTML] element if this cannot be placed in that position. For instance, in the above example, the root of the blockquote should be a li
element. However, [HTML] restricts use of li
to be positioned only under an ol
or ul
element. The lh-doppelganger
thus represents a sort of escaping of the element, replacing it and denoting the type of replaced element through the data-target attribute. In case that the replaced element contains further attributes, these are replaced by attributes named as: data-attr-<originalattributename>.
E.g. if the element tag:
<td colspan="2">
is to be replaced, the following would appear:
<lh-doppelganger data-target=”td” data-attr-colspan="2">
The notation of direct applicability is specific the European Union legislation, in which certain acts (e.g., regulations and decisions with an addressee) are legally binding without the need for a implementing laws in the member states. Such acts contain a special formula after the last article, which can be represented as a customized built-in element section
/lh-direct-applicability
.
The enclosed content of this section can be, more in detail, annotated through the lh:applicability
property. The value of the property should be taken from the Corporate body authority table of the OP. In the following case, the corpbody:EUMS
resource corresponds to all EU Member States, as expressed in the text of the direct applicability of the act.
An act can contain other documents that support it, complement it or are the subject of it. In this section we describe various sectioning elements available in LegalHTML for describing such embedded documents. In general, all these elements are represented by customized versions of the built-in element section
In the context of ordinary legislative procedures, a body's position on a certain subject can be consolidated over time. The result of this consolidation can be described in a dedicated section with section
/lh-consolidated-text
The element can be further decorated with RDFa code expressing the nature of the consolidated text
In the following example, the eli:type_document
and restype:CONS_TEXT
value from EU authority table Resource type are used to describe the consolidated text
The customized built-in element
can be used to express the position of a body over a given draft legislative actsection
/lh-legislative-resolution
The element can be further decorated with RDFa code expressing the nature of the legisltive resolution
In the following example, the eli:type_document
and restype:RES_LEGIS
value from EU authority table Resource type are used to describe the document as a legislative resolution
The act that is the subject of a proposal can be embedded into the latter through the customized built-in element section
/lh-proposed-act
The element can be further decorated with RDFa code better qualifying the type of act
In the following example, the eli:type_document
and restype:DEC
value from EU authority table Resource type are used to describe a Decision proposed into a "Proposal for a Decision"
The act that is the subject of an enclosing legal document (e.g. a position) can be embedded into the latter through the customized built-in element section
/lh-target-act
The element can be further decorated with RDFa code better qualifying the type of act
In the following example, the eli:type_document
and restype:DEC
value from EU authority table Resource type are used to describe a Decision on which the EU Council has expressed its position.
A solution for general cases of documents thought to support or provide a rationale for the main act is provided by section
/lh-accompanying-document
The element can be further decorated with RDFa code better qualifying the type of accompanying document and thus its role in the main act
In the following example, the eli:type_document
and restype:STAT_REASON
value from EU authority table Resource type are used to describe a document providing the reasons supporting a Council's position.
The marker with the broadest semantics for embedded documents is represented by section
/lh-embedded-document
As for accompanying docs, this element can be further decorated with RDFa code better qualifying the type of embedded document and thus its role in the main act
An annex contains according to the [OJ-STYLE-GUIDE] rules and techinical data that do not appear in
the enacting terms. Annexes are represented as a customized section typed lh-annex
,
which is not constrained in its content. Indeed, annexes usually comprise tables and lists, but they might also
contain figures, chemical formulas, and other specificalized data. For this reason, LegalHTML grants a very
degree of freedom to writing annexes, which might benefit from other Web standards with native support in
browsers or via JavaScript polyfills:
mhchem
extension of [MATH-JAX] implments chemical equation macros of the LaTeX
mhchem package
An lh-annex
must contain a header
that contains:
Footnotes are represented in a footer
as an lh-footnote
custom element.
References to a footnote are written in the document as a custom a
element in [HTML] named
lh-footnote-ref
.
The href
attribute must be a relative [URL] utilizing a fragment reference to the id
of the
target footnote.
LegalHTML embraces the single document multiple version approach to consolidation that is being developed in [AKN4EU]: a legal act together with any subsequent amendment, including indirect ones, are consolidated into a single document that represents a superposition of different versions, while part of the default behavior of a LegalHTML document is to materialize each of them as a different views of the act, depending on factors including the applicability of each change.
At the structural level, different versions of the same section or internal part thereof are
wrapped in different lh-version
custom elements, which are then grouped into a lh-cons
custom element.
The information needed to select the right version (e.g. of the effective title, in the example above) is stored in RDF as part of the metadata embedded in the document.
Inside an lh-version
custom element, it is possible to find the same structure of consolidated
versions, if there are later changes to parts of that version.
The default presentation of a LegalHTML document borrows a lot from the look and feel of [EUR-LEX].
A document in LegalHTML also has a default behavior that addresses different concerns:
LegalHTML adds the special property consleg
to the document object upon load. For a
consolidated resource, this property allows for switching to a view of the document as it was in force in a
given date.
The consleg property hosts a reasoner able to render the document according to different provision requirements. The current API allows for switching to different versions in force at the requested date of provision. However, since the consolidation model of this specification and the LegalHTML ontology foresee the representation of events related to the publication and applicability of the legal acts and of their modifications, further evolutions of the API provided by this property are possibile, without requiring changes to the LegalHTML language. For instance, it could be relevant to have API for rendering a document considering all modifications that are not just in force, rather applicable at the requested date of provision. Complex combinations can also be implemented, e.g. considering all modifications that are not only effective at the date of provision, but have been determined by modifying acts that have been indeed published before that date. This can be relevant for determining, in disputes, the good will of someone who, at a certain point in time, acted unaware of a law that, though retroactive, was published after that said point in time.
As clearly expressed in the introduction and in the following sections, LegalHTML has not been conceived for a specific legal tradition. Nonetheless, its initial developement has been funded and supported by an effort of the EU commission to simplify the editorial and publication workflow of legal documents.
In the context of this effort, LegalHTML is guaranteed to cover and properly represent all legal documents to be published on the Official Journal of the EU, by following the [OJ-STYLE-GUIDE] and by testing the developed specifications against a reference corpus of documents from the Official Journal
The following list reports on all document types that have informed the development of LegalHTML
legal acts
other legal documents
act proposals
embedded documents; these documents exist only attached to other legal documents
The general structure of a legal act consists of the following sections:
Regulations also specify the direct applicability in between the enacting terms anb concluding formulas
Some recurring patterns in the structure of documents that are not acts include:
All documents that are not acts usually open with several metadata entries, such as:
lh-actingEntity
property)lh-reference
)All proposals for an act (i.e. decision, directive, regulation) feature:
eli:type_document
set to restype:EXPL_MEMORANDUM"
eli:type_document
set to restype:STAT_REASON"
lh-legislature
property)two embedded documents:
Referenced in: