Copyright © 2022 the document editors/authors. All rights reserved.
LegalHTML is an extension of the [HTML] language designed for representing legal documents.
This section is non-normative.
LegalHTML is an extension of the [HTML] language designed for representing legal documents. It provides an explicit domain language addressing all structural aspects of an act such as articles, paragraphs, items and references. It is enhanced with rich metadata for describing the editorial and jurisdictional history of an act and includes references to domain entities. Furthermore, LegalHTML addresses the consolidation of an act and its subsequent modifications by representing it in a single document using a tree-like model of the original content and of its modified versions. The associated implementation of LegalHTML introduces a default presentation and default behavior for compliant documents, which become "active" as they suppport table of content generation, footnotes crossreferences and, moreover, point-in-time-visualization of consolidated documents.
LegalHTML is unifying standard for the semantics of legal documents on one hand and their visual representation on the other hand. This is in contrast with other approaches where such aspects are managed through separate models and are consequently expressed through different document instances. As a result, at the publication level, there is a single document that incorporates:
Adopting a unifying format covering structure, presentation and semantics is not only beneficial in terms of publication. It indeed supports agile editorial workflows in which a single document can be initially drafted at the mere text level and progressively refined by specialized figures concerned with the different aspects of the document (proper legal formulation of the content, document structure, cosmetics, semantics, entity references etc..)
This document describes how to use LegalHTML-specific elements for describing legal documents. However, a legal document is also composed of figures, tables, legends, ordinary text (e.g. in memoranda) that do not require these elements for the legal domain and can instead be represented through HTML Standard. Such representations are not described in the present document. The reader may refer to the HTML Standard for a complete compendium on document editing for the Web.
Sections of ordinary text can still benefit, though, of some elements from the LegaHTML vocabulary, when these are beneficial to the proper representation of the content. These exceptions will be discussed along this document.
LegalHTML distinguishes three semantic layers with different concerns:
LegalHTML addresses these different concerns utilizing the various Extensibility mechanisms in the HTML Living Standard, making an informed decision to use of them in different contexts to achive a concise and clear representation.
Furthermore, LegalHTML is not bound to a specific legal tradition and is thought to be extensible through the use of different ontologies and authority resources; these can be interchangeably adopted, providing the vocabulary and semantics required by any given tradition.
A LegalHTML document is an [HTML] document that conforms to the constraints specified herein. These constraints include the requirement to link the [JavaScript] file and the [CSS-2021] style sheet that together implement LegalHTML. The following example illustrates the structure of legal documents in LegalHTML.
The script element with type text/turtle contains RDF metadata conveniently
serialized in [turtle] syntax. The body of the
document consists of a number of custom sections matching the structure of the specific legal document to
represent.
For what concerns the body, LegalHTML offers a broad set of elements covering different requirements. The actual content of a LegalHTML document then depends on the nature of the legal document being encoded.
Metadata provides global information about the document, possibly including references to external domain entities.
The LegalHTML specification includes a companion ontology, LegalHTML Ontology associated with the namespace https://w3id.org/legalhtml/ov#, the preferred prefix of which is lh.
As already mentioned, this ontology can then be complemented with other ontologies and authority resources providing the vocabulary, semantics and entity references required by any given tradition.
In all the examples throughout the specifications, the [ELI] ontology and the metadata profile found in [EUR-LEX] and authority tables of the Publications Office of the European Union have been adopted for additional semantics and entity reference.
A detailed description of the ontology can be found in the associated documentation. The example below illustrates the metadata describing an act and three subsequent amendments.
LegalHTML uses the script element with type text/turtle as
a means to store metadata inside the [HTML] document using the [turtle] syntax.
The resource <http://data.europa.eu/eli/dec/2008/589/2012-08-10>, which coincides with the
base URI associated with the example document, is firstly said to be an lh:ConsolidatedResource, and
for the sake of interoperability it is also said that its type_document according to the [ELI]
ontology is <http://publications.europa.eu/resource/authority/resource-type/CONS_TEXT>. As
such, the document consolidated (lh:consolidates) a number of documents, one being the original act
and the others being amendments (indeed, the latter is an indirect amendment). The document is also associated
with the corresponding change sets (via the property lh:changeSet), each one being associated with an
act (lh:changingAct), some dates (e.g. the property lh:entryIntoForce holds the date of
first entry into force), and, finally, the actual changes. Indeed, the ontology defines a number of properties to
hold different types of change, such as lh:forceChange and lh:textualChange for force
changes and textual changes, respectively. The latter is, indeed, specialized into distinct subclasses
representing different types of textual changes. In the example, an lh:Substitution represents the
fact that some subdivision (lh:amendingText) of the changing act has specified the replacement of a
subdivision (lh:amendedText) of the act being consolidated. Actually, the last change set in the
example illustates an indirect amendment, in which the change affects a subdivision of an early amendment to the
current act, entailing the presence of two values for the property lh:amendedText. The aforementioned
properties are complemented by similar ones, which instead hold technical references to the actual [HTML]
elements in the mutiple version document that hold the affected portions of the act being consolidated.
In the example, the <../change1> resource substitutes the content of the second
paragraph: the property lh:replacedContent is a relative URL pointing by means of a fragment
identifier to the lh-version that represents the version of the paragraph being substituted, while
the property lh:replacement is a similar reference to the lh-version that represents the
new version of that paragraph (both elments have to be children of the same lh-cons custom element).
Additionally, [rdfa-core] makes it possible to interleave machine-readable metadata and textual content, when the information of interest appears in the document such as in the case of the concluding formulas.
A number of properties from the LegalHTML Ontology can indeed be represented by text present in the act. In such cases, [rdfa-core] should be used to semantically annotate relevant passages, providing metadata inline, rather than placing them separatedly, as described in the previous section.
The LegalHTML Ontology is usually combined with other ontologies, vocabularies, and authority tables, to dealwith semantic concepts that are specific to a particular legal tradition and thus could not be included in the general ontology.
The following examples use some authority tables from the Publications Office of the EU (e.g., for the property values), and the [ELI] ontology (to a lesser extent), all dealing with semantics related to EU legislation.
| Prefix | Namespace | Reference |
|---|---|---|
| corpbody | http://publications.europa.eu/resource/authority/corporate-body/ |
[OP-CORPORATE-BODY] |
| file-status | http://publications.europa.eu/resource/authority/file-status/ |
[OP-FILE-STATUS] |
| language | http://publications.europa.eu/resource/authority/language/ |
[OP-LANGUAGE] |
| procphase | http://publications.europa.eu/resource/authority/procedure-phase |
[OP-PROCEDURE-PHASE] |
| role | http://publications.europa.eu/resource/authority/role/ |
[OP-ROLE] |
| restype | http://publications.europa.eu/resource/authority/resource-type/ |
[OP-RESOURCE-TYPE] |
Each of the acting entities of a document can be represented by means of the lh:actingEntity property
A document (e.g. a decision) can be addressed to specific entities, which can be represented by means of the lh:addressee property
The typology of the document, specified with respect to a jurisdiction-specific repertoire, is best represented in an a meta element inside the head of the document, making it readily available to HTML processors without the need for extracting the information as RDF.
The entities that originated a document can be annotated individually with the lh:issuer property.
The lh:legislature property indicates, by means of an explicit reference to an authority resource, the legislative term of a given body under which a certain act has been issued
The language in which a document has been originally drafted can be represented by means of the lh:originalLanguage property. The value has to be taken from an adopted references for languages IRIs.
The unique reference for a procedure can be represented by means of the lh:procedureID property and value taken from the content text.
The procedure stage can be reported, in a document, by using full wording or a dedicated code. In ontology data, the property lh:procedureStage should be used to represent such information, possibly normalizing the value, especially in case of full wording, to a formal code
Each of the entities that have proposed a document can be represented by means of the lh:proposingEntity property.
A Reference is an identifier assigned to a document by an entity; it can be expressed by means of the lh:reference property.
Since the reference is usually expressed as a string, there is no need of normalization (though it is permitted to provide one) and its value can be implicitly assumed to be the annotated portion of the content text, as in the following example:
The lh:reference property can be used to indicate that the subject matter of an act is of (any sort of) relevance for a certain body,
In the following example, the RDFa annotation expresses the fact that the subject-matter of the act is governed by Agreement on the European Economic Area (EEA)
The status of a document can be represented by means of the lh:status property
The title of the document can be expressed through an RDFa annotation. The advantage of such element is that it can be added in different structural elements of the document, as in the following case of council internal document 7940/19, where the title of the document is not present in a proper title structure (which will be later described in title)
The title section of an act, which is represented by the customized built-in element lh-title,
includes according to the [OJ-STYLE-GUIDE]:
The aforementioned components of a title are represented as different h1 elements. Text-level
semantic elements can be used to further annotate the content: in the example, a time element is used to contain the canonical value of a date in [ISO8601] format.
please notice the custom element lh-effective-title denoting a portion of the title that is specifically addressing the subject of the document and that is never rephrased when citing the document. Indeed, in amendments and references to an act, the full title is often not cited “verbatim”.
A legal act can be broken down into a number of subivisions that account for distinctions above the basic unit (e.g., the article), which could be, depending on the legal tradition, parts, titles, chapters or sections.
These subdivisions are represented as custom built-in elements section/lh-section with the attribute data-type indicating the subvision type, as one of: part, title, chapter or section. As already mentioned, it is out of the scope of LegalHTML to define a controlled list of such levels, let alone constraing how they can be nested.
The custom built-in element must contain a nested section/lh-sectionheader, which in turns contains the custom build-in elements h2/lh-section-number (with attribute data-value for the normalized number of the section) and h3/lh-section-title
The preamble section of an act, which is represented by the section/lh-preamble customized built-in element, covers according to the [OJ-STYLE-GUIDE] the content between the title and the enacting terms,
consisting of an introductory element (represented by a div/lh-preamble-init customized built-in element, usually conveying the acting entities), citations, recitals and a final clause (represented by a div/lh-preamble-final customized built-in element, usually containing the solemn form that introduces the enacting terms).
The citations section may refer according to the [OJ-STYLE-GUIDE] the legal basis of the act, preparatory acts and,
in legislative acts, the transmission of the document
and the procedure followed. These are represented by customized built-in elements div/lh-citation enclosed within a customized built-in element div/lh-citations.
In the following example, there is a semantic annotation for the mention of an article in the TFEU that sets the legal basis for the adoption an the act. The nature of the refence is represented with the property lh:based_on from the [ELI] ontology, while the target is identified using the URI provided by the OP implementation of ELI.
The recitals briefly discuss the reasons for the content of the enacting terms. These are grouped into a customized built-in element div/lh-recitals, which in turn contains:
The customized built-in element div/lh-recital-init contains the introduction to the recitals section, which is
usually a phrase like "whereas". Subsequently, the customized built-in element ol/ lh-recital-list contains the actual list of recitals. Each recital gets itw own li element, which in turn contains:
The enacting terms section of an act, which is represented by the customized built-in element section/lh-enacting-terms, contains according to the [OJ-STYLE-GUIDE] the normative content of the act, articulated into articles, (numbered) paragraphs, points and other
minor subdivisions.
An article is represented with a standard article element, which is used solely for that purpose in LegalHTML.
An article must contain a header that contains:
Paragraphs are represented through the built-in element div customized as lh-paragraph
LegalHTML allows for a smart representation of the textual content of the paragraph. If the paragraph is not structured, its content can be represented as text content within its same tag. If there is a structure, then <p> elements should be used for the simple text content
Numbered paragraphs are enclosed into a customized built-in element .
ol/lh-paragraphs
Each paragraph is then associated with a distinct li element containing:
Points express an enumeration, which can be numbered or not.
p element to represent it.
Numbered points are represented using a customized built-in element ol/lh-numpoints. Avoiding generated text, each li child must contain a customized built-in element span/lh-number annotating the point number (whose normalized value is in the attribute data-value) and a customized built-in element span/lh-point annotating the actual content of the point.
Bullet points are represented using a customized built-in element ol/lh-bullpoints. Avoiding generated text, each li child must contain a customized built-in element span/lh-bullet annotating the marker and a customized built-in element span/lh-point annotating the actual content of the point.
LegalHTML provides a few customized built-in elements to structure the content of captions, for example that associated with figures.
LegalHTML provides the customized built-in element div/lh-formula to represent (mathematical) formulas. For the content, MathML is supported, which is the official language for representing mathematical formulas in HTML, but other languages can be supported through dedicated scripting in future versions of LegaHTML (e.g. Latex).
The code above using MathML to represent a mathematical formula results in the following rendering in the browser:
According to the [[OJ-STYLE-GUIDE], the concluding formulas, which are represented by an
lh-concluding-formulas custom element, report the date and place an act has been signed as well as
the signatory
Each signature is represented in LegalHTML through the customized built-in element.div/lh-signature
Each signatory can be semantically annotated for the organization they represent (property from the LegalHTML ontology: lh:signatoryOrganization), their role in the organization (lh:signatoryRole) and the signatory themself (lh:signatory).
An active modification is a portion of text describing an amendment performed on another act
the whole text of the amendment is wrapped by a customized built-in elementdiv/lh-placedate
the portions of text that are to be inserted, replaced, etc.. can be cited explicitly through the blockquote or q elements. It is worth mentioning that:
lh-amendment covers the whole amending paragraphblockquote is used for quotes of structures that may go from a simple element to structures of several nested tags. As an alternative, q can be used for pure replacing/replaced text, e.g. to say that “dog must be replaced by fog” both dog and fog would be covered by distinct q
blockquote and q must possess the attributes data-start-quote and data-end-quote specifying the character used in the article to surround the active modification
multiple amendments may appear as introduced by sentences qualifying part of the information (e.g. part of the reference) that is relevant for all of the amendments. In this case, the element is adopted as an umbrella over the whole set of amendments. Furthermore, this element is a non-terminal one: a nesting of more div/lh-amendment-grouplh-amendment-group tags is possible, to introduce progressively more specific information about progressively smaller numbers of amendments, as in the following example:
The lh-doppelganger element
Please notice in this last example the use of the autonomous custom lh-doppelganger element. This element is meant to replace a specific [HTML] element if this cannot be placed in that position. For instance, in the above example, the root of the blockquote should be a li element. However, [HTML] restricts use of li to be positioned only under an ol or ul element. The lh-doppelganger thus represents a sort of escaping of the element, replacing it and denoting the type of replaced element through the data-target attribute. In case that the replaced element contains further attributes, these are replaced by attributes named as: data-attr-<originalattributename>.
E.g. if the element tag:
<td colspan="2">
is to be replaced, the following would appear:
<lh-doppelganger data-target=”td” data-attr-colspan="2">
The notation of direct applicability is specific the European Union legislation, in which certain acts (e.g., regulations and decisions with an addressee) are legally binding without the need for a implementing laws in the member states. Such acts contain a special formula after the last article, which can be represented as a customized built-in element section/lh-direct-applicability.
The enclosed content of this section can be, more in detail, annotated through the lh:applicability property. The value of the property should be taken from the Corporate body authority table of the OP. In the following case, the corpbody:EUMS resource corresponds to all EU Member States, as expressed in the text of the direct applicability of the act.
An act can contain other documents that support it, complement it or are the subject of it. In this section we describe various sectioning elements available in LegalHTML for describing such embedded documents. In general, all these elements are represented by customized versions of the built-in element section
In the context of ordinary legislative procedures, a body's position on a certain subject can be consolidated over time. The result of this consolidation can be described in a dedicated section with section/lh-consolidated-text
The element can be further decorated with RDFa code expressing the nature of the consolidated text
In the following example, the eli:type_document and restype:CONS_TEXT value from EU authority table Resource type are used to describe the consolidated text
The customized built-in element can be used to express the position of a body over a given draft legislative actsection/lh-legislative-resolution
The element can be further decorated with RDFa code expressing the nature of the legisltive resolution
In the following example, the eli:type_document and restype:RES_LEGIS value from EU authority table Resource type are used to describe the document as a legislative resolution
The act that is the subject of a proposal can be embedded into the latter through the customized built-in element section/lh-proposed-act
The element can be further decorated with RDFa code better qualifying the type of act
In the following example, the eli:type_document and restype:DEC value from EU authority table Resource type are used to describe a Decision proposed into a "Proposal for a Decision"
The act that is the subject of an enclosing legal document (e.g. a position) can be embedded into the latter through the customized built-in element section/lh-target-act
The element can be further decorated with RDFa code better qualifying the type of act
In the following example, the eli:type_document and restype:DEC value from EU authority table Resource type are used to describe a Decision on which the EU Council has expressed its position.
A solution for general cases of documents thought to support or provide a rationale for the main act is provided by section/lh-accompanying-document
The element can be further decorated with RDFa code better qualifying the type of accompanying document and thus its role in the main act
In the following example, the eli:type_document and restype:STAT_REASON value from EU authority table Resource type are used to describe a document providing the reasons supporting a Council's position.
The marker with the broadest semantics for embedded documents is represented by section/lh-embedded-document
As for accompanying docs, this element can be further decorated with RDFa code better qualifying the type of embedded document and thus its role in the main act
An annex contains according to the [OJ-STYLE-GUIDE] rules and techinical data that do not appear in
the enacting terms. Annexes are represented as a customized section typed lh-annex,
which is not constrained in its content. Indeed, annexes usually comprise tables and lists, but they might also
contain figures, chemical formulas, and other specificalized data. For this reason, LegalHTML grants a very
degree of freedom to writing annexes, which might benefit from other Web standards with native support in
browsers or via JavaScript polyfills:
mhchem extension of [MATH-JAX] implments chemical equation macros of the LaTeX
mhchem package
An lh-annex must contain a header that contains:
Footnotes are represented in a footer as an lh-footnote custom element.
References to a footnote are written in the document as a custom a element in [HTML] named
lh-footnote-ref.
The href attribute must be a relative [URL] utilizing a fragment reference to the id of the
target footnote.
LegalHTML embraces the single document multiple version approach to consolidation that is being developed in [AKN4EU]: a legal act together with any subsequent amendment, including indirect ones, are consolidated into a single document that represents a superposition of different versions, while part of the default behavior of a LegalHTML document is to materialize each of them as a different views of the act, depending on factors including the applicability of each change.
At the structural level, different versions of the same section or internal part thereof are
wrapped in different lh-version custom elements, which are then grouped into a lh-cons
custom element.
The information needed to select the right version (e.g. of the effective title, in the example above) is stored in RDF as part of the metadata embedded in the document.
Inside an lh-version custom element, it is possible to find the same structure of consolidated
versions, if there are later changes to parts of that version.
The default presentation of a LegalHTML document borrows a lot from the look and feel of [EUR-LEX].
A document in LegalHTML also has a default behavior that addresses different concerns:
LegalHTML adds the special property consleg to the document object upon load. For a
consolidated resource, this property allows for switching to a view of the document as it was in force in a
given date.
The consleg property hosts a reasoner able to render the document according to different provision requirements. The current API allows for switching to different versions in force at the requested date of provision. However, since the consolidation model of this specification and the LegalHTML ontology foresee the representation of events related to the publication and applicability of the legal acts and of their modifications, further evolutions of the API provided by this property are possibile, without requiring changes to the LegalHTML language. For instance, it could be relevant to have API for rendering a document considering all modifications that are not just in force, rather applicable at the requested date of provision. Complex combinations can also be implemented, e.g. considering all modifications that are not only effective at the date of provision, but have been determined by modifying acts that have been indeed published before that date. This can be relevant for determining, in disputes, the good will of someone who, at a certain point in time, acted unaware of a law that, though retroactive, was published after that said point in time.
As clearly expressed in the introduction and in the following sections, LegalHTML has not been conceived for a specific legal tradition. Nonetheless, its initial developement has been funded and supported by an effort of the EU commission to simplify the editorial and publication workflow of legal documents.
In the context of this effort, LegalHTML is guaranteed to cover and properly represent all legal documents to be published on the Official Journal of the EU, by following the [OJ-STYLE-GUIDE] and by testing the developed specifications against a reference corpus of documents from the Official Journal
The following list reports on all document types that have informed the development of LegalHTML
legal acts
other legal documents
act proposals
embedded documents; these documents exist only attached to other legal documents
The general structure of a legal act consists of the following sections:
Regulations also specify the direct applicability in between the enacting terms anb concluding formulas
Some recurring patterns in the structure of documents that are not acts include:
All documents that are not acts usually open with several metadata entries, such as:
lh-actingEntity property)lh-reference)All proposals for an act (i.e. decision, directive, regulation) feature:
eli:type_document set to restype:EXPL_MEMORANDUM"eli:type_document set to restype:STAT_REASON"lh-legislature property)two embedded documents:
Referenced in: