ittig logo - go to the home page
Institute of Legal Information
Theory and Techniques

banner_top
search
write to us
research

Research Project: JurWordNet, law semantic lexicon
More info

JurWordNet

As for the other WordNets, this is a semantic lexicon, that is, a lexicon where the meaning of a term is defined and explained by computer programs.

The synsets making up the net represent the concepts: they amount to all the terms expressing the same concept (house, home, dwelling domicile…), linked by a semantic relation verging on synonymy. In fact, these are word classes based on the same meaning, and representing a concept or an individual (instances).

JurWordNet is a terminology lexicon linked as special resource to the generic resource (IWN) of common Italian. Apart from having taxonomic vertical relations, the synsets of law lexicon also have associative horizontal relations, whereas the semantic equivalence ones are limited (variants): this is common to many terminology lexicons abounding in technical terms, and where synonyms are rare. Conversely, it is important to create relations of equivalence with common Italian in legal language, (Links between legal wordnet and generic wordnet), in order to make up for the imprecision of non-experts when searching for legal information, and using terms out of the common language instead of juridical terminology. Disambiguation of polysemies in terminology lexicon should be considered extensively, as a distinction of the common meaning from the technical one. In legal language, there are very few terms expressing instances instead of classes (President of the Republic, Finance Minister, Head of Government,…)

Consistent with all WordNet projects, the developing methodology of the net favors the use and harmonization of already existing lexicon resources. Relevant ideas have been spotted bottom-up, taking the terms from the questions of legal information systems. In particular the lists of the Italgiure/Find system, the largest Italian law information system, developed by the Court of Cassation, produced:

    • the Semi database, 11,00 key words and lemmas conceptually connected to them;

    • the list of terms that common users includes in AND, from which derives the list of syntagms, a group of about 13,000 two-word expressions.
    • The list of words that common users include in OR, the so-called analogical word chain. Analogical chains are made up of synonyms, or terms that, at least in a certain amount of researches, were declared interchangeable by the majority of users.

From syntagms, the taxonomy was automatically created with the main term, and the top levels of the trees derived from it, using, in a partially automatic mode, the dictionary glosses. The horizontal connection of concepts and the disambiguation of meanings were carried out by hand. At this point there is a sufficiently-consolidated corpus of about 2000 synsets, which will be almost automatically increased through the link with thesauri and describers for juridical databanks.

Syntagm management is based on practical criteria of efficient research: they are considered hyponyms of the main term, but when the link is not relevant, the syntagm is connected with the other synsets through different semantic relations. For instance, hearing minutes is not connected to minutes but to hearing, by a role relation (in particular: Means – to). This does not mean that all hyponyms are strictly subsets of a higher term; for instance, besides lease contract as subclass of contract, we find valid contract, sham contract…).

Disambiguation of polysemies: linguistic and ontological levels

The Role of Ontology

From lexicon level to ontological level

Connection between legal and generic WordNet

Using JurWordNet

<<

ITTIG/Research/JurWordNet Project/JurWordNet, WordNet per il diritto