In the course of our R&D activities, and as instrumental assets for the execution of our projects, we developed or are developing the following tools and resources:
Computational grammar for deep linguistic processing of Portuguese. Developed within the DELPH-IN consortium.
Named Entity Recognizer.
Fully fledged automatic verbal conjugator for Portuguese, including all forms of clitic conjugation.
Fully fledged automatic verbal lemmatizer for Portuguese.
Fully fledged automatic nominal lemmatizer for Portuguese.
Full coverage tables with rules and exceptions for Portuguese verbal and nominal inflection.
The three tools below are available online, as the LX-Suite.
Automatic part of speech tagger for Portuguese.
Automatic segmenter of lexemes of Portuguese.
Automatic segmenter of paragraphs and sentences of Portuguese.
LX-Closed Classes Lexicon
Full coverage lexicon for the Portuguese POS closed classes.
LX-Parser is a freely available on-line service for constituency parsing of Portuguese sentence.
CINTIL-Treebank Online Searcher is a freely available online service to search and view the parser and dependency tree of the CINTIL-Treebank.
MWNPT-International WordNet of Portuguese
WordNet of Portuguese with ca. 16 500 concepts and 21 000 word senses (May 2008). Developed in coperation with MultiWordnet project of ITC-Irst from Trento, Italy.
CINTIL - Corpus Internacional do Português
High quality, linguistically interpreted, accurately hand tagged 1Mtoken corpus wrt POS, inflection and NER. To be distributed soon via ELRA. Developed and maintained in cooepration with CLUL-Centro de Linguística da Universidade de Lisboa.
Advanced, freely available online concordancer for the CINTIL corpus. Developed and maintained in coperation with CLUL-Centro de Linguística da Universidade de Lisboa.
Exhaustive set of part of speech tags for Portuguese, including coverage of transcriptions of verbal productions. This is the tagset used in the annotation of the CINTIL corpus. It is also the the tagset assumed for the operation of the tools LX-Tagger, LX-Inflector, LX-Conjugator and LX-Lemmatizer.
CINTIL Annotation Manual
The companion manual of CINTIL corpus with explicit guidelines for annotation/interpretation.
The CINTIL-Treebank is a corpus of syntactic trees of constituencies, composed of sentences taken from the CINTIL-International Corpus of Portuguese.
Corpus with the transcriptions of syllogistic reasoning protocols.