Fredagsseminar: Éric Laporte

When:
2 October, 2009 @ 14:15 – 16:00
2009-10-02T14:15:00+02:00
2009-10-02T16:00:00+02:00
Where:
Rom HF-216

Éric Laporte
Université Paris-Est – Laboratoire d’Informatique Gaspard-Monge (LIGM):

Lexicons and grammars for language processing: industrial or handcrafted products?

Summary

During the recent years, the use of language resources for language processing (semantic ambiguity resolution, translation…) increased progressively. A few years ago, nearly all the language resources used for this purpose were collections of texts as the Brown Corpus and the Penn Treebank, but the use of electronic lexicons (WordNet, FrameNet, VerbNet, ComLex, Lexicon-Grammar…) and formal grammars (TAG…) developed recently. This development is slow because most processes of construction of lexicons and grammars are manual, whereas the construction of corpora has always been highly automated.
However, more and more specialists of language processing realize that the information content of lexicons and grammars is richer than that of corpora, and hence the former make more elaborate processing possible. The difference in construction time is likely to be connected with the difference in information content: the handcrafting of lexicons and grammars by linguists would make them more informative than automatically generated data.
This situation can evolve into two directions: either specialists of language technology get progressively used to handling manually constructed resources, which are more informative and more complex, or the process of construction of lexicons and grammars is automated and industrialized, which is the mainstream perspective. Both evolutions are already in progress, and a tension exists between them. The relation between linguists and computer scientists depends on the future of these evolutions, since the first implies training and hiring numerous linguists, whereas the other depends essentially on solutions elaborated by computer engineers.
The aim of this talk is to analyse practical examples of the language resources in question, and to discuss about which of the two trends, handcrafting or generating industrially, or a combination of both, can give the best results or is the most realistic.

Leave a Reply