Back to the Roots of Genres

Text Classification by Language Function

verfasst von: Henning Wachsmuth, Kathrin Bujna
Abstract: The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.
Externe Organisation(en): Universität Paderborn
Typ: Aufsatz in Konferenzband
Seiten: 632-640
Anzahl der Seiten: 9
Publikationsdatum: 11.2011
Publikationsstatus: Veröffentlicht
ASJC Scopus Sachgebiete: Sprache und Linguistik, Artificial intelligence, Software, Linguistik und Sprache
Elektronische Version(en): https://aclanthology.org/I11-1071.pdf (Zugang: Offen)

BibTeX

@inproceedings{41358c80f81f46a6a6af36f02ec78b04,
title = "Back to the Roots of Genres: Text Classification by Language Function",
abstract = "The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.",
author = "Henning Wachsmuth and Kathrin Bujna",
note = "Funding Information: This work was partly funded by the German Federal Ministry of Education and Research (BMBF) under contract number 01IS08007A.; 5th International Joint Conference on Natural Language Processing, IJCNLP 2011 ; Conference date: 08-11-2011 Through 13-11-2011",
year = "2011",
month = nov,
language = "English",
pages = "632--640",
editor = "Haifeng Wang and David Yarowsky",
booktitle = "Proceedings of the 5th International Joint Conference on Natural Language Processing",
publisher = "Association for Computational Linguistics (ACL)",
}

Details zu Publikationen

Back to the Roots of Genres

Text Classification by Language Function

Gefördert vom