Automated text categorization in a dead language. The detection of genres in Late EgyptianGohy, Stéphanie ; Martin Leon, Benjamin ; Polis, Stéphane ![]() in Polis, Stéphane; Winand, Jean (Eds.) Texts, Languages & Information Technology in Egyptology. Selected papers from the meeting of the Computer Working Group of the International Association of Egyptologists (Informatique & Égyptologie), Liège, 6-8 July 2010 (2013) This paper is a first step in applying machine learning methods typical of Automated Text Catego-rization (ATC) for Automatic Genre Identification (AGI) in Late Egyptian, a language written in either ... [more ▼] This paper is a first step in applying machine learning methods typical of Automated Text Catego-rization (ATC) for Automatic Genre Identification (AGI) in Late Egyptian, a language written in either hieroglyphic or hieratic scripts that is found in documents from Ancient Egypt dating from ca. 1350-700 BCE. The study is divided into three parts. After a general intro¬duction on AGI (§1), we introduce the levels of annotation that are integrated in the Ramses corpus and can be used when performing AGI on Late Egyptian (§2). In the following section (§3) we offer a brief survey of the types of features that have been discussed in the literature on AGI, before proceeding with three case studies where we apply supervised machine learning methods — namely the naïve Bayes classifier (§4.1), the Support Vector Machine (§4.2), and the Segment and Combine approach (§4.3) — to a selection of texts in the corpus. Their respective performances are tested using lexical, part-of-speech and inflectional features. [less ▲] Detailed reference viewed: 47 (13 ULg) - Détection automatique des textes épistolaires du corpus néo-égyptien : méthodes exploitant la récurrence de motifs discriminantsGohy, Stéphanie ; Martin Leon, Benjamin ![]() in Purnelle, Gérald; Longrée, Dominique; Dister, Anne (Eds.) Actes des 11es Journées internationales d'Analyse statistique des Données Textuelles (2012, June 15) Detailed reference viewed: 9 (4 ULg) Détection automatique des textes épistolaires du corpus néo-égyptien : méthodes exploitant la récurrence de motifs discriminantsGohy, Stéphanie ; Martin Leon, Benjamin ![]() in Dister, Anne; Longrée, Dominique; Purnelle, Gérald (Eds.) Actes des 11es Journées internationales d'Analyse statistique des Données Textuelles (2012) In this paper, we will develop two methods allowing an automatic detection of the Late-Egyptian epistolary genre. Among the criteria which could be mobilized to identify different genres within a corpus ... [more ▼] In this paper, we will develop two methods allowing an automatic detection of the Late-Egyptian epistolary genre. Among the criteria which could be mobilized to identify different genres within a corpus, the study of “motifs” (“patterns”) represents a particularly promising approach that has already been successfully exploited for a corpus of Latin texts. In our communication, we suggest applying this process to the Late Egyptian corpus, and more particularly to the epistolary genre. Two methods will be applied to our corpus to identify whether or not particular documents belong the epistolary genre. We shall begin by explaining the principle of functioning of these two methods. The results obtained will then be analyzed; we shall try to understand why certain documents were improperly classified. [less ▲] Detailed reference viewed: 20 (4 ULg) Identification of ‘Textsorten’ in the Late Egyptian CorpusGohy, Stéphanie ; Martin Leon, Benjamin ![]() in Winand, Jean; Polis, Stéphane (Eds.) Texts, Languages & Information. Technology in Egyptology. Selected papers from the meeting of the Computer Working Group of the International Association of Egyptologists (Informatique & Égyptologie), Liège, 6-8 July 2010 (2010, July 08) Detailed reference viewed: 3 (1 ULg) Classification automatique de textes néo-égyptiens selon leur genre littéraireMartin Leon, Benjamin ![]() Master's dissertation (2010) Detailed reference viewed: 45 (18 ULg) Projet « Ramsès » : Réalisation d’une bibliothèque de traitement à états finisMartin Leon, Benjamin ![]() Master's dissertation (2009) Detailed reference viewed: 30 (18 ULg) |
||