157x Filetype PDF File size 0.16 MB Source: ufal.mff.cuni.cz
Building the Valency Lexicon of Arabic Verbs ´ ˇ Viktor Bielicky OtakarSmrz Institute of Formal and Applied Linguistics, Charles University in Prague ´ ´ ˇ ´ Malostranske namestı 25, Prague 1, 118 00, Czech Republic padt@ufal.mff.cuni.cz Abstract This paper describes the building of a valency lexicon of Arabic verbs using a morphologically and syntactically annotated corpus, the Prague Arabic Dependency Treebank, as its primary source. We present the theoretical account on valency developed within the Functional Generative Description theory. We apply the framework to Arabic and discuss various valency-related phenomena with respect to examples from the corpus. We then outline the methodology and the linguistic and technical resources used in the building of the lexicon. Valency lexicons can find application in automatic parsing as well as in language generation. 1. Introduction Actant Meaning Example Valency of a lexical unit, in particular a verb, is a set of its ACT Actor Peter read a letter. obligatory and/or optional arguments potentially or actu- ADDR Addressee Peter gave Mary a book. ally realized in an utterance. Valency information is useful PAT Patient I saw him. in restoring the syntactic structure of an utterance, and has EFF Effect Wemadeherthesecretary. consequences for the study of the meaning. ORIG Origin Shemadeacakefromapples. Thegoalofthispaperistopreparethetheoretical(Sections Table 1: Types of actants (inner participants) illustrated on 2and3)andmethodological(Sections4and5)background ´ for creating the valency lexicon of the most frequent Arabic English sentences (Lopatkova et al., 2006: xvi). verbs, exploiting various resources of information. Our ap- Adjunct Meaning DIR1 Direction from proach is inspired by the VALLEX lexicon of Czech verbs MANN Manner DIR3 Direction to ´ (Lopatkova et al., 2006, 2008) and its treebank-oriented MEANS Means TWHEN Timewhen ˇ twin project, the PDT-VALLEX (Hajic et al., 2003, 2006). LOC Location THO Timehowoften In our case, we focus on Modern Standard Arabic (MSA) and take as reference the Prague Arabic Dependency Table 2: Types of adjunct (free modifications) appearing in Treebank (PADT). It provides refined linguistic annota- ´ tions whose multi-level description scheme discerns func- this paper. For the complete list, cf. (Mikulova et al., 2006). tional morphology, analytical dependency syntax, and tec- togrammatical representation of linguistic meaning. The on a verb. Each verb has at least one valency frame. The ˇ current, largely extended version of PADT (cf. Hajic et al., exact number of valency frames depends on the number of ˇ 2004; Smrz, 2007) covers over one million words of text. different meanings of the particular verb. For expressing 2. TheoryofValencyinFGD relations between a verb and its complements, FGD uses various functors. These functors are divided into actants Before we focus on some issues concerning verbal va- (inner participants, arguments) and free (adverbial) modifi- lency in Arabic and our proposed methodology for creat- cations (adjuncts). The entire number of actants is five (for ing the valency lexicon, let us briefly outline the theoreti- examples in English, see Table 1): cal framework we have adopted. The Functional Genera- ACTor – usually the agent (the surface subject) or the tive Description (FGD) theory, which has been elaborated bearer of some property/quality; PATient – the goal/target since the sixties of the last century (in particular in Sgall or the object affected by the action with consequences for ˇ ´ its morphemic representation (the case in inflectional lan- et al., 1986; Hajicova and Sgall, 2003), is a multi-stratal dependency-oriented description of language. The valency guages) brought about by verbal government (usually the theory of verbs has been thoroughly researched within the direct object of transitive verbs); ADDRessee – usually the ´ indirect object on the surface; ORIGin – this participant framework of FGD since the seventies (Panevova, 1974, ´ ´ is probably never obligatory; EFFect – usually the second 1975, 1994; Lopatkova and Panevova, 2005). The ques- tion of valency is closely associated with the underlying (inanimate) object, the predicative complement or the ad- tectogrammatical level of language description represent- verbial of result. ing the meaning of the discourse. As regards the actants, they have to fulfill two conditions. According to the valency theory of FGD, valency informa- The first condition is that the set of certain actants is char- tion of the given verb is defined by the valency frame—the acteristic for a particular verb—in other words, not every sequence of frame slots—which is filled by a specific num- actant can depend on every verb. The second is that every ber of various valency complements, i.e. a variety of either actant can occur only once as a complement of the given required or specifically permitted syntactic units dependent verb, disregarding coordination or apposition. On the contrary, there are different kinds of free modifi- not be taken into consideration in our present study of Ara- cations denoting various types of adverbial complementa- bic and will be the subject of our further research. In this tion (e.g. time, location, direction, manner, aim, cause, re- preliminary phase of our research, we adopt the valency gard, accompaniment). These free modifications can ap- frames in their narrow sense, i.e. including obligatory and pear more than once with a single verb and theoretically optional actants and only obligatory free modifications as can modify any verb. It means that they are actually not has been pointed at in Table 3. restricted to a certain group of verbs, as is the case with actants. For examples of free modifications, see Section 3. 3. Valency in Arabic: Preliminary Overview The verbal valency frame in its narrow sense consists of In this section, we will adapt some aspects of the above both obligatory and optional inner participants and obliga- mentionedtheoretical approach of FGD for Arabic in order tory free modifications (see Table 3). The criterion of obli- to make our preliminary observations about the valency be- gatoriness or optionality of verbal complements was intro- havior of Arabic verbs and their verbonominal derivatives. ´ duced in the dialogue test by (Panevova, 1974, 1975) with The only elaborate work on valency in Arabic which has respect to possibility to intentionally omit a contextually come to our knowledge is (Al-Qahtani, 2004). Contrary to bound obligatory complement on the surface morphemic FGD, al-Qahtani has adopted predominantly semantic ap- level of representation through the ellipsis or, for instance, proach, since he deals with verbal valency in terms of Case as a general (“dummy”) subject or object, etc. Grammar theory. He applies the Matrix Model of (Cook, obligatory optional 1979) to the semantic classification of Arabic verbs (state, inner participants (actants) + + action, and process verbs). To each class a specific set of re- free modifications (adjuncts) + − quired semantic complements (“deep cases”) is assigned— namelyAgent,Experiencer,Benefactive,Object,andLoca- Table 3: Members included in the valency frames. tive. The obligatory Object is omnipresent with every verb (in contrast to Actor in FGD) and can occur more than once It is to be stated that the approach adopted by the FGD in a case frame. Experiencer, Benefactive, and Locative are takes into account both syntactic and semantic criteria mutually exclusive. Sometimes, a particular case is not re- for assigning functors to verbal complements (contrary to alized on the surface (“covert case role”), i.e. it is either par- other more semantically-based approaches). Within this tially covert (“deletable”) or totally covert (“coreferential” approach the concept of “shifting of cognitive roles” was or “lexicalized”). Those deletable case roles can be omit- ´ ted on the surface (optional or elided complements in terms adopted (Panevova, 1974, 1975, 1994). This “shifting” de- of FGD, see (X) and Table 4), whereas the so-called coref- notes application of primarily syntactic criteria for identi- erential and lexicalized case roles are always absent from fying the first two actants (Actor and Patient). Due to this the surface. The former coreferential roles denote instances fact, the first actant of the given verb is always identified where a single noun cumulates two case roles simultane- as Actor and the second one as Patient regardless of their ously (not permitted in FGD, see (Y)), while the latter lex- actual semantics. On the contrary, semantic criteria are ap- icalized roles include instances where a certain case role plied when assigning functors to other actants as well as to (usually Object) is incorporated in the semantics of the verb all free (adverbial) modifications of a verb. For the concept (see (Z)). No shift of case roles takes place in this approach. of “shifting” see Figure 1. Some examples of “shifting” (Al-Qahtani, 2004: 148, 178) will be illustrated on Arabic in Section 3. (X) qala Zaydun maqulata-hu he-said Zayd said-of-him ¯ ¯ ORIG Zaydsaid what he had to say qal AEO/E-del (Experiencer is deleted) ¯ ACT PAT ADDR (Y) darasa Zaydun al-kitaba he-studied Zayd the-book ¯ EFF Zaydstudied the book daras AEO/A=E(AgentequalsExperiencer) Figure 1: Shifting of cognitive roles as a criterion for as- (Z) ֒amila Zaydun he-worked Zayd signing functors to actants (inner participants) of a verbal Zaydworked=Zayddidsomework ´ ֒amil AO/O-lex (Object is lexicalized) frame (Panevova, 1994: 234). The valency frames as appeared in the valency lexicon (subject) li- (prep.) ֒an (prep.) 4-/֓inna (conj.) obl opt opt obl of Czech verbs VALLEX are enriched with two other ACT ADDR PAT EFF sets of complements, namely quasi-valency complements someone to someone about sth. something/that ´ ´ and typical complements (Lopatkova and Panevova, 2005; Table 4: Valency frame of the verb qal ÈA¯ ‘to say’. ´ ¯ Lopatkova, 2003). The former quasi-valency type (consist- ing of newly introduced Obstacle and Difference and of re- vised previously existing complements Intention and Me- 3.1. Verbal valency diator) is the kind of complement lying somewhere in be- tween the free modifications and actants, while the latter First, let us demonstrate some basic issues postulated by the typical type denotes optional free modification usually co- FGDapproach. In all the following examples in this sec- occurring with a particular verb. Those complements will tion, the complements highlighted in bold are considered to be obligatory, the others are optional. Some examples transitive verbs as ֓a֒ta IV ‘to give’ (=both objects (ADDR, . ¯ derived from available corpora had to be abridged. PAT)areinaccusative (8)) can be used regularly also in the In case that the verbal valency frame consists of only one reversed position (PAT, ADDR). In that case, the indirect inner participant, it is always Actor, whatever the semantics object (i.e., ADDR) appears with the preposition li- (9). of that complement would be. Here, the syntactic criteria (8) ֓a֒ta-hu ’l-fursata . ¯ . play the major role in assigning the functor to a comple- ⌈ ACT ⌈ ADDR⌈ PAT he -gave- him the-opportunity ment. Those verbs are typically intransitive stative (1) or he gave him the opportunity passive/reflexive (2). (9) ֓a֒tat-i ’s-saytarata li-’l-bunuki . . ¯ ˇ ⌈ ACT ⌈ PAT ⌈ ADDR (1) kana yanamu ֒adatan fı sari֒in sagırin ¯ ¯ ¯ ¯ ¯ . ˙¯ it -gave the-power to-the-banks ⌈ ACT ⌈ THO ⌈ he-was he -sleeps usually in a-street a- it gave the power to the banks LOC small Valency frames with ACT, PAT, EFF. The following verbs he usually slept in a small street are also double transitive: (2) intahara bi-taswıbi-hi ’l-musaddasa ֓ila ra֓si-hi . . ¯ ¯ (10) ֒ayyana-hu hakiman li-’l-Kuwayti . ¯ ⌈ ACT ⌈ ⌈ ⌈ ⌈ he -commited-a-suicide by-aiming-of-him the- ACT PAT EFF gunat head-of-himMEANS he -appointed- him a-ruler for-Kuwait he committed a suicide by aiming the gun at his head he appointed him as a ruler of Kuwait ֓ (11) i֒tabara Adunıs ֓urubıyan ¯ ¯ ¯ ¯ ¯ If the valency frame includes two actants, the first actant is ⌈ ACT ⌈ PAT ⌈ EFF he-considered Adonis a-European considered to be Actor and the other Patient. Some verbs he considered Adonis to be European are directly transitive (3), whereas others are transitive in- Valency frames with four actants ACT, PAT, ORIG, EFF: directly through a preposition (4). (12) targama ֓aktara min hamsına kitaban min-a ’l-farisı- (3) ֒aqadat-i ’l-lagnatu ’l-munazzimatu mu֓tamaransiha- ˇ ¯ ¯ ¯ ¯ ¯ ˇ . . . . ¯ yati ֓ila ’l-֒arabıyati ˘ fıyan ֓awwala min ֓amsi ¯ ¯ ¯ ⌈ ACT ⌈ PAT ⌈ ⌈ ACT he -translated a-more than fifty a-book from it-held the-committee the-organizational ORIG ⌈ EFF ⌈ PAT ⌈ Persian into Arabic a-conference a-press a-first-day from he translated more than fifty books from Persian into TWHEN yesterday Arabic the organizational committee held a press conference ˇ ˇ ˇ (13) gayyarat [as-sarikatu] nasata-ha min ֓intagi ’l-qamhi the day before yesterday ˙ ¯ ¯. ¯ ¯ ˇ . ֓ila ֓intagi ’l-buduri (4) nagaha ֒ulama֓u faransıyuna fı ’stinsahi ֓araniba ¯ ¯ ˇ ¯ ¯ ˇ . ¯ ¯ ¯ ¯ ¯ ¯ ⌈ ACT ⌈ PAT ˘ it -changed [the companies] activity-of-it ⌈ ACT ⌈ he-succeeded scientists French in cloning-of ⌈ ORIG ⌈ PAT from production-of the-wheat to production-of rabbits EFF French scientists succeeded in cloning rabbits the-seeds [the companies] changed their activity from the pro- In case of verbs with three or more actants (no matter if duction of grain to the production of seeds obligatory or optional) where Patient is from the seman- In the following examples, let us mention some verbal va- tic viewpoint not realized in the valency frame, the above lency frames that comprise some type of free (adverbial) mentioned preference of syntactic criteria is applied. This modifications. means that the other actant (EFF, ORIG, or ADDR) under- (14) bad֓u ’l-harbi wada֒a-hu ֓amama ֓amrin waqi֒in goesthe“shift”(seeSection2.,esp.Figure1)tooccupythe . . ¯ ¯ ⌈ ACT ⌈ PAT ⌈ unfilled slot of Patient. To the remaining actants, functors beginning-of the-war it-put- him in-front-of are assigned according to the semantic criteria. In example a-thing a-realDIR3 (5), Effect undergoes the shift to Patient. the beginning of the war put him in front of the reality (15) ֒adat min-a ’l-Qahirati ֓ila Bayruta (5) tahawwalat-i ’l-munazzamatu min ֓adati muwa- ¯ ¯ ¯ ¯ . . . ¯ ¯ ⌈ ACT ⌈ DIR1 ⌈ DIR3 gahatin ֓ila ֓adatin li-’l-bahti she -returned from Cairo to Beirut ˇ ¯ ¯ . she returned from Cairo to Beirut ⌈ ¯ ACT ⌈ it-changed the-organization from instrument- It should be pointed out that we make a difference in verbal ORIG ⌈ of a-confrontation to an-instrument for-the- frames when assigning a functor to a verbal complement PAT research (EFF→PAT) that could be semantically regarded as a free modification the organization changed from an instrument of con- (e.g. some directional meaning), but on the surface level frontation to an instrument for research this complement is the direct object in accusative. In this In the examples below, some verbs with three and more case, the syntactic (or morpho-syntactic) viewpoint (verbal actants are illustrated. Valency frames with ACT, ADDR, government affects the morphemic form of a complement, PAT: i.e. the criterion of direct transitivity) is preferred, and con- (6) samaha la-hu bi-’d-duhuli ֓ila ’l-bayti . ˘ ¯ ¯ sequently a functor of Patient is assigned as in sentence ⌈ ACT ⌈ ADDR ⌈ he -permitted to-him by-the-entering into (16). If there are two (or more) different morphemic real- PAT the-house izations on the surface (i.e. prepositional phrase versus di- he permited him to enter the house rect verbal government), although the meaning of that verb ˇ (7) sarakat zawga-ha fı ’l-hukmi ¯ ˇ ¯ ¯ . is in both cases the very same, two (or more) different va- ⌈ ACT ⌈ ADDR⌈ PAT she -shared husband-of-her in the-reign lency frames are distinguished ((17) with DIR3 and (18) she shared the reign with her husband with PAT). ֓ It is worth mentioning that the usual word order of double (16) gadarat-i ’l-Qahirata ֓ila Tall Abıb ˙ ¯ ¯ ¯ ¯ ⌈ ACT ⌈ PAT ⌈ DIR3 PAT she -left Cairo to Tel Aviv joint she left Cairo for Tel Aviv and no joint press conference was held (17) wasala ’l-muntahabu ֓ila madınati Salırnu ’l-֓ıtalıyati ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ With double transitive verbs, those with complements Pa- . ˘ . ⌈ ACT ⌈ it-arrived the-representation to town-of Salerno tient and Effect, the first object (PAT) substitutes the gram- DIR3 the-Italian matical (surface) subject while the second object (EFF) re- the representation arrived to the Italian town Salerno mains in accusative (compare to the active voice (10)). ˇ (18) wasaltu-ha [Dimasq] min-a ’d-Dawhati ¯ (22) ֒uyyina’d-dukturuMawsilıwakılanli-kullıyati’t-tibbi . . ¯ . ¯ ¯ ¯ . . ⌈ ACT ⌈ PAT ⌈ ⌈ PAT ⌈ I -arrived-to- it [Damascus] from ad- he-was-appointed the-doctor Mawsili an- DIR1 EFF Dawha assistant-dean to-faculty-of the-medicine I arrived there [to Damascus] from ad-Dawha doctor Mawsili was appointed as an assistant dean of Onthecontrary, when a more abstract meaning of a partic- the faculty of medicine ular verb occurs, the complement is no longer considered It is to be pointed out that those double transitive verbs to be a free modification (directional meaning) and both with complements ADDRandPAT(verbsas ֓a֒ta‘togive’) . ¯ (or more) variants—that with a prepositional phrase and might be passivized in two ways, either the former object the other with a direct object—are regarded as morphemic usually referred to as indirect (23) or the latter direct ob- variants of the same actant (Patient in this case) within one ject (24) can substitute the grammatical (surface) subject single valency frame (19) and (20). (Agameya, 2008: 559) (compare the active voice (8) and (19) wasalat qımatu-ha ֓ila hamsati yuruhatin (9)). . ¯ ¯ ¯ ˘ ¯ ¯ ¯ ⌈ ACT ⌈ PAT it-reached value-of-it to five euros (23) ֓u֒tiyat fursatan taniyatan . . ¯ ¯ its value reached 5 euros ⌈ ADDR ⌈ PAT she -was-given a-chance a-second (20) wasalat qımatu ’s-sadirati 625 milyuna dularin she was given the second chance . ¯ . . ¯ ¯ ¯ ¯ ¯ ⌈ ACT⌈ it-reached value-of the-exports 625million-of a- (24) ֓u֒tiya ’d-daw֓u ’l-֓ahdaru li-’l-malikati . . . ˘ . PAT ⌈ PAT ⌈ dollar it-was-given the-light the-green to-the- the value of export reached 625 million dollars ADDR queen When dealing with verbal valency in MSA, some issues the queen was given the green light concerning diathesis should be briefly discussed as well. In case of indirectly transitive verbs through prepositions MSA (contrary to Arabic dialects), as the successor of when passivized, the verb itself always remains in the 3rd Classical Arabic, has preserved one of its characteristic person masculine singular in the passive while the surface features—regularly formed passive by changing the vowel subject (the previous object of the active verb) goes af- pattern of active verb (so-called inflectional, internal or ter the preposition (Badawi et al., 2004: 387–388; Ryding, apophonic passive)—which is usually used when the agent 2005: 666–667). of an action is not known or is preferred not to be men- (25) yuhkamu ֒alay-hi bi-’s-signi 1 . ˇ tioned. With some rare exceptions, only transitive verbs ⌈ PAT ⌈ EFF undergo passivisation, no matter if they are transitive di- it-is-sentenced upon-him by-the-jail rectly or indirectly through the preposition. In the passive, he is sentenced to jail the position of the underlying Actor is reduced and the un- Sometimes,theagent(Actorontheunderlyinglevelofrep- derstood surface object (usually PAT or ADDR) of the ac- resentation) of some verb in the passive voice is expressed tive verb becomes a subject (Agameya, 2008: 558). How- periphrastically after particular prepositional phrases like ever, besides this type of diathetical transformation, another min qibali ‘by, on part of’, etc. (Badawi et al., 2004: 385– ¨ typeofpassiveexistsinArabic,namely“aderivationalpas- 386; Retso, 2006: 624–625). ˇ (26) sarikatun tudaru min qibali mudara֓a mutahassisına sive, where a derivational verb form (typically V, VII, or ¯ ¯ ¯ ˘ . . .¯ ⌈ PAT ⌈ VIII) is used to convey a passive, reflexive or mediopas- companies is-managed from directions-of man- sive sense of the action involved in the verb” (Ryding, agers specialistACT 2005: 657). Those cases are then, as a result of derivation companies are managed by professional managers through verbal morpho-semantic patterns, autonomous lex- It is worth mentioning that several Arabic verbs in passive icalized passive or passive-related verbs with their own va- voice have undergone some kind of semantic shift and are lency frames with more or less probable word-formational used figuratively. Due to this fact they have to be consid- relation to some active verb (causative, factitive, etc.) they ered as idiomatic, since they are no longer real semantic are derived from, cf. Figure 2. counterparts of their active forms. We can mention at this Withdirectlytransitiveverb, Actorisreducedandthedirect point two very frequent verbs (27) tuwuffiy V ‘to die, to ˇ object(Patient)becomesagrammaticalsubject(compareto pass away’ and (28) ustushid X ‘to die as a martyr’ reflect- the active voice in (3)). ingsomedegreeofeuphemisminconnectionwithreligious ˇ feeling. Those verbs would be treated in our proposed lex- (21) wa-lam yu֒qad ֓ayyu mu֓tamarin sihafıyin mustarakin . . ¯ ¯ icon as separate word entries. ⌈ and-not it-was-held any a-conference a-press a- (27) tuwuffiya walidu-hu fı haditi sayyaratin ¯ ¯ . ¯ ¯ ¯ ⌈ ACT ⌈ MANN 1Some intransitive verbs, esp. those of movement, become he-died father-of-him in accident-of a-car transitive secondarily whentakingapreposition, e.g.ga֓‘tocome’ his father died in a car accident ˇ ¯ ˇ (28) ustushida ֒ama 1991 fı haditi ’gtiyalin →ga֓bi- ‘to bring sth.’; qam ‘to stand up’ → qam bi- ‘to carry ¯ ¯ ¯ ˙ ¯ ˇ ¯ ¯ ¯ . ¯ ´ ⌈ ACT ⌈ TWHEN ⌈ out sth.’ (Badawi et al., 2004: 382–383; Drozdık, 2001). he -died-as-a-martyr year-of 1991 in
no reviews yet
Please Login to review.