DIFFICULTIES FOR LEARNING PERSIAN AS A SECOND LANGUAGE Karine Megerdoomian MITRE Approved for Public Release; Distribution Unlimited. Case number 08-0171 . ©2008 The MITRE Corporation. All rights reserved. Source: American Educational Research Foundation, adapted from National Virtual Translation Center (2006), “Language Learning Difficulty for English Speakers” Goals of this talk 4 What are the factors that place Persian in Category II of language difficulty for a speaker of English? 1. i.e., more difficult than French, Swedish or German but easier than Arabic or Chinese 2. How can linguistics help in the classroom? use linguistic patterns to explain difficult issues computational tools to discover and practice language patterns Language Relatedness How close are Persian and English? They are both members of the Indo-European language family Source: The American Heritage® Dictionary of the English Language, Fourth Edition. Copyright © 2000 by Houghton Mifflin Company. Source: Circle of Ancient Iranian Studies, SOAS Language Relatedness Persian and English are both Indo-European They share certain linguistic features (verb conjugation, morphology patterns) But their writing systems are very different Iranian and Dari Persian use an extended version of the Arabic script (4 additional letters) Tajiki Persian uses an extended version of the Cyrillic script Writing System Writing goes from right to left Letters are completely unfamiliar for English speakers Letters must be connected together rather than being written separately Letters have up to 4 different shapes based on context in the word ()هـ ـهـ ـه ه Diacritics are generally not written Writing System: Diacritics Tanvin ( اقالaqla) aqalan Alef maksura ( حتیhti) hattâ اقلن به خودم مي گفت رژیمی که حتا قادربه گرفتن دست دوستان خود نیست Vowels (a,e,o) ( کرمkrm) kerm (worm), korom (chrome), karm (vine), karám (generosity), káram (I’m deaf), kerem (cream) Writing System: Ambiguous Letters ‘he’: /h/ after a vowel and /é/ after a consonant mâhâné ( ماهانهmonthly) kuh ( کوهmountain), sé ( سهthree) به beh (quince) vs. bé (to) ‘vav’: /v/ after a vowel and /u/, /ow/ or /o/ after consonants sovashun ( سووشونSovashun) gâv ( گاوcow), ravand ( روندprocess) dur ( دورfar), dowr ( دورaround) Writing System 12 Difficult things: Missing diacritics Ambiguous letters (he and vav) Sound to spelling correspondence (dictation) ambiguous: e.g., /z/ can map to several letters = زظ ذ ض/z/ = س ص ث/s/ =هـ ح/h/ But simplified pronunciation Writing System 13 Easy things: Alphabet system (unlike Chinese) Regular, one-to-one correspondence of letter to sound overall (more than French or English!) Writing System: Handwriting!! 14 Forms of the letters can be quite different from the typeset (naskh) form Used very commonly in writing letters, manuscripts, class notes, but also street signs, ads, at grocery stores Outline 15 Language Relatedness Writing System Lexicon Morphology Syntax Social issues Computational linguistics Lexicon 16 Difficulties: Lack of cognates for English speakers (in contrast with French, Spanish) Distinctions that do not exist in English time know ( دفعهvez) vs. ( وقتtiempo) vs. (ساعتhora) ( دانستنsaber) vs. ( شناختنconocer) Lexicon: Language Contact 17 Persian has been in contact with many languages historically and has many borrowings Other Iranian languages (e.g., Parthian) اژدهاeždehâ (dragon), آژیرâžir (siren), چهرهchehre (face), پیغامpeyghâm (message), فرشتهfereshte (angel) Arabic forms a large part of the vocabulary سعی کنیم از کلمات عربی استفاده نکنیم ‘Let us try not to use arabic words’ But watch out for false friends! جامعهjâme’e (society), جوابjavâb (answer), کنیساkenisâ (synagogue) Lexicon: Language Contact 18 French words entered the language in late 19th/early 20th century شوفاژ، نوستالژی، سکسوالیته، تئاتر، مرسی، سندیکا، مونارشی،دموکراسی démocratie, monarchie, syndicat, merci, théâtre, sexualité, nostalgie, chauffage Most new technical words are English کلیک کردن، کامپیوتر، فیلترینگ، آن الین، دانلود،ایمیل e-mail, download, online, filtering, computer, click … Note: Dari Persian has more English borrowings Most common, everyday words have to be learned as new words as they are usually not borrowed Outline 19 Language Relatedness Writing System Lexicon Morphology Syntax Social issues Computational linguistics Morphology 20 Easy Parts: Concatenative like English: use of prefixes and suffixes Ex. sâz from sâkhtan (‘to build’ or ‘to agree’) سازشsâzesh (agreement), نسازnasâz (uncompromising, disagreeing), سازگارsâzegâr (agreeable), ناسازگاریnâsâzegâri (disagreement), سازش پذیریsâzesh paziri (compatibility) No gender, case, dual number No agreement of adjective or demonstrative with noun MorphoSyntax 21 But Persian is more free in suffixation Words that equal a sentence کوچکترینهایشانند kuchek-tarin-hâ-yeshân-and small-Sup-Pl-Pron.3pl-Cop.3pl ‘They are the smallest ones of them.’ پسرخالههامونین pesar khale-hâ-mun-in son aunt-Pl-Pron.1pl-Cop.2pl ‘You are our cousins.’ Morphology 22 Difficult Parts: Many forms for the plural کتابهاketabhâ (books), بینندگانbinandegân (viewers), کلماتkalamât (words), مسافرینmosâferin (travelers), روحانیونruhânyun (clergy) Arabic morphology system ()اشتقاق Mostly plural and participial forms Based on a root template: اشعار، شعرا، مشاعره، شاعر،شعر But there are patterns that can be learned Morphology Patterns: Plurals 23 Plurals with -ân Mothers Children Residents Iranians Students Animals مادران کودکان ساکنان ایرانیان دانشآموزان جانوران Only on Animate nouns Plurals with -hâ Books Centuries Opinions Iranians Houses کتابها قرنها عقیدهها ایرانیها خانهها Morphology Patterns: Plurals 24 Plural ending -gân parandegân nevisandegân fereshtegân پرندگان نویسندگان فرشتگان zelzele zadegân زلزلهزدگان nouns ending in ‘e’ (written as he) Plural ending -yân âshnâyân dâneshjuyân binavâyân zurguyân Plural ending -ân dustân دوستان دانشجویان farzandân فرزندان بینوایان mehmânân مهمانان آشنایان زورگویان nouns ending in vowel ‘â’ or ‘u’ zanân زنان nouns ending in consonant Morphology Patterns: Plurals 25 Plural form thoughts waves individuals tribes poems Arabs goals afkar amvaj afrad aqvam ashEar aErab ahdaf xyz axyaz Singular form افکار امواج افراد اقوام اشعار اعراب اهداف thought wave individual tribe poem Arab goal fkr mvj frd qvm shEr Erb hdf zyx zاyxا فکر ا فـکـــا ر فکر موج فرد قوم شعر عرب هدف Outline 26 Language Relatedness Writing System Lexicon Morphology Syntax Social issues Computational linguistics Knowledge of Arabic 27 Can help in: Writing / Reading Vocabulary Morphology based on Arabic patterns But Persian syntax is very different from Arabic Syntax: Word Order 28 Word order different from English: verb-final Scrambling allowed (free word order) Very long sentences in print Linking element (ezafe) is not written makes it hard to identify boundaries Source: Hamshahri رئیسجمهوری که در شب عاشورای حسینی در جمع دانشجویان عزادار ، در سخنانی که واحد مرکزی خبر،كوي دانشگاه تهران حضور یافته بود موضوع دیگري که آنها مي خواستند به: افزود،آن را مخابره کرده است این بود که در ایران هرگونه اقدامي براي حرکت در،ایران تحمیل کنند مسیر فناوري هسته اي ميبایست قبال به اطالع قدرتهاي غربي برسد و ملت ایران این کار را بدتر از قرارداد ترکمنچاي تشخیص داد و آن را .رد کرد که این امر بدون یاري و قدرت الهي امکان نداشت The President, that on the night of Husseini Ashura was present among the group of the mourning students of Tehran University Street, in a speech that the central news unit has broadcast added: Another matter that they wanted to impose on Iran was that any step in Iran moving towards Uranium enrichment must first be notified to the Western powers and the people of Iran recognized this issue as worse than the Treaty of Torkmanchay and rejected it, which could not have been possible without divine help and power. Syntax: Word Order 30 رئیسجمهوری که در شب عاشورای حسینی در جمع دانشجویان عزادار ... ،كوي دانشگاه تهران حضور یافته بود president that on night Ashura-of Hosseini in group university-students mourning street university tehran presence had found, … The President, that on the night of Husseini Ashura was present among the group of the mourning students of Tehran University Street, … Syntax: Word Order 31 Noun Phrase word order is quite structured .این دو تا کتاب کهنه خودش رو بهم هدیه داد this two CL book old his-own OBJ to-me gift gave ‘He gave me these two old books of his as a gift.’ - Det Num Cl Noun Adjectives Possessor Elements after the Noun are linked by the ezafe Elements before the Noun have no ezafe Syntax: Ezafe 32 Noun + modifiers + possessor ‘ کتاب کهنه پیامPayam’s old book’ Preposition + complement ‘ روی کتاب کهنه پیامon Payam’s old book’ Proper names ‘ احمد شاملوAhmad Shamlu’ Geographic names ‘ دریای مازندرانCaspian Sea’ ‘ کوه دماوندDamavand Mountain’ Syntax: râ 33 Specificity, rather than definiteness, is marked دیشب کتاب خوندم دیشب کتاب رو خوندم ‘I read a book/books last night’ ‘I read the book last night’ من دیشب یه کتاب خوندم ‘I read a book last night’ من دیشب یه کتاب رو خوندم ‘I read a (specific) book last night’ Specific direct objects are marked by ‘ راrâ’ (‘ روro’) Absence of article does not mean indefinite or definite bare nouns have ambiguous roles Syntax: Light verb constructions 34 Word for word translation Water to boil _______ came. His friend to cry _______ threw. I window open made. _______ Superman her rescue _____ gave. How much you worry are ____ eating! Persian sentence .آب به جوش آمد ______ .دوستش رو به گریه انداخت ______ ______ .من پنجره را باز کردم ______ .سوپرمن اورا نجات داد ______ !چقدر تو غصه میخوری Syntax: Light verb constructions 35 keshidan ‘pull, drag’ be in pain درد کشیدن to take pains زحمت کشیدن to wait انتظار کشیدن to scream worry غصه خوردن catch a cold سرما خوردن be deceived گول خوردن be slapped سیلی خوردن to be ashamed خجالت کشیدن to yell فریاد کشیدن be beaten کتک خوردن to suffer رنج کشیدن be shot to last طول کشیدن داد کشیدن Focus on the duration of the event zadan ‘hit’ khordan ‘eat, collide’ be defeated شکست خوردن تیر خوردن comb brush teeth شانه زدن مسواک زدن sweep جارو زدن whip شالق زدن stab چاقو زدن pedal پا زدن beat (w/ wood) چوب زدن wax واکس زدن Subject is affected Repetitive event using an instrument (negatively) Syntax: Interference from English 36 Interference: Transference of elements of one language to another Embedded questions .*میخوام بدونم اگه میتونی بیای ‘I want to know if you can come’ Preposition subcategorization Write it in English I am on the phone We arrived on time He used your book به انگلیسی بنویسید پا تلفنم ما سر ساعت رسیدیم از کتابت استفاده کرد Syntax: Subjunctive mood 37 Used in more contexts than e.g. Romance languages Possibility, probability, necessity, ability Desire, will, preference, hope, command Doubt implied Expressions of emotion Adjectival clauses Purpose clause Deliberative interrogative Temporal expressions ‘ ممکنه بخوابمI may sleep’ ‘ باید بخوابمI must sleep’ ‘ نمیتونم بخوابمI can’t sleep’ ‘ میخواد بخوابمHe wants that I sleep’ ‘ اجازه میدی بخوابم ؟Would you allow me to sleep?’ ‘ اگه بخوابم سرحالتر میشمIf I sleep I’ll feel better’ ‘ فکر کنم خوابیده باشهI think he may be asleep’ ‘ میترسم بخوابمI am afraid I might sleep’ ‘ هرکسی که بخوابه کباب نمیگیرهWhoever sleeps will get no kabob’ ‘ رفتم خانه که بخوابمI went home to sleep’ ‘ آخه کی بخوابم ؟Well, when should I sleep?’ ... ‘ قبل از اینکه بخوابمBefore I sleep …’ Outline 38 Language Relatedness Writing System Lexicon Morphology Syntax Social issues Computational linguistics Social Issues: Taarof 39 Taarof: Honorific system is quite complex and affects morphology and syntax Pronoun usage 2nd person: 3rd person: توvs. شما اوvs. ایشان Verb choice come as I told you بیاین ← تشریف بیارین همونجور که بهتون گفتم ← همونجور که خدمت شما عرض کردم Social Issues: Diglossia 40 Diglossia: Two distinct varieties of the language coexist in society Literary variant (used in newsprint, official documents, and literature) Conversational variant (used for everyday conversations, writing letters, in some weblogs, modern literature) Most textbooks teach only the literary variant What is the goal of the learner? To speak with people? To give lectures? To study literature? To read and analyze newsprint only? Blogs? Letters? Teaching only the literary language has shortcomings: 41 1. Students learn obsolete things Future Present Perfect Literary Conversational خواهم رفت می رم khâham raft miram شام خوردم شام خوردم shâm khórdam shâm khórdam شام خوردهام شام خوردم shâm khordé am shâm khordám ‘I will go’ ‘I ate dinner’ ‘I have eaten dinner’ Teaching only the literary language has shortcomings: 42 2. Students don’t learn conversational forms Definite marker: Verb forms Literary Use مرده بگویم beguyam بگم begam ‘I say’ [subj] میآیند mi-âyand میان miân ‘they are coming’ میاندازد mi-andâzad میندازه mindâze ‘she’s throwing’ دادم بهش dâdam behesh ‘I gave it to him’ word order ازش ‘the man’ Conversational of pronominal clitics: Free marde عین خر داره خون میاد هیچوقت حس استیصال بهم دست نداده بود تهران Outline 43 Language Relatedness Writing System Lexicon Morphology Syntax Social issues Computational linguistics Computational Linguistics 44 Study of language from a computational perspective Components Analyze the word form so you can look up the word automatically in a dictionary or use it to search documents containing variants of that word Parse sentences by analyzing their structures and identifying their constituents Translate sentences into e.g., English requires all of the above as well as semantic analysis to disambiguate word choices. Computational Linguistics 45 Using computational linguistics in the language classroom Intelligent tutoring systems: automatically detect errors in web-based exercises and provide feedback to learners Finding language patterns: can be used by teachers or learners to detect language patterns in authentic media (e.g., conversational language in weblogs) Automatic classifiers: can help teachers find relevant material for a lesson Evaluation: is the learner’s writing improving? 46 Persian has high web presence and is among top 10 blog languages in the world Issues of Persian for Computational Approaches 47 Writing system Detecting noun phrase boundaries (where is the ezafe?) Morphological patterns Word order issues (free word order) Light verb constructions (especially when separated) Ambiguities in the lexicon Conversational Persian forms (very different from Literary variant) Conclusion 48 Studied factors that make Persian more or less difficult for speakers of English writing system, morphology, lexicon, syntax, social issues Emphasis has been placed on communicative approaches at the detriment of explanations What may seem difficult (e.g., plurals, Arabic morphology, light verb constructions) often follow patterns that can be generalized no need to learn by heart necessarily Computational linguistics can be used in the classroom and on authentic media to discover language patterns or to generate feedback
© Copyright 2024