SlaviCorp 2018

Transcription

SlaviCorp 201824–26 September 2018Charles University, PragueBook of Abstracts

Table of ContentsPlenariesBjörn Hansen, Edyta Jurkiewicz-Rohrbacher, Zrinka KolakovićDetecting constraints on clitic climbing – with the helpof corpora and psycholinguistic tests. 10Alexandr RosenThe merits of a parallel corpus and how to get the most out of it. 13Ruprecht von WaldenfelsVariation on many levels: why and how comparing corporaand (Slavic) languages makes sense. 14Full papersMagdalena AdamczykA contrastive look at discursive uses of English ‚now‘and Polish ‚teraz‘. 16Dorota Adamiec, Renata Bronikowska, Włodzimierz Gruszczyński,Emanuel Modrzejewski, Aleksandra WieczorekThe Electronic Corpus of the 17th and 18th c. PolishTexts (up to 1772). The final result. 19Anastasiia Baranchikova, Anna Dmitrieva, Mariia Fedorova,Aleksandr Klimov, Olesya Kisselev, Mikhail Kopotev,Svetlana Toldova, Natalia ZevakhinaCAT&kittens: a corpus-based text-analytic tool for Russianacademic writing. 22Vladimír Benko, Radovan GarabíkEnsemble Tagging Slovak Web Data. 26Neil Bermel, Luděk KnittlThe fate of variant forms in historical corpora:Tracing locative exponents in DIAKON. 293

Martina BerrocalA corpus-assisted study of the Presidentialinterviews of Milos Zeman. 32Stefan Heck, Eugen KravchenkoPolish być w trakcie verbal noun – a progressive periphrasticconstruction?. 63Katja BrankačkecProductivity and Meaning of the Prefix nadin the Word-Formation of Upper Sorbian, Lower Sorbianand Czech in a Diachronic Perspective: Evidence from Corpora. 34Milena Hnátková, Tomáš Jelínek, Marie Kopřivová,Vladimír Petkevič, Alexandr Rosen, Hana Skoumalová,Pavel VondřičkaMultiword Expressions in Czech: Typology and Lexicon. 66Kat DziwirekTo taste is to live and love: Verbs of taste in Polish and English. 37Jakob HorschA Construction Grammar Account of the SlovakComparative Correlative Construction. 70Hanne Martine Eckhoff, Aleksandrs Berdičevskis,Marius JøhndalFrom diachronic treebank to dictionary resource:the Varangian Rus project. 39Tomaž Erjavec, Nikola Ljubešić, Darja FišerTraining data and tools for processing user-generatedcontent in Slovene, Croatian and Serbian. 42Matea Filko, Krešimir Šojat, Marko TadićConstruction za infinitive – evidence from the Croatian corpora . 47Olga Goritskaya, Mikita SuprunchukFrequency Dictionary of Belarusian Borrowingsin the Belarusian Variety of the Russian Language. 50Natalia Grabar, Olga Kanishcheva, Thierry HamonMultilingual aligned corpus with Ukrainian as the target language. 53Jane Hacking, Erin Schnur, Fernando RubioMuSSeL: Designing and building a corpus of multilingualsecond language speech. 57Juho HärmeLast year but not yesterday? Explaining differences in the locationsof Finnish and Russian time adverbials using comparable corpora. 604Laura Janda, Francis TyersParts Give More Than Wholes: Paradigms from the Perspectiveof Corpus Data. 73Tomáš JelínekNew error annotation of Czech learner corpora. 76Tomáš KáňaTerminology in and around Diminutives. 79Witold Kieraś, Łukasz Kobyliński, Maciej OgrodniczukKorpusomat — new functionalities and future development. 82Witold Kieraś, Marcin WolińskiBasic natural language processing toolkit for 19th century Polish. 85Valeria Kolosova, Ksenia Zaytseva, Kira KovalenkoPhytoLex – the Database of Russian Phytonyms:from Idea to Implementation. 88Lucie KopáčkováOprahin or Opražin? How to Correctly Form PossessiveAdjective from Female First Name or Surname of ForeignOrigin in Contemporary Written Czech Language?. 915

Natalia Kotsyba, Bohdan MoskalevskyiAn essential infrastructure of Ukrainian languageresources and its possible applications. 94Jana Nová, Vít Michalec, Zdeňka Opavská, Renáta NeprašováFrequency (not) sacred: The headword list of a contemporaryCzech monolingual dictionary and corpora . 132Anna KryvenkoA reference corpus for discourse dynamics analysis in Ukrainian?. 97Tatiana PerevozchikovaPronominal expression of possession in noun phrasesin Russian, Czech, and Bulgarian. 136Miroslav Kubát, Jan Hůla, Radek Čech, David Číž,Kateřina PelegrinováContext Specificity of Lemma. Diachronic analysis. 100Moulay Zaidan LahjoujiThe Corpus of Spoken Rusyn – A user-friendly resourcefor research on Rusyn dialects . 103Nikola Ljubešić, Tanja Samardžić, Tomaž Erjavec, Darja FišerMaja Miličević Petrović, Simon Krek“Kad se mnogo malih složi”: Collaborative developmentof gold resources for Slovene, Croatian and Serbian. 107David Lukeš, Zuzana Komrsková, Marie Kopřivová,Petra PoukarováPronunciation of casual spoken Czech: A quantitative survey. 112Lucie Lukešová (Chlumská), Dominika KováříkováExtracting Multi-word Expressions for the CzechAcademic Phrase List. 117Marek ŁazińskiActional Interpretation of Verbal Aspectin Legal Texts - Corpus Analysis. 121Jiří Milička, Alžběta RůžičkováSlovak Vowel Phonotactics: Slavic Origins vs. Hungarian Influences. 124Tore NessetCascading S-curves: What corpus linguistics tells usabout language change. 1296Alexander PiperskiAspect-Specific Keywords in Russian. 139Adam Przepiórkowski, Agnieszka PatejukAn Enhanced Universal Dependencies Treebank of Polish. 142Anna ŘehořkováCzech conditional verb forms in assertive complement clauses. 148Thomas SamuelssonThe Russian adjectives a ntirossijskij , antirusskij and antisovetskij in Russian media: a corpus study. 152Ranka Stanković, Miloš Utvić, Aleksandra Tomašević,Ivan Obradović, Biljana LazićDevelopment and application of a domain specific corpusfor mining engineering. 155Ilona Starý KořánováAspectual homonymy and polysemy in Czech. 159Marcin SzczepańskiRecent challenges and advances in the developmentof Lower Sorbian corpus resources. 161Magda Ševčíková, Adéla Kalužová, Zdeněk ŽabokrtskýA language resource specialized in Czech word-formation:Recent achievements in developing the DeriNet database. 1647

Svatava ŠkodováSebrat se a . a construction between coordinationand subordination in contemporary Czech . 168Petar VukovićThe second future tense in contemporary Croatian:A corpus-driven study in grammatical semantics. 171Adrian Jan ZasinaEvaluating a corpus-driven approach in L2 classroomon the example of Czech. 173Adrian Jan Zasina, Michal ŠkrabalMorfio.pl – the possibilities for the application of Czechcorpus tools to other languages. 176Jan Patrick ZellerSyntagmatic corpus analyses of mixed speech:code-shifting in Belarusian trasyanka and Ukrainian suržyk. 1808Plenaries

Björn HansenUniversität eEdyta Jurkiewicz-RohrbacherUniversität RegensburgEdyta.Jurkiewicz-Rohrbacher@ur.deZrinka KolakovićUniversität g.deDetecting constraints on clitic climbing – with thehelp of corpora and psycholinguistic testsThe talk aims to show how corpora can be used to study fairly complex phenomena. We will base the discussion on the example of constraints on cliticclimbing in Bosnian, Croatian and Serbian (BCS). Descriptively speaking,clitic climbing (CC) “refers to constructions in which the clitic is associatedwith a verb complex in a subordinate clause but is actually pronounced inconstructions with a higher predicate” (Spencer & Luís 2012: 162). An example of CC out of an infinitival complement is given in (1) where the clitical pronoun ga ‘him’ is realised in the second position of the matrix clause(Wackernagel position); in other cases, however, CC does not take place asin (2) where the clitic ih stays in the complement im.accmust.3prssee.infAljović, N. (2004) “Cliticization Domains: Clitic Climbing in Romance and inSerbo-Croatian.” In: Crouzet, O. et alii (eds.) Proceedings of JEL’2004 Domain(e)s, Université de Nantes, 169-175.Aljović, N. (2005) “On clitic climbing in Bosnian/Croatian/Serbian”. In: Leko,N. (ed.) Lingvistički vidici 34:(05). Sarajevo: Forum, 58-84.Bošković, Ž. (2001) On the nature of the syntax-phonology interface: cliticization and related phenomena. Amsterdam: Elsevier.Browne, W. (2014) “Groups of Clitics in West and South Slavic Languages”.In: Kaczmarska, E.; Nomachi, M. (eds.) Slavic and German in Contact:‘Milan must see him.’(2)Stjepanović (2004: fthem.acc‘I am afraid to test them.’10Although clitics in Bosnian, Croatian and Serbian (BCS) have attractedconsiderable attention in the syntactic literature (cf. Franks & King 2000,Browne 2014, or Bošković 2004), the syntactic conditions and constraints forCC are seriously understudied in comparison to e.g. Czech (e.g. Junghanns2002). There are only very few studies on CC in BSC: Stjepanović (2004),Aljović (2004, 2005) mainly deal with theoretical considerations based on asmall selection of construed examples.Jurkiewicz-Rohrbacher et al. (2017a, 2017b), Hansen et al. (2018) are thefirst descriptions of CC in BCS based on empirical investigations. Basingon the data obtained from massive web corpora {bs, hr, sr} WaC (Ljubešić &Klubička 2014), the raising-control dichotomy of matrix predicates is shownto be a relevant factor of CC. Apart from that, it is found out that reflexivity plays a major role. Kolaković et al. (accepted), on the other hand, tackle the question of register as a relevant factor by comparing results fromForum subcorpus of hrWaC v2.2, Croatian Language Repository (Ćavar &BrozovićRončević 2012) Croatian National Corpus (Tadić 2009)while examining the same types of matrix predicates.First, the talk presents the results of the corpus based and corpus drivenstudies mentioned above, discusses in detail the particular steps of a corpusapproach, ranging from the formulation of queries, coping with tagging errors, to the statistical analysis of the data. Second, it will show how theseresults feed into a major psycholinguistic experiment recently carried outin Croatia (7 experiments x 40 participants 280 participants). The logisticregression mixed models based on data from thespeeded yes-no grammaticality judgment tasks with OpenSesame free software provide the additionalevidence for constraints on CC.hrWaC v2.211

Studies from Areal and Contrastive Linguistics. Slavic Eurasian Studies26, 81-96Ćavar, D., Brozović-Rončević, D. (2012) “Riznica: The Croatian LanguageCorpus”. In: Prace filologiczne 63, 51-65.Franks, S. & King, T. H. (2000) A Handbook of Slavic clitics. Oxford: OUP.Hansen, B.; Kolaković, Z.; Jurkiewicz-Rohrbacher, E.; (2018) “Clitic climbingand infinitive clusters in Bosnian, Croatian and Serbian a corpus-driven study.” In: Fuß, Eric et al., Grammar and Corpora 2016. Heidelberg:Heidelberg University Publishing (heiUP).Junghanns, U. (2002) „Clitic climbing im Tschechischen“. In: LinguistischeArbeitsberichte 80, 57-90.Jurkiewicz-Rohrbacher, E.; Kolaković, Z.; Hansen, B. (2017) “Web Corpora –the best possible solution for tracking rare phenomena in underresourced languages: clitics in Bosnian, Croatian and Serbian”. In: Bański, P. etal. (eds.): Proceedings of the Workshop on Challenges in the Managementof Large Corpora and Big Data and Natural Language Processing (CMLC-5 BigNLP) 2017 including the papers from the Web-as-Corpus (WAC-XI)guest section. Ma

Frequency Dictionary of Belarusian Borrowings in the Belarusian Variety of the Russian Language. 50 Natalia Grabar, Olga Kanishcheva, Thierry Hamon Multilingual aligned corpus with Ukrainian as the target language. 53 Jane Hacking, Erin Schnur, Fernando Rubio