Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Учебное пособие 1838

.pdf
Скачиваний:
3
Добавлен:
30.04.2022
Размер:
2.39 Mб
Скачать

Scientific Journal “Modern Linguistic and Methodical-and-Didactic Researches” Issue 3 (26), 2019 ISSN 2587-8093

stand him. At school it seems to him that he got the most mischievous classmates and homeroom teacher. The name of the main character contains a characterizing component - Horrid Henry. To the personal name Henry is added Horrid, which means terrible, disgusting, creepy and it denotes the characterization of the boy’s behavior. This nickname is assigned to him because he is negatively configured in advance to everything and everything happening around him so he tries to do what he is comfortable with but constantly gets into awkward situations for him and becomes the object of indignation of adults and children. His parents ask him not to be horrible to his younger brother and to them every day, and so he considers himself horrible.

Aunt Ruby's name is a charactonym - Ruby translates as ruby, a gemstone, and is associated with wealth. Aunt Ruby always gives expensive gifts to her son Steve and that is the cause of her nephew to envy and it makes him think that his parents are poor.

The main character studies in the classroom where the class teacher is a strict lady who has a charactonym Battle-Axe. The surname can be translated as an axe or a battle axe. The teacher is very strict, her words and phrases sound abrupt and sharp as if she is chopping them with an axe. She makes the boy feel horror asking him to read many books and do a huge number of exercise and creative projects.

The complete opposite of her is the teacher of his younger brother and at the same time a rival to Horrid Henry. She has a charactonym Honey that means honey. She is kind and sweet to her students, she does not give them as much homework as it seems to the main character, and she always praises and never scolds or shames her pupils.

At school where children-the-characters learn there are a lot of children's theatrical performances. The preparations for such events are led by Miss Tutu and Miss Thumpe, two teachers of a dance dtudio for children. Both characters have charactonyms that reflect the scope of their activities. The noun tutu has the meaning of a special ballet skirt a tutu, and the director of all dances is a young teacher miss Tutu by name has the appearance and movements of a ballerina. Her accompanist is Miss Thumpe, she plays the piano, so the author decided to give her the appropriate charactonym. The onomatopoeic noun thump conveys the sound of a knock or thud. It is derived from the verb thump, meaning to knock, hit, beat.

In one of the stories where the pupils are vaccinated a nurse and a doctor with charactonyms Niddle and Dr Dettol appear. Both charactonyms are related to their characters' professional activities. The nurse makes injections which all the children are afraid of and to make the situation comical the author gives her a charactonym (the surname Needle translates as a needle). Dettol is the name of an antiseptic well known in the UK so the school doctor gets the appropriate surname, as due to his direct work he should not allow the appearance and spread of infection among the children at school.

Perfect Peter – Horrid Henry’s younger brother Peter by name is a perfect child in the family; he always obeys his parents, willingly eats useful but not tasty food, without any reminding does his school tasks in all subjects. He is never late for school and is not rude to his elder friends, his parents are happy with him and always praise him. Henry does not like it very much; he is jealous and always tries to frame his brother so that his ideality is violated, that is why he "endows" his brother with the ironic characterizing nickname perfect, flawless, which is transmitted by the adjective perfect.

Goody-Goody Gordon is another child-character who received an ironic nickname good, quiet. Gordon is Horrid Henry’s classmate who tries to be a good student and be friends with everyone. Henry does not like such behavior of his classmate, as Gordon gets on well with his school "enemies", so his conflict-free and goodwill characteristics are perceived as negative qualities by the protagonist. The nominative characterizing nickname component is represented by the doubled use of the noun goody.

Spotless Sam – in this characterizing nickname the characterization is given to another good boy who the main character does not like either. The adjective spotless means flawless, flawless. Sam is well-mannered, not naughty at school and always ready for lessons, he is an

104

Scientific Journal “Modern Linguistic and Methodical-and-Didactic Researches” Issue 3 (26), 2019 ISSN 2587-8093

exemplary student. His impeccable behavior is the opposite trait of the main character so Horrid Henry also does not like this classmate.

Brainy Brian – in the class where Horrid Henry studies there is a very clever and reasonable boy, he studies excellently and good at all the subjects and can answer any question, the main character envies him, therefore Brian receives the corresponding ironic characterizing nickname from him. The adjective brainy, which is a common noun component of the characterizing nickname, means smart, brainy and is a qualitative characteristic of the character.

Jolly Josh is a cheerful boy who never gets discouraged, he loves to play with classmates and friends and make them laugh. Horrid Henry is jealous of him and gives him an ironic positive characterizing nickname. The nominative component describes the boy's character and is conveyed by the adjective jolly.

Athletic Ashton is one more Horrid Henry’s classmate who is in a perfect physical form for his young age because the boy is keen on sports, he strong, active and it becomes an object of Henry’s envy and he enshrined it in characterizing nickname expressed by an adjective athletic sports, athletic.

Anxious Andrew - this Horrid Henry’s classmate is constantly worried and nervous about any trifle and school assessment. He always gets excited when he is going to the blackboard or talking to his classmates. His behavior irritates Henry and the adjective anxious is used to characterize such behavior which becomes the nominative component for the given.

Dizzy Dave is an insecure and timid boy who is shy to answer at the blackboard, in his communication with other children he speaks very quietly or his speech is confused. His behavior or reactions in the process of the games is difficult for other children to understand, they have an impression that he is dizzy, so he behaves like that. This character trait is conveyed by the nominative component of the characterizing nickname, expressed by the adjective dizzy (feeling dizzy, dizzy).

Rude Ralf is an ill-mannered boy who is rude with all the children around, either an defiant senior or a classmates, he can fight with other boys of his school, so Henry is afraid of him and seeks not to mess to with him. The nominative component of the characterizing nickname is the adjective rude (rude, ill-mannered).

Babbling Bob is a very talkative boy, he talks constantly and everywhere: in the classroom, at a break and visiting friends, the main character does not feel disposed to him, as Bob does not give anyone the opportunity to say anything and often speaks not to the point. The participle babbling has the meaning of babbling, babbling, chattering, and as a noun translates as empty chatter, thus, the nominal component expressed by this participle becomes in this nickname a speech characteristic of the character's behavior.

Weepy William is a classmate who annoys the main character with constant whimpering due to each grade received, his fears of injections and vaccinations. He can cry on any occasion: if he does not get a toy or is pushed during the game accidentally. Every little thing upsets him and causes tears, so the main character gives him a characterizing nickname corresponding to William's behavior in such situations. The adjective weepy (whiny) becomes a characterizing nickname component, aptly defining such a trait of the child-character.

Greedy Graham is a child-character who received his characterizing nickname due to a negative trait of his character. Graham is Horrid Henry’s greedy classmate; he will never share a candy or an apple with his friends. He is sorry to spend his pocket money on sweets or toys. Henry doesn't like to play with him because Graham never lets him play with his toys or share treats at school like other children do. Thus, he receives a characterizing nickname from the main character expressed by the adjective greedy (mean).

Pimply Paul is another boy who causes irritation and possibly disgust in the protagonist. In this case, the characterizing nickname component reflects not a trait of character or behavior, but appearance. Pimply means pimply, and the shortcomings and flaws of a person’s appear-

105

Scientific Journal “Modern Linguistic and Methodical-and-Didactic Researches” Issue 3 (26), 2019 ISSN 2587-8093

ance often becomes the object of ridicule and as a result the owner of such appearance receives an offensive characterizing nickname.

Beefy Bert is Henry’s classmate Bert by name loves burgers, he constantly eats something, and beef patties are his favorite dish. He sometimes finds it difficult to run and do exercises at his physical education lessons because of his weight. Overweight has always been and remains the subject of school jeers so the basis of the characterizing nickname component (adjective beefy-fleshy) reflects only the physical data, and not the character traits of a person.

Stuck-up Steve is Horrid Henry's cousin is his biggest "enemy". He attends a prestigious private school, studies French and he has all the toys and gadgets that Henry dreams of. Steve is a bit older and during his visits to his relatives he always mocks Henry and plays evil tricks with him showing his superiority, thereby showing his arrogant character as evidenced by his nickname characteristic expressed by a stable combination of the verb and the postposition- stuck-up (smug).

Vomiting Vera is another relative, cousin of Henry. She's just a baby, who is often sick after feeding, it's quite common in young children. This physiological characteristic became the basis of the nominative characterizing nickname component-the participle vomiting, meaning vomiting, vomiting, vomiting.

Moody Margaret is a classmate and girl living next door. The main character does not like to communicate with her because of her eternal whims and discontent on any occasion, she is always annoyed if in games she is not the leader. The adjective moody has several meanings, including capricious, gloomy, dull and conveys the character traits of her character in the given characterizing nickname.

Vain Violet is another pupil in the main character's class. Violet does not have the best character among the girls as Henry thinks. She always strives to look the best in the eyes of others, to hear flattering compliments about her outfit or a good assessment from the girls or adults and to emphasize her superiority. Based on this character trait, the author chooses the adjective vain as if Horrid Henry does it. The adjective means vain, smug, narcissistic and is used as a nominative component for the characterizing nickname.

Sour Susan is Moody Margaret’s classmate and friend. She is constantly in a depressed mood due to the fact that Margaret likes to command her at school and during joint games. The adjective sour meaning gloomy, sullen describes the traits of her character and correctly reflects the characterizing nickname of this girl.

Lisping Lily. In one of the stories there is a very small girl of three or four years, a sister of one of the main character’s classmates. Lily is very small so she does not talks distinctly but she tries to communicate with Henry actively, she not pronounces many sounds, in particular, breath [r] and utters Henwy instead of Henry. Thus, the individual phonetic feature of the speech of a little child allows a reader to create a characterizing nickname name where the adjective lisp becomes a characterizing component of the nickname.

Clever Clare is the only girl who gets a nickname characteristic with a positive connotation. This is really a very kind and nice girl; she is smart and reasonable and always knows how to find a common language with any classmates at school and with all the children in joint games. Perhaps, that Horrid Henry has some sort of affection to her so he gives her a characterizing nickname using adjective clever (smart) which characterizes the girl exclusively from her positive side.

In this study 22 characterizing nickname and 7 charactonyms were identified and analyzed. The results of the study are shown in table 1 below:

Table 1 Forming models of characterizing nicknames and charactonyms

 

adjective

 

noun

participle

verb

Characterizing nicknames

18

 

1

2

1

 

 

106

 

 

 

Scientific Journal “Modern Linguistic and Methodical-and-Didactic Researches” Issue

3 (26), 2019

ISSN 2587-8093

 

 

 

 

 

 

Charactonyms

-

7

-

 

-

As you can see from the component structure of the analyzed characterizing nicknames of the children-characters, all the nominative components (18 of 22) are represented by an adjective. This is the simplest and most common forming model of a characterizing nickname. The participle, noun and verb are the least productive characterizing components (2, 1, and 1, respectively). As for seven charactonyms analyzed, all of them are expressed by nouns.

Conclusion

According to the analysis of 29 illustrative examples of charactonyms and characterizing nicknames it can be concluded that mostly the nominative component of a characterizing nickname describes a certain trait of the character and the charactonyms characterize the appearance or emphasize a certain physical feature.

Moreover, both elements of any characterizing nickname nomination begin with the same letter: Anxious Andrew, Clever Clare, Moody Margaret. The alliteration technique is used by the author for a "special melodic effect" [10, p. 80] and to amuse the reader. Alliteration in characterizing nicknames and the use of charactonyms are the author's technique of F. Simon. Moreover, their use in her works contributes to the formation of a reader’s linguistic representations and the development of linguistic horizons.

The use of characterizing nicknames and charactonyms is one of the ways to implement the comic effect function. As such names have a certain code the process of decoding them brings a great pleasure and makes the reader smile.

As the main readership are preschoolers and children of primary school age so the disclosure of the character features of the heroes with the help of charactonyms 2s34 and characterizing nicknames provides young readers with better understanding of the behavior and reactions of literary characters in a particular situation in the stories. It helps to understand the likes and dislikes of the protagonist in his attitude to those characters. Thus, the use of charactonyms and characterizing nicknames develops cognitive abilities of a reader and realizes the developmental function.

The author of the stories uses names with a common nicknaming element in order to give characteristics of the children-characters on certain activity or external signs. Reading these stories children understand that all the heroes have characterizing nicknames given them by the main character, who himself likes to play pranks and manipulate other children, it makes a reader think about the behavior and character of the children-characters in the stories and whether it is good or bad to give nicknames to other people. Thus, the educational function of using nicknames in the texts for children is carried out.

References

[1]Razhina V.A. Onomastic realities: linguistic, cultural and pragmatic aspects: dis. kand. fil. nauk / V.A. Razhina. – Rostov-na-Donu, 2007. – 156 s.

[2]Matveev A. K. Onomastics and onomatology: a terminological inquiry / A. K.Matveev // Voprosy onomastiki. 2005 g. № 2. – P. 5-11. URL: http://www.ruslang.ru/doc/onomastica/onomastica2text.pdf (vremya obrashcheniya – 07.07.19).

[3]Wilard L. Naming characters: charactonym / L. Wilard. URL: https://larawillard.com/2012/06/06/naming-characters/ (vremya obrashcheniya – 06.07.19).

[4]Balteiro, Isabel. Word-formation and the translation of Marvel comic book charactonyms. 2010. № 7. – Р. 31-53. URL: https://www.researchgate.net/publication/290652782_Word-

107

Scientific Journal “Modern Linguistic and Methodical-and-Didactic Researches” Issue 3 (26), 2019 ISSN 2587-8093

formation_and_the_translation_of_ Marvel_comic_book_charactonyms (vremya obrashcheniya – 06.07.19).

[5]Kalashnikov A. Shakespearean Charactonyms in Translations into Russian. Journal of Language and Education / A. Kalashnikov // National Research University Higher School of Economics Journal of Language & Education Issue 2, 2016. №2. – Р. 14-22. URL: https://jle.hse.ru/article/view/1354 (vremya obrashcheniya – 06.07.19).

[6]Boër S. E. Attribtive names. Notre Dame J. Formal Logic Boër S. E. 19 (1978), № 1. – Р. 177-185. URL: https://projecteuclid.org/euclid.ndjfl/1093888224 (vremya obrashcheniya – 06.07.19).

[7]Oliviu Felecan Name and naming : synchronic and diachronic perspectives Newcastle- upon-Tyne : Cambridge Scholars Publishing, 2012. – 470 р. URL: (vremya obrashcheniya – 06.07.19).

[8]Strel'cova M.YU. Сharacterizing names in Russian language: dis. kand. fil. nauk / M.YU. Strel'cova. – Vladivostok, 2010. – 325 s.

[9]Kapkova S.YU. The translation of realities and nicknames (based on the cycle of stories «Horrid Henry»//Nauchnyj vestnik Voronezhskogo gosudarstvennogo arhitekturnostroitel'nogo universiteta. Seriya: Sovremennye lingvisticheskie i metodiko-didakticheskie issledovaniya. 2011. № 16. – Р. 174-182.

[10]Kapkova S.YU. Features of the individual author's style in the cycle of children's humorous stories by F. Simon // Nauka i obrazovanie v XXI veke, sbornik nauchnyh trudov po aterialam Mezhdunarodnoj nauchno-prakticheskoj konferencii: v 34 chastyah. – Tambov. – Izd-vo : TROOO «Biznes.Nauka.Obshchestvo», 2013. – P. 70-72.

Analyzed sources

[1*] Simon F. Horrid Henry’s Wicked Ways. London: Orion Children’s Books, 2006. – 190

p.

Dictionaries used

[1**] Onomastic terminology. URL: https://www.merriamwebster.com/dictionary/charactonym (vremya obrashcheniya – 17.07.19).

[2**] Dictionary of Russian onomastic terminology / N.V. Podol'skaya. – Moscow : Nauka, 1978. – 200 s.

[3**] Dictionary-reference. Electronic edition. Siberian federal University / Pod redakciej A.P. Skovorodnikova. Krasnoyarsk, 2014. – Р. 101. URL: http://lib3.sfu- kras.ru/ft/lib2/elib/b81/i-489924.pdf (vremya obrashcheniya – 17.07.19).

[4**] Chernec L.V., Semenov V.B., Skiba V.A. School dictionary of literary terms and concepts / L.V. Chernec, V.B. Semenov, V.A Skiba. – Izd: Prosveshchenie, 2013. – 558 s.

[5**] Ozhegov S.I., Shvedova N.YU. Dictionary of the Russian language / S.I. Ozhegov, N.Yu. Shvedova. URL: https://classes.ru/all-russian/russian-dictionary-Ozhegov-term- 12234.htm (vremya obrashcheniya – 17.07.19).

[6**] Efremova T.F. New Russian dictionary. Explanatory and word-formation / T.F. Efremova. URL: https://classes.ru/all-russian/russian-dictionary-Efremova-term-85978.htmv (vremya obrashcheniya – 18.07.19).

[7**] Brokgauz F.A., Efron I.A. Jenciklopedicheskij slovar' / F.A. Brokgauz, I.I. Efron. – Sankt - Peterburg, 1890-1907. URL: dic.academic.ru/dic.nsf/brokgauz_efron/83784/Prozvishha.]. (vremja obrashhenija – 19.07.19).

[8**] Merriam-Webster. URL: https://www.merriam-webster.com/dictionary/nickname (vremja obrashhenija – 19.07.19).

[9**] Cambridge Dictionary. URL: https://dictionary.cambridge.org/dictionary/english/ (vremya obrashcheniya – 19.07.19).

108

Scientific Journal “Modern Linguistic and Methodical-and-Didactic Researches” Issue 3 (26), 2019 ISSN 2587-8093

THEORY AND PRACTICE OF TRANSLATION

UDC 81-13

THE TECHNOLOGY OF PARALLEL TEXT CORPORA AND ITS USE

IN THE PROCESS OF TRANSLATION TRAINING

А.А. Аvdeev

____________________________________________________________________________

Voronezh State Technical University

Candidate of Philological Science, Assistant Professor of the Chair of foreign languages and translation technology

Alexander Aleksandrovich Avdeev e-mail: alexander77777@mail.ru

____________________________________________________________________________

Statement of the problem. The objective of this article is to identify the essence, goals and features of using the technology of creating text corpora as a means of optimizing the automated translation process. The peculiarities of structural organization, properties and mechanisms of text corpora functioning are described, as well as the main tools of corpus linguistics, the goals of their creation, their role in studying the aspects of translation, the nature and the types of using the corpus methodology in translation training.

Results. We have revealed the significance of the parallel text corpora technology as a means of teaching translation and a component of automated translation systems. The article reviews the principles of structural and linguistic organization of text collections, defines the stages of the technology of creating text corpora and discloses their role in methodological and didactic support of translation training.

Conclusion. A corpus of parallel texts is an effective tool for translation teachers and practitioners, providing authentic and correct translation of texts, related to general or specific discourse, from one language to another.

Key words: corpus linguistics, text corpus, parallel text corpora, aspects of translation, technology, tools, crosscultural communication, tagging, text alignment, concordance.

For citation: Аvdeev А.А. The technology of parallel text corpora and its use in the process of translation training / А.А. Аvdeev // Scientific Journal “Modern Linguistic and Methodical-and-didactic Researches”. – 2019. - № 3 (26). – P. 109-117.

Introduction

It is known that, in the process of translating from one language into another, specialists often encounter the problem of the lack of aids, able to simplify the task of seeking equivalent units, as well as the similarities and differences between two language systems. A large number of translation problems can be solved by employing the methods of corpus linguistics. This methodology not only allows for translation training, based on the data obtained, but also provides the possibility to seek texts in large-size parallel corpora, using the methods of natural language processing.

The concept of «text corpus», basic for corpus linguistics, occupies an increasingly important place in the scientific discourse of specialists in the field of linguistics and computer translation. Most well-known linguists consider a corpus as a collection of texts, selected and processed according to certain rules and used as a basis of language learning. The aims of a corpus also include a statistical analysis of language units and phenomena, as well as confirmation of the rules of the given language. Among the main characteristics of a text corpus, the following ones can be singled out: electronic, representative (able to qualitatively represent the object being modelled), tagged (as different from an ordinary text collection) and pragmatical- ly-oriented (i.e. created for a specific functional-pragmatic task) [1, p. 223].

_________________

© Аvdeev А.А., 2019

109

Scientific Journal “Modern Linguistic and Methodical-and-Didactic Researches” Issue 3 (26), 2019 ISSN 2587-8093

The emergence of parallel text corpora has changed the method, used by specialists in the field of linguistics and translation. Lexicologists, having unlimited access to a text corpus or another set of texts, subject to automatic processing, can sample, analyze and count the examples of the use of a word or a phrase in the shortest time, using materials from a database containing millions of text units of different size. In turn, translators using this technology are able to easily choose the most frequent variant of translation of an original language unit, which usually turns out to be correct. A large number of examples studied allows us to make the descriptions obtained more complete and accurate. The fact that text corpus data include a large amount of meta-information (i.e. the author of the text, its style and genre, the date of writing, regional, social and cultural differences, etc.) makes it easy to establish the links of the use of individual units or phrases, characteristic of a certain style, regional differences, etc.

A number of researchers emphasize special relevance of using text corpora in the process of two-way written translation. For instance, the research of L.N.Belyaeva is focused on applied aspects of corpus linguistics, one of which, in her opinion, can be the creation of automated lexicographic systems, necessary for the work of written translators [2, p.86; 3]. V.N.Shevchuk notes that the application of the text corpus technology makes it possible to improve the quality of written translation and provides a full picture of all varieties of norms (lexical, grammatical, syntactic, stylistic, orphographic and punctuational), existing in the modern language [3]. In addition, there is no doubt about the role of text corpora in solving numerous linguistic and extralinguistic problems, arising in the process of translation.

The methodology of the research

In the present article, the object of the research are the corpora of parallel texts of Russian and English languages, presented on web portals www.linguee.ru and www.context.reverso.net. The subject of the research are main structural and linguistic features of text corpora, the mechanisms of their functioning and use for various tasks in the field of the language theory and translation, and the basic tools of corpus linguistics, contributing to the study of translation aspects.

The objectives of the research are to review the principles of structural arrangement and linguistic organization of text corpora, describe the algorithm of their creation and investigate the features of their use as a means of methodological and didactic support of classroom and independent work of students in the field of translation. The practical task of the study is to analyze the main forms of translation training, based on parallel text corpora, with examples of using language material presented on the portals of contextual search systems www.linguee.ru and www.context.reverso.net.

The material of the research is Russian and English text fragments, presented in relevant corpora on e-portals www.linguee.ru and www.context.reverso.net. The study employs the comparative and contrastive methods, as well as the method of contextual analysis of language units.

The results of the research

The study of theoretical works in the field of corpus linguistics and the comparative analysis of textual material (i.e the databases of portals www.linguee.ru and www.context.reverso.net) made it possible to disclose the principles of structural and linguistic organization of text collections, underlying parallel corpora in the Russian and English languages. The typology of text corpora is presented, the main types of text data meta-tagging are described, and the main tools and mechanisms are considered that allow text corpora to implement their methodological and didactic potential. The practical material under study enabled us to identify the nature and main types of using the corpus technique in teaching professionaloriented translation.

The use of text corpora and related programs aims to achieve the following goals:

- the search and selection of lexical and grammatical language units of different levels, i.e. words, word forms, grammatical categories and word combinations. This procedure is per-

110

Scientific Journal “Modern Linguistic and Methodical-and-Didactic Researches” Issue 3 (26), 2019 ISSN 2587-8093

formed, using the special function, called a morphological descriptor. This function is applied for the comprehesive analysis of the language unit under study, based on its grammatical categories and properties;

-the search for the required word form in contexts, defined by specified parameters and distinguished, according to certain criteria. This function aims, firstly, to collect the factual data and, secondly, to constitute the theoretical base of research (i.e. dictionaries, grammar guides, exercise books, methodological and didactic manuals, practical courses);

-the analysis of concordance (the joint use of a language unit with others in contexts of fixed length);

-obtaining linguo-statistical data, which aims to identify the frequency of the use of certain forms, words or phrases. The solution of this problem includes the definition of semantic differences in the structure of synonyms; the detection of contexts and occasional variants of usage and translation, typical of synonymous language units; identification, distinction and analysis of genre and stylistic features, as well as shades of word meanings;

-the etymological analysis and description of the nature of semantic transformations of a language unit [4].

Let us consider in greater detail the issue, related to the properties of a text corpus, its structure and mechanisms of parallel text corpora functioning in the process of automated translation.

In our view, representativeness and openness must be emphasized among the main properties of a text corpus. V.P. Zakharov defines representativeness as sufficient and proportional representation of texts, related to various periods, genres, styles, authors, etc. in the text corpus

[5].The functional potential of corpus investigation offers the ability to process the vast array of language material (both spoken and written), which provides the necessary data typicality and completeness in presentation of the entire range of linguistic phenomena. The corpus content consists of numerous natural written and oral texts of all types that exist in a particular historical period in a particular socio-cultural environment. Moreover, various data are presented in a text array in their natural contextual form and typical language environment, which creates the possibility of their comprehensive and objective study.

In turn, the openness of a text corpus implies the unlimited possibility of its replenishment with new facts, as a result of which researchers and translators have the opportunity to get an idea of the latest trends in the language, based on the analysis of using one or another word in a corpus.

By their structure, text corpora can be differentiated as monolingual, bilingual and multilingual. A multilingual corpus includes several monolingual corpora, similar in structure. Parallel corpora also include other types of text arrays, in particular, diachronic corpora (those comprising texts in the earlier form of language evolution, as well as their translations into a modern language), and transcriptional corpora, composed of texts in the literary language, read by users of its different dialects and territorial varieties [6].

The significance of parallel corpora is due to the fact that they provide the possibility to present and objectively establish the methods by which translators cope with translation problems in practice. In addition, a variety of text arrays represent an important tool for the elaboration of objective translation models for beginning specialists.

Another major application area of corpus technology is the study of the translation norm in certain sociocultural and historical contexts, as well as the nature of text conformity with regulatory requirements, or deviation from them. In the field of applied linguistics, the arrays of parallel texts provide the data necessary for approbation and testing of automatic translation systems, the formation, supplementation and operation of translation memory systems, and the development of an automatic system of search for translation equivalents. Other tasks, successfully solved with the use of parallel text corpora, include the contrastive, statistical and frequency analysis of language material, the detection of the degree of inaccuracy and information

111

Scientific Journal “Modern Linguistic and Methodical-and-Didactic Researches” Issue 3 (26), 2019 ISSN 2587-8093

loss in the process of translation, the development and implementation of various translation strategies and the choice of the appropriate translation techniques.

It should be noted that, in the preparation of parallel text corpora, it is necessary to consider the factor of intercultural links. If they are extremely weak or missing, the task of creating the full-fledged parallel text corpus becomes impossible. The texts in the source language, despite their primacy, are selected with account of availability of equivalents in the target language. The materials used may include various sources, namely, the texts on special subjects, newspaper and informative texts, scientific and technical texts, literary texts. The combination of these sources gives a parallel corpus the property of representativeness, mentioned above.

The structural organization of the text corpus can be very diverse, which is determined by the pragmatic tasks of its compiler or user. The most common options are:

1)a traditional text with reference to translation;

2)a tabular «mirror» form, providing the convenience of perception, comparison and

analysis;

3)a database (a structure, which is an integral part of automatic processing systems).

One of the most important conditions for creating a full-fledged text corpus is the align-

ment of texts. Alignment is identification of correlating fragments in both parts of the corpora. The necessity for this procedure is determined by two factors. First, the differences in the semantic structure and combinability of words, as well as in grammatical and syntactic structures of two languages, and the presence of idiomatic and stylistic differences between the original text and the translation rules out the linear correspondence at the level of language units. Secondly, the most literal version of the translation is not always preferable in practice, due to which a translator uses various techniques, in particular, splitting of original syntactic structures. Thus, the alignment of parallel texts is the basis of any work related to their statistical analysis. The alignment of the parallel corpus at the sentence level is an important condition for conducting applied linguistic research. In the process of translation, the text sentences may undergo various transformations (splitting, combination, insertion of sentences, word sequence change). In this connection, the alignment often appears to be a complicated task.

The availability of a data array is not a sufficient and the only condition for solving various problems in the field of text linguistics and translation. The comprehensive analysis of textual material necessarily requires the presence of additional linguistic and extralinguistic information in texts. This resulted in the need for proper text corpus structuring, which found its reflection in the idea of a tagged corpus. The idea of tagging (in English, tagging) consists in the fact that text units and their components get special tags, so that a specialist has the opportunity to get a complete picture of the properties of a given text and the required prior knowledge. Linguistic tags, as a rule, contain the data on linguistic characteristics of the text itself (lexical, grammatical and others). Structural tags contain the data on the features of its structural organization and the levels of text material (chapter, heading, paragraph, sentence, word combination, word form), while extralinguistic tags include information about the author and the general data on the text under study (author, title, date of publication, genre, subject, stylistic identity, pragmatic function, etc.). The general objectives of the study, the needs of specialists in a particular case of linguistic or translational analysis, and the possibility of introducing additional features into the text are most important factors determining the address of specialists (translators) to some or other corpus data.

The analysis of works in the field of structural organization of a text corpus (A.M. Amieva, I.M.Boguslavsky) allows us to conclude that linguistic tagging includes the following types [7; 8]:

- morphological or part-of-speech tagging. This type of tagging serves to indicate the characteristics of speech parts, as well as their inherent grammatical categories. Most largesize corpora are morphologically tagged in their structure, which creates the prerequisites for

112

Scientific Journal “Modern Linguistic and Methodical-and-Didactic Researches” Issue 3 (26), 2019 ISSN 2587-8093

comprehensive morphological and semantic-syntactic analysis, while the achievements in computer morphology allow for automatic tagging of large-size corpora;

-syntactic tagging as a result of syntactic analysis or parsing (in English, parsing is syntactic analysis), based on the data of morphological analysis. This type of tagging is used to describe syntactic links between lexical units, that is, the surface syntactic structure of a sentence, and various syntactic structures (for example, a subordinate clause, a verbal phrase, etc.). The information obtained helps to more fully represent the set of logical roles in the situation described by a particular sentence, that is, to understand the deep structure of this sentence;

-semantic tagging, which serves to designate semantic categories, which include concrete language units, as well as their more specific varieties.

-anaphoric tagging, which is a marker of reference (for example, pronominal) links in the given sentence;

-prosodic tagging, containing the markers of stress and intonation. This type of tagging is often accompanied by the so-called discourse tagging, which serves to mark pauses, repetitions, etc.

The creation of a text corpus is a complex technological process, which includes both linguistic procedures proper and a set of actions related to automation of this process. It can be presented in the form of the following stages or steps:

-The search and selection of information resources. At the contemporary stage, these problems are solved using universal search engines and directories (for example, Google), as well as national search engines. The use of advanced search mechanisms (i.e. advanced search) allows us to seek the necessary texts on sites in a particular language or having a specific territorial affiliation.

-Digitization of texts (the conversion to computer form). It should be noted that, at present, this problem is solved relatively easily, at least in relation to modern texts and spelling. The simplicity of solving this problem is based on improving the possibilities of optical input (scanning) and text recognition, as well as on the growing computerization of modern life, including in areas, related to search and processing of text data. The main methods for obtaining e-texts for corpora creation are manual input, scanning, the use of author’s copies and a number of other methods.

-Preliminary text processing. At this stage, the text material intended for creating a corpus and obtained from different sources is subject to preliminary editing and correction, and also provided with the necessary bibliographic and extralinguistic description.

-Conversion, transcoding, formatting and graphematic analysis. At this stage, the texts are subject to complex machine processing, including one or several stages, which, if necessary, involves transcoding and processing of non-textual elements (figures, drawings, tables), often associated with their deletion or transformation, dehyphenation and other operations. As a rule, these operations are performed in automatic mode. In addition, this stage includes the text fragmentation, that is, the selection of its structural components for further analysis.

-Text tagging. This stage is associated with compilation of meta-description for text units, that is, with attribution of additional information (metadata) to texts and their constituent elements (both structural and lexical). This operation greatly simplifies the automatic analysis of the text corpus. The meaningful elements of the meta-description have already been mentioned above, and the formal elements include an indication of technical parameters of a text unit (i.e. file name, coding parameters, tagging language version, performers of specific work stages). This information is usually entered manually, while structural and linguistic tagging are performed automatically.

-At the next stage, the results of automatic tagging are subject to proofreading, associated with error correction and the removal of ambiguity caused by various linguistic and extralinguistic factors.

113