<?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE article PUBLIC "-//NLM/DTD JATS (Z39.96) Journal Publishing DTD v1.2 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
    <!--<?xml-stylesheet type="text/xsl" href="article.xsl">-->
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.2" xml:lang="en">
	<front>
		<journal-meta>
			<journal-id journal-id-type="issn">2313-0288</journal-id>
			<journal-id journal-id-type="eissn">2411-2968</journal-id>
			<journal-title-group>
				<journal-title>Russian Linguistic Bulletin</journal-title>
			</journal-title-group>
			<issn pub-type="epub">2313-0288</issn>
			<publisher>
				<publisher-name>Cifra LLC</publisher-name>
			</publisher>
		</journal-meta>
		<article-meta>
			<article-id pub-id-type="doi">10.60797/RULB.2025.71.6</article-id>
			<article-categories>
				<subj-group>
					<subject>Brief communication</subject>
				</subj-group>
			</article-categories>
			<title-group>
				<article-title>CORPUS APPROACH IN TRANSLATION STUDIES</article-title>
			</title-group>
			<contrib-group>
				<contrib contrib-type="author" corresp="yes">
					<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-1990-3061</contrib-id>
					<contrib-id contrib-id-type="rid">https://publons.com/researcher/C-1924-2016</contrib-id>
					<name>
						<surname>Ivanova</surname>
						<given-names>Elizaveta Vasilievna</given-names>
					</name>
					<email>e.v.ivanova@spbu.ru</email>
					<xref ref-type="aff" rid="aff-1">1</xref>
				</contrib>
			</contrib-group>
			<aff id="aff-1">
				<label>1</label>
				<institution>Saint Petersburg State University</institution>
			</aff>
			<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2025-11-10">
				<day>10</day>
				<month>11</month>
				<year>2025</year>
			</pub-date>
			<pub-date pub-type="collection">
				<year>2025</year>
			</pub-date>
			<volume>3</volume>
			<issue>71</issue>
			<fpage>1</fpage>
			<lpage>3</lpage>
			<history>
				<date date-type="received" iso-8601-date="2025-09-16">
					<day>16</day>
					<month>09</month>
					<year>2025</year>
				</date>
				<date date-type="accepted" iso-8601-date="2025-10-10">
					<day>10</day>
					<month>10</month>
					<year>2025</year>
				</date>
			</history>
			<permissions>
				<copyright-statement>Copyright: &amp;#x00A9; 2022 The Author(s)</copyright-statement>
				<copyright-year>2022</copyright-year>
				<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
					<license-p>
						This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See 
						<uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>
					</license-p>
					.
				</license>
			</permissions>
			<self-uri xlink:href="https://rulb.org/archive/11-71-2025-november/10.60797/RULB.2025.71.6"/>
			<abstract>
				<p>The article is aimed at examining the advantages gained from the corpus analysis for exploring and choosing the most appropriate translation equivalents for Russian set expressions. The goal of the article resides in the consistent description of English translation equivalents against the background of their frequency in the corpus, as well as the features of the contexts in which they are used. The achievement of this goal results in supplementing some additional strategies for choosing the most suitable translation equivalent. Further research along these lines will contribute to both the more detailed and complete description of the choice principles involved and practical realization of these principles.</p>
			</abstract>
			<kwd-group>
				<kwd>set expression</kwd>
				<kwd> translation equivalent</kwd>
				<kwd> corpus</kwd>
				<kwd> corpus analysis</kwd>
				<kwd> frequency</kwd>
			</kwd-group>
		</article-meta>
	</front>
	<body>
		<sec>
			<title>HTML-content</title>
			<p>1. Introduction</p>
			<p>The theory and practice of translation have both undergone significant changes in the first quarter of the 21st century due to the impact of cognitive science and cognitive linguistics, on the one hand, and the introduction of corpus data and computer technologies in general into the linguistic research, including translation studies, on the other. Corpus approach in translation studies can be based on parallel corpora, in which texts in source and target languages are lined up sentence by sentence, or on comparable corpora, which incorporate texts in different languages on the same topics but not directly translated. Corpus data can also be employed to select domain-specific terms and their translations, which is useful for compiling dictionaries and glossaries. In this paper, we will consider some aspects of using monolingual corpus data for choosing translation equivalents.</p>
			<p>The aim of this paper is to look at the use of corpus data in translating set combinations of words, i.e. set expressions. These expressions are chosen from the sphere of academic studies. To achieve the goal of the research definition analysis, contextual analysis, translation analysis and corpus analysis are used. Corpus analysis in the article is based on the Corpus of Contemporary American English [5].</p>
			<p>2. Main results</p>
			<p>The study allows us to outline the following results:</p>
			<p>1. Computer technologies are widely used in translation studies, providing researchers and translators with useful tools for theoretical reflection on translation processes and practical implementation of their potential in realizing pragmatic targets of translating and developing translation techniques. The contribution of corpus data in this respect cannot be overestimated.</p>
			<p>2. Corpus analysis of the frequencies demonstrated by the set phrases “practice exam / test”, “trial exam /test”, “preliminary exam/test”, “mock exam /test”, “simulated exam/test” identifies the most plausible version for translating the Russian set expression “тренировочный экзамен/тест”. Nevertheless, though frequency has a huge impact on the choice of the target language unit in translation, other factors, such as the context as a whole should not be ignored.</p>
			<p>3. The vast number of variable contexts supplied by corpus data for this or that set phrase helps researchers and translators delineate subtle differences in the semantics and make the appropriate choice in the process of translation. The example of “language use” and “language usage” discussed in this article illustrates the importance of considering these contexts.</p>
			<p>3. Discussion</p>
			<p>At the moment, there exists a vast and varied sphere of translation studies based on corpus analysis as well as the implementation of computer technologies in general [4], [6], [8], [9], [10]. In particular, the importance of corpus data for translation was given detailed description in the works of D.O. Dobrovolsky and E.V. Pivovarova [1], [2], [3]. The scholars outlined the irreplaceable value of corpora for examining the contextual suitability of phraseological equivalents in source and target languages. In this article set expressions, characterized by reproducibility, but not imagery, are considered regarding their translation equivalents.</p>
			<p>It is a well-known fact that word combining is characterized by idiomaticity and particularity, and these factors are different in different languages. A classical example is “высокий мужчина” — “a tall man”, “высокое здание” — “a tall/high building”.</p>
			<p>For this reason, it is often not possible to substitute words in a combination, especially in a set combination, by their dictionary equivalents in the target language. For example, “тренировочный экзамен/тест” could be translated into English as “practice exam/test”, “trial exam/test”, “preliminary exam/test”, “simulated exam/test” or “mock exam/test”.</p>
			<p>Let’s look at the frequency of these set combinations in the corpus.</p>
			<p>The combination “preliminary exam” has the highest frequency — 17, while “preliminary test” scores 49, but only 5 cases out of this number refer to the academic sphere.</p>
			<p>The interested students usually take a preliminary test to determine whether they want to attempt the qualifying exam.</p>
			<p>The combination “practice exam” on the other hand is not registered at all, while “practice test” achieves the highest frequency among all the analysed combinations — 82, with only a very small number of all cases not referring to the academic sphere (7 cases).</p>
			<p>I went online and took the practice test. I knew I struggled with math and science.</p>
			<p>The frequency of the next set expressions is as follows: “mock exam” — 3, “mock test” — 4.</p>
			<p>Lastly, even though the students participated in a full, eight-hour timed mock exam, it only simulates the real MCAT testing situation,</p>
			<p>Alternatively, we'll be happy to mail you an in-home mock test.</p>
			<p>The combination “trial exam” is not encountered in the corpus, “trial test” is used 4 times, but not in the academic sense:</p>
			<p>This week, the California-based producer of baby carrots launched a trial test of its newest product</p>
			<p>There are no examples for “simulated exam”, while “simulated test” is registered 5 times, but only 1 example refers to the academic domain:</p>
			<p>…undergraduate courses such as general psychology or personality theory. In this situation, simulated test items should be used to demonstrate any given device or technique.</p>
			<p>If we go by the corpus data and the frequency of the potential translation equivalents for choosing the best option for translation, we must decide on “preliminary exam” and “practice test”. But the combinations “mock exam/test” look more balanced regarding the usage of the same first word. Another factor influencing the decision of the translator could be the meaning of the noun “mock/mocks”, designating an exam or test taken for practice. But the corpus data for the frequency of this word requires manual processing because the frequency of the noun and the frequency of the corresponding verb are not differentiated.</p>
			<p>The translation equivalents for “оценка научно-исследовательской работы” have similar frequencies in the corpus: “evaluation of research” — 12, “assessment of research” — 10. So going by the frequency, we can choose both.</p>
			<p>Another aspect that should be taken into account when looking for the translation equivalents is the context.</p>
			<p>Let’s look at the Russian frequently encountered set expression “использование языка”. The dictionary [7] offers the following definitions for the nouns “use” and “usage”:</p>
			<p>Use — the act of using something; the state of being used</p>
			<p>Usage — the way in which words are used in a language: current English usage</p>
			<p>It looks at first glance that “language usage” is a more appropriate translation. But the corpus data raise some doubt in this respect.</p>
			<p>“language use” — 429</p>
			<p>Lying can cause behavioral change in language use because it is cognitively demanding</p>
			<p>… problems associated with poor communication between patients and doctors, including issues of language use,</p>
			<p>“language usage” — 128</p>
			<p>While collocation can reveal new patterns in language usage, it tends to be an exploratory tool</p>
			<p>The author notes that the areas in which students struggled were mainly centred on language usage, expressed by the educators as ‘the inability of students to express themselves'.</p>
			<p>Still, if the goal of proper language usage is to be understood by others, clarity is better than complexity.</p>
			<p>It is possible to assume that the contexts with “language usage” are more particular and concrete than those with “language use”, which look more generalized, but this difference is very subtle and not traceable in all sentences.</p>
			<p>4. Conclusion</p>
			<p>Computer technologies in general and corpus analysis in particular are and will be playing an increasing role in translation, which undoubtedly calls for further theoretical and practical study of various cases of their implementation. All the case studies will provide invaluable material for the development of translation theory and translation techniques. In this paper, only two aspects of using a monolingual corpus for translation are examined.</p>
		</sec>
		<sec sec-type="supplementary-material">
			<title>Additional File</title>
			<p>The additional file for this article can be found as follows:</p>
			<supplementary-material xmlns:xlink="http://www.w3.org/1999/xlink" id="S1" xlink:href="https://doi.org/10.5334/cpsy.78.s1">
				<!--[<inline-supplementary-material xlink:title="local_file" xlink:href="https://rulb.org/media/articles/21460.docx">21460.docx</inline-supplementary-material>]-->
				<!--[<inline-supplementary-material xlink:title="local_file" xlink:href="https://rulb.org/media/articles/21460.pdf">21460.pdf</inline-supplementary-material>]-->
				<label>Online Supplementary Material</label>
				<caption>
					<p>
						Further description of analytic pipeline and patient demographic information. DOI:
						<italic>
							<uri>https://doi.org/10.60797/RULB.2025.71.6</uri>
						</italic>
					</p>
				</caption>
			</supplementary-material>
		</sec>
	</body>
	<back>
		<ack>
			<title>Acknowledgements</title>
			<p/>
		</ack>
		<sec>
			<title>Competing Interests</title>
			<p/>
		</sec>
		<ref-list>
			<ref id="B1">
				<label>1</label>
				<mixed-citation publication-type="confproc">Dobrovolskii D.O. Korpusi tekstov i dvuyazichnaya frazeografiya [Text corpora and two-language phraseology] / D.O. Dobrovolskii // Vestnik Novosibirskogo gosudarstvennogo pedagogicheskogo universiteta [Bulletin of Novosibirsk State Pedagogical University]. — 2015. — № 5 (27). — P. 23–37. [in Russian]</mixed-citation>
			</ref>
			<ref id="B2">
				<label>2</label>
				<mixed-citation publication-type="confproc">Dobrovolskii D.O. Korpusnii podkhod k issledovaniyu frazeologii: novie rezultati po dannim parallelnikh korpusov [Corpus approach to phraseological studies: new results based on parallel corpus data] / D.O. Dobrovolskii // Vestnik Sankt-Peterburgskogo universiteta. Yazik i literatura [Bulletin of Saint-Petersburg state university. Language and Literature]. — 2020. — Vol. 17. — № 3. —  P. 398–411. [in Russian]</mixed-citation>
			</ref>
			<ref id="B3">
				<label>3</label>
				<mixed-citation publication-type="confproc">Pivovarova Ye.V. Metod korpusnogo analiza v izuchenii frazeologii nemetskogo yazika. Teoreticheskii obzor [Method of corpus analysis in German phraseology studies. Theory review] / Ye.V. Pivovarova // Filologicheskie nauki. Voprosi teorii i praktiki [Philological Sciences. Theoretical and Practical Issues]. — 2019. — Vol. 12. — № 12. — P. 263–268. [in Russian]</mixed-citation>
			</ref>
			<ref id="B4">
				<label>4</label>
				<mixed-citation publication-type="confproc">Baker M. Corpus linguistics and translations studies: Implications and Applications / M. Baker // Text and Technology: In Honour of John Sinclair. — Amsterdam/Philadelphia : John Benjamins, 1993. — P. 233-252.</mixed-citation>
			</ref>
			<ref id="B5">
				<label>5</label>
				<mixed-citation publication-type="confproc">Corpus of Contemporary American English. — 2025. — URL: https://www.english-corpora.org/coca (accessed: 04.09.2025).</mixed-citation>
			</ref>
			<ref id="B6">
				<label>6</label>
				<mixed-citation publication-type="confproc">Ding J. Corpus-based translation studies: Examining media language through a linguistic lens / J. Ding // SHS Web of Conferences. — 2024. — 185. — URL: https://creativecommons.org/licenses/by/4.0/ (accessed: 12.09.2025).</mixed-citation>
			</ref>
			<ref id="B7">
				<label>7</label>
				<mixed-citation publication-type="confproc">Oxford Advanced Learner’s dictionary. — 6th edition. — Oxford : Oxford University Press, 2000. — 1540 p.</mixed-citation>
			</ref>
			<ref id="B8">
				<label>8</label>
				<mixed-citation publication-type="confproc">Saldanha G. Principles of corpus linguistics and their application to translation studies research / G. Saldanha // Revista Tradumatica. — 2009. — 7 p. </mixed-citation>
			</ref>
			<ref id="B9">
				<label>9</label>
				<mixed-citation publication-type="confproc">Umerova M.V. Parallel corpora in translation studies / M.V. Umerova // Sciences of Europe. — 2018. — № 29–3 (29). — P. 56–59. </mixed-citation>
			</ref>
			<ref id="B10">
				<label>10</label>
				<mixed-citation publication-type="confproc">Wang G. An analytical framework for corpus-based translation studies / G. Wang, Y. Xyn // Humanities and Social Sciences Communications. — 2024. — Vol. 11. — № 1709. — URL: https://www.nature.com/articles/s41599-024-04250-4 (accessed: 04.09.2025).</mixed-citation>
			</ref>
		</ref-list>
	</back>
	<fundings/>
</article>