ЛИНГВО-ПЕРЕВОДЧЕСКИЙ АСПЕКТ РАЗРАБОТКИ МУЛЬТИЯЗЫЧНОЙ ОНТОЛОГИИ СПЕЦИАЛЬНОЙ УЧЕБНОЙ ДИСЦИПЛИНЫ

Научная статья
DOI:
https://doi.org/10.18454/RULB.8.09
Выпуск: № 4 (8), 2016
PDF

Аннотация

В статье рассматривается вопрос о представлении отраслевых знаний в структурированном виде (онтологии) на нескольких естественных языках. Дается описание специально разработанного инструмента управления отраслевыми учебными знаниями – мультиязычного обучающего комплекса (МОК), содержательный компонент которого представлен в виде онтологии учебной дисциплины «Общий курс железных дорог» и предметной онтологии «Железнодорожный транспорт» на русском, английском, китайском и корейском языках. Представляются этапы перевода синтаксически обедненного текста на контролируемом русском языке как основы онтологии учебной дисциплины.

Industry-related knowledge is a specific subclass of knowledge which is effectively formed in the “industry - industry educational establishments” system under direct vision of industry administrative bodies. Such knowledge is also a result of teaching-and-learning process of industry workforce at all levels of professional education.

It has been identified that industry-related knowledge might be seen as knowledge of complex nature where we could find philosophical, psychoeducational, socioeconomic, information-technological, linguotranslation and other aspects. Let us consider linguotranslation aspect of industry-related knowledge in details.

Industry-related knowledge - like any other types of knowledge - is an information object represented in a symbolic form. While knowledge hasn’t been transferred into a symbolic form, we won’t to be able to manipulate it. In particular, we can’t transmit it or enrich. However, as we represent knowledge in any symbolic form - in symbols of natural or artificial languages or even by using graphic aid, we could manage it. For example, industry-related knowledge adapted to educational process (at college or university) and represented by means of a natural language could form a textbook [1]. Such a textbook could be used by students of industry universities - residents of the Russian Federation and other countries who speak foreign languages - to enrich the components of industry-related knowledge (fundamental, general-professional, narrowly-specialized and corporate knowledge). To this effect students need a particular level of education (including second language knowledge), some intellectual efforts, agreement in terminology systems in two or more languages [2] and a special tool to manipulate knowledge.

Such kind of a tool is being developed in Siberian Transport University (Novosibirsk, Russia) within a multidisciplinary scientific project [3]. The tool is offered as a multilingual educational complex (MEC) of an academic discipline, which content is represented in a complex of ontologies – an academic discipline ontology (the Introductory Course on Railways) and a subject ontology (Railway Transport).

To develop the MEC ontologies, innovative computer technologies (Semantic Web technologies) are used. The development of MEC requires creation of several alternative ontologies in different foreign languages. The main language is Russian. Other languages are English, Chinese and Korean. Their choice is grounded by the fact that the development of cooperation with East Asia, in particular with China and South Korea, is first-priority for Russian railway industry. By its turn, English plays a role of an intermediary language for international interaction for users who speak languages not represented in MEC.

Thus, the Introductory Course on Railways’ content is considered to be multilingual. Multilingual means, on the one hand, the development of several versions of content (ontologies) represented in different natural languages equivalently agreed through the common terminology system. On the other hand, this is a function of software of MEC.

To create the academic discipline ontology within MEC a purposely developed method was applied. This method includes the following stages:

1. At the first stage the original text (a piece of text-book) represented as a linear text was transferred into a simple-syntax text called the Controlled Natural Language text (in its Russian version) – CNL-R.

CNL keeps syntax and semantic of a natural language unchanged as CNL is a sublanguage of natural language. To coin a term Controlled Natural Language is proved by the reason that natural language syntax is severely restricted to limit its expressive power. At the same time this sublanguage demonstrates sufficient expressive power to describe a subject domain. However imposed restrictions (limitations) are intended for disambiguation of a natural language and make a CNL text machine-readable.

An example of CNL-R is shown below.

устройства и сооружения

предназначены для

нормального обеспечения перевозок

посредством

железнодорожного транспорта,

расположены вдоль

пути;

расположены над

путём,

подразделяются на

пассажирские платформы,

здания,

опоры контактной сети,

сигнальные и путевые знаки,

приводы электрической̆ централизации стрелок,

путепроводы,

мосты,

провода связи и энергоснабжения,

другие устройства и сооружения.

As you can see form the above shown example, a simple form of structuring let us highlight the key concepts in the CNL text (given in bold) and related properties (underlined) as “Subject – Predicate – Object” triplets that form the bases of modern ontology representation languages. 

2.  At the second stage the CNL-R text was translated into English, Chinese and Korean languages (CNL-Eng, CNL-Ch and CNL-K).

Based on the stages of translation process suggested by I. S. Alekseeva [4], we identified the translation work contents to translate a piece of text represented in CNL-R into other languages of MEC.

The preparatory stage included the following actions of a translator: 1) to give characteristic to the source (linear) text in Russian language; 2) to determine the invariant for the source text and its logic; 3) to determine a translation strategy; 4) to select the translator’s tools (dictionaries, reference books, Internet sites, etc.).

At the second stage, pre-translation text analysis was conducted. This stage included the following actions of a translator: 1) to characterize a source linear text author; 2) to characterize a source linear text recipient; 3) to determine a dominant information type; 4) to determine information density; 5) to determine language at a) syntax level, b) semantic level, c) pragmatic level of the source linear text; 6) to identify the communicative task for the source linear text and the target CNL text.

At the third stage, 1) analytic search of language units in the source CNL-R text to communicate them equivalently in the target CNL (Eng, Ch, K) text was conducted; 2) the translation difficulties were determined.; 3) key methods to eliminate these difficulties (by using translation transformations) were defined; 4) a translate was drawn up.

The final stage was related to 1) textual revision; 2) editing of the target text; 3) checking the translation strategy.

It is important to note that works within the above stated translation stages are not influenced by a specific target language. That is they are similar for translation into English, Chinese, Korean or any other languages. The differences appear at the stage of translation per se when a translator searches language equivalents.

On the way of translation a number of specific features of the source text represented in CNL-R to reduce a level of translators’ efforts have been identified. These features include the following: 1) unification of concepts (terms) throughout the target CNL text is simplified; 2) number of translation transformations is reduced due to simpler syntax of the source CNL-R text  and reduction of emotionality components presented in the source linear text. 

In particular, a great number of translation transformations are not used. Among them are 1) lexical transformations: generalization, specialization, meaning extension, stylistic neutralization; 2) grammatical transformations: syntax construction replacement, sentence integration, sentence fragmentation.

Following the translation from CNL-R to CNL-Eng we came to the conclusion that the key requirements to represent English version of MEC ontology were generally met – the structure and format of the source CNL-R text has not been changed.

Due to grammar structure of a Chinese sentence, most of source CNL-R text was disarranged. Chinese language required replacement of parts in complex collocations. In simple sentences the structure of source CNL-R text was unchanged. However in complex sentences this structure was unavoidably broken.

Thus, we could not keep the structure and format of source CNL-R text in the target CNL-Ch text unchanged.

The examples of English and Chinese text versions are given below.

railway facilities and structures

are designed

to provide regular transportation service

by

rail,

are situated along

the track;

are situated over

the track,

are divided into

passenger platforms,

buildings,

catenary supports,

restricted traffic  and wayside signs,

electric interlocking point machines,

viaducts,

bridges,

communication and electric wires,

other facilities and structures.

建筑物和设备

确保铁路正常通行

沿    轨道    设置

   轨道   上面 设置,

分为

客运平台

建筑物

接触电线网支柱

信号标和路标

箭头电动集中化的驱动

高架桥

桥梁

通讯电缆及电源

等其他设备和设施。

3. At the third stage CNL-Eng, CNL-Ch and CNL-K texts were implemented into the proper software environment using the purposely developed ontology editor named Onto.plus. Then, to check completeness and consistency of this ontological model, a free, open-source ontology editor Protégé was utilized.

4. At the fourth stage to enrich the model we added some glossary articles to the key railway terms [5]. An example of a glossary article is given below:

A term in Russian: Вагонный замедлитель

Translation: Retarder

Transcription: [rıˈtɑːdə]

Grammatical characteristics: noun,UK,sg = сущ., брит., ед.ч.

Definition: a device installed in a classification yard used to reduce the speed of freight cars as they are sorted into trains.

The term in a context: Each retarder consists of a series of stationary brakes surrounding a short section of each rail on the track that grip and slow the cars' wheels through friction as they roll through them. 

Translation: Каждый замедлитель состоит из ряда стационарных тормозов, расположенных вокруг небольшого участка рельса, эти тормоза зажимают и замедляют колеса вагонов при соприкосновении с ними.

Visualization: (picture, drawing, video, etc.)

As a result, the MEC ontologies have been created.

Conclusion:

Having regard to the above, we can come to the following conclusions:

1. The considered MEC with the function of an integrative glossary is represented an information-education resource in several natural languages. Semantic Web technologies could help to integrate the ontologies of the developed MEC with ontologies of other academic disciplines and subject ontologies, forming the common system of terms in different languages.

2. The stages of translation of CNL-R into English, Chinese, Korean and other languages are typical and not influenced by a specific CNL target language.

3. In translation the structure and format of the source CNL-R text has not been changed in the target CNL-Eng text as contrasted with the target CNL-Ch text.

4. The MEC project could contribute into an important task solution which the International Association of Transport Universities of Asia-Pacific Countries identifies as follows: interaction and coordination of the activity of universities for the unification of transport terminology [6].

Список литературы

  • Айсмонтас Б. Б. Педагогическая психология: Учебное пособие для студентов / Б. Б. Айсмонтас. – М: МГППУ, 2004. – 368 с.

  • Седякин В. П. Информация и знания / В. П. Седякин // Научные ведомости БелГУ. Серия: Философия. Социология. Право. – 2009. – №8 (63). – С.180-187.

  • Государственный контракт №30/16 от 30.06.2016 г. на разработку мультиязычного обучающего комплекса в виде русско-англо-китайской предметной онтологии с использованием технологий семантического веба (на примере дисциплины «Общий курс железных дорог»).

  • Алексеева И. С. Текст и перевод. Вопросы теории : монография / И. С. Алексеева. – Москва: Международные отношения, 2008. – 184 с.

  • Волегжанина И. С. Мультиязычный глоссарий как средство унификации терминологии при создании онтологий учебных дисциплин / И. С. Волегжанина // European Social Science Journal. – 2015. – № 10. – С. 209­217.

  • Международная ассоциация транспортных университетов стран Азиатско-Тихоокеанского региона: официальный сайт [Электронный ресурс]. – URL: http://iastu-ap.org (дата обращения 21.10.2016).