Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Ординатура / Офтальмология / Английские материалы / Assistive Technology for Visually Impaired and Blinde People_Hersh,Jonson_2008.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
12.16 Mб
Скачать

14 Speech, Text and Braille Conversion Technology

Learning Objectives

Text in electronic form is a key and increasingly important intermediary in allowing access to information by visually impaired and blind people using assistive technology. Once text is in electronic form, it can be transmitted to distant recipients, read aloud using synthetic speech, converted to Braille media and displayed in large print for visually impaired readers. Text can be produced in electronic form using input from a keyboard, speech and/or Braille. Conversion technology is thus the enabler or intermediary which makes possible the various input and output combinations that allow blind and visually impaired, as well as other disabled people to access electronic text. The key conversion technologies of speech-to-text (STT), text-to-speech (TTS), Braille-to-text (BTT) and text-to-Braille (TTB) are the focus of this chapter, which has the following learning objectives:

Gaining an in-depth understanding of the fundamental scientific principles that support spoken language technology.

Learning some of the details of speech-to-text and text-to-speech conversion.

Understanding the basic concepts of Braille conversion technologies.

Gaining an appreciation of the application of these conversion technologies to assistive technology systems for visually impaired and blind people.

Learning about the benefits and limitations of the current state-of-the art technologies for conversion applications.

14.1 Introduction

14.1.1 Introducing Mode Conversion

Human communication is multimodal. People use a number of different types of communication, including images, text, gestures, oral speech, sign languages, touch, mime, body language, facial expressions and music, to communicate with each other. Two or more means of communication are frequently combined to improve comprehension. For instance, in face-to-face communication, speech is

498 14 Speech, Text and Braille Conversion Technology

often combined with gestures and mime. In addition, most people are able to switch between different types of communication signals, for instance, from saying “please could you give me that loaf of bread” to actually pointing at the item desired. Each communication signal is received and analysed by one of the senses. This fact can be used to categorise the various communication signals into modes (acoustical, visual, and tactile). Blind people have no or only very limited access to visual communication modes. Therefore technical support is required to present information which was originally intended to be received by the visual mode into other modes, including the following.

Speech

Linguistic information, which is very important for communication, can be presented in either written or spoken form. Only the spoken form is accessible to blind people and therefore the conversion of text to speech (and vice versa) is a key procedure for assistive technology. The fundamental differences between written and spoken language increase the complexity of the conversion technology. Therefore, spoken language technology (SLT) is a demanding and still evolving field of research and development. The state of the art in this field (which is still unsatisfactory, particularly when compared to human speech output), will form the main topic of this chapter.

Other audio information

The visual channel is continuously exposed to a wide range of non-linguistic information, selected domains of which, can be converted into audio information using assistive technology. A typical example is the transformation of document structures to audio information. In such cases, there are benefits in complementing speech information by additional audio information. The conversion technology involved is of increasing importance, as it is the key for blind people (but not deafblind people) to access the World Wide Web successfully.

Tactile information

Some information in the visual channel can be successfully converted into tactile information. Maps and diagrams are typical examples, as they can be equipped with contours which can be felt with the fingertips. However, there are issues of the appropriate amount and type of detail and tactile diagrams are discussed in more detail in Chapter 4. Text can be represented in tactile form using text-Braille conversion technology. Braille is well-established, though computer programmes for converting text to Braille are considerably more recent. Learning to read Braille requires training, time and effort (similarly to learning to read text) and is difficult for older people. Braille is increasingly being replaced by speech-based methods. However, these methods are not appropriate for deafblind people and there would be benefits to blind people in being able to choose whether to access text through Braille or speech.

14.1 Introduction

499

Blind people require a number of different converters in order to be able to access visual information in their preferred format. This chapter will focus on the conversion tasks that are closely related to linguistic information, namely speech- to-text, text-to-speech, and text-to-Braille.

14.1.2 Outline of the Chapter

The main aim of this chapter is to provide an understanding of spoken language technology (SLT) and its applications to support visually impaired and blind people. Although SLT is still not very widespread, it has a very large potential for future development. However, this very practical goal is supported by underlying theory and therefore much of the contents of the chapter are necessarily theoretical in order to provide the background for understanding the practical applications. SLT is complicated and the currently available equipment has still to be perfected, resulting in problems and unsatisfactory performance at times. A good understanding of SLT fundamentals is required to recognize the reasons for these problems as well as the potential for future development of SLT.

The technical presentation of the chapter begins with Section 14.2 which provides the reader with the prerequisites necessary for understanding speech and language related technology. This includes aspects of signal processing and, in particular, spectral analysis (Section 14.2.1), as well as some aspects of linguistics (Section 14.2.2). This leads to a general scheme for a speech processing system, shown in Figure 14.4 which will serve as a didactic framework for the discussion presented in the rest of the chapter.

A detailed explanation of speech-to-text conversion is given next, in Section 14.3, as this is required to assess the capabilities and the problems of speech recognition equipment. A presentation of selected principles of pattern recognition in general is given in Section 14.3.1, and speech recognition, in particular is described in Section 14.3.2. Selected classes of speech recognizers are described in Section 14.3.3.

A key technology in assistive technology and rehabilitation engineering is text- to-speech (TTS) conversion and a detailed explanation of this technology is presented in Section 14.4. The principles of speech production in general are given in Section 14.4.1, with particular attention to the audio output (Section 14.4.2). Finally, an overview of the existing classes of TTS systems is presented (Section 14.4.3).

The basic principles of Braille conversion are introduced in Section 14.5 and, since Braille technology only appears at the level of written language (text level), this is a short section. It is followed by Section 14.6 that looks at the application of different conversion technologies to commercial equipment for blind and visually impaired people. This is a very large field, which it is partially covered by other chapters in this book. Therefore, this section is restricted to summarising the technology that is considered to be of most practical use.

Finally, in Section 14.7, some open problems and their potential solutions are discussed. This section commences with a few remarks on the current state of the art (Section 14.7.1). A number of problems are identified (Section 14.7.2) and