Добавил:

Sekretar kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Ростовский Государственный Медицинский Университет

Предмет:

Медицина общая

Файл:

Ординатура / Офтальмология / Английские материалы / Assistive Technology for Visually Impaired and Blinde People_Hersh,Jonson_2008.pdf

Скачиваний:

Добавлен:

28.03.2026

Размер:

12.16 Mб

Скачать

☆

<<< < Предыдущая 103 104 105 106 107 108 109 110 111 112 113 114115 / 192115 116 117 118 119 120 121 122 123 124 125 126 127 > Следующая >>>

12.4 Audio-transcription of Printed Information

401

Blind and visually impaired people require access to a wide variety of different types of printed information, including books, newspapers, menus and timetables. One of the earliest approaches to making print accessible was the talking book. This involved making a recording of the book being read, generally by volunteer readers. Once the original recording was made, multiple copies were produced, originally on tape and currently on cassette or CD. The copies were then distributed to be played on an appropriate player. This approach has also been used to produce talking newspapers. The main advantage of this approach is that the recording sounds natural since a person, rather than synthetic speech has been used to produce it. The drawback of this is that it is time intensive and expensive if the recordings are made by paid staff. In addition, it is most suitable for items such as books with a stable text that will be used for an extended period. It is less practicable for items such as menus, timetables and theatre programmes that are constantly changing.

This gives the need for reading systems or devices that can read items as they are presented, rather than having a recording prepared in advance. Such reading systems generally include text-to-speech conversion software, which is discussed in more detail in Chapter 14. A simple classiﬁcation of reading systems is given in Figure 12.9. It should be noted that one of the main distinctions is between standalone reading systems and reading systems that are computer-based. Stand-alone reading systems and the Read-IT project are discussed in the next two sections.

DAISY technology (discussed in Chapter 15) has been developed as a standard for audio output of printed material. The idea of a standard navigable format for visually impaired end-users to access information in audio format is clearly a good one. However, the time lag due to number of factors including the time spent in developing the format and working for acceptance of it, have meant that the technology has evolved. Despite considerable hard work to publicise DAISY, it has not been taken up on a large scale by publishers and many publishers are unaware of it or how they could use it. It is hoped that this situation will change in due course.

From the end-user perspective there are advantages in having output which can be played on widely available standard devices, such as a CD, cassette or MP3 player, rather than requiring a special player. As this indicates, there are advantages in a design for all approach to providing information in audio format to anyone and everyone who might want to use it, including visually impaired and other print disabled people. This may mean revisiting and updating the DAISY standard from a design for all perspective or ensuring that visually impaired and other end-users have a choice of a number of different formats, including DAISY, so they can choose the one that is best suited to their needs or even use different formats in different circumstances.

12.4.1 Stand-alone Reading Systems

Stand-alone reading systems are independent systems that are able to able to scan a printed document (including letters, books, leaﬂets and newspapers) and

402 12 Accessible Information: An Overview

Figure 12.9. Stand-alone text-to-speech (TTS) technologies

Figure 12.10. Block diagram of reading system operations

produce an audio (or tactile) version of the document for visually impaired and blind readers. The sequence of operations carried out by reading systems is shown in Figure 12.10.

These operations comprise the following three main stages:

Stage 1. The camera and scanning mechanism create an image ﬁle.

Stage 2. Optical recognition software and/or hardware converts the image ﬁle to a text ﬁle.

Stage 3. Text-to-speech software uses the text ﬁle to drive a speech synthesizer card and speaker unit, thereby producing an audio speech output.

Commercial reading systems generally have alternative input and output options. For instance, on the input side the system may be able to read from CD ROM or from stored ﬁles and, in addition to audio output using a speaker and headphones, the output may be displayed as text on a computer screen.

12.4 Audio-transcription of Printed Information

403

Figure 12.11a,b. Scanning and Reading Appliance, SARA™: a SARA™, in action; b SARA™, the control panel (photographs reproduced by kind permission of Freedom Scientific, USA)

Figure 12.11 shows photographs of the Scanning and Reading Appliance (SARA™) developed by Freedom Scientiﬁc, USA. Some of the key functions of SARA are listed in Table 12.1. As can be seen from Figure 12.11 and Table 12.1, SARA is a highly convenient route to the audio transcription of printed information. However, it is not very portable and therefore more suitable for applications in a ﬁxed location than for use while moving around. A portable system, called Read It, is discussed in the next section.

12.4.2 Read IT Project

Portable devices generally have the advantage of reducing costs since the same device can be used in different locations and is easier to handle. In the case of reading systems, there is a wide range of textual information, including menus, price tags, bus and train timetables, indicator boards and theatre programmes, found in different locations that could not be read with a ﬁxed system. In addition to the technical issues associated with portability, further technical challenges are posed by the wide variety of material to be read, the difﬁculties associated with reading hand written and poor quality texts and the fact that some textual information, such as street signs and indicator boards, generally have to be read at a distance.

404 12 Accessible Information: An Overview

Table 12.1. Some technical feature of SARA™ reading appliance

Controls	Large, colour coded with tactile markings and symbols
	See Figure 12.11b for layout
	Search facilities: single word; single line, fast forward, rewind,
	move up page, move down page
Speech control	Controls for speech rate, and volume, selection of the voice from
	a voice set, choice of languages from 17 options
Input	Scanned documents, background scanning operation
	Files (.txt, .rtf, .doc, .pdf, .html)
	CD ROM drive, DAISY books, microphone input
Output	Stereo speakers (integral to appliance), audio jack for headphones
	Text output to computer screen with display options
Some technical speciﬁcations	Power: 100–240 V, rear power jack input
	Size: 50.8 × 8.89 × 30.48 cm
	Weight: 8.16 kg
	20 GB hard disk drive; 256 MB RAM; 600 MHz processor

One approach to producing a portable reading system for blind people is the prototype Read IT project (Chmiel et al. 2005) carried out by a student team from the Department of Computer Science and Management of Poznan University of Technology, Poland under the guidance of Dr. Jacek Jelonek.

Enduser aspects of the Read IT system

End-user involvement in the development of (assistive) technology systems from the start is crucial to ensure that the resulting device does meet the needs of the end-user community, and reduce the likelihood of it being rejected. In the Read IT project the development team worked with the Polish Association of Blind People to draw up a list of end-user requirements, which included the following (Chmiel et al. 2005):

1.The device should be comfortable (portable and lightweight) to wear and should integrate the user into the wider community not identify them as different.

2.The user should have their hands free to engage in other activities whilst listening to the speech output.

3.The user should be able to hear other sounds as well as the generated speech.

4.The user should be able to move onto other tasks once positioning and capture of the text is complete.

5.The generated speech should be clearly understandable and resemble human speech.

These requirements from the end-user community were translated into design speciﬁcations and inﬂuenced the ﬁnal design and implementation. Requirement 1 led to the device being lightweight, portable and as unobtrusive as possible. Requirements 2, 3 and 4 arise from safety considerations and the requirement for the

12.4 Audio-transcription of Printed Information

405

user to, for instance, have their hands free to use a long cane and/or carry shopping while using the device. They resulted in the speech output being delivered to only one earphone that was directly inserted into the ear. Requirement 4 enables the end-user to either relax or move onto other tasks once the text to be read has been located and captured. Requirement 5 has been translated into a design speciﬁcation for the quality of the speech synthesizer card used in the device.

Engineering issues and implementation

As illustrated in Figure 12.12, an image of the text is captured by a video camera positioned in the user’s sunglasses. This image is analysed and the text content identiﬁed drives the speech synthesizer card. The resulting speech output is then delivered to the user via a single earphone. The signal processing unit, speech synthesizer card and battery power supply are housed in a small box worn at waist level. Manual control is via a small hand-held Braille key pad. A full description of the development process is given in the Read IT report (Chmiel et al. 2005). In view of the requirement that the device should integrate the user into the wider community, the video camera could presumably be worn on glasses with plain lenses at times when there is little sun. However, difﬁculties could be encountered in transferring the device between different spectacle frames.

In contrast to technologies, such as mobile phones, where miniaturisation causes difﬁculties for blind and visually impaired people, it is component miniaturisation that has made portable reading devices, such as Read It, feasible. In particular, an important feature is the miniature video camera that can be unobtrusively mounted on the user’s sunglasses. A PVI-430D video was selected and this captures 30 frames per second at a resolution of 640 × 480 pixels. The very small size of the video

Figure 12.12. Overview of the Read IT system (Chmiel et al. 2005)

406 12 Accessible Information: An Overview

Figure 12.13. Video capture unit (Chmiel et al. 2005)

capture unit can be seen from Figure 12.13 where the microcontroller chip is just 5 × 5 mm. The camera range is between 0.4 and 0.8 m for standard sized fonts. It is generally feasible to approach to within this distance of bus timetables, indicator boards and street signs.

Software issues

The Read IT project used a mix of standard software and customised software and DSP algorithms developed by the project. The steps involved in the digital signal processing algorithms required are shown in Figure 12.14.

Two aspects of this digital signal processing architecture are especially interesting. First, there is a “navigation task” with associated “navigation messages”. This module generates voiced directional instructions to the user to ensure that the camera is directed at the text to be read. The audio message feedback loop was designed to optimise and enhance image capture and identiﬁcation within the device. Once a satisfactory image has been captured, then the important operations of analysing the video text captured can proceed. This involves a number of subtasks, including text segmentation, enhancement and recognition. This is

Figure 12.14. Digital signal processing framework for Read IT (Chmiel et al. 2005)

<<< < Предыдущая 103 104 105 106 107 108 109 110 111 112 113 114115 / 192115 116 117 118 119 120 121 122 123 124 125 126 127 > Следующая >>>

Соседние файлы в папке Английские материалы

#
28.03.202611.17 Mб0Artificial Sight Basic Research, Biomedical Engineering, and Clinical Advances_Humayun, Weiland, Chader_2007.pdf
#
28.03.20263.39 Mб0Artisan Lens Effects on Vision Quality, the Corneal Endothelium and Vision-Related Quality of Life _Saxena,_2009.pdf
#
28.03.20266.39 Mб0Arvind's Atlas of Fungal Corneal Ulcers_Prajna_2008.pdf
#
28.03.202630.04 Mб0Asian Blepharoplasty and the Eyelid Crease_Chen_2006.pdf
#
28.03.20266.71 Mб0Assessing and Treating Glaucoma in Children of the Developing World_Helveston, Smallwood_2009.pdf
#
28.03.202612.16 Mб0Assistive Technology for Visually Impaired and Blinde People_Hersh,Jonson_2008.pdf
#
28.03.202620.72 Mб0Astigmatism – Optics, Physiology and Management_Goggin_2012.pdf
#
28.03.20266.62 Mб0At the Crossing Pediatric Ophthalmology And Strabismus_Balkan, Ellis. Eustis_2004.pdf
#
28.03.202610.56 Mб0Atlas of Aesthetic Eyelid and Periocular Surgery_Spinelli, Lewis, Elahi_2004.pdf
#
28.03.202616.27 Mб0Atlas of Clinical and Surgical Orbital Anatomy 2nd edition_Dutton_2011.chm
#
28.03.202617.68 Mб0Atlas of Confocal Laser Scanning In-vivo Microscopy in Opthalmology - Principles and Applications in Diagnostic and Therapeutic Ophtalmology_Guthoff, Baudouin, Stave_2006.pdf