Добавил:

Sekretar kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Ростовский Государственный Медицинский Университет

Предмет:

Медицина общая

Файл:

Ординатура / Офтальмология / Английские материалы / Assistive Technology for Visually Impaired and Blinde People_Hersh,Jonson_2008.pdf

Скачиваний:

Добавлен:

28.03.2026

Размер:

12.16 Mб

Скачать

☆

<<< < Предыдущая 135 136 137 138 139 140 141 142 143 144 145 146147 / 192147 148 149 150 151 152 153 154 155 156 157 158 159 > Следующая >>>

14.7 Discussion and the Future Outlook

543

14.6.6 Access to Telecommunication Devices

There has been considerable development of telecommunication systems over the last decade and this has had an impact on telecommunication use by visually impaired and blind people. “Historically, people with a hearing disability have been the group facing the most problems when using telephones; however, with the ever increasing reliance on visual means for displaying information, it is increasingly visually impaired people who have been confronted with access problems” (Roe, 2001, p 30).

Speech technology can provide potential solutions, as in the case of the following input/output functions for mobile phones:

•Speech recognition is frequently used for voice-dialling. This feature was originally developed mainly for hands-free telephony in cars.

•Speech synthesis will be increasingly used for improving the user interface (speech MMI), caller name announcement, reading short messages (SMS) and remote access to e-mails.

Although these features were developed originally to provide improved performance for sighted users, they are very useful for visually impaired people and illustrate the beneﬁts of a design for all approach. The technical prerequisites are the development of embedded speech input/output solutions (Hoffmann et al. 2004).

Despite the beneﬁts of design for all, it is not able to resolve all problems and therefore visually impaired telecommunications users also require some special equipment. For instance, the Braillino system shown in Figure 14.21b is illustrated in combination with the Nokia Communicator. However, it can be used more generally with any mobile phone that uses the Symbian operating system, which is the global industry standard operating system for smartphones (www.symbian.com). This includes the Series 60 Phones (without an alphanumeric keyboard) and the Series 80 Phones (with an organizer function and an alphanumeric keyboard). The connection can be wireless via Bluetooth. From the functional point of view, the communication software (called Talks&Braille) acts as a screen reader for the Symbian operating system.

14.7 Discussion and the Future Outlook

14.7.1 End-user Studies

Potential users of speech technology would like to have (comparative) information on the performance of the available systems. However, it is difﬁcult to obtain global comparative evaluations, due to the complexity of the systems and the fact that the evaluation criteria depend on the intended application. The studies carried out to date can be grouped and discussed as follows.

544 14 Speech, Text and Braille Conversion Technology

Evaluation of research systems

Progress in speech technology is normally measured in terms of improved word recognition rates (for recognizers) or improved scores when rating the naturalness (for synthesizers). Therefore, there are presentations giving an ongoing evaluation of research systems at the leading conferences. The availability of common databases allows the results of the evaluation of different systems to be compared. However, these research-oriented results relate to systems that are not yet commercially available, rather than the current state of the market.

Comparison with human performance

It is natural to compare speech technology with human performance. Every user of speech technology soon notices that it does not perform nearly as well as a person, but there are few quantitative assessments of this difference in performance. A fundamental investigation was carried out by Lippmann (1997) for speech recognizers. He demonstrated how the recognition rate breaks down in the presence of environmental noise, whereas human listeners perform essentially better. Corresponding results can be obtained by rating the quality of speech synthesis using a mean opinion score (MOS) scale ranging from 1 (bad) to 5 (excellent). The naturalness of human speech is rated close to 5, but the output of TTS systems is generally valued somewhere in the middle range; between 1.73 and 3.74 according to the survey by Alvarez and Huckvale (2002). Considerable further research will be required to close the gap in both speech recognition and speech synthesis compared to a human listener or speaker, respectively.

Evaluation of commercial systems

Before including speech technology in a product, a company generally evaluates a number of competing systems, though the results are only published occasionally. This type of study gives an interesting insight into the real performance of the available products. For example, Maase et al. (2003) investigated the performance of command and control speech recognizers for controlling a kitchen device. Usability studies showed that the users are accepting this kind of control for recognition rates greater than 85%. Tests with eight different products showed that this performance was never reached in real environments. Typical parts of these results are shown in Figure 14.23.

General product studies are very time-consuming and expensive. Therefore, they require a sponsor who has not got a vested interest in one of the products. For instance the study of ten different dictation systems (Flach et al. 2000) mentioned in Section 14.3.3 was originally produced for a computer journal. A more recent study (Stiftung Warentest 2004) of six dictation systems was carried out without publishing the recognition rates. The system with the best performance is indicated in the previously shown Table 14.8.

14.7 Discussion and the Future Outlook

545

Figure 14.23a,b. Selected results from the study of Maase et al. (2003). The diagrams show the recognition rate of selected C&C recognizers for different noises (a) and different speaker positions (b). The speaker positions describe different places in the usability lab with growing distance (from 1 to 7 m). Reprinted by courtesy of the authors

Evaluation for user groups with special needs

There is clearly a need for studies of speech support and dictation systems for blind and visually impaired people. Unfortunately, there is a distinct lack of large scale user studies of speech support systems for this user group. However, there are several more general studies which include consideration of speech technology to a certain extent. A number of such investigations have considered improving learning environments for blind students (Kahlisch 1998). Another emerging ﬁeld is the study of the assistive technology needs for elderly people. Since many elderly people have acquired visual impairments, these studies include useful material on speech-related technologies. Figure 14.24 presents an example.

14.7.2 Discussion and Issues Arising

An overview of the remarks in this chapter shows that the performance of speech input/output systems is by no means perfect, despite improved algorithms, larger databases, increased memories, and growing computing power. In general, this still somewhat disappointing performance is due to the extreme complexity of human speech processing which it is difﬁcult to satisfactory approximate by technical systems. Although there is not space to discuss the reasons for this less than satisfactory performance in detail, some of the reasons for this are brieﬂy summa-

546 14 Speech, Text and Braille Conversion Technology

Figure 14.24. Example of a usability study. The diagram shows the acceptance of speech controlled services by different user groups according to the study of Hampicke (2004). Reprinted by courtesy of the author. Score of 6: in any case. Score of 3: medium. Score of 0: in no case. The legend describes the grade of visual impairment

rized in Table 14.10. Examining this table leads to the following conclusions about important future directions for basic research in speech technology:

•Speech understanding.

•Acoustic front end.

•Modelling human speech and language processing.

These topics are all highly interdisciplinary and will require interdisciplinary work.

14.7.3 Future Developments

As discussed in this chapter, speech technology has established itself as a stable and successful component of assistive technology. Speech technology is also becoming increasingly successful in other ﬁelds with a greater economic impact, including in the telecommunications area for communication with call centres and telephone banking. Although beyond the remit of this chapter, a survey of user opinions of this technology would be interesting, since there is at least anecdotal evidence that users prefer to communicate with a person and are highly dissatisﬁed with call centres. According to recent data (Sohn 2004), the turnover in business applications of speech technology will grow from $ 540 millions currently worldwide to $ 1600 millions in the year 2007.

This growth in the use of speech technology is not surprising in view of the importance of speech in telephone applications and consequently also for their automation. The importance of speech input/output systems relative to other media is likely to grow, as can be seen from Table 14.11.

What will this tendency mean for blind and visually impaired people? Developments in speech technology will improve access to interfaces for an increasing range of applications for this group (though not for deafblind people). The resulting beneﬁts are likely to be substantial and cover applications ranging from access to numerous knowledge sources to improved accessibility of household appliances.

14.7 Discussion and the Future Outlook

547

Table 14.10. Actual research problems in speech technology, explained by means of the general scheme of a speech processing system (Figure 14.4)

Where are	How can the problems	What can research do	Examples for ﬁrst
the problems	be described?	to solve the problems?	solutions
localized
in Figure 14.4?

At the top	Our systems do not	Develop speech	Speech-to-speech
of the ﬁgure	understand what they	understanding, cooperate	translation systems like
	do. The scheme is ending	with computer	Verbmobil
	at text level without	linguistics/AI/semiotics	(Wahlster 2000)
	semantic components		In speech synthesis:
			concept-to-speech (CTS)
			instead of TTS
At the bottom	The acoustic channel	Consider the system	Acoustic signal processing
of the ﬁgure	between the user and the	(recognizer or	such as:
	converters (microphone	synthesizer, respectively)	•	Microphone arrays.
	or loudspeaker,	and the acoustic	•	Microphone arrays.
	respectively) is still	environment as a unit	•	Noise suppression.
	neglected in most cases	and develop the	•	Source separation.
		“acoustic frontend”	•	Source separation.
		“acoustic frontend”
			•	Directed sound supply.
In the	Because our	Although a technical	Many activities in
components	understanding of human	system needs not to be	modelling prosody in
of the ﬁgure	speech processing is far	a close copy of the	close cooperation of
	from an applicable level,	biological counterpart,	engineers and
	the models which we use	we need essentially more	phoneticians during
	are more or less	knowledge of human	the last decade;
	mathematical or empiric	speech production and	Research systems which
		perception	model human acoustic
			processing

Table 14.11. How to interact with future systems? An overview from Weyrich (2003)

Small devices	Speech
Service robots	Speech and gestures, artiﬁcal skin, emotions
Federation of systems	Speech and gestures, emotions
e-Business	Active dialogue systems, interactive multimedia
Augmented reality systems	Speech, gestures

548 14 Speech, Text and Braille Conversion Technology

Talking products which are of interest to both sighted and visually impaired people are more attractive to companies due to their larger markets and therefore this type of product is more likely to be widely available from standard suppliers and at a reasonable price than specialised products for visually impaired people. For instance, blind and many visually impaired users require speech (or tactile) output to state the function of the key being pressed or the knob setting on the (complex) control panel of a washing machine. This audio option may also be of interest to sighted users. The inclusion of both speech and tactile output could be considered part of a design for all approach, but, as already indicated, though design for all should be part of good design practice, it will never totally replace the need for assistive devices.

There is therefore considerable potential for increasing accessibility to blind and visually impaired people, though further technical developments will be required. However, it should also be noted that access to new technologies is limited by a number of different factors, including geography and poverty. The term ‘digital divide’ is often used to describe the difference between people who do and do not have access to modern technologies and the resulting disadvantages, whereas the term eInclusion is used for access to the information society by disabled people and other potentially disadvantaged groups. While it is important to ensure that blind and visually impaired people are able to fully participate in the information society, it should also be recognised that some people, both blind and sighted, do not like technology. It will therefore also be important to ensure that there are also low technology accessibility solutions for blind and visually impaired people and that information is available in a number of different formats, including but not solely electronically.

Speech and language technology will always be compared to natural human speech and language. Therefore, regardless of progress, they are likely to be found wanting for a long time to come, if not permanently. This presents an ongoing challenge, which is probably much greater than that encountered in many other disciplines. As Waibel and Lee (1990) state in their preface to Readings in Speech Recognition: “Many advances have been made during these past decades; but every new technique and every solved puzzle opens a host of new questions and points us in new directions. Indeed, speech is such an intimate expression of our humanity—of our thoughts and emotions—that speech recognition is likely to remain an intellectual frontier as long as we search for a deeper understanding of ourselves in general, and intelligent behaviour in particular.”

Acknowledgement. As can be seen from the list of references, the material in this chapter is based on research results and teaching material of the chair for speech communication at the Technische Universität Dresden. The author would like to take the opportunity to thank his team for their fruitful cooperation on many projects.

Special thanks for helpful discussions and support to Professor Dieter Mehnert, formerly at the Humboldt-Universität zu Berlin, Professor Klaus Fellbaum, Brandenburgische Technische Universität Cottbus, Professor Gerhard Weber, Universität Kiel, and Dr. Lothar Seveke, Computer für Behinderte GmbH, Dresden.

<<< < Предыдущая 135 136 137 138 139 140 141 142 143 144 145 146147 / 192147 148 149 150 151 152 153 154 155 156 157 158 159 > Следующая >>>

Соседние файлы в папке Английские материалы

#
28.03.202611.17 Mб0Artificial Sight Basic Research, Biomedical Engineering, and Clinical Advances_Humayun, Weiland, Chader_2007.pdf
#
28.03.20263.39 Mб0Artisan Lens Effects on Vision Quality, the Corneal Endothelium and Vision-Related Quality of Life _Saxena,_2009.pdf
#
28.03.20266.39 Mб0Arvind's Atlas of Fungal Corneal Ulcers_Prajna_2008.pdf
#
28.03.202630.04 Mб0Asian Blepharoplasty and the Eyelid Crease_Chen_2006.pdf
#
28.03.20266.71 Mб0Assessing and Treating Glaucoma in Children of the Developing World_Helveston, Smallwood_2009.pdf
#
28.03.202612.16 Mб0Assistive Technology for Visually Impaired and Blinde People_Hersh,Jonson_2008.pdf
#
28.03.202620.72 Mб0Astigmatism – Optics, Physiology and Management_Goggin_2012.pdf
#
28.03.20266.62 Mб0At the Crossing Pediatric Ophthalmology And Strabismus_Balkan, Ellis. Eustis_2004.pdf
#
28.03.202610.56 Mб0Atlas of Aesthetic Eyelid and Periocular Surgery_Spinelli, Lewis, Elahi_2004.pdf
#
28.03.202616.27 Mб0Atlas of Clinical and Surgical Orbital Anatomy 2nd edition_Dutton_2011.chm
#
28.03.202617.68 Mб0Atlas of Confocal Laser Scanning In-vivo Microscopy in Opthalmology - Principles and Applications in Diagnostic and Therapeutic Ophtalmology_Guthoff, Baudouin, Stave_2006.pdf