Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Ординатура / Офтальмология / Английские материалы / Assistive Technology for Visually Impaired and Blinde People_Hersh,Jonson_2008.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
12.16 Mб
Скачать

562 15 Accessing Books and Documents

ognized. OCR developers use automated tools to evaluate this with sets of pages called test decks. The test deck pages are carefully scanned and then proofread to ensure that they are accurate. The OCR engine to be tested then recognizes the image and the resulting text is compared to the proofread text. There are many types of errors that can be evaluated (some of these overlap with each other):

Misrecs: where the correct character is misrecognized as a different character: saying it is the letter ‘c’ when it is really a letter ‘e’:

Splits: where the character is turned into multiple characters, often because the contract is too light ‘iii’ for the letter ‘m’.

Joins: multiple characters are incorrectly recognized as one: ‘m’ for ‘rn’.

Nonrecs: where the OCR engine detects the symbol, but has no idea what the symbol is.

Drops: the OCR engine fails to put anything out for a character.

Adds: characters are output that were not characters on the original, such as a speck of dirt coming out as a period or quote mark.

This sort of approach is quite time-consuming. Still, modern OCR systems are quite accurate, doing very well on simple text-oriented documents such as novels. If the scanning is done carefully, the accuracy can be very close to perfect, with no errors on a typical page. Highly complex documents are very difficult to read using OCR: complicated textbooks, fancy magazine articles and hand-written documents are typically unreadable. OCR users become adept at quickly judging how to get the best results out of their OCR technology, as well as recognizing its limitations.

Speed has become less of an issue over the years as computers have become faster and faster. It is usually the scanner that is the limiting factor in the speed of converting documents, especially if the pages are being turned by hand and the book is being placed on a flat-bed scanner.

Institutions scanning large numbers of books often use higher capacity production tools to deliver greater amounts of scanning, using chopping devices to remove bindings, high speed page scanners and OCR software. Much commercial OCR software is highly visual, providing detailed images of pages for direct interaction. Reading systems that integrate OCR and can be used by people who are not able to see the page are the solution for those who are visually impaired.

15.3 Reading Systems

Reading systems are designed for use by people with disabilities. They use OCR to provide the printed word in accessible form. There are three main elements of a reading system: the scanner, the OCR system and the accessible presentation.

Reading systems are also packaged differently from conventional OCR. The majority of reading systems are built on a standard personal computer, where the user is likely to also have the use of a screen reader or screen magnifier for

15.3 Reading Systems

563

other tasks. In addition, there are stand-alone reading systems, which bundle all of the components into a single purpose unit that reads; the user is shielded from computer capabilities of the device. The functioning of the device is the same, but the complexity is hidden to make the device less intimidating.

Once the text of the document is available, the user has many ways of accessing it. Accessible presentation technologies include text-to-speech, Braille and enlargement. These same technologies are often used to provide control feedback to users to operate the reading system. Stand-alone systems use text-to-speech almost exclusively. PC-based systems use one or more of the following access techniques.

Text-to-speech (TTS) is the most widely used technology for providing access to printed information with reading systems, because it is quite inexpensive and works for the great majority of visually impaired users. Synthetic speech sounds artificial, even after the major technology advances made over the past decade. This makes reading systems unattractive to individuals who are reluctant users of technology, especially seniors. However, the progress in speech technology means that TTS is much closer to sounding like a human narrator than the early computerized voices many consumers may still remember.

Braille technology is very popular with the segment of the blind who read Braille, as it is more precise and efficient than speech output. Because the majority of the legally blind population are seniors who generally do not learn Braille, less than 20% of the blind are effective Braille readers. However, the educational and economic success of blind persons who are Braille readers tends to be much higher than non-Braille readers. Canadian surveys in the 1990s (Campbell et al. 1996) showed that employment-aged Braille readers had an unemployment rate lower than the general population, and also had better economic and educational attainment.

There are certain barriers to the use of Braille; it is very expensive and not included as a standard component of reading systems. Refreshable Braille displays typically have 20, 40 or 80 characters of Braille using plastic tactile pins that pop up to display the text, emulating the feel of dots embossed onto paper as is used in hardcopy Braille. This technology is indispensable to people who are both deaf and blind, and cannot use audible speech output. Reading systems also print Braille documents using specialized printers called embossers, which punch Braille dots into paper to create hardcopy Braille books and documents.

The last accessible presentation technology is enlargement. This is the electronic equivalent of the video magnifier, but access to the underlying text using OCR can make it easier to display the text useably to the low vision reader. For example, when working in a large font, wrapping the text on a screen is easier to do with digital text compared to the same task using a picture of a page of text. The colours and contrast can be adjusted through a wider range of options than in a direct video magnifier.

Combinations of accessible technologies are quite common. Low vision users appreciate the option to both view enlarged text, as well as listen to it at the same time. This “bi-modal” display technique, originally designed for low vision users, is very popular with people with learning disabilities such as dyslexia. Braille readers often use a combination of a Braille display with TTS.

564 15 Accessing Books and Documents

At the time of writing, the leading reading systems for the visually impaired are products from companies such as Freedom Scientific (OPENBook) and Kurzweil (the Kurzweil 1000). These are PC-based software solutions, where adding the reading system software and a scanner turn a PC into a talking reading system. Users who are not very skilled using a PC can often use these reading systems because they can be effectively operated with a couple of keys. The user places a page of print on the scanner and presses a key that tells the reading system to scan the page, do the OCR and start reading the page aloud. The other important key starts and stops the speech output reading of the page.

As an alternative to PC-based systems, Freedom Scientific also makes a standalone version of their reading system, called SARA. In addition, there are a variety of other reading systems produced in different countries such as the Poet by Baum in Germany and the ScannaR from Human Ware in New Zealand. These systems hide the computer details into a single housing, with the controls and speakers generally built into the front of the scanner. While these are more expensive than PC-based systems for users who already own a PC, they are far less intimidating to the non-technical user.

There is also a new portable reading machine, the Kurzweil-National Federation of the Blind Reader, which is the marriage of a digital camera to a personal digital assistant. It weighs less than 400 g (under 1 lb). The user aims the camera lens at a page of text and the Reader speaks it aloud with TTS. It is relatively expensive, costing more than a stand-alone reading machine with a flatbed scanner. The typical user would be unlikely to read an entire book using such a device, but it seems well suited to daily reading tasks like reading mail and other short documents.

Other than the K-NFB Reader, reading systems are not very portable. This is especially true when considering reading entire books. Since portability in reading is important, users have come up with a number of ways to take their reading with them. Many Braille displays are designed to be portable as a notetaker device, enabling the user to store the scanned book in the device’s memory and carry around a small library of books. Given the large size of hardcopy Braille books, it is quite exciting to have hundreds of books stored inside a notetaker, which is smaller than a single Braille hardcopy volume.

There are also specialized portable devices with TTS synthesizers built in. The first type consists of TTS notetakers, either with QWERTY full keyboards or Braille chording keyboards (where text is entered by chording the six or eight Braille dots for each character). These are less expensive than the equivalent Braille notetakers which are equipped with refreshable Braille displays because of the expense of the Braille cell technology. In addition, there also devices designed to simply be text readers, such as the BookCourier and Book Port. These have built-in TTS and simple numeric pads, and read books aloud through a headphone jack.

PC users are increasingly using software that creates MP3 digital audio files using TTS, so that commonly available music players can be used to play books aloud. Reading systems such as OPENBook and K1000 have an option to create MP3 files from scanned text. A software program that performs just this task is TextAloud, which takes a text file and creates an MP3 audio file. The quality depends on which