- •Contents
- •Figures and Tables
- •Introduction to the Aqua Human Interface Guidelines
- •The Benefits of Applying the Interface Guidelines
- •Deciding What to Do First
- •Tools and Resources for Applying the Guidelines
- •If You Have a Need Not Covered by the Guidelines
- •Human Interface Design
- •Human Interface Design Principles
- •Metaphors
- •See-and-Point
- •Direct Manipulation
- •User Control
- •Feedback and Communication
- •Consistency
- •WYSIWYG (What You See Is What You Get)
- •Forgiveness
- •Perceived Stability
- •Aesthetic Integrity
- •Modelessness
- •Knowledge of Your Audience
- •Worldwide Compatibility
- •Cultural Values
- •Language Differences
- •Default Alignment of Interface Elements
- •Resources
- •Universal Accessibility
- •Visual Disabilities
- •Hearing Disabilities
- •Physical Disabilities
- •The Dock
- •The Dock’s Onscreen Position
- •Dock Notification Behavior
- •Dock Menus
- •Clicking in the Dock
- •Menus
- •Menu Elements
- •Menu Titles
- •Menu Items
- •Grouping Items in Menus
- •Hierarchical Menus (Submenus)
- •Menu Behavior
- •Scrolling Menus
- •Toggled Menu Items
- •Sticky Menus
- •Standard Pull-Down Menus (The Menu Bar)
- •The Apple Menu
- •The Application Menu
- •The Application Menu Title
- •The Application Menu Contents
- •The File Menu
- •The Edit Menu
- •The View Menu
- •The Window Menu
- •The Help Menu
- •Menu Bar Status Items
- •Other Menus
- •Contextual Menus
- •Using Special Characters and Text Styles in Menus
- •Using Symbols in Menus
- •Using Text Styles and Fonts in Menus
- •Using Ellipses in Menus and Buttons
- •Windows
- •Window Layering
- •Window Appearance and Behavior
- •Textured Windows
- •Opening and Naming Windows
- •Positioning Windows
- •Closing Windows
- •Moving Windows
- •Resizing and Zooming Windows
- •Active and Inactive Windows
- •Click-Through
- •Scroll Bars and Scrolling Windows
- •Automatic Scrolling
- •Minimizing and Expanding Windows
- •Windows With Changeable Panes
- •Special Windows
- •Drawers
- •When to Use Drawers
- •Drawer Behavior
- •Utility Windows
- •The About Window
- •Dialogs
- •Types of Dialogs and When to Use Them
- •Document-Modal Dialogs (Sheets)
- •Sheet Behavior
- •When to Use Sheets
- •When Not to Use Sheets
- •Alerts
- •Dialog Behavior
- •Accepting Changes
- •The Open Dialog
- •Saving, Closing, and Quitting Behavior
- •Save Dialogs
- •Closing a Document With Unsaved Changes
- •Saving Documents During a Quit Operation
- •Saving a Document With the Same Name as an Existing Document
- •The Choose Dialog
- •The Printing Dialogs
- •Controls
- •Control Behavior and Appearance
- •Push Buttons
- •Push Button Specifications
- •Radio Buttons and Checkboxes
- •Radio Button and Checkbox Specifications
- •Selections Containing More Than One Checkbox State
- •Pop-Up Menus
- •Pop-Up Menu Specifications
- •Command Pop-Down Menus
- •Command Pop-Down Menu Specifications
- •Combination Boxes
- •Combo Box Specifications
- •The Text Entry Field
- •The Scrolling List
- •Placards
- •Bevel Buttons
- •Bevel Button Specifications
- •Toolbars
- •Pop-Up Icon Buttons and Pop-Up Bevel Buttons
- •Slider Controls
- •Slider Control Specifications
- •Tab Controls
- •Tab Control Specifications
- •Progress Indicators
- •Text Fields and Scrolling Lists
- •Tools for Creating Lists
- •Text Input Field Specifications
- •Scrolling List Specifications
- •Image Wells
- •Disclosure Triangles
- •Layout Guidelines
- •Group Boxes
- •Sample Dialog Layouts
- •Using Small Versions of Controls
- •User Input
- •The Mouse and Other Pointing Devices
- •Using the Mouse
- •Clicking
- •Double-Clicking
- •Pressing
- •Dragging
- •The Keyboard
- •The Functions of Specific Keys
- •Character Keys
- •Modifier Keys
- •Arrow Keys
- •Function Keys
- •Key Combinations Reserved by the System
- •Recommended Keyboard Equivalents
- •Creating Your Own Keyboard Equivalents
- •Keyboard Focus and Navigation
- •Full Keyboard Access Mode
- •Type-Ahead and Auto-Repeat
- •Selecting
- •Selection Methods
- •Selection by Clicking
- •Selection by Dragging
- •Changing a Selection With Shift-Click
- •Changing a Selection With Command-Click
- •Selections in Text
- •Selecting With the Mouse
- •What Constitutes a Word
- •Selecting Text With the Arrow Keys
- •Selections in Graphics
- •Selections in Arrays and Tables
- •Editing Text
- •Inserting Text
- •Deleting Text
- •Replacing a Selection
- •Intelligent Cut and Paste
- •Editing Text Fields
- •Entering Passwords
- •Fonts
- •Icons
- •Icon Genres and Families
- •Application Icons
- •User Application Icons
- •Viewer, Player, and Accessory Icons
- •Utility Icons
- •Non-Application Icons
- •Document Icons
- •Icons for Preferences and Plug-ins
- •Icons for Hardware and Removable Media
- •Toolbar Icons
- •Icon Perspectives and Materials
- •Conveying an Emotional Quality in Icons
- •Suggested Process for Creating Aqua Icons
- •Tips for Designing Aqua Icons
- •Drag and Drop
- •Drag and Drop Design Overview
- •Drag and Drop Semantics
- •Move Versus Copy
- •When to Check the Option Key State
- •Selection Feedback
- •Single-Gesture Selection and Dragging
- •Background Selections
- •Drag Feedback
- •Destination Feedback
- •Windows
- •Text
- •Multiple Dragged Items
- •Automatic Scrolling
- •Using the Trash as a Destination
- •Drop Feedback
- •Finder Icons
- •Graphics
- •Text
- •Transferring a Selection
- •Feedback for an Invalid Drop
- •Clippings
- •Language
- •Style
- •Terminology
- •Developer Terms and User Terms
- •Labels for Interface Elements
- •Capitalization of Interface Elements
- •Using Contractions in the Interface
- •Writing Good Alert Messages
- •User Help and Assistants
- •Apple’s Philosophy of Help
- •Help Viewer
- •Providing Access to Help
- •Help Tags
- •Help Tag Guidelines
- •Setup Assistants
- •Files
- •Installing Files
- •Where to Put Files
- •Handling Plug-ins
- •Naming Files and Showing Filename Extensions
- •Displaying Pathnames
- •Speech Recognition and Synthesis
- •Speech Recognition
- •Speakable Items
- •The Speech Recognition Interface
- •Speech-Recognition Errors
- •Guidelines for Implementing Speech Recognition
- •Speech Synthesis
- •Guidelines for Implementing Speech Synthesis
- •Spoken Dialogues and Delegation
- •General Considerations
- •Installation and File Location
- •Graphic Design
- •Menus
- •Pop-Up Menus
- •Windows
- •Utility Windows
- •Scrolling
- •Dialogs
- •Feedback and Alerts
- •The Mouse
- •Keyboard Equivalents
- •Text
- •Icons
- •User Documentation
- •Help Tags
- •Document Revision History
- •Glossary
- •Index
C H A P T E R 1 6
Speech Recognition and Synthesis
Speech Recognition
It’s important to distinguish between the speech engine and the applications that call the engine. The Mac OS X speech-recognition engine
■is speaker independent. Users don’t have to invest any time training it to recognize their voice before they can use it.
■supports continuous speech. Users don’t have to pause between words.
■has a large vocabulary (more than 120,000 words) and linguistic analysis to predict the correct pronunciation of words not in its dictionary.
■works with “far-field” microphones, so users don’t have to tether themselves to the computer with a headset. In addition, most Macintosh computers have a built-in microphone. All microphones in Macintosh computers are optimized to work well with the Mac OS X speech-recognition engine.
■works with a finite-state grammar. This is the most successful general-purpose speech technology and is optimal for uses such as interactive dialogs, command-and-control, and language/literacy. It is not optimal, however, for unrestricted dictation.
In order to have the most flexibility in using speech recognition, an application should call the speech engine functions directly. Doing so requires the following steps:
1.Tell the engine what to listen for (that is, define what users can say).
2.Start listening.
3.Act on the message sent to the application when the engine hears a defined command.
Alternatively, you can easily provide basic speech control of your application by taking advantage of Apple’s Speakable Items application (see “Speakable Items” (page 255)).
254Speech Recognition
Apple Computer, Inc. June 2002
C H A P T E R 1 6
Speech Recognition and Synthesis
Speakable Items
Speakable Items, an application built in to the Mac OS X user interface, calls the speech-recognition engine and provides all users with the ability to control their computer by voice. It does this by creating a folder in the user’s Library folder (Library/Speech/Speakable Items). Anything in the Speakable Items folder is launched when the user speaks its name; saying its name is equivalent to double-clicking that item’s icon, except that it works even if the folder is not visible or the Finder is not active.
Developers can add their own items to the Speakable Items folder—such as AppleScript scripts, documents, templates, applications, or aliases—and when the user speaks the item’s name it executes.
The Speakable Items folder can also contain XML files that associate spoken commands with keyboard shortcuts. “Make this bold,” for example, sends Command-B; “Copy this to the Clipboard” sends Command-C.
The Speakable Items folder also contains an Application Speakable Items folder, which contains a subfolder for each application, so that you can create spoken commands that apply only to your application. Items in your application’s folder are speakable only when your application is active (frontmost). Alternatively, you can include application-specific speakable items in a folder within your application bundle.
The Speech Recognition Interface
Mac OS X provides a consistent, well-integrated user interface for speech-recognition across all applications. This interface comprises the following items:
■The Speech pane of System Preferences is where users can control general speech-recognition settings, regardless of which application is using it. These settings include microphone volume (helpful for using non-Apple microphones) and the listening mode (push-to-talk versus continuous listening). Developers get these interface features for free regardless of how they use the speech-recognition engine.
The Speech pane also contains controls specific to the Speakable Items application, such as whether Speakable Items is on or off and whether it applies to menus and window controls.
Speech Recognition |
255 |
Apple Computer, Inc. June 2002
C H A P T E R 1 6
Speech Recognition and Synthesis
■The speech feedback window provides information about the level of sound input, whether the system is actively listening, and which listening method the user has chosen.
Figure 16-1 The speech feedback window
■The Speech Commands window shows users what they can say at a specific time. It also displays what the speech-recognition engine “heard” and what it spoke to the user in response.
Figure 16-2 The Speech Commands window
256Speech Recognition
Apple Computer, Inc. June 2002
C H A P T E R 1 6
Speech Recognition and Synthesis
Speech-Recognition Errors
Because speech and sounds can be ambiguous, the speech-recognition process sometimes produces errors. Two such errors are the following:
■A rejection error occurs when the system hears something it considers speech (rather than noise) but can’t match the sound to a known command. By default, this kind of error returns “???”; your application can specify its own rejection word or other response.
■A substitution error occurs when the system incorrectly interprets a sound— recognizing the command “cut” as “quit,” for example.
Because substitution errors are generally more annoying to users than rejection errors, the speech-recognition engine has been tuned to prefer to reject rather than substitute.
With the “Listen for” AppleScript command, you can easily test your application’s spoken commands without writing any code. Consult the Mac OS X Developer Tools CD for examples of using the speech-recognition server’s “Listen for” command.
Guidelines for Implementing Speech Recognition
To minimize speech-recognition errors, observe the following guidelines in designing your spoken interface.
■Avoid commands that sound similar but have different meanings. For example, “Turn backups on” and “Turn backups off” differ by only one phoneme and might be confused by the recognizer in a noisy environment.
■Avoid single-word commands; they are less distinctive and can be confused by the recognizer. “Cut” sounds similar to “Quit,” for example. Phrases that are from three to six words long are more distinctive and will be better recognized.
■Define commands that are easy to remember, feel natural to say, and don’t conflict with menu items or controls.
■Provide speech-recognition commands that add value by doing more than can be accomplished through a single click or keyboard equivalent.
Speech Recognition |
257 |
Apple Computer, Inc. June 2002