Diff for "Voice Input" - 4ourth Mobile Design Pattern Library

Differences between revisions 4 and 5

Direct Voice Input (DVI) (sometimes called Voice Input Control (VIC)) is a style of Human-Machine Interaction "HMI" in which the user makes voice commands to issue instructions to the machine. It has found some usage in the design of the cockpits of several modern military aircraft, particularly the Eurofighter Typhoon, the F-35 Lightning II, the Dassault Rafale and the JAS 39 Gripen, having been trialled on earlier fast jets such as the Harrier AV-8B and F-16 VISTA. A study has also been undertaken by the Royal Netherlands Air Force using voice control in a F-16 simulator.[1]

Always use "user-independent" systems for general use... only build user voice profiles (user dependent) when needed, such as for specialized languages or libraries of words.

Problem

A method must be provided to control some or all of the functions of the mobile device, or provide text input, without handling the device.

Solution

Both the needs of low-vision ... safety such as navigation while driving...

While this has... mobile uniquely positioned to accept this due to the sensors... many already have speakers, microphones designed for voice quality communications, and embedded into the device chipset.

Since most mobile devices are now connected, or only are useful when connected to the network, an increasingly useful option is for a remote server to perform all the speech recognition functions. This can even be used for fairly core functions, such as dialing the handset, as long as a network connection is required for the function to be performed anyway. For mobile handsets, the use of the voice channel is especially advantageous as no special effort must be made to gather or encode the input audio.

Variations

Voice Command - use voice to input a limited number of commands; akin to use of Accesskeys but with a larger set of commands. big problem in affordance much like gestural or other touch commands, they are not on screen and generally cannot be due to space...

text - speech recognition (voice recognition implies user dependent input)... to type with the voice

Interaction Details

usually, mobile devices use key or touch input and visual output, so have to initiate any voice input from one of these methods... to support low-sighted users or eyes-off use cases, suggest a key or key-combination. common one is something already associated with audio like speakerphone, as a long press

when active, should make a Tone or voice readback/reminder of the condition (e.g. "Say a command")... after this, the system accepts input.

when done, usually should read back what was entered...

during this, much like pen input where you get a correction time, "no" wipes or allows for selection from a list...

For Voice Command, as much interactivity as practical should be provided. When controlling the device OS, all the basic functions must be able to be performed, by offering controls such as Directional Entry and the ability to activate menus. This also may mean that a complete scroll-and-select style focus assignment system is required, even for devices that otherwise rely purely on touch or pen input.

Presentation Details

Antipatterns

Audio systems and processing cannot be relied on to be full duplex so don't get in the way with too-fast response, etc.

-  ⇤ ← Revision 4 as of 2011-04-04 23:21:41 → 
  Size: 2069
  Editor: shoobe01
  Comment:
+   ← Revision 5 as of 2011-04-05 00:44:39 → ⇥
  Size: 3433
  Editor: shoobe01
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 9:
+A method must be provided to control some or all of the functions of the mobile device, or provide text input, without handling the device.
-Line 11:
+Line 13:
+Both the needs of low-vision ... safety such as navigation while driving... 

While this has... mobile uniquely positioned to accept this due to the sensors... many already have speakers, microphones designed for voice quality communications, and embedded into the device chipset. 

.

Since most mobile devices are now connected, or only are useful when connected to the network, an increasingly useful option is for a remote server to perform all the speech recognition functions. This can even be used for fairly core functions, such as dialing the handset, as long as a network connection is required for the function to be performed anyway. For mobile handsets, the use of the voice channel is especially advantageous as no special effort must be made to gather or encode the input audio.
-Line 27:
+Line 37:
+For Voice Command, as much interactivity as practical should be provided. When controlling the device OS, all the basic functions must be able to be performed, by offering controls such as '''[[Directional Entry]]''' and the ability to activate menus. This also may mean that a complete scroll-and-select style focus assignment system is required, even for devices that otherwise rely purely on touch or pen input.