Advances in the Human Computer Interface

The goal of improving the Human and Computer Interface (HCI) is to make our work with the computer more "natural." It is to capitalize upon our sensory network - sight, hearing, speech, touch, taste and smell, and to eliminate the "crude" devices with which we must now deal - the keyboard, mouse, and computer monitor.

There are a number of interesting HCI devices and areas that are emerging. However we will examine only three of them in this presentation. These interfaces have particular applicability in Department of Defense applications. We will examine them in some depth and then synthesize the material in a fielded application arena. The three areas are video teleconferencing, automatic speech recognition, and virtual reality.
Video Teleconferencing

Video teleconferencing (VTC) has been around for some time now. Back in the mid 80's a one hour two way VTC session would have cost thousands of dollars. More recently with the introduction of high bandwidth lines at low-cost we have seen it appear more ubiquitously. Now that same connection may be procured for tens of dollars. With Digital Subscriber Line (DSL) delivering 640k bandwidth to the consumer desktop, quality VTC now becomes a more viable option, even to the home computer.

At the same time, over the past 5 years many DOD installation have built VTC centers with high end equipment from vendors such as PictureTel. They have also procured permanent links with the major carriers such as MCI and AT&T for the image transmission. It has become apparent that virtually all senior level DOD personnel now have experienced a VTC session. Many are now using it on a regular basis for routine meetings over distances. The opinion continues to emerge that it saves travel hassles, per diem dollars, and time for the right type of task.

Defining the "right type of task" is less obvious. VTC delivers well when the meeting is more routine, i.e. the participants have met before and are covering material that does not require particularly serious deliberation. Caution must be exercised for making serious decisions using this mode, and for meetings with individuals who have not previously met. An excellent example of the danger of VTC for serious decision making was the Challenger incident. The Morton Thiokol engineers met on the morning of the disaster with NASA officials using VTC and the seriousness of their concerns apparently did not come across fully.

The chart below provides some criteria for judging the cost effectiveness of the medium. With each of the categories provided there are issues addressed. Ultimately you get what you pay for. But there is also no doubt that the costs will continue to go down in this area and the bandwidth will continue to increase.

Speech and Natural Language Processing

The ability of our computing machines to recognize speech (not wreck a nice beach) can be better understood by dividing the task into two areas - automatic speech recognition (ASR) and natural language processing (NLP). Speech recognition has to do with capturing the sound waves, matching them against a user's pre-recorded phoneme sound pattern and then pulling a word from a dictionary of words stored by phoneme. The focus of NLP is to "understand", i.e. interpret what has been said to provide the user with feedback or an answer to their query.

If you had delved into the first area, ASR, ten years ago you would have found it very expensive and very unreliable. There has been amazing progress since that time. Now for $29.95 you can buy "Point and Speak" a package from Dragon Systems, which allows you to speak into your word processor in an almost normal tone of voice and speed, with very few errors. Point and speak will work with any application on your computer such as e-mail or for producing viewgraphs.

The problem of natural language processing however remains a challenge. The issue with natural language processing can be pointed out quite effectively with a couple of short sentences.

Consider the sentence, "The chickens are ready to eat." What does this mean? Are the barnyard chickens interested in getting their grain or are the household residents interested in getting their chicken, mashed potatoes, and gravy? What is important here is something known as context dependency. The meaning of that sentence depends upon where when and how it is spoken. Some other interesting examples from actual newspaper headlines.

"Ban on Nude Dancing on Governor's Desk" from a Georgia newspaper column discussing current legislation

"Lebanese chief limits access to private parts" - talking about an Army General's initiative

"Death may ease tension" - an article about the death of Colonel Jean-Claude Paul in Haiti

There have been and continue to be efforts to corral the massive natural language interpretation problem. Scientist such as Roger Shank at Northwestern University and Douglas Lenat of CycCorp, a child of MCC, a technology think tank and development center in Austin, have created ontologies, which allow language to be properly framed and then, queried. But it is a very large issue. Dr. Lenat has created a very large knowledgebase called CYC which already has an estimated 5 million axioms in it such as a dog has four legs or when the sun is shining it is likely daytime.
The Wearable Computer and Virtual Reality

When we think of computers we typically associate them with two large boxes - the main computer casing and the monitor. Inside that casing we have circuit cards, computer chips and storage devices. On that monitor we look at and manipulate numbers and words. The drawbacks to both the large casing and to the computer monitor are numerous.

Portable Computers have been created to help us move about with the computing power that we need. And while there have been significant advances in portable PCs, they remain very "unnatural" devices. There are better alternatives - wearable computers and virtual reality. Much more common usage of these capabilities is right over the horizon.

Wearable Computers (WC) such as the one pictured below from Xybernaut Corporation are becoming popular for a variety of computing tasks. Combined with "Heads-Up" Displays and Internet connections, the machine can be used very effectively in a wide variety of tasks where the computer needs to be taken to the data, versus the data to the computer. In manufacturing, workers can record or access significant data while moving around on the factory floor. In agriculture, the farmer can move around the crops, fields, or orchards to quickly record or access pertinent data. In hardship areas the wearable computer provides access to information which can make life much easier and/or more effective.

The WC, combined with Virtual Reality supplies a particularly potent combination. Five years ago VR existed only in very well funded national level laboratories. Creating virtual reality "worlds" required specialized programming skills and $500,000 computers. But that scenario has changed substantially and will continue to change even more dramatically. Now many virtual reality worlds run easily on PCs. Hundreds of these worlds can be found free out on the Internet. The Virtual World Wide Web is a good place to start exploring these exciting worlds.

VR provides us with the capability to move "into" new worlds versus viewing them from the outside. Most people think of virtual reality as a method for simulating "concrete" worlds. For example a way to permit us to drive cars, airplanes, and tanks in a safe environment. But new methods through VR and a heads up display can provide a way to place an otherwise invisible wiring diagram on top of a visible aircraft so that a technician can quickly locate junctions or components. Or it can help a doctor doing brain surgery much more quickly and effectively locate a tumor by mapping a Magnetic Resonance Imaging (MRI) picture on top of the actual brain.