Conversion in Progress
Chapter 4
The Concerns of Artificial Intelligence
Artificial Intelligence (AI) / Knowledge Based Systems (KBS) methods
possess the following general qualities. They are; 1) heuristic in nature,
2) concerned primarily with satisficing, and 3) utilize symbolic and qualitative
processing techniques. These methods are applied within a number of AI
subdisciplines including; Expert Systems (ES), Image/Vision Processing,
Knowledge Representation (KR) & Heuristics, Machine Learning (aka Artificial
Neural Networks) , Natural Language Processing & Automatic Speech Recognition
(NLP & ASR), and Robotics.Each of these topics is a field in its own
right, with its own group of scientists, conferences, publications, books,
and research centers. And often each discipline has its own specially built
representation schemes, computers, and programming languages. As partial
proof of this, the Encyclopedia of Artificial Intelligence [AI1] has hundreds
of contributors and thousands of entries. An attempt to cover all these
areas, even on a gross basis in this short chapter is without merit. To
filter this coverage, the concerns to be addressed here will be those areas
within the very large discipline of AI which can be most aptly applied
to generic problem solving and decision making. These areas include knowledge
representation and heuristics, expert systems, natural language processing,
and machine learning.Knowledge Representation & HeuristicsKR may be
seen as the foundation area for all the other AI sub-disciplines. Systems
built to process vision or natural language use structures and techniques
developed within the scope of KR and heuristics. Minksy[AI8] has detailed
the primary knowledge representation schemes as; Rules, Frames, Semantic
Nets, Neural Networks, and Predicate Logic.Rules are an [IF x THEN y] deductive
representation with inference mechanisms to control the flow of rule execution.
Two general inference mechanisms exist - forward chaining and backward
chaining. Frames are more of a structural representation which makes analogical
reasoning easy. Procedural like code lies attached to a frame but lies
fallow until the frame is accessed. At that time it may be executed if
appropriate. Semantic Nets are cyclical graphs which represent relationships
among elements. Neural Networks are an inductive approach which "reads"
example data and then establishes strengths in a network based upon the
frequency of example occurrences. The strengths are constantly updated
exhibiting a "learning" mechanism as the system matures.Predicate Logic
uses formal logic representation combined with a technique called resolution
to perform logical inferencing. Other KR schemes growing in popularity
include genetic algorithms and fuzzy logic. Fikes and Kehler[ES1] have
provided an encompassing model for demonstrating the types of knowledge
needing representation in a knowledge base. The diagram below shows these
sources. ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
º º º º º º º º º º º
º º º º º º º º º º º
º º º º º º º º º º º
º º º ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
Figure 4-1. Types of KnowledgeThe knowledge representation schemes discussed
above can be shown to be effective in representing these knowledge types.
For example, behavior descriptions and beliefs are well represented by
frames, and by rules (using a attachment called a confidence factor), and
by fuzzy logic. Objects and relationships are an ideal match for both semantic
nets, and frames. Heuristics and decision rules are obviously well represented
by rules, procedures by procedures, and frames for typical situations.
For obvious reasons, the foundations of KR are tied closely to those methods
of thinking or reasoning as outlined in Chapter 3 including deductive,
inductive, analogical, formal logic, procedural, and meta reasoning. An
integral part of any of these representations is the encoding of symbolic
knowledge and the later extraction or search for this knowledge. Therefore
both heuristics and search are key concepts in knowledge engineering and
in problem solving and decision making.Heuristics and Search Heuristics
could best be described as those methods which have been proven generally
reliable, but are not always correct. Often they are referred to as "rules
of thumb". Some examples of heuristics follow:ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»º
If it ain't broke, don't fix it. ºº ºº In chess, select
moves that protect the center ºº of the board. ºº ºº
If it looks like rain, carry an umbrella. ºº ºº To
open the door, try pulling or pushing. ºÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
Figure 4-2. Examples of HeuristicsHeuristics often revolve around concepts
that cannot be reduced to simple numbers, or any very specific data relationships.
They exist to channel a problem or a goal.Douglas Lenat, in his doctoral
work at Stanford, researched the application of heuristics to problem solving.
In this process he gathered hundreds of fairly general heuristics, and
placed them into a program to do problem solving. His program, Eurisko,
proved very effective at re-discovering heretofore "well known" discoveries
across diverse domains such as set theory, war gaming, and computer programming.
Dr. Lenat found some heuristics to be particularly valuable. These included
extreme cases, coalescence, and fortuitous accident. Extreme Cases is self
explanatory - begin by paying special attention to outlyers. Why do these
"worst" or "best" points exist? Can we capitalize upon this knowledge?
Coalescence means "growing together". Using this concept, Eurisko formulated
the approaches of self destruction for damaged war fighting machines, recursion
for computer programs, and doubling and squaring in the math domain. Finally
Fortuitous Accident forces continual re-examination to of results to see
if somehow, while plodding along, you've bypassed an intermediate goal
to reach the top goal. Search has been said to be necessary and the most
important element of most artificial intelligence methods. To that end
AI researchers have formulated effective methods for search and discovery.
Many of these methods originated in the decision sciences and have been
"borrowed" and put to use in working with knowledge bases. There are dozens
of search methods and sub-methods. However the more popular ones include;
- depth first - breadth first - hill climbing - british museum - best first
- branch & bound - A* - minimax, and - alpha beta.The first 4 of these
are simplistic algorithmic procedures, while the last 5 utilize additional
information to pare down the search space at each branch of the network
path. The viability of each method depends largely upon the purpose of
the search. For example depth first is effective when most of the paths
we are investigating do not get very deep. Deep searches without success
can be seen to be wasteful. Hill climbing is effective when there is some
"good" natural way of applying attributes to the remaining paths between
where we are and our goal. And using the branch and bound method, originally
developed in the operations research world, an evaluation is made at each
iteration of the shortest path of all uncompleted paths. See Winston [AI11]
or Pearl [AI9] for excellent, in depth discussions of this subject. Heuristics
and search routines are only of value if they can be used in a practical
application. The AI discipline of expert systems provides just such a playing
field.Expert SystemsThe class of systems which deal with the application
of human expertise to problem solving is known as expert systems. Frequently
this definition of expert systems is extended to include all applications
where knowledge is applied to problem solving. In a strict sense however,
an expert, or a person with many years of knowledge and experience must
be intimately involved with the system's development. Systems where a formal
expert is not involved are known as knowledge based systems. ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
º º º º º º º º º º º
º º º º º º º º º º º
º º º º º º º º º º º
º º º ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
Figure 4-3. Diagram of an Expert SystemExpert Systems are typically built
using one or a combination of the knowledge representation schemes mentioned
earlier. As can be seen in the diagram above and in the table below, the
key element in the operation of an expert system is the inferencing mechanism.
In expert system lingo this part is known as the inference engine. 
Inputs Processing OutputsQ&A or Inferencing Expert Advice orSensors
Feedback Control
1
Figure 4-4. Expert System I-P-O.This inferencing mechanism controls
the order in which users or sensors are queried, as well as the progress
toward deriving conclusions. The table below depicts some of the methods
used with particular knowledge representation schemes for inferencing.
Knowledge Representation             Inference Mechanism                   
Rules                                Forward Chaining or                   
                                     Backward Chaining                     
Predicate Logic                      Resolution                            
Examples                             ID3 Algorithm                         
Figure 4-5. Inference MechanismsThese methods each provide a unique
approach toward finding solutions to problems. Forward Chaining is termed
a "data driven" approach because the system first collects substantial
amounts of data, and then sifts through this data, narrowing the potential
causes as it goes. This approach is best if only small amounts of data
are necessary. Backward Chaining is termed "goal driven" because it begins
by asking your goal, and then works to narrow the search space of combinations
that could lead to that particular goal. This approach is better when many
different combinations of data/clues are evident. Resolution is a very
interesting approach which is effectively "proof by refutation". It works
by first negating the logical statement you are trying to prove. This negation
is then added to a list of axioms underlying the proof. This list is then
resolved using logical equivalencies. The result is either a NIL, i.e.
the theorem is true, or a FALSE, it is False. This logic "programming"
may be quite involved, and therefore requires sophisticated mechanisms
for converting the propositions into something called Well Formed Formulas
(WFFs). For example the following WFF says "The red brick is on the table."
{Vx[brick(x)->3y[on(x,y)]}This is a calculus which applies quantifiers,
predicates, variables and functions to propositions. This is why it is
termed propositional or predicate calculus. Since the output of an expert
system is in the form of expert advice, or may used as a control mechanism,
they are frequently used as support within other problem solving systems.
For example American Express uses an expert system to help decide whether
"unusual" credit card transactions should be approved. Before the expert
system was built these decisions were made by floor managers. Other examples
abound. Currently most expert systems are built using tools called expert
system shells. These tools bundle one or more of the major paradigms into
an easily controlled environment. Because of their relatively low cost
($200 - $10,000 for personal computers) these tools can provide significant
aid for improving productivity in problem areas. Paybacks of 10 to 100
to 1 are commonplace.Natural Language Processing/Understanding (NLP/U)NLP
is a subject of research and progress in AI which directly attacks the
cognitive area of language and communication. In chapter 3 language and
communication was addressed using a layered model - starting with phonetics
and moving up through "real world" modeling. Much of the research works
with this general model, either within a layer, or within the entire ladder.
The purpose of most practical efforts in NLP is to create an environment
where humans can easily interact with the computer - i.e. through natural
language. This may mean relatively simple tasks such as English-like queries
to databases. Or it may mean much more complex tasks such as interpreting
the "meaning" hidden or overt in textual messages, ranging from wire service
news stories to technical professional journals. The potential for helping
to recognize problems should be obvious. With a capability to digest large
volumes of textual material, a NLU system could alert a DM whenever unusual
signs or items of interest appear, whether in internal information or from
external sources. While the discussion in chapter 3 covered basic terminology
this section seeks to outline some of the larger scale efforts to conquer
man-machine communication. The table below provides a list of some of the
more significant projects.Parsing
Shanks Conceptual Dependency DiagramsFrames & ScriptsThe Message
Understanding Conference MCC's CYC ProjectCommercial Database Access Products
2
Figure 4-6. NLU Systems Parsing is a method of decomposing or breaking
apart English statements in an attempt to better understand relationships
within the text. These techniques work from the phonological level up through
the syntactic and even semantic levels. Parse trees - which are similar
to sentence diagrams are used to classify words and clauses in a sentence.
Some variations on parsing include transitional grammars, which focus on
inter and intra sentence transition, nondeterministic methods, which can
use bottom up or top down analysis, deterministic methods, which delay
analysis until a sufficient "look ahead" is complete, and augmented transition
nets (ATNs) which force sentences into pre-arranged relationships. Conceptual
Dependency Diagramming (CDD) is a method which operates primarily at the
semantic level. Roger Shank, while at Yale developed his own "calculus"
of objects, relationships, and symbolic manipulation. CDD's could be viewed
as extensions to ATNs. The thrust of Shank's system was to place all language
within a finite logical set of objects/ relationships. In this manner the
underlying "meaning" of the text could be understood. Once established,
a CDD could be moved into a frame/script to derive even further "general
knowledge" about a situation. Another very recent effort in building theory
in the NLU arena has been the Message Understanding Conferences (MUC).
These have been a series of three conferences which have focused upon the
interpretation, and extraction of meaning from text. Sponsored by the U.S.
Naval Ocean Systems Centerin San Diego, each year a "contest" is held to
determine which software best interprets (via a set of Q&A) a text
message, which is unknown (other than a general domain) before the conference.
The MUC efforts can be seen as "locally" oriented. They only attempt to
derive meaning from the information that is provided - no additional external
information is available to help interpret the text. This contrasts with
other major projects, such as the one following, whose purpose is to add
information. These systems help place the meaning into a wider context.
Perhaps the most ambitious effort to date is the CYC (short for encyclopedia
and/or psychology) Project. This is a 10 year $35 million project being
administered at Microelectronics and Computer Technology Corporation (MCC)
in Austin, Texas. MCC is a research consortium consisting of major computer,
communications, aerospace, and other high technology companies. CYC is
an effort to embed 100 million axioms of knowledge into a database. Its
primary purpose is to overcome "brittleness" in current KBSs, i.e., install
"Common Sense" into the computer. As humans if we do not understand a particular
situation we "back up" and attempt to view the reasoning surrounding an
event based upon our wider, more general knowledge. CYC is attempting to
mimic this ability. Again the potential for assisting problem solvers/
decision makers is tremendous. An ability to identify otherwise unforeseen
problems, to draw upon analogies not seen, and or otherwise use a vast
network of heuristics already programmed into CYC would now be at a DM
fingertips. On a practical "now available" note, several software developers
now market natural language front ends for database products such as Oracle
and DB2. Some of these products include Intellect, Ramis, and Natural.
The existing limitations of these products revolve around both the structure,
i.e. Noun-Verb-Object, and limited word sets for retrieval. Texas Instruments
has copyrighted an approach which is pseudo-menu driven. Using this approach,
a user may select any word from a list within a series of boxes on the
screen. Users have found this approach satisfying and accomodating. Advances
in natural language understanding should continue with the future looking
at the melding of NLU with speech recognition, the ability to handle non-domain
specific tasks, the incorporation of parallel computers to speed up processing
and the potential of neural networks (following) for incorporating learning
into the understanding equation.Neural NetworksNeural Networks (NN) are
a growing technology with strong potential for aiding in human cognitive
weakness areas. This is a technology modeled after human neural activity
- hence the names. NNs show particular strengths in pattern recognition,
classification, and adaptive processing schemes. They have alternatively
been called Machine Learning Systems, Connectionist Networks, and Artificial
Neural Systems. Figure 4-7 below is a diagram of a typical neural net.
ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
º º º º º º º º º º º
º º º º º º º º º º º
º º º º º º º º º º º
º º º ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
Figure 4-7. Diagram of a Neural NetworkAs can be seen from this diagram
a NN consists of a set of inputs and outputs along with what is termed
a hidden layer.Each of the circles in this diagram is called a neuron.
The outputs consist of a response such as identification of a specific
object or selection of a class within a classification scheme. The inputs
are those set of characteristics which are indicative of the objects being
identified. The hidden layer acts as a filter to accumulated the input
signals and turn on the correct output neuron.As an example of a neural
net application, the network outputs would consist of different types of
aircraft to be identified such as Boeing 747, DC-10, B-52, or C-5A . Inputs
would consist of characteristics such as wing shape, fuselage length, tail
height, and number of engines. Input OutputWing Shape: Long/ThinFuselage:
ElongatedTail: High B-52Engines: SixThe NN is "trained" using sets of these
inputs and the correct output for each input set. All input-output combinations
are fed to the network repeatedly until the network "settles in", i.e.
adapts to the particular task environment. At that point it would correctly
identify particular outputs based upon the set of inputs fed into the net.
Input sets not seen before would nonetheless trigger an output response.
In the above aircraft network it would select that aircraft which is most
like the new input set. This is the most interesting attribute of neural
nets and what sets them apart from a technology like ES. They can identify
something that is close to, but not exactly like, an object or a class
it has seen before. NN's can also provide a "proximity of fit" so that
a user may be made aware of differences in previously unidentified objects.
Most practical applications utilize a training approach, known as "supervised"
to establish the net. However some networks can also be created without
training. The purpose of these networks, termed "unsupervised", is simply
to classify. The network creator would establish x number of classes, and
then the network would proceed to divide the set of inputs into x output
categories based upon the collective input characteristics. Non-parametric
statistical methods exist which deal with this same type of problem. There
are a number of approaches toward establishing the layers, the learning
mechanisms, and other characteristics of a neural network. Some of the
generalized network architectures created by researchers include perceptron,
back propagation, adaptive resonance theory, and adaptive linear element.
The diagram below details some of the elements in NN creation which are
subject to fine tuning based upon the chosen architecture. ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»º
Network Architecture ºº # Layers, # Clusters/Layer º º
Layer Associativity/Connections ºº # Neurons/Layer ºº
Directions of Feed/Recall ºº ºº Neuron Architecture
ºº Initial Weighting ºº Initial Activation Level ºº
Summation Function ºº Transfer Function ºº Learning
Algorithm ºº ºº Training Method ºº Learning
Algorithm ºº Supervised or Unsupervised ºº Weight Steps,
degree changes ºº ºº Data ºº Scale Type ºº
Conversion/Normalization ºÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
Figure 4-8. Art of Neural Network CreationAs will be detailed in Chapter
6, NNs have shown considerable similarity to the purposes behind many statistical
methods, mathematical curve fitting and another AI technique termed Memory
Based Reasoning [AI10]. SummaryThe areas of knowledge representation and
heuristics, expert systems, natural language processing, and machine learning
have been examined. By being able to effectively codify knowledge through
representation schemes and heuristics, the power of a DM may be substantially
enhanced. This aid may be in the form of judgmental refinement or amplification,
but is typically available through easier access to expert opinion resident
in expert systems. The ability of natural language processing to help solve
problems and aid the decision maker now remains crude. But efforts in working
with large knowledge bases together with very fast computers and refined
techniques will create an interface with the machine of monumental proportions,
likely before the turn of the century. Finally neural networks continue
to advance with potential for solving optimization type problems on a more
exacting scale while also exhibiting potential for attacking the classification
problem on a new level. This technology has shown considerable promise
for providing solutions to continuous scale problems such as trend prediction
and diagnosis problems of all types.