Probabilistic methods for Intelligent Software Systems
Computer scientists and programmers increasingly develop complex software
solutions to intelligent tasks such as expert systems, speech and natural
language understanding, vision, knowledge discovery, automatic text indexing,
robotics, image matching and indexing, image clustering and classification,
robotics and agents, systems monitoring, health management and diagnosis,
scientific instrumentation, applied physics, molecular biology, networking
and communications, and so forth. These tasks involve a high degree of
uncertainty for the following reasons:
Complexity:
- Intractable search problems (like aircraft scheduling) very large data
problems (like image analysis of tera-byte earth science data), and complex
physical models (atmospheric modeling or lighting models in graphics) have
computational complexity that defies exact or optimal solution. One can
approximate a solution, and thus be uncertain as to how good an approximation
one is making. In addition, at any of the choice points during the computation,
uncertainty exists as to its future consequences for the computation.
Incompleteness:
- The inputs to the problem may be inadequate to yield a unique solution:
for instance, in learning only a small number of examples may be given,
or in diagnosis there may be many possible explanations of the phenomena.
In image interpretation, one will be given a coarse pixel representation
of a complex object. One is uncertain as to which of the possible solutions
may be the best.
Intrinsic uncertainty:
- "Noise" in its pure form crops up in many problems, for instance,
due to sampling, truncation, instrument drift, and human error. This is
uncertainty in its statistical sense.
Approximation:
- How good is an approximation, under which conditions does it work?
How can the approximation be improved, or its parameters tweaked for different
conditions? Uncertainty exists as to the quality of the approximation,
and in setting the parameters to tune the approximation.
Language:
- Spoken and written input is finite and therefore vague in some aspects,
especially when human frailties intervene: for instance a victim's description
of an attacker, or a physician's summary of the knowledge gleaned from
50 case histories. Uncertainty exists in interpreting the precise meaning
of language, and incorporating this with other information.
Information fusion:
- Intelligent systems increasing acquire information from disparate sources:
for instance, speech recognition systems use information about the speaker,
about the context of the utterance, and about grammar. Geographic information
systems combine information about hydrology, soil-type, climate and satellite
data from several different instruments. Uncertainty exists in determining
how to assign relative importance to disparate information sources.
Uncertainty is fundamental in intelligent software systems.
While many models exist for addressing uncertainty, analysis using the
probability calculus is perhaps the most general. Most well known frameworks
for analysis can be modeled within the probability calculus, often times
leading to significant insight. Uncertainty models included in this category
are fuzzy logic, classical frequentist statistics, minimum complexity methods
such as description length, and maximum entropy methods. Probability calculus
now sees widespread use in neural networks, vision, graphics, natural language,
and text processing, as well as in its original stronghold of statistical
analysis. In may cases, these areas only make partial use of the full power
of the probability calculus because they employ a classical frequentist
interpretation which implies a sample space---many problems in intelligent
systems are unfortunately one-off so this is not possible. The Artificial
Intelligence community originally saw logic as a powerful calculus that
could be the theoretical basis for intelligence. While logic has fundamental
contributions to make in representation and programming languages, uncertainty
invariably arises and other analytic tools are required, for instance the
probability calculus.
Returning now to the design of intelligent systems, the probability
calculus has computational variants in much the same way that logic has
its computational variants. The understanding of probabilities, its use
within a computation, and its efficient implementation within some broader
application are issues of general concern in the design of intelligent
systems.
Last change: Fri, Nov 8th, 10:38am, 1996
wray@ic.eecs.berkeley.edu