Thesis
You may download a complete postscript version of my thesis.
Abstract
This dissertation investigates the field of inductive logic programming(ILP) and in so doing an ILP system, lime, is designed and developed.
lime addresses the problem of noisy training examples; learning from
only positive, only negative, or both positive and negative examples;
efficiently biasing and searching the hypothesis space; and handling
recursion efficiently and effectively.
The Q-heuristic is introduced to address the problem of learning with
both noisy training examples and fixed numbers of positive and negative
training examples. This heuristics is based on Bayes rule. Both a
justification of its derivation and a description of the context in
which
it is appropriately applied are given. Because of the general
nature of
this heuristic its application is not restricted to ILP.
Instead of employing a greedy covering approach to constructing
clauses, lime employs the Q-heuristic to evaluate entire logic
programs as hypotheses. To tame the inevitable explosion in the
search space, the notion of a simple clause is introduced.
These sets of literals may be viewed as subparts of clauses that are
effectively independent in terms of variables used. Instead of
growing
a clause one literal at a time, lime efficiently combines simple clauses
to construct a set of gainful candidate clauses. Subsets
of these
candidate clauses are evaluated using the Q-heuristic to find the
final hypothesis. Details of the algorithms and data structures
of lime are discussed. lime's handling of recursive logic programs
is also described.
Experimental results are provided to illustrate how \lime achieves its
design goals of better noise handling, learning from a fixed set of
examples (e.g., from only positive data), and of learning recursive
logic programs. These results compare the performance of
\lime with
other leading ILP systems like Foil and Progol in a variety
of domains. Empirical results
with a boosted version of lime are also reported.
The dissertation also makes an attempt to characterise the learning
ability
of the Q-heuristic. To this end, some preliminary results are
presented
to show that the Q-heuristic stochastically learns in the limit some
restricted classes of concepts. This includes cases where the
training
set may contain: noise, only positive examples, or only negative examples.