Towards a Parallel Data Mining Toolbox
Peter Christen, Markus Hegland, Ole M. Nielsen, Stephen Roberts,
Peter E. Strazdins and Irfan Altas, Efficient Data Mining:
Scripting and Scalable Parallel Algorithms ,PDDM-01:
4th International Workshop on Parallel and Distributed Data Mining,
in conjuction with IPDPS'2001, San Fracsisco, April 2001.
Abstract:
This paper presents our approach to data mining that allows the coupling
of parallel applications with a scripting language resulting in an
efficient and flexible toolbox. Parallel algorithms which are scalable
both in data size and number of processors are a key issue to be able to
solve the ever increasing problems in data mining. On the other hand,
data mining applications should be flexible to allow interactive data
exploration. By using a toolbox written in a scripting language we are
able to steer parallel applications in a flexible way, thus fulfilling
the needs of a data miner for fast interactive data analysis. The
chosen approach is discussed and first results are presented.
Contents
Keywords
data mining, thin plate splines, additive models, scripting,
wavelets, parallel linear systems, symmetric indefinite systems