10.3.1 Memory Usage and Performance of 'process-gnaf.py'

The process-gnaf.py program works by building in-memory hash table data structures for all the fields (or attributes) in the G-NAF data files, and it therefore needs a large amount of main memory and takes some processing times. For example, processing the New South Wales part of G-NAF (containing around 4 million address site records) on a SUN Enterprise 450 shared memory (SMP) server with four 480 MHz Ultra-SPARC II processors and 4 Giga Bytes of main memory used around 3,300 Mega Bytes (3.3 Giga Bytes) of main memory and took around 34.5 hours (with all processing flags in process-gnaf.py set to True). Ways to reduce the amount of memory needed are to

Note that these results are only particular to the above given computing platform, actual running times may heavily depend upon processor speed, memory access times as well as disk input and output times. Note also that currently the process-gnaf.py program only runs sequentially, we are planning to develop a parallel version in the future.