14.5 Neighbouring Region Look-up Table

Neighbouring region look-up tables are used to find neighbouring regions within a geocoding system, for example postcodes and suburb names. For a given region these tables contain all its neighbours. These look-up tables can be created using geographical data extracted from GIS system, and are used in the Febrl geocode matching engine (see Chapter 10 for more details).

Look-up tables of both direct and indirect neighbours (i.e. neighbours of direct neighbours) are used in the geocode matching engine to find matches in addresses where no exact postcode or suburb match can be found. Experience shows that people often record different postcode or suburb values if a neighbouring postcode or suburb has a higher perceived social status (e.g. 'Double Bay' and 'Edgecliff'), or if they live close to the border of such regions.

Neighbouring region look-up tables can be created using the program get-neighbour-regions.py which is described in Section 10.5.1.

The format of these look-up tables is as follows. Each line starts with the value of a region (for example a postcode or suburb name), followed by a colon ':' and a comma separated list (in Python format, i.e. surrounded by square brackets, and all entries must be either single or double quoted) of the neighbours of this region (these can be both direct and indirect neighbours, no distinction is made). The following example is taken from a neighbouring look-up table with suburb names.

# ====================================================================

tura beach: ['berrambool','bournda','merimbula']
turill: ['bungaba','cassilis','mogo','uarbry','ulan']
turlinjah: ['bodalla','coila','tuross head','wamban']
turner: ['acton','braddon','city','dickson','lyneham',"o'connor"]

The default value for the attribute default is the value [], i.e. an empty list. So if a value that does not exist is searched in a table, an empty list is returned and the calling program can act upon this. The default value can be changed when a neighbouring region look-up table is initialised using the default argument.

Assuming the lookup.py module has been imported using the import lookup command, an example neighbouring region look-up table can be initialised and loaded from a file as shown in the following example.

# ====================================================================

sub_neighbour_table = NeighbourLookupTable(name='Suburbs')
sub_neighbour_table.load(file_names = 'geocode'+dirsep+ \
                                      'sub-neighbours.txt')

print sub_neighbour_table.length

print sub_neighbour_table['aarons pass']
print sub_neighbour_table['akolele']