10.1 G-NAF - A Geocoded National Address File

In many countries geographical data is collected by various state and territory agencies. In Australia, for example, each state and territory have their own governmental agency that collect data to be used for land planning, as well as property, infrastructure or resource management. Additionally, national organisations like post and telecommunications, electoral rolls and statistics bureaus collect their own data. All these data sets are collected for specific purposes, have varying content and are stored in different formats.

Figure: Simplified G-NAF data model (10 main files only). Links 1-n denote one-to-many, and links 1-1 denote one-to-one relationships.
\includegraphics[width=0.55\textwidth]{g-naf-files}

The need for a nation-wide, standardised and high-quality geocoded data set has been recognised in Australia since 1990 [27], and after years of planning, collaborations and development the G-NAF was first released in March 2004. Approximately 32 million address records from 13 organisations were used in a five-phase cleaning and integration process, resulting in a database consisting of 22 normalised files (or tables). Figure 10.2 shows the simplified data model of the 10 main G-NAF files.


Table 10.1: Characteristics of the 10 main G-NAF files (NSW data only).
\begin{table}
\begin{center}
\begin{tableiii}{l\vert c\vert r}{textrm}{G-NAF dat...
...
\lineiii{~ }{~ }{ LOCALITY\_PID } \hline
\end{tableiii}\end{center}\end{table}


G-NAF is based on a hierarchical model, which stores information about address sites separately from locations and streets. It is possible to have multiple geocoded locations for a single address, and vice versa, and aliases are available at various levels. Three geocode files contain location (longitude and latitude) information at different levels of details. If an exact address match can be found, its location can be retrieved from the ADDRESS_SITE_GEOCODE file. If there is only a match on street (but not street number) level, the STREET_LOCALITY_GEOCODE file will provide an overall street geocode. Finally, if no street level match can be found the LOCALITY_GEOCODE file contains geocode information for localities (e.g. towns and suburbs). Both the STREET_LOCALITY_GEOCODE and LOCALITY_GEOCODE files also contain information about the extent of the street and locality.

For our project we only used the G-NAF records covering the Australian state of New South Wales (NSW), containing around 4 million address, 60,000 street and 5,000 locality records. Table 10.1 gives an overview of the size and content of the 10 main G-NAF data files used.