project object one method (routine) is available to
define and start a standardisation process. Assuming that a
project object has been created (by copying and modifying the
template module project-standardise.py) and input and output
data sets, as well as component and a record standardisers have been
defined, standardisation of a data set can be done by one simple call
to the method
standardise as shown in the following example.
# ==================================================================== myproject.standardise(input_dataset = hospital_data, output_dataset = clean_hospital_data, rec_standardiser = hospital_standardiser, first_record = 0, number_records = 100000)
In the given example, 100,000 records in a fictitious hospital data set are standardised and written into an output data set (assuming it has been initialised).
The following arguments need to be defined for the standardisation process.
appendaccess mode. This output data set can be any data set implementation except a memory based data set (as all standardised records would be lost once the program finishes). See Chapter 13 for more information on data set implementations.
None(default), the first record (i.e. record with number
0) is taken.