In many data sets there are fields that are already in a cleaned and
standardised form, and which therefore can directly be copied into
corresponding output fields (so they are available for a linkage or
deduplication process later). The simple PassFieldStandardiser
can be used for this. It copies the values from the input fields into
the corresponding output fields without any modifications. The number
of input fields must be the same as the number of output fields.
Values in the first input field will be copied into the first output
field, values from the second input field into the second output
field, and so on. The example code below shows how to apply the field
pass standardiser on two fields in a fictitious hospital data set.
The following arguments must be given when a field pass standardiser is initialised.
name
description
input_fields
output_fields
In the example given in the following code block, values from the
input field 'ohoscode'
are passed (copied) into the output
field 'hosp_code'
, and values from input field 'rseqnum'
are copied into output field 'seq_num'
.
# ==================================================================== pass_fields = PassFieldStandardiser(name = 'Field passer', description = 'Passes OK fields', input_fields = ['ohoscode', 'rseqnum'], output_fields = ['hosp_code', 'seq_num'])