7.1 Hidden Markov Model Implementation Module 'simplehmm.py'

The hidden Markov model (HMM) functionalities used in the Febrl system are implemented in the simplehmm.py module. This module provides a class hmm with methods to initialise a HMM, to set its transition and observation probabilities, to train a HMM, to save it to and load it from a text file, and to apply the Viterbi algorithm to an observation sequence. Additionally, methods to print a HMM (its states, observations symbols all the probabilities) and to check its probabilities (if they sum up correctly in each state) are provided.

For more details we refer the reader to the source code of the simplehmm.py module and its unit testing module simplehmmTest.py. The following example program code (mainly taken from the simplehmmTest.py module) shows how to initialise, train, use, save and load a HMM using the simplehmm.py module. It is assumed that the simplehmm.py module has been imported using the Python command import simplehmm.

See Also:

# ====================================================================

# Define HMM state list and observation list

test_hmm_states = ['title', 'givenname', 'surname']
test_hmm_observ = ['TI', 'GM', 'GF', 'SN', 'UN']

# Some example training records (one per line) with state/tag pairs

train_data = [[('title','TI'),('givenname','GF'),('surname','SN')],

# Some test examples (observation (tag) sequences), one per line

test_data = [['TI','GM','SN'],

# Initialise a new HMM and train it

test_hmm = simplehmm.hmm('Test HMM', test_hmm_states, test_hmm_observ)
test_hmm.train(train_data)  # Train the HMM

test_hmm.check_prob()  # Check its probabilities
test_hmm.print_hmm()   # Print it out

# Apply the Viterbi algorithm to each sequence of the test data

for test_rec in test_data:
  [state_sequence, sequence_probability] = test_hmm.viterbi(test_rec)

# Initialise and train a second HMM using the same training data and
# applying Laplace smoothing

test_hmm2 = simplehmm.hmm('Test HMM 2', test_states, test_observ)
test_hmm2.train(train_data, smoothing='laplace')

# Save the second  HMM into a text file


# Initialise a third HMM, then load the previously saved HMM into it

test_hmm3 = simplehmm.hmm('Test HMM 3',  ['dummy'], ['dummy'])
test_hmm3.print_hmm()  # Print it out