The hidden Markov model (HMM) functionalities used in the Febrl system are implemented in the simplehmm.py module. This module provides a class hmm with methods to initialise a HMM, to set its transition and observation probabilities, to train a HMM, to save it to and load it from a text file, and to apply the Viterbi algorithm to an observation sequence. Additionally, methods to print a HMM (its states, observations symbols all the probabilities) and to check its probabilities (if they sum up correctly in each state) are provided.
For more details we refer the reader to the source code of the
simplehmm.py module and its unit testing module
simplehmmTest.py. The following example program code (mainly
taken from the simplehmmTest.py module) shows how to
initialise, train, use, save and load a HMM using the
simplehmm.py module. It is assumed that
the simplehmm.py module has been imported using the Python
command import simplehmm
.
See Also:
# ==================================================================== # Define HMM state list and observation list test_hmm_states = ['title', 'givenname', 'surname'] test_hmm_observ = ['TI', 'GM', 'GF', 'SN', 'UN'] # Some example training records (one per line) with state/tag pairs train_data = [[('title','TI'),('givenname','GF'),('surname','SN')], [('givenname','GM'),('surname','UN')], [('title','UN'),('givenname','GM'),('surname','UN')], [('title','TI'),('givenname','SN'),('surname','SN')], [('givenname','GM'),('surname','SN')], [('title','TI'),('givenname','GF'),('surname','SN')], [('title','TI'),('surname','SN'),('givenname','GM')], [('surname','UN'),('givenname','UN')], [('givenname','GF'),('surname','GF'),('surname','SN')]] # Some test examples (observation (tag) sequences), one per line test_data = [['TI','GM','SN'], ['UN','SN'], ['TI','UN','UN'], ['TI','GF','UN'], ['UN','UN','UN','UN'], ['TI','GM','UN','SN'], ['GF','UN']] # Initialise a new HMM and train it test_hmm = simplehmm.hmm('Test HMM', test_hmm_states, test_hmm_observ) test_hmm.train(train_data) # Train the HMM test_hmm.check_prob() # Check its probabilities test_hmm.print_hmm() # Print it out # Apply the Viterbi algorithm to each sequence of the test data for test_rec in test_data: [state_sequence, sequence_probability] = test_hmm.viterbi(test_rec) # Initialise and train a second HMM using the same training data and # applying Laplace smoothing test_hmm2 = simplehmm.hmm('Test HMM 2', test_states, test_observ) test_hmm2.train(train_data, smoothing='laplace') # Save the second HMM into a text file test_hmm2.save_hmm('testhmm2.hmm') # Initialise a third HMM, then load the previously saved HMM into it test_hmm3 = simplehmm.hmm('Test HMM 3', ['dummy'], ['dummy']) test_hmm3.load_hmm('testhmm2.hmm') test_hmm3.print_hmm() # Print it out