class SpeechHMM

This class implements a special case of Hidden Markov Models that can be used to do connected word speech recognition for small vocabulary, using embedded training.

Inheritance:


Public Fields

[more]int n_models
the number of basic phoneme models
[more]HMM** models
the basic phoneme models
[more]char** model_names
for each model, a unique name (used for translation)
[more]EMTrainer* model_trainer
if an initial alignment is given and an emtrainer for each model then it is used to train the models after kmeans during reset
[more]Dictionary* dict
the acceptable dictionary, using the indices instead of the names
[more]Grammar* grammar
the acceptable grammar
[more]real word_entrance_penalty
word entrance penalty: during viterbi, penalizes large sentences
[more]bool** word_transitions
true if the given transition is a transition between words
[more]int max_n_states
the maximum number of states in the graph (used for allocation)
[more]int* states_to_model_states
the relation between model states and SpeechHMM states
[more]int* states_to_model
the relation between models and SpeechHMM states
[more]int* states_to_word
the relation between words and SpeechHMM states
[more]int* word_sequence
the word sequence corresponding to the state sequence
[more]int* target_word_sequence
the target word sequence
[more]int target_word_sequence_size
the length of the target word sequence
[more]int target_word_sequence_max_size
the length of the longest target word sequence
[more]EditDistance* edit_distance
this object is used to compute the decoding error

Public Methods

[more] SpeechHMM(int n_models_, HMM** models_, char** model_names_, Dictionary* dict_, Grammar* grammar_, real word_entrance_penalty_ = LOG_ONE, EMTrainer* model_trainer_ = NULL)
In order to create a SpeechHMM, we need to give a vector of n_models_ HMMs as well as their corresponding name, a dictionary and a grammar, an optional word_entrance_penalty and an optional trainer that can be used to initialize each model independently
[more]virtual void decode(List* input)
this method returns the sentence associated to the input
[more]virtual void prepareTrainModel(List* input)
this method prepare the transition graph associated with a given training sentence
[more]virtual void prepareTestModel(List* input)
this method prepare the transition graph associated with a given test sentence
[more]virtual int addWordToModel(int word, int current_state)
this method is used by prepareTrainModel and prepareTestModel to prepare the model.
[more]virtual void addConnectionsBetweenWordsToModel(int word, int next_word, int current_state, int next_current_state, real log_n_next)
this method is used by prepareTrainModel and prepareTestModel to prepare the model.
[more]virtual void realloc(int n_frames, int n_states_)
this methods reallocate the structure to accomodate a new sequence
[more]virtual int nStatesInGrammar()
this methods returns the number of states in the grammar
[more]virtual int nStatesInWord(int word)
this methods returns the number of states in a given word


Inherited from HMM:

Public Fields

oSeqDataSet* data
oint n_states
oreal prior_transitions
oDistribution** states
oDistribution** unique_states
oint n_unique_states
oreal** transitions
oreal** log_transitions
oreal** dlog_transitions
oreal** transitions_acc
oreal** log_alpha
oreal** log_beta
oint** arg_viterbi
oint* viterbi_sequence
oreal** log_probabilities_s

Public Methods

ovirtual void printTransitions(bool real_values=false, bool transitions_only=false)
ovirtual void logAlpha(SeqExample* ex)
ovirtual void logBeta(SeqExample* ex)
ovirtual void logViterbi(SeqExample* ex)
ovirtual void logProbabilities(List* inputs)


Inherited from Distribution:

Public Fields

oint n_observations
oint tot_n_frames
oint max_n_frames
oreal log_probability
oreal* log_probabilities

Public Methods

ovirtual real logProbability(List* inputs)
ovirtual real viterbiLogProbability(List* inputs)
ovirtual real frameLogProbability(real* observations, real* inputs, int t)
ovirtual void frameExpectation(real* observations, real* inputs, int t)
ovirtual void eMIterInitialize()
ovirtual void iterInitialize()
ovirtual void eMSequenceInitialize(List* inputs)
ovirtual void sequenceInitialize(List* inputs)
ovirtual void eMAccPosteriors(List* inputs, real log_posterior)
ovirtual void frameEMAccPosteriors(real* observations, real log_posterior, real* inputs, int t)
ovirtual void viterbiAccPosteriors(List* inputs, real log_posterior)
ovirtual void frameViterbiAccPosteriors(real* observations, real log_posterior, real* inputs, int t)
ovirtual void eMUpdate()
ovirtual void eMForward(List* inputs)
ovirtual void viterbiForward(List* inputs)
ovirtual void frameBackward(real* observations, real* alpha, real* inputs, int t)
ovirtual void viterbiBackward(List* inputs, real* alpha)


Inherited from GradientMachine:

Public Fields

obool is_free
oList* params
oList* der_params
oint n_params
oreal* beta

Public Methods

ovirtual void init()
ovirtual int numberOfParams()
ovirtual void backward(List* inputs, real* alpha)
ovirtual void allocateMemory()
ovirtual void freeMemory()
ovirtual void loadFILE(FILE* file)
ovirtual void saveFILE(FILE* file)


Inherited from Machine:

Public Fields

oint n_inputs
oint n_outputs
oList* outputs

Public Methods

ovirtual void forward(List* inputs)
ovirtual void reset()


Inherited from Object:

Public Methods

ovoid addOption(const char* name, int size, void* ptr, const char* help="", bool is_allowed_after_init=false)
ovoid addIOption(const char* name, int* ptr, int init_value, const char* help="", bool is_allowed_after_init=false)
ovoid addROption(const char* name, real* ptr, real init_value, const char* help="", bool is_allowed_after_init=false)
ovoid addBOption(const char* name, bool* ptr, bool init_value, const char* help="", bool is_allowed_after_init=false)
ovoid setOption(const char* name, void* ptr)
ovoid setIOption(const char* name, int option)
ovoid setROption(const char* name, real option)
ovoid setBOption(const char* name, bool option)
ovoid load(const char* filename)
ovoid save(const char* filename)


Documentation

This class implements a special case of Hidden Markov Models that can be used to do connected word speech recognition for small vocabulary, using embedded training.

It contains a set of phoneme models (represented by HMMs), a dictionary of words (which are sequences of phonemes) and a grammar (which states the legal sentences of the langage).

The decoding is done by creating the whole transition matrix and hence is not adapted to large vocabulary problems.

oint n_models
the number of basic phoneme models

oHMM** models
the basic phoneme models

ochar** model_names
for each model, a unique name (used for translation)

oEMTrainer* model_trainer
if an initial alignment is given and an emtrainer for each model then it is used to train the models after kmeans during reset

oDictionary* dict
the acceptable dictionary, using the indices instead of the names

oGrammar* grammar
the acceptable grammar

oreal word_entrance_penalty
word entrance penalty: during viterbi, penalizes large sentences

obool** word_transitions
true if the given transition is a transition between words

oint max_n_states
the maximum number of states in the graph (used for allocation)

oint* states_to_model_states
the relation between model states and SpeechHMM states

oint* states_to_model
the relation between models and SpeechHMM states

oint* states_to_word
the relation between words and SpeechHMM states

oint* word_sequence
the word sequence corresponding to the state sequence

oint* target_word_sequence
the target word sequence

oint target_word_sequence_size
the length of the target word sequence

oint target_word_sequence_max_size
the length of the longest target word sequence

oEditDistance* edit_distance
this object is used to compute the decoding error

o SpeechHMM(int n_models_, HMM** models_, char** model_names_, Dictionary* dict_, Grammar* grammar_, real word_entrance_penalty_ = LOG_ONE, EMTrainer* model_trainer_ = NULL)
In order to create a SpeechHMM, we need to give a vector of n_models_ HMMs as well as their corresponding name, a dictionary and a grammar, an optional word_entrance_penalty and an optional trainer that can be used to initialize each model independently

ovirtual void decode(List* input)
this method returns the sentence associated to the input

ovirtual void prepareTrainModel(List* input)
this method prepare the transition graph associated with a given training sentence

ovirtual void prepareTestModel(List* input)
this method prepare the transition graph associated with a given test sentence

ovirtual int addWordToModel(int word, int current_state)
this method is used by prepareTrainModel and prepareTestModel to prepare the model. It adds a given word to the current graph.

ovirtual void addConnectionsBetweenWordsToModel(int word, int next_word, int current_state, int next_current_state, real log_n_next)
this method is used by prepareTrainModel and prepareTestModel to prepare the model. It adds the connections between words.

ovirtual void realloc(int n_frames, int n_states_)
this methods reallocate the structure to accomodate a new sequence

ovirtual int nStatesInGrammar()
this methods returns the number of states in the grammar

ovirtual int nStatesInWord(int word)
this methods returns the number of states in a given word


This class has no child classes.
Author:
Samy Bengio (bengio@idiap.ch)

Alphabetic index HTML hierarchy of classes or Java



This page was generated with the help of DOC++.