Package edu.berkeley.nlp.lm
Class StupidBackoffLm<W>
java.lang.Object
edu.berkeley.nlp.lm.AbstractNgramLanguageModel<W>
edu.berkeley.nlp.lm.AbstractArrayEncodedNgramLanguageModel<W>
edu.berkeley.nlp.lm.StupidBackoffLm<W>
- Type Parameters:
W-
- All Implemented Interfaces:
ArrayEncodedNgramLanguageModel<W>,NgramLanguageModel<W>,Serializable
public class StupidBackoffLm<W>
extends AbstractArrayEncodedNgramLanguageModel<W>
implements ArrayEncodedNgramLanguageModel<W>, Serializable
Language model implementation which uses stupid backoff (Brants et al., 2007)
computation. Note that stupid backoff does not properly normalize, so the
scores this LM computes are not in fact probabilities. Also, unliked LMs estimated
using
, this model returns natural
logarithms instead of log10.
invalid reference
LmReaders.createKneserNeyLmFromTextFiles
- Author:
- adampauls
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.ArrayEncodedNgramLanguageModel
ArrayEncodedNgramLanguageModel.DefaultImplementationsNested classes/interfaces inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
NgramLanguageModel.StaticMethods -
Field Summary
FieldsFields inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
lmOrder, oovWordLogProb -
Constructor Summary
ConstructorsConstructorDescriptionStupidBackoffLm(int lmOrder, WordIndexer<W> wordIndexer, NgramMap<LongRef> map, ConfigOptions opts) -
Method Summary
Modifier and TypeMethodDescriptionfloatgetLogProb(int[] ngram) Equivalent togetLogProb(ngram, 0, ngram.length)floatgetLogProb(int[] ngram, int startPos, int endPos) Calculate language model score of an n-gram.floatgetLogProb(List<W> ngram) Scores an n-gram.longgetRawCount(int[] ngram, int startPos, int endPos) Gets the raw count of an n-gram.Methods inherited from class edu.berkeley.nlp.lm.AbstractArrayEncodedNgramLanguageModel
scoreSentenceMethods inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
getLmOrder, getWordIndexer, setOovWordLogProbMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
getLmOrder, getWordIndexer, scoreSentence, setOovWordLogProb
-
Field Details
-
map
-
-
Constructor Details
-
StupidBackoffLm
public StupidBackoffLm(int lmOrder, WordIndexer<W> wordIndexer, NgramMap<LongRef> map, ConfigOptions opts)
-
-
Method Details
-
getLogProb
public float getLogProb(int[] ngram, int startPos, int endPos) Description copied from interface:ArrayEncodedNgramLanguageModelCalculate language model score of an n-gram. Warning: if you pass in an n-gram of length greater thangetLmOrder(), this call will silently ignore the extra words of context. In other words, if you pass in a 5-gram (endPos-startPos == 5) to a 3-gram model, it will only score the words fromstartPos + 2toendPos.- Specified by:
getLogProbin interfaceArrayEncodedNgramLanguageModel<W>- Specified by:
getLogProbin classAbstractArrayEncodedNgramLanguageModel<W>- Parameters:
ngram- array of words in integer representationstartPos- start of the portion of the array to be readendPos- end of the portion of the array to be read.- Returns:
-
getRawCount
public long getRawCount(int[] ngram, int startPos, int endPos) Gets the raw count of an n-gram.- Parameters:
ngram-startPos-endPos-- Returns:
- count of n-gram, or -1 if n-gram is not in the map.
-
getLogProb
public float getLogProb(int[] ngram) Description copied from interface:ArrayEncodedNgramLanguageModelEquivalent togetLogProb(ngram, 0, ngram.length)- Specified by:
getLogProbin interfaceArrayEncodedNgramLanguageModel<W>- Overrides:
getLogProbin classAbstractArrayEncodedNgramLanguageModel<W>- See Also:
-
getLogProb
Description copied from interface:NgramLanguageModelScores an n-gram. This is a convenience method and will generally be relatively inefficient. More efficient versions are available inArrayEncodedNgramLanguageModel.getLogProb(int[], int, int)andContextEncodedNgramLanguageModel.getLogProb(long, int, int, edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel.LmContextInfo).- Specified by:
getLogProbin interfaceNgramLanguageModel<W>- Overrides:
getLogProbin classAbstractArrayEncodedNgramLanguageModel<W>
-
getNgramMap
-