Package edu.berkeley.nlp.lm.cache
Class ContextEncodedCachingLmWrapper<T>
java.lang.Object
edu.berkeley.nlp.lm.AbstractNgramLanguageModel<T>
edu.berkeley.nlp.lm.AbstractContextEncodedNgramLanguageModel<T>
edu.berkeley.nlp.lm.cache.ContextEncodedCachingLmWrapper<T>
- Type Parameters:
W-
- All Implemented Interfaces:
ContextEncodedNgramLanguageModel<T>,NgramLanguageModel<T>,Serializable
This class wraps
ContextEncodedNgramLanguageModel with a cache.- Author:
- adampauls
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel
ContextEncodedNgramLanguageModel.DefaultImplementations, ContextEncodedNgramLanguageModel.LmContextInfoNested classes/interfaces inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
NgramLanguageModel.StaticMethods -
Field Summary
Fields inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
lmOrder, oovWordLogProb -
Method Summary
Modifier and TypeMethodDescriptionfloatgetLogProb(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo contextOutput) Get the score for an n-gram, and also get the context offset of the n-gram's suffix.int[]getNgramForOffset(long contextOffset, int contextOrder, int word) Gets the n-gram referred to by a context-encoding.getOffsetForNgram(int[] ngram, int startPos, int endPos) Gets the offset which refers to an n-gram.Each LM must have a WordIndexer which assigns integer IDs to each word W in the language.static <T> ContextEncodedCachingLmWrapper<T> This type of caching is only threadsafe if you have one cache wrapper per thread.static <T> ContextEncodedCachingLmWrapper<T> wrapWithCacheNotThreadSafe(ContextEncodedNgramLanguageModel<T> lm, int cacheBits) static <T> ContextEncodedCachingLmWrapper<T> This type of caching is threadsafe and (internally) maintains a separate cache for each thread that calls it.static <T> ContextEncodedCachingLmWrapper<T> wrapWithCacheThreadSafe(ContextEncodedNgramLanguageModel<T> lm, int cacheBits) Methods inherited from class edu.berkeley.nlp.lm.AbstractContextEncodedNgramLanguageModel
getLogProb, scoreSentenceMethods inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
getLmOrder, setOovWordLogProbMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
getLmOrder, setOovWordLogProb
-
Method Details
-
wrapWithCacheNotThreadSafe
public static <T> ContextEncodedCachingLmWrapper<T> wrapWithCacheNotThreadSafe(ContextEncodedNgramLanguageModel<T> lm) This type of caching is only threadsafe if you have one cache wrapper per thread.- Type Parameters:
T-- Parameters:
lm-- Returns:
-
wrapWithCacheNotThreadSafe
public static <T> ContextEncodedCachingLmWrapper<T> wrapWithCacheNotThreadSafe(ContextEncodedNgramLanguageModel<T> lm, int cacheBits) -
wrapWithCacheThreadSafe
public static <T> ContextEncodedCachingLmWrapper<T> wrapWithCacheThreadSafe(ContextEncodedNgramLanguageModel<T> lm) This type of caching is threadsafe and (internally) maintains a separate cache for each thread that calls it. Note each thread has its own cache, so if you have lots of threads, memory usage could be substantial.- Type Parameters:
T-- Parameters:
lm-- Returns:
-
wrapWithCacheThreadSafe
public static <T> ContextEncodedCachingLmWrapper<T> wrapWithCacheThreadSafe(ContextEncodedNgramLanguageModel<T> lm, int cacheBits) -
getWordIndexer
Description copied from interface:NgramLanguageModelEach LM must have a WordIndexer which assigns integer IDs to each word W in the language.- Specified by:
getWordIndexerin interfaceNgramLanguageModel<T>- Overrides:
getWordIndexerin classAbstractNgramLanguageModel<T>- Returns:
-
getOffsetForNgram
public ContextEncodedNgramLanguageModel.LmContextInfo getOffsetForNgram(int[] ngram, int startPos, int endPos) Description copied from interface:ContextEncodedNgramLanguageModelGets the offset which refers to an n-gram. If the n-gram is not in the model, then it returns the shortest suffix of the n-gram which is. This operation is not necessarily fast.- Specified by:
getOffsetForNgramin interfaceContextEncodedNgramLanguageModel<T>- Specified by:
getOffsetForNgramin classAbstractContextEncodedNgramLanguageModel<T>
-
getNgramForOffset
public int[] getNgramForOffset(long contextOffset, int contextOrder, int word) Description copied from interface:ContextEncodedNgramLanguageModelGets the n-gram referred to by a context-encoding. This operation is not necessarily fast.- Specified by:
getNgramForOffsetin interfaceContextEncodedNgramLanguageModel<T>- Specified by:
getNgramForOffsetin classAbstractContextEncodedNgramLanguageModel<T>
-
getLogProb
public float getLogProb(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo contextOutput) Description copied from interface:ContextEncodedNgramLanguageModelGet the score for an n-gram, and also get the context offset of the n-gram's suffix.- Specified by:
getLogProbin interfaceContextEncodedNgramLanguageModel<T>- Specified by:
getLogProbin classAbstractContextEncodedNgramLanguageModel<T>- Parameters:
contextOffset- Offset of context (prefix) of an n-gramcontextOrder- The (0-based) length ofcontext(i.e.order == 0iffcontextrefers to a unigram).word- Last word of the n-gramcontextOutput- Offset of the suffix of the input n-gram. If the parameter isnullit will be ignored. This can be passed to future queries for efficient access.- Returns:
-