Package edu.berkeley.nlp.lm.io
package edu.berkeley.nlp.lm.io
-
ClassDescriptionArpaLmReader<W>A parser for ARPA LM files.Callback that is called for each n-gram in the collectionComputes the log probability of a list of files.FirstPassCallback<V extends LongRepresentable<V>>Reader callback which adds n-grams to an NgramMapReads in n-gram count collections in the format that the Google n-grams Web1T corpus comes in.Some IO utility functions.Class for producing a Kneser-Ney language model in ARPA format from raw text.Class for producing a Kneser-Ney language model in ARPA format from raw text.LmReader<V,
C extends LmReaderCallback<V>> Callback that is called for each n-gram in the collectionThis class contains a number of static methods for reading/writing/estimating n-gram language models.Estimates a Kneser-Ney language model from raw text, and writes the language model out in ARPA-format.Given a language model in ARPA format, builds a binary representation of the language model and writes it to disk.Given a directory in Google n-grams format, builds a binary representation of a stupid-backoff language model language model and writes it to disk.LikeMakeLmBinaryFromGoogle, except it only writes the NgramMap portion of the LM, meaning the binary does not contain the vocabulary.Reader callback which adds n-grams to an NgramMapCallback that is called for each n-gram in the collectionTextReader<W>Class for reading raw text files.