public class EnglishTokenizer extends Object implements Serializable, TextTokenizer
EnglishAnalyzer,
PorterStemFilter and PorterStemmer to fix
Issue
LUCENE-3335
causing sigsegv!| Modifier and Type | Field and Description |
|---|---|
protected Set<String> |
stemExclusionsSet |
protected List<String> |
stopWords |
| Constructor and Description |
|---|
EnglishTokenizer() |
EnglishTokenizer(int minNGram,
int maxNGram) |
EnglishTokenizer(List<String> stopWords) |
EnglishTokenizer(List<String> stopWords,
Set<String> stemExclusionsSet) |
| Modifier and Type | Method and Description |
|---|---|
protected org.apache.lucene.analysis.TokenStream |
createTokenStream(String text) |
int |
getMaxNGram() |
int |
getMinNGram() |
Set<String> |
getStemExclusionsSet() |
List<String> |
getStopWords() |
boolean |
isnGram() |
void |
setMaxNGram(int maxNGram) |
void |
setMinNGram(int minNGram) |
void |
setnGram(boolean nGram) |
void |
setNGram(int minNGram,
int maxNGram) |
void |
setStemExclusionsSet(Set<String> stemExclusionsSet) |
void |
setStopWords(List<String> stopWords) |
List<String> |
tokenize(String text) |
public EnglishTokenizer()
public EnglishTokenizer(int minNGram,
int maxNGram)
public List<String> tokenize(String text)
tokenize in interface TextTokenizerprotected org.apache.lucene.analysis.TokenStream createTokenStream(String text)
public void setNGram(int minNGram,
int maxNGram)
public boolean isnGram()
public void setnGram(boolean nGram)
public int getMinNGram()
public void setMinNGram(int minNGram)
public int getMaxNGram()
public void setMaxNGram(int maxNGram)
Copyright © 2013. All rights reserved.