| Interface | Description |
|---|---|
| BaseTokenStreamTestCase.CheckClearAttributesAttribute |
Attribute that records if it was cleared or not.
|
| CannedBinaryTokenStream.BinaryTermAttribute |
An attribute extending
TermToBytesRefAttribute but exposing CannedBinaryTokenStream.BinaryTermAttribute.setBytesRef(org.apache.lucene.util.BytesRef) method. |
| NumericTokenStream.NumericTermAttribute |
Expert: Use this attribute to get the details of the currently generated token.
|
| Class | Description |
|---|---|
| Analyzer |
An Analyzer builds TokenStreams, which analyze text.
|
| Analyzer.GlobalReuseStrategy | Deprecated
This implementation class will be hidden in Lucene 5.0.
|
| Analyzer.PerFieldReuseStrategy | Deprecated
This implementation class will be hidden in Lucene 5.0.
|
| Analyzer.ReuseStrategy |
Strategy defining how TokenStreamComponents are reused per call to
Analyzer.tokenStream(String, java.io.Reader). |
| Analyzer.TokenStreamComponents |
This class encapsulates the outer components of a token stream.
|
| AnalyzerWrapper |
Extension to
Analyzer suitable for Analyzers which wrap
other Analyzers. |
| BaseTokenStreamTestCase |
Base class for all Lucene unit tests that use TokenStreams.
|
| BaseTokenStreamTestCase.CheckClearAttributesAttributeImpl |
Attribute that records if it was cleared or not.
|
| CachingTokenFilter |
This class can be used if the token attributes of a TokenStream
are intended to be consumed more than once.
|
| CannedBinaryTokenStream |
TokenStream from a canned list of binary (BytesRef-based)
tokens.
|
| CannedBinaryTokenStream.BinaryTermAttributeImpl |
Implementation for
CannedBinaryTokenStream.BinaryTermAttribute. |
| CannedBinaryTokenStream.BinaryToken |
Represents a binary token.
|
| CannedTokenStream |
TokenStream from a canned list of Tokens.
|
| CharFilter |
Subclasses of CharFilter can be chained to filter a Reader
They can be used as
Reader with additional offset
correction. |
| CollationTestBase |
Base test class for testing Unicode collation.
|
| CrankyTokenFilter |
Throws IOException from random Tokenstream methods.
|
| DelegatingAnalyzerWrapper |
An analyzer wrapper, that doesn't allow to wrap components or readers.
|
| LookaheadTokenFilter<T extends LookaheadTokenFilter.Position> |
An abstract TokenFilter to make it easier to build graph
token filters requiring some lookahead.
|
| LookaheadTokenFilter.Position |
Holds all state for a single position; subclass this
to record other state at each position.
|
| MockAnalyzer |
Analyzer for testing
|
| MockBytesAnalyzer |
Analyzer for testing that encodes terms as UTF-16 bytes.
|
| MockCharFilter |
the purpose of this charfilter is to send offsets out of bounds
if the analyzer doesn't use correctOffset or does incorrect offset math.
|
| MockFixedLengthPayloadFilter |
TokenFilter that adds random fixed-length payloads.
|
| MockGraphTokenFilter |
Randomly inserts overlapped (posInc=0) tokens with
posLength sometimes > 1.
|
| MockHoleInjectingTokenFilter |
Randomly injects holes (similar to what a stopfilter would do)
|
| MockPayloadAnalyzer |
Wraps a whitespace tokenizer with a filter that sets
the first token, and odd tokens to posinc=1, and all others
to 0, encoding the position as pos: XXX in the payload.
|
| MockRandomLookaheadTokenFilter |
Uses
LookaheadTokenFilter to randomly peek at future tokens. |
| MockReaderWrapper |
Wraps a Reader, and can throw random or fixed
exceptions, and spoon feed read chars.
|
| MockTokenFilter |
A tokenfilter for testing that removes terms accepted by a DFA.
|
| MockTokenizer |
Tokenizer for testing.
|
| MockUTF16TermAttributeImpl |
Extension of
CharTermAttributeImpl that encodes the term
text as UTF-16 bytes instead of as UTF-8 bytes. |
| MockVariableLengthPayloadFilter |
TokenFilter that adds random variable-length payloads.
|
| NumericTokenStream |
Expert: This class provides a
TokenStream
for indexing numeric values that can be used by NumericRangeQuery or NumericRangeFilter. |
| NumericTokenStream.NumericTermAttributeImpl |
Implementation of
NumericTokenStream.NumericTermAttribute. |
| Token | Deprecated
This class is outdated and no longer used since Lucene 2.9.
|
| TokenFilter |
A TokenFilter is a TokenStream whose input is another TokenStream.
|
| Tokenizer |
A Tokenizer is a TokenStream whose input is a Reader.
|
| TokenStream | |
| TokenStreamToAutomaton |
Consumes a TokenStream and creates an
Automaton
where the transition labels are UTF8 bytes (or Unicode
code points if unicodeArcs is true) from the TermToBytesRefAttribute. |
| TokenStreamToDot |
Consumes a TokenStream and outputs the dot (graphviz) string (graph).
|
| ValidatingTokenFilter |
A TokenFilter that checks consistency of the tokens (eg
offsets are consistent with one another).
|
| VocabularyAssert |
Utility class for doing vocabulary-based stemming tests
|
The main classes of interest are:
BaseTokenStreamTestCase: Highly recommended
to use its helper methods, (especially in conjunction with
MockAnalyzer or MockTokenizer),
as it contains many assertions and checks to catch bugs. MockTokenizer: Tokenizer for testing.
Tokenizer that serves as a replacement for WHITESPACE, SIMPLE, and KEYWORD
tokenizers. If you are writing a component such as a TokenFilter, its a great idea to test
it wrapping this tokenizer instead for extra checks. MockAnalyzer: Analyzer for testing.
Analyzer that uses MockTokenizer for additional verification. If you are testing a custom
component such as a queryparser or analyzer-wrapper that consumes analysis streams, its a great
idea to test it with this analyzer instead. Copyright © 2000–2021 The Apache Software Foundation. All rights reserved.