org.apache.solr.analysis
Class BufferedTokenStream

java.lang.Object
  extended by org.apache.lucene.analysis.TokenStream
      extended by org.apache.solr.analysis.BufferedTokenStream
Direct Known Subclasses:
RemoveDuplicatesTokenFilter

public abstract class BufferedTokenStream
extends TokenStream

Handles input and output buffering of TokenStream

 // Example of a class implementing the rule "A" "B" => "Q" "B"
 class MyTokenStream extends BufferedTokenStream {
   public MyTokenStream(TokenStream input) {super(input);}
   protected Token process(Token t) throws IOException {
     if ("A".equals(t.termText())) {
       Token t2 = read();
       if (t2!=null && "B".equals(t2.termText())) t.setTermText("Q");
       if (t2!=null) pushBack(t2);
     }
     return t;
   }
 }

 // Example of a class implementing "A" "B" => "A" "A" "B"
 class MyTokenStream extends BufferedTokenStream {
   public MyTokenStream(TokenStream input) {super(input);}
   protected Token process(Token t) throws IOException {
     if ("A".equals(t.termText()) && "B".equals(peek(1).termText()))
       write(t);
     return t;
   }
 }
 

Version:
$Id$
Author:
yonik

Constructor Summary
BufferedTokenStream(TokenStream input)
           
 
Method Summary
 Token next()
           
protected  Iterable<Token> output()
          Provides direct Iterator access to the buffered output stream.
protected  Token peek(int n)
          Peek n tokens ahead in the buffered input stream, without modifying the stream.
protected abstract  Token process(Token t)
          Process a token.
protected  void pushBack(Token t)
          Push a token back into the buffered input stream, such that it will be returned by a future call to read()
protected  Token read()
          Read a token from the buffered input stream.
protected  void write(Token t)
          Write a token to the buffered output stream
 
Methods inherited from class org.apache.lucene.analysis.TokenStream
close, next, reset
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BufferedTokenStream

public BufferedTokenStream(TokenStream input)
Method Detail

process

protected abstract Token process(Token t)
                          throws IOException
Process a token. Subclasses may read more tokens from the input stream, write more tokens to the output stream, or simply return the next token to be output. Subclasses may return null if the token is to be dropped. If a subclass writes tokens to the output stream and returns a non-null Token, the returned Token is considered to be at the head of the token output stream.

Throws:
IOException

next

public final Token next()
                 throws IOException
Overrides:
next in class TokenStream
Throws:
IOException

read

protected Token read()
              throws IOException
Read a token from the buffered input stream.

Returns:
null at EOS
Throws:
IOException

pushBack

protected void pushBack(Token t)
Push a token back into the buffered input stream, such that it will be returned by a future call to read()


peek

protected Token peek(int n)
              throws IOException
Peek n tokens ahead in the buffered input stream, without modifying the stream.

Parameters:
n - Number of tokens into the input stream to peek, 1 based ... 0 is invalid
Returns:
a Token which exists in the input stream, any modifications made to this Token will be "real" if/when the Token is read() from the stream.
Throws:
IOException

write

protected void write(Token t)
Write a token to the buffered output stream


output

protected Iterable<Token> output()
Provides direct Iterator access to the buffered output stream. Modifying any token in this Iterator will affect the resulting stream.



Copyright © 2006 - 2009 The Apache Software Foundation