Google

Jakarta-Regexp 1.2 API: Class RECompiler

org.apache.regexp
Class RECompiler


java.lang.Object

  |

  +--org.apache.regexp.RECompiler

Direct Known Subclasses:
REDebugCompiler

public class RECompiler
extends java.lang.Object

A regular expression compiler class. This class compiles a pattern string into a regular expression program interpretable by the RE evaluator class. The 'recompile' command line tool uses this compiler to pre-compile regular expressions for use with RE. For a description of the syntax accepted by RECompiler and what you can do with regular expressions, see the documentation for the RE matcher class.

Version:
$Id: RECompiler.java,v 1.2 2000/05/14 21:04:17 jon Exp $
Author:
Jonathan Locke
See Also:
RE, recompile

Inner Class Summary
(package private)  class RECompiler.RERange
          Local, nested class for maintaining character ranges for character classes.
 
Field Summary
(package private) static int[] bracketEnd
           
(package private) static int bracketFinished
           
(package private) static int[] bracketMin
           
(package private) static int[] bracketOpt
           
(package private) static int brackets
           
(package private) static int[] bracketStart
           
(package private) static int bracketUnbounded
           
(package private) static char ESC_BACKREF
           
(package private) static char ESC_CLASS
           
(package private) static char ESC_COMPLEX
           
(package private) static char ESC_MASK
           
(package private) static java.util.Hashtable hashPOSIX
           
(package private)  int idx
           
(package private)  char[] instruction
           
(package private)  int len
           
(package private)  int lenInstruction
           
(package private) static int maxBrackets
           
(package private) static int NODE_NORMAL
           
(package private) static int NODE_NULLABLE
           
(package private) static int NODE_TOPLEVEL
           
(package private)  int parens
           
(package private)  java.lang.String pattern
           
 
Constructor Summary
RECompiler()
          Constructor.
 
Method Summary
(package private) static void ()
           
(package private)  void allocBrackets()
          Allocate storage for brackets only as needed
(package private)  int atom()
          Absorb an atomic character string.
(package private)  void bracket()
          Match bracket {m,n} expression put results in bracket member variables
(package private)  int branch(int[] flags)
          Compile one branch of an or operator (implements concatenation)
(package private)  int characterClass()
          Compile a character class
(package private)  int closure(int[] flags)
          Compile a possibly closured terminal
 REProgram compile(java.lang.String pattern)
          Compiles a regular expression pattern into a program runnable by the pattern matcher class 'RE'.
(package private)  void emit(char c)
          Emit a single character into the program stream.
(package private)  void ensure(int n)
          Ensures that n more characters can fit in the program buffer.
(package private)  char escape()
          Match an escape sequence.
(package private)  int expr(int[] flags)
          Compile an expression with possible parens around it.
(package private)  void internalError()
          Throws a new internal error exception
(package private)  int node(char opcode, int opdata)
          Adds a new node
(package private)  void nodeInsert(char opcode, int opdata, int insertAt)
          Inserts a node with a given opcode and opdata at insertAt.
(package private)  void setNextOfEnd(int node, int pointTo)
          Appends a node to the end of a node chain
(package private)  void syntaxError(java.lang.String s)
          Throws a new syntax error exception
(package private)  int terminal(int[] flags)
          Match a terminal node.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait
 

Field Detail

instruction


char[] instruction

lenInstruction


int lenInstruction

pattern


java.lang.String pattern

len


int len

idx


int idx

parens


int parens

NODE_NORMAL


static final int NODE_NORMAL

NODE_NULLABLE


static final int NODE_NULLABLE

NODE_TOPLEVEL


static final int NODE_TOPLEVEL

ESC_MASK


static final char ESC_MASK

ESC_BACKREF


static final char ESC_BACKREF

ESC_COMPLEX


static final char ESC_COMPLEX

ESC_CLASS


static final char ESC_CLASS

maxBrackets


static final int maxBrackets

brackets


static int brackets

bracketStart


static int[] bracketStart

bracketEnd


static int[] bracketEnd

bracketMin


static int[] bracketMin

bracketOpt


static int[] bracketOpt

bracketUnbounded


static final int bracketUnbounded

bracketFinished


static final int bracketFinished

hashPOSIX


static java.util.Hashtable hashPOSIX
Constructor Detail

RECompiler


public RECompiler()
Constructor. Creates (initially empty) storage for a regular expression program.
Method Detail


static void ()

ensure


void ensure(int n)
Ensures that n more characters can fit in the program buffer. If n more can't fit, then the size is doubled until it can.
Parameters:
n - Number of additional characters to ensure will fit.

emit


void emit(char c)
Emit a single character into the program stream.
Parameters:
c - Character to add

nodeInsert


void nodeInsert(char opcode,
                int opdata,
                int insertAt)
Inserts a node with a given opcode and opdata at insertAt. The node relative next pointer is initialized to 0.
Parameters:
opcode - Opcode for new node
opdata - Opdata for new node (only the low 16 bits are currently used)
insertAt - Index at which to insert the new node in the program

setNextOfEnd


void setNextOfEnd(int node,
                  int pointTo)
Appends a node to the end of a node chain
Parameters:
node - Start of node chain to traverse
pointTo - Node to have the tail of the chain point to

node


int node(char opcode,
         int opdata)
Adds a new node
Parameters:
opcode - Opcode for node
opdata - Opdata for node (only the low 16 bits are currently used)
Returns:
Index of new node in program

internalError


void internalError()
             throws java.lang.Error
Throws a new internal error exception
Throws:
java.lang.Error - Thrown in the event of an internal error.

syntaxError


void syntaxError(java.lang.String s)
           throws RESyntaxException
Throws a new syntax error exception
Throws:
RESyntaxException - Thrown if the regular expression has invalid syntax.

allocBrackets


void allocBrackets()
Allocate storage for brackets only as needed

bracket


void bracket()
       throws RESyntaxException
Match bracket {m,n} expression put results in bracket member variables
Throws:
RESyntaxException - Thrown if the regular expression has invalid syntax.

escape


char escape()
      throws RESyntaxException
Match an escape sequence. Handles quoted chars and octal escapes as well as normal escape characters. Always advances the input stream by the right amount. This code "understands" the subtle difference between an octal escape and a backref. You can access the type of ESC_CLASS or ESC_COMPLEX or ESC_BACKREF by looking at pattern[idx - 1].
Returns:
ESC_* code or character if simple escape
Throws:
RESyntaxException - Thrown if the regular expression has invalid syntax.

characterClass


int characterClass()
             throws RESyntaxException
Compile a character class
Returns:
Index of class node
Throws:
RESyntaxException - Thrown if the regular expression has invalid syntax.

atom


int atom()
   throws RESyntaxException
Absorb an atomic character string. This method is a little tricky because it can un-include the last character of string if a closure operator follows. This is correct because *+? have higher precedence than concatentation (thus ABC* means AB(C*) and NOT (ABC)*).
Returns:
Index of new atom node
Throws:
RESyntaxException - Thrown if the regular expression has invalid syntax.

terminal


int terminal(int[] flags)
       throws RESyntaxException
Match a terminal node.
Parameters:
flags - Flags
Returns:
Index of terminal node (closeable)
Throws:
RESyntaxException - Thrown if the regular expression has invalid syntax.

closure


int closure(int[] flags)
      throws RESyntaxException
Compile a possibly closured terminal
Parameters:
flags - Flags passed by reference
Returns:
Index of closured node
Throws:
RESyntaxException - Thrown if the regular expression has invalid syntax.

branch


int branch(int[] flags)
     throws RESyntaxException
Compile one branch of an or operator (implements concatenation)
Parameters:
flags - Flags passed by reference
Returns:
Pointer to branch node
Throws:
RESyntaxException - Thrown if the regular expression has invalid syntax.

expr


int expr(int[] flags)
   throws RESyntaxException
Compile an expression with possible parens around it. Paren matching is done at this level so we can tie the branch tails together.
Parameters:
flags - Flag value passed by reference
Returns:
Node index of expression in instruction array
Throws:
RESyntaxException - Thrown if the regular expression has invalid syntax.

compile


public REProgram compile(java.lang.String pattern)
                  throws RESyntaxException
Compiles a regular expression pattern into a program runnable by the pattern matcher class 'RE'.
Parameters:
pattern - Regular expression pattern to compile (see RECompiler class for details).
Returns:
A compiled regular expression program.
Throws:
RESyntaxException - Thrown if the regular expression has invalid syntax.
See Also:
RECompiler, RE


Copyright © 2000 Apache Software Foundation. All Rights Reserved.