org.jacorb.idl
Class lexer

java.lang.Object
  extended byorg.jacorb.idl.lexer

public class lexer
extends java.lang.Object

This class implements a scanner (aka lexical analyzer or lexer) for IDL. The scanner reads characters from a global input stream and returns integers corresponding to the terminal number of the next token. Once the end of input is reached the EOF token is returned on every subsequent call.

All symbol constants are defined in sym.java which is generated by JavaCup from parser.cup.

In addition to the scanner proper (called first via init() then with next_token() to get each token) this class provides simple error and warning routines and keeps a count of errors and warnings that is publicly accessible. It also provides basic preprocessing facilties, i.e. it does handle preprocessor directives such as #define, #undef, #include, etc. although it does not provide full C++ preprocessing This class is "static" (i.e., it has only static members and methods).

Version:
$Id: lexer.java,v 1.48 2004/05/06 12:39:59 nicolas Exp $
Author:
Gerald Brose

Field Summary
protected static java.util.Hashtable char_symbols
          Table of single character symbols.
protected static boolean conditionalCompilation
           
protected static int current_line
          Current line number for use in error messages.
protected static int current_position
          Character position in current line.
static java.lang.String currentFile
          current file name
static java.lang.String currentPragmaPrefix
          currently active pragma prefix
protected static java.util.Hashtable defines
          Defined symbols (preprocessor)
protected static int EOF_CHAR
          EOF constant.
protected static boolean in_string
          Have we already read a '"' ?
protected static java.util.Hashtable java_keywords
          Table of Java reserved names.
protected static java.util.Hashtable keywords
          Table of keywords.
protected static java.util.Hashtable keywords_lower_case
          Table of keywords, stored in lower case.
protected static java.lang.StringBuffer line
          Current line for use in error messages.
protected static int next_char
          First and second character of lookahead.
protected static int next_char2
           
static int warning_count
          Count of warnings issued so far
protected static boolean wide
          Are we processing a wide char or string ?
 
Constructor Summary
lexer()
           
 
Method Summary
protected static void advance()
          Advance the scanner one character in the input stream.
static java.lang.String checkIdentifier(java.lang.String str)
          Checks whether Identifier str is legal and returns it.
static int currentLine()
          record information about the last lexical scope so that it can be restored later
static void define(java.lang.String symbol, java.lang.String value)
           
static java.lang.String defined(java.lang.String symbol)
           
protected static token do_symbol()
          Process an identifier.
static void emit_error(java.lang.String message)
          Emit an error message.
static void emit_error(java.lang.String message, str_token t)
           
static void emit_warn(java.lang.String message)
          Emit a warning message.
static void emit_warn(java.lang.String message, str_token t)
           
protected static int find_single_char(int ch)
          Try to look up a single character symbol, returns -1 for not found.
static PositionInfo getPosition()
          return the current reading position
protected static boolean id_char(int ch)
          Determine if a character is ok for the middle of an id.
protected static boolean id_start_char(int ch)
          Determine if a character is ok to start an id.
static void init()
          Initialize the scanner.
static boolean needsJavaEscape(Module m)
           
static token next_token()
          Return one token.
protected static void preprocess()
          Preprocessor directives are handled here.
protected static token real_next_token()
          The actual routine to return one token.
static void reset()
          reset the scanner state
static void restorePosition(PositionInfo p)
           
static boolean strictJavaEscapeCheck(java.lang.String s)
          called during the parse phase to catch clashes with Java reserved words.
protected static void swallow_comment()
          Handle swallowing up a comment.
static void undefine(java.lang.String symbol)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

next_char

protected static int next_char
First and second character of lookahead.


next_char2

protected static int next_char2

EOF_CHAR

protected static final int EOF_CHAR
EOF constant.

See Also:
Constant Field Values

keywords

protected static java.util.Hashtable keywords
Table of keywords. Keywords are initially treated as identifiers. Just before they are returned we look them up in this table to see if they match one of the keywords. The string of the name is the key here, which indexes Integer objects holding the symbol number.


keywords_lower_case

protected static java.util.Hashtable keywords_lower_case
Table of keywords, stored in lower case. Keys are the lower case version of the keywords used as keys for the keywords hash above, and the values are the case sensitive versions of the keywords. This table is used for detecting collisions of identifiers with keywords.


java_keywords

protected static java.util.Hashtable java_keywords
Table of Java reserved names.


char_symbols

protected static java.util.Hashtable char_symbols
Table of single character symbols. For ease of implementation, we store all unambiguous single character tokens in this table of Integer objects keyed by Integer objects with the numerical value of the appropriate char (currently Character objects have a bug which precludes their use in tables).


defines

protected static java.util.Hashtable defines
Defined symbols (preprocessor)


conditionalCompilation

protected static boolean conditionalCompilation

current_line

protected static int current_line
Current line number for use in error messages.


line

protected static java.lang.StringBuffer line
Current line for use in error messages.


current_position

protected static int current_position
Character position in current line.


in_string

protected static boolean in_string
Have we already read a '"' ?


wide

protected static boolean wide
Are we processing a wide char or string ?


warning_count

public static int warning_count
Count of warnings issued so far


currentPragmaPrefix

public static java.lang.String currentPragmaPrefix
currently active pragma prefix


currentFile

public static java.lang.String currentFile
current file name

Constructor Detail

lexer

public lexer()
Method Detail

reset

public static void reset()
reset the scanner state


init

public static void init()
                 throws java.io.IOException
Initialize the scanner. This sets up the keywords and char_symbols tables and reads the first two characters of lookahead. "Object" is listed as reserved in the OMG spec. "int" is not, but I reserved it to bar its usage as a legal integer type.

Throws:
java.io.IOException

define

public static void define(java.lang.String symbol,
                          java.lang.String value)

undefine

public static void undefine(java.lang.String symbol)

defined

public static java.lang.String defined(java.lang.String symbol)

currentLine

public static int currentLine()
record information about the last lexical scope so that it can be restored later


getPosition

public static PositionInfo getPosition()
return the current reading position


restorePosition

public static void restorePosition(PositionInfo p)

advance

protected static void advance()
                       throws java.io.IOException
Advance the scanner one character in the input stream. This moves next_char2 to next_char and then reads a new next_char2.

Throws:
java.io.IOException

emit_error

public static void emit_error(java.lang.String message)
Emit an error message. The message will be marked with both the current line number and the position in the line. Error messages are printed on standard error (System.err).

Parameters:
message - the message to print.

emit_error

public static void emit_error(java.lang.String message,
                              str_token t)

emit_warn

public static void emit_warn(java.lang.String message)
Emit a warning message. The message will be marked with both the current line number and the position in the line. Messages are printed on standard error (System.err).

Parameters:
message - the message to print.

emit_warn

public static void emit_warn(java.lang.String message,
                             str_token t)

id_start_char

protected static boolean id_start_char(int ch)
Determine if a character is ok to start an id.

Parameters:
ch - the character in question.

id_char

protected static boolean id_char(int ch)
Determine if a character is ok for the middle of an id.

Parameters:
ch - the character in question.

find_single_char

protected static int find_single_char(int ch)
Try to look up a single character symbol, returns -1 for not found.

Parameters:
ch - the character in question.

swallow_comment

protected static void swallow_comment()
                               throws java.io.IOException
Handle swallowing up a comment. Both old style C and new style C++ comments are handled.

Throws:
java.io.IOException

preprocess

protected static void preprocess()
                          throws java.io.IOException
Preprocessor directives are handled here.

Throws:
java.io.IOException

do_symbol

protected static token do_symbol()
                          throws java.io.IOException
Process an identifier.

Identifiers begin with a letter, underscore, or dollar sign, which is followed by zero or more letters, numbers, underscores or dollar signs. This routine returns a str_token suitable for return by the scanner or null, if the string that was read expanded to a symbol that was #defined. In this case, the symbol is expanded in place

Throws:
java.io.IOException

checkIdentifier

public static java.lang.String checkIdentifier(java.lang.String str)
Checks whether Identifier str is legal and returns it. If the identifier is escaped with a leading underscore, that underscore is removed. If a the legal IDL identifier clashes with a Java reserved word, an underscore is prepended.

Parameters:
str - - the IDL identifier

Prints an error msg if the identifier collides with an IDL keyword.

strictJavaEscapeCheck

public static boolean strictJavaEscapeCheck(java.lang.String s)
called during the parse phase to catch clashes with Java reserved words.


needsJavaEscape

public static boolean needsJavaEscape(Module m)

next_token

public static token next_token()
                        throws java.io.IOException
Return one token. This is the main external interface to the scanner. It consumes sufficient characters to determine the next input token and returns it.

Throws:
java.io.IOException

real_next_token

protected static token real_next_token()
                                throws java.io.IOException
The actual routine to return one token.

Returns:
token
Throws:
java.io.IOException