I recently was asking about parsing IGES files using xtext, which included 
Hollerith strings in the specification. These strings are denoted by an int 
value, the number of characters, followed by a 'H' and then the string. To 
parse such tokens, you recommended I use a custom lexer. I was able to get 
decent parsing to work using this approach, but was curious if the way I am 
lexing is not optimal or recommended. 

To handle a token like 9Hmy String, for example, I added a terminal rule in my 
grammar called HOLLERITH with this definition:
terminal HOLLERITH:    INT 'H' . ;
and then created a new CustomIGESLexer that extended the generated 
InternalIEGSLexer. I then overrode the mTokens() method to check for these 
Hollerith strings first before allowing the internal lexer to work for any 
other token. I was wondering if this is a good approach, because I do not want 
to write a completely unique lexer, I just want to provide custom lexing for 
the Hollerith strings. The code is something like this:
public void mTokens() throws RecognitionException {    if (isHollerith()) {     
   myRULE_HOLLERITH();    } else {        super.mTokens();    }}
myRULE_HOLLERITH() {try {    int _type = RULE_HOLLERITH;    int _channel = 
DEFUALT_TOKEN_CHANNEL;
    //... get the token, match the characters with match()
    state.type = _type;    state.channel = _channel;    } finally {    }
}
I tried to resemble the style of the internal lexer when creating the custom 
rules. The isHollerith() just checks for an int followed immediately by a 'H'
    private boolean isHollerith() {
        int index = 1;
        int cur = input.LA(index);
        // See if an int starts the string
        while (cur >= '0' && cur <= '9') {
            index++;
            cur = input.LA(index);
        }
        // Followed by an 'H'
        return index > 1 && cur == 'H';
    }
This might be a terrible way to customize the lexer rules, but it works for 
now. 

Thank you,
Kasper GammeltoftOak Ridge National Lab, 
Computer Science & Mathematics Division
Computer Science Research Group
_______________________________________________
xtext-dev mailing list
[email protected]
To change your delivery options, retrieve your password, or unsubscribe from 
this list, visit
https://dev.eclipse.org/mailman/listinfo/xtext-dev

Reply via email to