Hi Kasper,
I have some experience with this, I've customized an xtext lexer for our
commercial product.
Overriding the mTokens method is the right way to do this but I do it a
little differently:
override mTokens() throws RecognitionException {
if(!SpecialTokensHandler.handle(input, state)) {
super.mTokens()
}
}
Detecting & emiting the 'HOLLERITH' token can be combined.
We do this because you also have to override the lexer for content assist.
Look for the generated subclass
of org.eclipse.xtext.ui.editor.contentassist.antlr.internal.Lexer. In that
lexer you have to override the same method.
To make sure that the token type from the runtime lexer & ui lexer are the
same you also have to change your mwe2 file.
Use parser.antlr.ex.rt.AntlrGeneratorFragment
& parser.antlr.ex.ca.ContentAssistParserGeneratorFragment instead of the
usual generators.
kind regards,
Lieven
2016-06-30 17:13 GMT+02:00 kaspergam <[email protected]>:
> I recently was asking about parsing IGES files using xtext, which included
> Hollerith strings in the specification. These strings are denoted by an int
> value, the number of characters, followed by a 'H' and then the string. To
> parse such tokens, you recommended I use a custom lexer. I was able to get
> decent parsing to work using this approach, but was curious if the way I am
> lexing is not optimal or recommended.
>
> To handle a token like 9Hmy String, for example, I added a terminal rule
> in my grammar called HOLLERITH with this definition:
>
> terminal HOLLERITH:
> INT 'H' . ;
>
> and then created a new CustomIGESLexer that extended the generated
> InternalIEGSLexer. I then overrode the mTokens() method to check for these
> Hollerith strings first before allowing the internal lexer to work for any
> other token. I was wondering if this is a good approach, because I do not
> want to write a completely unique lexer, I just want to provide custom
> lexing for the Hollerith strings. The code is something like this:
>
> public void mTokens() throws RecognitionException {
> if (isHollerith()) {
> myRULE_HOLLERITH();
> } else {
> super.mTokens();
> }
> }
>
> myRULE_HOLLERITH() {
> try {
> int _type = RULE_HOLLERITH;
> int _channel = DEFUALT_TOKEN_CHANNEL;
>
> //... get the token, match the characters with match()
>
> state.type = _type;
> state.channel = _channel;
> } finally {
> }
> }
>
> I tried to resemble the style of the internal lexer when creating the
> custom rules. The isHollerith() just checks for an int followed immediately
> by a 'H'
>
> private boolean isHollerith() {
> int index = 1;
> int cur = input.LA(index);
> // See if an int starts the string
> while (cur >= '0' && cur <= '9') {
> index++;
> cur = input.LA(index);
> }
> // Followed by an 'H'
> return index > 1 && cur == 'H';
> }
>
> This might be a terrible way to customize the lexer rules, but it works
> for now.
>
> Thank you,
>
> Kasper Gammeltoft
> Oak Ridge National Lab,
> Computer Science & Mathematics Division
> Computer Science Research Group
>
> _______________________________________________
> xtext-dev mailing list
> [email protected]
> To change your delivery options, retrieve your password, or unsubscribe
> from this list, visit
> https://dev.eclipse.org/mailman/listinfo/xtext-dev
>
_______________________________________________
xtext-dev mailing list
[email protected]
To change your delivery options, retrieve your password, or unsubscribe from
this list, visit
https://dev.eclipse.org/mailman/listinfo/xtext-dev