Re: [xtext-dev] Dealing with variable length tokens (Hollerith)

Lieven Lemiengre Thu, 30 Jun 2016 08:58:49 -0700

Hi Kasper,

I have some experience with this, I've customized an xtext lexer for our
commercial product.


Overriding the mTokens method is the right way to do this but I do it a
little differently:

override mTokens() throws RecognitionException {
if(!SpecialTokensHandler.handle(input, state)) {
super.mTokens()
}
}

Detecting & emiting the 'HOLLERITH' token can be combined.

We do this because you also have to override the lexer for content assist.
Look for the generated subclass
of org.eclipse.xtext.ui.editor.contentassist.antlr.internal.Lexer. In that
lexer you have to override the same method.

To make sure that the token type from the runtime lexer & ui lexer are the
same you also have to change your mwe2 file.
Use parser.antlr.ex.rt.AntlrGeneratorFragment
& parser.antlr.ex.ca.ContentAssistParserGeneratorFragment instead of the
usual generators.


kind regards,
Lieven


2016-06-30 17:13 GMT+02:00 kaspergam <[email protected]>:

> I recently was asking about parsing IGES files using xtext, which included
> Hollerith strings in the specification. These strings are denoted by an int
> value, the number of characters, followed by a 'H' and then the string. To
> parse such tokens, you recommended I use a custom lexer. I was able to get
> decent parsing to work using this approach, but was curious if the way I am
> lexing is not optimal or recommended.
>
> To handle a token like 9Hmy String, for example, I added a terminal rule
> in my grammar called HOLLERITH with this definition:
>
> terminal HOLLERITH:
>     INT 'H' . ;
>
> and then created a new CustomIGESLexer that extended the generated
> InternalIEGSLexer. I then overrode the mTokens() method to check for these
> Hollerith strings first before allowing the internal lexer to work for any
> other token. I was wondering if this is a good approach, because I do not
> want to write a completely unique lexer, I just want to provide custom
> lexing for the Hollerith strings. The code is something like this:
>
> public void mTokens() throws RecognitionException {
>     if (isHollerith()) {
>         myRULE_HOLLERITH();
>     } else {
>         super.mTokens();
>     }
> }
>
> myRULE_HOLLERITH() {
> try {
>     int _type = RULE_HOLLERITH;
>     int _channel = DEFUALT_TOKEN_CHANNEL;
>
>     //... get the token, match the characters with match()
>
>     state.type = _type;
>     state.channel = _channel;
>     } finally {
>     }
> }
>
> I tried to resemble the style of the internal lexer when creating the
> custom rules. The isHollerith() just checks for an int followed immediately
> by a 'H'
>
>     private boolean isHollerith() {
>         int index = 1;
>         int cur = input.LA(index);
>         // See if an int starts the string
>         while (cur >= '0' && cur <= '9') {
>             index++;
>             cur = input.LA(index);
>         }
>         // Followed by an 'H'
>         return index > 1 && cur == 'H';
>     }
>
> This might be a terrible way to customize the lexer rules, but it works
> for now.
>
> Thank you,
>
> Kasper Gammeltoft
> Oak Ridge National Lab,
> Computer Science & Mathematics Division
> Computer Science Research Group
>
> _______________________________________________
> xtext-dev mailing list
> [email protected]
> To change your delivery options, retrieve your password, or unsubscribe
> from this list, visit
> https://dev.eclipse.org/mailman/listinfo/xtext-dev
>

_______________________________________________
xtext-dev mailing list
[email protected]
To change your delivery options, retrieve your password, or unsubscribe from 
this list, visit
https://dev.eclipse.org/mailman/listinfo/xtext-dev

Re: [xtext-dev] Dealing with variable length tokens (Hollerith)

Reply via email to