Robert Corbett wrote:
> > Transitions on declared, but unused tokens are not included into state
> > transitions. You normally would get $default transitions, but byacc
> > insists on specifying all tokens. This leads to wrong
> > results because a syntax-error is generated on a perfectly legal input.
> 
> No, you get a syntax error on illegal input.  I don't see a problem
> with that.

(sorry, I should have changed the lookahead to make it more clear in the
first post)

The correct output would be:
Success: Got tNL
yyerror: syntax error

Let me explain...
If you use 'yyclearin' in the rule "line: tTOK xpr ',' xpr" as the
action, then you will not get the syntax error. See below, the example
is adapted to reflect this. Now the example should never generate a
syntax error.

The problem lies in the fact that the parser stack contains a reducable
content *before* the error token should be shifted (unless you specify
an extra rule to match the error token as well with the rest of the
rule). LALR must reduce if any rule is delimited (regardless of the
trailing context). The reduction must happen *before* the lookahead
token is used as the input token. Afterall, the lookahead is just a hint
and not input yet.

> > Try this source, once without the tNL rule, and once with. A syntax
> > error is generated if the rule is not implemented. When the rule is
> > there, then the proper result is printed.
> 
> Yes, when there is no rule for reading a "tNL" it is an error to have
> a "tNL" in the input.  I do not understand why that is a problem.

Wrong. All tokens in the %token declaration are valid input tokens,
regardless of the fact whether they are used in rules. Thats why they
are called non-terminals. You are also allowed to use direct characters
like ',' *without* declaration in %token statements. How do you explain
those then?

Yacc should not care about which token it reads, just make the correct
decision.

Back to the problem...
The rule "xpr" is delimited by the lookahead that is not '+'. This is in
the state transition:
state 10
        line : tTOK xpr ',' xpr .  (3)
        xpr : xpr . '+' xpr  (4)

        '+'  shift 7
        .  reduce 3     (this keeps the lookahead!)

However, your byacc does not do this, and enumerates all possible
tokens:
state 10
        line : tTOK xpr ',' xpr .  (3)
        xpr : xpr . '+' xpr  (4)

        '+'  shift 7
        $end  reduce 3
        tTOK  reduce 3

This is simply wrong because the error is generated *before* the
reduction of rule (3). Remember, rule (3) modifies the lookahead, which
is perfectly legal to do. Lookahead tokens are only used to resolve
simple shift/reduce conflicts. The action is to shift if a token is seen
that expands the expression, otherwise it must reduce.

Greetings Bertho

---- file byacc-bug.y ----
%{
#include <stdio.h>
#include <stdlib.h>
%}
%token tTOK tNUM tNL
%left '+'

%%
lines : line
      | lines line
      ;
line  : tTOK xpr ',' xpr {
              if(yychar == tNL)
                {
                      printf("Success: Got tNL\n");
                        /* could also write "yychar = -1;" */
                      yyclearin;
                }
      }
 /* Declare NON-terminal as used */
 /*   | tNL   */
      ;
xpr   : xpr '+' xpr
      | tNUM
      ;
%%
int yylex(void)
{
      static int tok[] = {tTOK, tNUM, ',', tNUM, tNL, 0};
      static int idx = 0;
#define NTOK  (sizeof(tok)/sizeof(tok[0]))
      if(idx < NTOK)
              return tok[idx++];
      return 0;
}
int yyerror(char *s)
{
      printf("yyerror: %s\n", s);
      exit(1);
      return 0;
}
int main(void)
{
      return yyparse();
}

Reply via email to