ANTLR4 : mismatched input -
i newbie antlr. want write grammar parse below input:
commit a1b2c3d4
the grammar given below ::
grammar commit; file : 'commit' commithash newline; commithash : [a-z0-9]+; date : ~[\r\n]+; newline : '\r'?'\n';
when try parsing above input using grammar, throws below exception::
line 1:0 mismatched input 'commit a1b2c3d4' expecting 'commit'
note : have intentionally added date token. without date token, works fine. know, happening when date token added.
i had referred link antlr4: mismatched input not still clear happened.
antlr lexers assign unambiguous token types before parser ever used. when 1 lexer rule can match more characters lexer rule, rule matching more characters preferred antlr, regardless of order in lexer rules appear in grammar. when 2 or more rules match same length of input symbols (and no other rule matches more number of input symbols), token type assigned rule appears first in grammar.
your lexer contains rule date
matches characters except newline character. since matches entire text of line, , none of tokens span multiple lines, result following:
- if entire text of single line matches
commit
, unnamed token corresponding input sequence produced. - if entire text of single line matches
[a-z0-9]+
,commithash
token created entire text of line.date
matches input,commithash
appears first used. - otherwise, if single line contains @ least 1 character,
date
token created entire text of line. if line startscommit
orcommithash
,date
rule used because matches longer sequence of characters. - finally,
newline
token created each newline.
you need 1 of following resolve problem. exact strategy depends on larger problem trying solve.
- remove
date
rule, or rewrite match more specific date format. - use semantic predicates and/or lexer modes restrict location(s) in input
date
token might produced.
Comments
Post a Comment