ANTLR4 : mismatched input -
i newbie antlr. want write grammar parse below input:
commit a1b2c3d4 the grammar given below ::
grammar commit; file : 'commit' commithash newline; commithash : [a-z0-9]+; date : ~[\r\n]+; newline : '\r'?'\n'; when try parsing above input using grammar, throws below exception::
line 1:0 mismatched input 'commit a1b2c3d4' expecting 'commit'
note : have intentionally added date token. without date token, works fine. know, happening when date token added.
i had referred link antlr4: mismatched input not still clear happened.
antlr lexers assign unambiguous token types before parser ever used. when 1 lexer rule can match more characters lexer rule, rule matching more characters preferred antlr, regardless of order in lexer rules appear in grammar. when 2 or more rules match same length of input symbols (and no other rule matches more number of input symbols), token type assigned rule appears first in grammar.
your lexer contains rule date matches characters except newline character. since matches entire text of line, , none of tokens span multiple lines, result following:
- if entire text of single line matches
commit, unnamed token corresponding input sequence produced. - if entire text of single line matches
[a-z0-9]+,commithashtoken created entire text of line.datematches input,commithashappears first used. - otherwise, if single line contains @ least 1 character,
datetoken created entire text of line. if line startscommitorcommithash,daterule used because matches longer sequence of characters. - finally,
newlinetoken created each newline.
you need 1 of following resolve problem. exact strategy depends on larger problem trying solve.
- remove
daterule, or rewrite match more specific date format. - use semantic predicates and/or lexer modes restrict location(s) in input
datetoken might produced.
Comments
Post a Comment