Calculator in Parsec
Spent some time on Parsec
recently, and would like to take some notes for personal reflection, which
hopefully could be of any use to others.
One simple calculator is a good example to illustrate how Parsec could be used, so we would ignore all kind of details, and only focus on the most essential part.
The intuitive way of writing expr
using BNF would be something like below, and most parser generators would accept it and generate the intended
parser given some extra rules for precedence and association.
expr = expr + expr
| expr - expr
| expr * expr
| expr / expr
| term
However, Parsec doesn’t support left recursion; the above rules would cause infinite loop in Parsec. Fortunately, this problem has gained enough attention so that dedicate solutions are provided in Parsec.
Before jumping to the final answer, it’s good to have a high level overview about how Parsec works from users’ perspective. Firstly, we need to
provide the definition of the language you want to parse, which includes but not limited to, reserved operations, keywords, comment style, etc. The
specific module for this task is Text.Parsec.Language
. Secondly, we build lexers from the language definition from first step, using module
Text.Parsec.Token
. The lexers constructed this way could take care of trailing spaces, which makes parsing much easier to work with. Last, we create
the expression parser using buildExpressionParser
from module Text.Parsec.Expr
. Here we could define all the operations supported, prefixed,
postfixed, and infixed, along with the precedence and association.
The complete code is shown below, less than 50 lines.
import Text.Parsec
import Text.Parsec.Language (emptyDef)
import qualified Text.Parsec.Token as P
import Text.Parsec.Expr
import Control.Applicative ((<$>))
lexer =
P.makeTokenParser emptyDef{
P.reservedOpNames = ["+", "-", "*", "/"]
}
reservedOp = P.reservedOp lexer
parens = P.parens lexer
natural = P.natural lexer
float = P.float lexer
expr = buildExpressionParser table term
term = parens expr
<|> fromIntegral <$> natural
<|> float
parse_expr = expr
table = [ [prefix "-" negate]
, [postfix "++" (+1), postfix "--" ((-)1)]
, [binary "*" (*) AssocLeft, binary "/" (/) AssocLeft]
, [binary "+" (+) AssocLeft, binary "-" (-) AssocLeft]
]
prefix name f = Prefix $ do {reservedOp name; return f}
postfix name f = Postfix $ do {reservedOp name; return f}
binary name f assoc = Infix (do {reservedOp name; return f}) assoc
p = print . (parse parse_expr "input")
inputs = [
"1" , "1.0" ,
"-1" , "-1.0" ,
"1++" , "1.0--" ,
"1*2" , "1/2" ,
"1+2" , "1-2" ,
"(1+2)*2" , "(1-2)/2"
]
main = do
mapM_ p inputs