-1. initial setup: two windows. LHS = source, RHS = repl goal of part 1: I want to type in "1 + 2*3 + 4" and have the answer, 11, calculated for me. which of course we can do at the repl and in various other calculator programs already. introduce trifecta: this is a parser combinator library. there are various ones around. Parsec is the one I was going to use and the code here looks basically the same, but I like the error messages better. lot of other stuff - encourage you to read through the haddocks, which I hope this tutorial will help you understand, even though I don't mention many of the combinators here. haddocks are in part "trifecta" proper, and part an accompanying package Text.Parser. 0. introduce `number` as a sequence of digits 0a - `digit` = oneOf "0123456789" - demonstrate this at the stack repl - we can parse symbols. we get failure on "A" and success on "3". can show what those failures look like at the repl. - ENVSKILL: can type things at repl - show type of digit at the repl - ENVSKILL: can look at type signatures and attempt to understand them 0b - `some digit` - we can pass parsers to functions to get new parsers 0c - apply a function to the result 0d - write this into the source file and reload - ENVSKILL: demonstrates ability to reload 1. addition - only an addition, not a number on its own number + number ** CAN NOW PARSE "1+2" but not "30" 2,3,4. addition or number - <|> committed-choice - but that doesn't work here addition <|> number commits incorrectly on addition, give a number number <|> addition commits incorrectly on number, given an addition - introduce `try` - (try addition) <|> number * CAN NOW PARSE "1+2", and "30" * CANNOT PARSE "1+2+3" 5,6. several additions - change number + number to a (recursive) addition and a number - CAUTION: we can only do right recursion here, not left recursion - luckily addition is usually (not always) associative so it doesn't matter which way we do it, numberswise - demonstrate at the command line that it hangs otherwise. - describe why - first thing we do is recurse into ourselves - forever - if we recurse on the right, we've definitely consumed a bit of the input before we - note that there's a combinator in trifecta that abstracts this - deliberately not using it so as to see the guts. - CAN NOW PARSE "1+2+3" - CANNOT PARSE "1 + 2 + 3" X whitespace: - some of these things in the input we can regard as tokens that should be allowed to be followed by whitespace - numbers and + signs - we can match whitespace with many . oneOf (seen before) - note 'many' is like 'some' but succeeds on no whitespace. - we can put `many . oneOf " \t"` after each token that we want to allow whitespace to follow - we can do that explicitly, or we can make a helper combinator to do that: - tok p :: Parser a -> Parser a tok act = act <* (many . oneOf "...") - demonstrate at commandline - add to source code as part of number and addition definitions. - CAN NOW PARSE "1 + 2 + 3" - CANNOT PARSE "1 + 2 * 3 + 4" Y multiplication - terminology here: "factor" - the things you multiply "term" - the things you add - they fit in a hierarchy - that is what operator precedence is - when you do multiplications "first" and then additions "afterwards". - so we do it the other way round: we figure out the additions, looking for terms as we go, and then as part of looking for terms, we figure out the factors. - side note - is there something to do all this in an expression language for us as a helper? I vaguely remember seeing something. Z. could look at step 8,9 error handling? to introduce and fail and some of that? potential rewrite of "addition" to look very different by always matching a number and then an optional + number afterwards. see `option` combinator - to always do an add but add 0 if no further term. Comment: it looks uglier in the code, but error messages look better for the user - more focused - we've been able to commit more. Note `try` as enemy of error messages - imagine a syntax error in your program that points at character 1 line 1 and says "well this isn't a program" End of part 1. Part 2. options for expanding parser: (make these as six cards) 2.0 `satisfy` - none of the basic parsers we combined here are primitive - they're built up out of `satisfy` and we can look at those definitions ourself quite straightforwardly. 2.1 Novel literals: 2.1.1 roman numerals (three steps: iiii, mmiii, mcm) 2.1.2 Pi - named constant 2.2 brackets - don't need a try as the first symbol distinguishes between a terminal and a recursive expression: either it's a ( or not. 2.3 negative numbers, and subtractions as an operator. - can discuss how '-' symbol is not ambiguous in the two contexts - can discuss implementing two operators with same precedence 2.4 hex 2.5 floats - a more complex example than one might expect. would be interesting to say "ok here are some examples, everyone spend 5 minutes trying themselves to get somewhere" - especially as we have integers already. and especially if need to implement 'e' notation. 2.6 Instead of direct evaluation, form a data structure representing the parse tree. (justify why: because we might want to do other things to the structure rather than evaluate it - for example, pretty print it / render it in some formula rendering system, run optimisations on it, compile it to assembly code for later execution - although most of these are a bit ridiculous for the level of complexity that we have)