I've made really good progress on my LR(1) parser combinators.
The next step (which I'm feeling a little stuck on, mostly because I need more energy to actually figure it out and I do not have energy), is implementing the selective applicative functor bit.
Basically it allows for a finite amount of context sensitivity: as it is parsing, it gives the user's logic (formed from the parser combinators) the ability to reject parses during parsing.
There's one big problem that LR parsing is nondeterministic from the perspective of the combinators, so you don't know what combinator to apply to the rules at first. So it could be a little inaccurate.
For cases where there isn't ambiguity (shift/reduce or reduce/reduce conflicts), it doesn't make much sense to keep trying to refute the matches. Just let the LR algorithm do its thing.
It's pretty common to write LR grammars with tons of ambiguity and resolve them with token precedence – it's just a way to prioritize shift/reduce and reduce/reduce conflicts. I think this is called precedence operators, idk why. (Not operator precedence.)
But if you have that going on and selective parsing that allows you to reject cases, you must apply the selective rejection first, and then precedence operators.
This whole project has been an interesting adventure in different layers of logic and such like this: how they communicate and dialogue.
Anyways, if you're bored and want to follow along in the code:
okay, so what I need is a stack of associative lists from state items to acceptance/rejectance functions, and update it as necessary.
each state item can have multiple acceptance functions associated with it, representing different paths through the combinator tree to arrive at the same grammar rule.
this is basically why it can't be baked into the state table (although it could be baked into a separate expanded state table maybe? to avoid the expensive operations ...)
I'll probably implement it the slow way first, and then find some way to sneak it in as additional data on the LR table itself (using monoids of course xD)
it should be pretty obvious how to take the top-level rule (always accept, parse the main nonterminal followed by EOF), compute its closure (add the rules for the main nonterminal, and their acceptance functions for context sensitivity), and then advance through it (drop rules that don't mention the new thing, then push it through the remaining rules, and compute closure again).
whenever there's ambiguity, the parser will stop, apply all of the acceptance functions to reject rules (both shifting and reducing), then resolve remaining ambiguity with precedence operators, then prefer shifts, then arbitrarily choose a reduction/give up
or I could add GLR ...
I guess it's hard to compile context sensitivity into a table per se, because acceptance functions can reject and uhh that seems nasty to keep track of (exponential blow up?)
but I guess you could probably get away with an actual map from state items to acceptance functions, since uhh they have a boolean algebra and True is a distinguished element and the algebra handles filtering and uhhhh sleepy
