git @ Cat's Eye Technologies The-Glosscubator / master by-topic / Parsing / commentary / Chris Pressey.md
master

Tree @master (Download .tar.gz)

Chris Pressey.md @masterview markup · raw · history · blame

Commentary by Chris Pressey

This work is distributed under a CC-BY-ND-4.0 license, with the following explicit exception: the ratings may be freely used for any purpose with no limitations.

Parsing

Parsing as Deduction

  • rating: 2

See also: [Definite clause grammar][], [Chart parser][], [Earley Deduction][].

They say something interesting about the "proof procedure" in Prolog:

it has the same theoretical problems of the top-down backtrack parsing algorithms after which it is modeled.

Are they really saying Horn clause resolution was modelled after parsing procedures?

They note that Earley Deduction is a natural generalization of Earley parsing but that it's challenging to make it as efficient as Prolog's "proof procedure". In particular,

It is very costly to implement subsumption in its full generality.

When they say "definite clauses are a simple subset of first-order logic", how do they mean "subset"? Definite clauses (Horn clauses) may be syntactically a subset of what we usually think of as the language of first order logic, but they have the same expressive power. So, they must mean syntax.

A parser based on definite clauses is a subset in terms of expressive power because one of the arguments is monotonic -- the input is always consumed -- thus the deduction always terminates.

There is a good short description of chart parsing on page 3.

Definite clause grammar is to CFG as Earley deduction is to Earley parsing. (Is it? The point is that the parsing-function is a sub-function of the deduction-function.)

On complexity, they use the phrase "offline-parseable", which a grammar can have if it is not "infinitely ambiguous". I'm not sure what that means. But if a DCG is offline-parseable then it is decidable. This surely corresponds to (a) non-infinite amounts of ambiguity and (b) the input is finite and always consumed.

They also note that unrestricted DCGs are Turing-complete and LFGs (lexical-functional grammars, a unification-based formalism) can encode "exponentially hard" languages (probably they mean EXPTIME-complete, or something similar.)

Principles and Implementation of Deductive Parsing

  • rating: 1

.

Taming Context-Sensitive Languages with Principled Stateful Parsing

  • rating: 1

It's a slightly annoying paper in many ways; but some of the ideas are good.

[Context-sensitive features] are traditionally handled with ad-hoc code (e.g., custom lexers), outside of the scope of parsing theory.

True dat. I too would love to bring them into parsing theory (or rather, parsing-and-generating theory.)

Stateful parsing is a crude but effective way to implement a parser for a context-sensitive language. (Implementing a CSG straight-up is far too inefficient.) "Crude" is on the level of global variables though. The "principled" part is to make it less crude.

They don't like the idea of explicitly threading state through the grammar (and/or parser code); as this increases the dependencies; so they prefer this more implicit, global-variable-like setup.

This seems similar to how people like "implicit arguments" and monads in functional programming.

The "principled" part is largely about managing the state using transactional operations.

each parser invocation either succeeds or leaves the state untouched

But... isn't the transactional approach only required because the state is treated as updatable? Wouldn't a purely functional approach (using immutable data) simply discard the altered state and revert to the previous state? Or rather, it would be, "each parser invocation either succeeds and returns a new state, or fails and returns nothing", where it's understood the state is untouched.

So they don't appear to be thinking of functional programming (or "persistent" data structures) here, but charging ahead with mutable state.

Also: in my ideas for applications for this at least, wouldn't it sometimes to be valuable to send information back to the choice point on failure?

(Need to double-check this paper/impl: does it use ordered choice like PEG?)

research!rsc: Yacc is Not Dead

  • rating: 1

.

Yacc is dead: An update

  • rating: 2

.