aptk

Write documented grammars.

class aptk.Grammar(s=None, **kargs)

Default grammar with basic tokens and rules.

This is the grammar, you will usually derive your grammars from.

It provides most common tokens:

SP   = \x20
NL   = \r?\n
LF   = \n
CR   = \r
CRLF = \r\n
ws   = \s+
ws?  = \s*

And a general ActionMap, which lets you connect your grammar to basic ParseActions:

:parse-action-map
    "$" make_string
    "@" make_list
    "%" make_dict
    "#" make_number
    "<" make_inherit
    ">" make_name
    "~" make_quoted

And most common rules:

ident   $= [A-Za-z_\-][\w\-]*
number  #= [+-]?\d+(\.\d+)?
integer #= \d+
ws      $= \b{ws}\b|{ws?}
line    $= [^\n]*\n

And define how args of BRANCH are parsed:

:args-of BRANCH string capturing non-capturing regex

:args-of  EXPR string capturing non-capturing raw 
            => aptk.oprec.OperatorPrecedenceParser
BRANCH(P, args, s=None, start=None, end=None)

lookahead and branch into some rule.

Example:

branched := <BRANCH{
             "a"    <a-rule>
             [bcd]  <bcd-rule>
             a|b    <a-or-b-rule>
             <default-rule>
            }>
aptk.parse(s, grammar, actions=None, rule=None)

parse s with given grammar and apply actions to produced lexems.

aptk.ast(s, grammar=None, actions=None, rule=None)

return ast of s if has one, else, parse s using grammar and actions and return it then

class aptk.parser.Parser(grammar, actions=None)

This combines an abstract grammar and parse-actions to a parser which produces an abstract syntax tree.

If no actions given, defaults to ParseActions object.

class aptk.grammar.BaseGrammar(s=None, **kargs)

Most basic grammar class.

A Grammar class has following attributes:

__metaclass__
GrammarType - the type of a grammar class
_TOKENS_

A dictionary of token-parsing regexes, which can be used with {name} for the smart value and {:name:} for the unchanged value.

Smart value means that if you specify a token like:

token = abcd

You still can quantify the token without having strange effects:

a-rule := foo{token}+

Will be translated to:

a-rule := foo(?:abcd)+

The other way of access:

b-rule := foo{:token:}+

Will be translated to:

b-rule := fooabcd+

You can use the second form for example for defining character classes:

word-chars = A-Za-z0-9_
dash       = \-
ident      = [{:word-chars:}{:dash:}]+

The tokens are evaluated directly after a rule-part is read.

_ACTIONS_

This dictionary maps rule-names to action-names, which are methods in either ParseAction object passed to parser or in Grammar. This map is created from implicit parse-action directives. Parse-actions are run on lexing a MatchObject and fill the ast-attribute of Lexem with life.

Implicit parse-actions are specified by _PARSE_ACTION_MAP_.

_START_RULE_
Name of start-rule if no other given.
class aptk.grammar.Grammar(s=None, **kargs)

Default grammar with basic tokens and rules.

This is the grammar, you will usually derive your grammars from.

It provides most common tokens:

SP   = \x20
NL   = \r?\n
LF   = \n
CR   = \r
CRLF = \r\n
ws   = \s+
ws?  = \s*

And a general ActionMap, which lets you connect your grammar to basic ParseActions:

:parse-action-map
    "$" make_string
    "@" make_list
    "%" make_dict
    "#" make_number
    "<" make_inherit
    ">" make_name
    "~" make_quoted

And most common rules:

ident   $= [A-Za-z_\-][\w\-]*
number  #= [+-]?\d+(\.\d+)?
integer #= \d+
ws      $= \b{ws}\b|{ws?}
line    $= [^\n]*\n

And define how args of BRANCH are parsed:

:args-of BRANCH string capturing non-capturing regex

:args-of  EXPR string capturing non-capturing raw 
            => aptk.oprec.OperatorPrecedenceParser
BRANCH(P, args, s=None, start=None, end=None)

lookahead and branch into some rule.

Example:

branched := <BRANCH{
             "a"    <a-rule>
             [bcd]  <bcd-rule>
             a|b    <a-or-b-rule>
             <default-rule>
            }>
aptk.grammar.compile(input, type=None, name=None, extends=None, grammar=None, filename=None)

compile a grammar

You can pass different inputs to this class, which has influence on return value.

# input is grammar

class:

class MyGrammar(Grammar):
    r"""This is my grammar class

    .. highlight:: aptk

    My grammar has following rule::

        <foo> = "bar"
    """

This is the way you usually invoke compile() with a grammar class, because compile() is invoked by GrammarType.

# Append whatever is defined in input to

grammar:

class MyGrammar(Grammar):
    r"""Here are rules defined"""

...

compile("here are more rules", grammar=MyGrammar)

input may be either a file object (something having a read() method) or a string.

# Create a new grammar named name, which extends grammars passed in
iteratable extends. If you do not pass extends, then your grammar will extends Grammar, extracting the rules from input.

# Simply compile input to a list of grammars.

list_of_grammars = compile(“”“

:grammar first some := <rule>

:grammar second another := <rule>

“””)

input may be either a file object (something having a read() method) or a string.

Parameters
input
Pass a grammar class, a string or whatever, which has a read() method, e.g. a file object.
type
Type of input, “sphinx” or “native”.
name
Name of grammar, which shall be created and keep the rules given in input.
extends
If you pass a name you may pass extends as a list of names of grammars.
grammar
If you pass a grammar class, the input is added to this grammar class.
filename
for informative purpose
Returns
A GrammarClass or (if no specific grammar given in some way) a list of grammar classes.

Operation Precedence Parser

Operation precedence parsers are intended to parse expressions, where never are multiple non-terminals (so one at maximum) in a row. Usually you will use it to parse (mathematical) expressions.

You can invoke OperationPrecedenceParser into your grammar by using:

:args-of OPTABLE string capturing non-capturing raw
         => aptk.oprec.OperatorPrecedenceParser

Then you can create rules like this:

my_rule_name1 := <OPTABLE{
:rule T <.term> ... }>
my_rule_name2 := <OPTABLE{
:rule T <.term2> :rule W “” :rule E ... }>

Every OPTABLE invokation creates a new rule.

In any Grammar-descending grammar operation precedence is accessible via rule EXPR:

:grammar arithmetic-1

expr  := <EXPR{
           :flags with-ops

           :op L E+E
           :op L E-E  = E+E
           :op L E*E  > E+E
           :op L E/E  = E*E
           :op L E**E > E*E
           :op L E++  > E**E
           :op L ++E  = E++
           :op L (E)  > E++
       }>

expr2 := <EXPR{
           :op L E+E
           :op L E-E  = E+E
           :op L E*E  > E+E
           :op L E/E  = E*E
           :op L E**E > E*E
           :op L E++  > E**E
           :op L ++E  = E++
           :op L (E)  > E++
       }>

term       := <number>

term       := <number>

<expr> ~~ 5 + 5 -> expr( E+E( number( '5' ), op( '+' ), number( '5' ) ) )
<expr> ~~ 1 + 2 + 3 -> expr( E+E( 
                         E+E(
                           number( '1' ), 
                           op( '+' ),
                           number( '2' ) 
                         ),
                         op( '+' ),
                         number( '3' )
                       ) )

<expr2> ~~ 5 + 5 * 4 -> expr2( E+E( number( '5' ), E*E( number( '5' ), number( '4' ) ) ) )
<expr2> ~~ 5**2 + 4**2/3**1 * 2 + 1 
       ->  expr2( E+E(
               E+E(
                 E**E( number( '5' ), number( '2' ) ), 
                 E*E(
                   E/E(
                     E**E( number( '4' ), number( '2' ) ),
                     E**E( number( '3' ), number( '1' ) )
                   ),
                   number( '2' )
                 )
               ),
               number( '1' )
           ) )

<expr2> ~~ 1*3+++++1 -> expr2( E+E( E*E( number( '1' ), E++( E++( number( '3' ) ) ) ), number( '1' ) ) )

<expr2> ~~ 1*3++ + ++1 -> expr2( E+E(
            E*E( number( '1' ), E++( number( '3' ) ) ),
            ++E( number( '1' ) )
        ) )


<expr2> ~~ 1*3+++(++1) -> expr2( E+E( E*E( number( '1' ), E++( number( '3' ) ) ), (E)( ++E( number( '1' ) ) ) ) )
              
<expr2> ~~ (1*3)++ -> expr2( E++(
                   (E)(
                     E*E(
                       number( '1' ),
                       number( '3' )
                     )
                   )
                ) )

prepostest1 := <EXPR{
               :op L ++E
               :op L E-- > ++E
              }>

<prepostest1> ~~ ++1-- -> prepostest1( ++E( E--( number( '1' ) ) ) )

prepostest2 := <EXPR{
               :op L ++E
               :op L E-- < ++E
              }>

<prepostest2> ~~ ++1-- -> prepostest2( E--( ++E( number( '1' ) ) ) )
class aptk.grammar_tester.GrammarTest(name, op, pos, input, actions, expected, skip=None)

simple class to save testdata

class aptk.grammar_tester.GrammarTestCase(name, grammar_test, grammar)

A TestCase for Grammar

class aptk.grammar_tester.RuleTest(name, op, pos, input, actions, expected, skip=None)

name specifies a rule

class aptk.grammar_tester.TokenTest(name, op, pos, input, actions, expected, skip=None)

name specifies a token

aptk.grammar_tester.generate_testsuite(grammar, suite=None, patterns=None)

gets a grammar class and maybe a suite

exception aptk.grammar_compiler.GrammarError(grammar_compiler, msg, **kargs)

exception in grammar compilation.

This exception is raised, if there is an error in grammar compilation.

Project Versions

Table Of Contents

Previous topic

Testing of aPTK Grammars

This Page