plainbox.impl.xparsers – parsers for various plainbox formats

This module contains parsers for several formats that plainbox has to deal with. They are not real parsers (as they can be handled with simple regular expressions most of the time) but rather simple top-down parsing snippets spread around some classes.

What is interesting though, is the set of classes and their relationships (and attributes) as that helps to work with the code.

Node and Visitor

The basic class for everything parsed is Node. It contains two attributes, Node.lineno and Node.col_offset (mimicking the python AST) and a similar, but not identical visitor mechanism. The precise way in which the visitor class operates is documented on Visitor. In general application code can freely explore (but not modify as everything is strictly read-only) the AST.

Regular expressions

We have to deal with regular expressions in many places so there’s a dedicated AST node for handling them. The root class is Re but it’s just a base for one of the three concrete sub-classes ReErr, ReFixed and RePattern. ReErr is an error wrapper (when the regular expression is incorrect and doesn’t work) and the other two (which also share a common base class ReOk) can be used to do text matching. Since other parts of the code already contain optimizations for regular expressions that are just a plain string comparison there is a special class to highlight that fact (ReFixed)

White Lists

White lists are a poor man’s test plan which describes a list of regular expressions with optional comments. The root class is WhiteList who’s WhiteList.entries attribute contains a sequence of either Comment or a subclass of Re.

class plainbox.impl.xparsers.Comment(*args, **kwargs)[source]

Bases: plainbox.impl.xparsers.Node

node representing single comment

as_dict() → dict

Return the data in this POD as a dictionary.

Note

UNSET values are not added to the dictionary.

as_tuple() → tuple

Return the data in this POD as a tuple.

Order of elements in the tuple corresponds to the order of field declarations.

col_offset

Column offset (0-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
comment

comment text, including any comment markers

Side effects of assign filters:
  • type-checked (value must be of type str)
  • constant (read-only after initialization)
enumerate_entries()
field_list = [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'comment'>]
lineno

Line number (1-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
namedtuple_cls

alias of Comment

visit(visitor: plainbox.impl.xparsers.Visitor)

Visit all of the sub-nodes reachable from this node

Parameters:visitor – Visitor object that gets to explore this and all the other nodes
Returns:The return value of the visitor’s Visitor.visit() method, if any. The default visitor doesn’t return anything.
class plainbox.impl.xparsers.Node(*args, **kwargs)[source]

Bases: plainbox.impl.pod.POD

base node type

as_dict() → dict

Return the data in this POD as a dictionary.

Note

UNSET values are not added to the dictionary.

as_tuple() → tuple

Return the data in this POD as a tuple.

Order of elements in the tuple corresponds to the order of field declarations.

col_offset

Column offset (0-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
enumerate_entries()[source]
field_list = [<Field name:'lineno'>, <Field name:'col_offset'>]
lineno

Line number (1-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
namedtuple_cls

alias of Node

visit(visitor: plainbox.impl.xparsers.Visitor)[source]

Visit all of the sub-nodes reachable from this node

Parameters:visitor – Visitor object that gets to explore this and all the other nodes
Returns:The return value of the visitor’s Visitor.visit() method, if any. The default visitor doesn’t return anything.
class plainbox.impl.xparsers.Re(*args, **kwargs)[source]

Bases: plainbox.impl.xparsers.Node

node representing a regular expression

as_dict() → dict

Return the data in this POD as a dictionary.

Note

UNSET values are not added to the dictionary.

as_tuple() → tuple

Return the data in this POD as a tuple.

Order of elements in the tuple corresponds to the order of field declarations.

col_offset

Column offset (0-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
enumerate_entries()
field_list = [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'text'>]
lineno

Line number (1-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
namedtuple_cls

alias of Re

static parse(text: str, lineno: int = 0, col_offset: int = 0) → plainbox.impl.xparsers.Re[source]

Parse a bit of text and return a concrete subclass of Re

Parameters:text – The text to parse
Returns:If text is a correct regular expression then an instance of ReOk is returned. In practice exactly one of ReFixed or RePattern may be returned. If text is incorrect then an instance of ReErr is returned.

Examples:

>>> Re.parse("text")
ReFixed(text='text')
>>> Re.parse("pa[tT]ern")
RePattern(text='pa[tT]ern', re=re.compile('pa[tT]ern'))
>>> from sre_constants import error
>>> Re.parse("+")
ReErr(text='+', exc=error('nothing to repeat',))
text

Text of the regular expression (perhaps invalid)

Side effects of assign filters:
  • type-checked (value must be of type str)
  • constant (read-only after initialization)
visit(visitor: plainbox.impl.xparsers.Visitor)

Visit all of the sub-nodes reachable from this node

Parameters:visitor – Visitor object that gets to explore this and all the other nodes
Returns:The return value of the visitor’s Visitor.visit() method, if any. The default visitor doesn’t return anything.
class plainbox.impl.xparsers.ReErr(*args, **kwargs)[source]

Bases: plainbox.impl.xparsers.Re

node representing an incorrect regular expression

as_dict() → dict

Return the data in this POD as a dictionary.

Note

UNSET values are not added to the dictionary.

as_tuple() → tuple

Return the data in this POD as a tuple.

Order of elements in the tuple corresponds to the order of field declarations.

col_offset

Column offset (0-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
enumerate_entries()
exc

exception describing the problem

Side effects of assign filters:
  • type-checked (value must be of type Exception)
  • constant (read-only after initialization)
field_list = [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'text'>, <Field name:'exc'>]
lineno

Line number (1-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
namedtuple_cls

alias of ReErr

parse(text: str, lineno: int = 0, col_offset: int = 0) → plainbox.impl.xparsers.Re

Parse a bit of text and return a concrete subclass of Re

Parameters:text – The text to parse
Returns:If text is a correct regular expression then an instance of ReOk is returned. In practice exactly one of ReFixed or RePattern may be returned. If text is incorrect then an instance of ReErr is returned.

Examples:

>>> Re.parse("text")
ReFixed(text='text')
>>> Re.parse("pa[tT]ern")
RePattern(text='pa[tT]ern', re=re.compile('pa[tT]ern'))
>>> from sre_constants import error
>>> Re.parse("+")
ReErr(text='+', exc=error('nothing to repeat',))
text

Text of the regular expression (perhaps invalid)

Side effects of assign filters:
  • type-checked (value must be of type str)
  • constant (read-only after initialization)
visit(visitor: plainbox.impl.xparsers.Visitor)

Visit all of the sub-nodes reachable from this node

Parameters:visitor – Visitor object that gets to explore this and all the other nodes
Returns:The return value of the visitor’s Visitor.visit() method, if any. The default visitor doesn’t return anything.
class plainbox.impl.xparsers.ReFixed(*args, **kwargs)[source]

Bases: plainbox.impl.xparsers.ReOk

node representing a trivial regular expression (fixed string)

as_dict() → dict

Return the data in this POD as a dictionary.

Note

UNSET values are not added to the dictionary.

as_tuple() → tuple

Return the data in this POD as a tuple.

Order of elements in the tuple corresponds to the order of field declarations.

col_offset

Column offset (0-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
enumerate_entries()
field_list = [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'text'>]
lineno

Line number (1-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
match(text: str) → bool[source]
namedtuple_cls

alias of ReFixed

parse(text: str, lineno: int = 0, col_offset: int = 0) → plainbox.impl.xparsers.Re

Parse a bit of text and return a concrete subclass of Re

Parameters:text – The text to parse
Returns:If text is a correct regular expression then an instance of ReOk is returned. In practice exactly one of ReFixed or RePattern may be returned. If text is incorrect then an instance of ReErr is returned.

Examples:

>>> Re.parse("text")
ReFixed(text='text')
>>> Re.parse("pa[tT]ern")
RePattern(text='pa[tT]ern', re=re.compile('pa[tT]ern'))
>>> from sre_constants import error
>>> Re.parse("+")
ReErr(text='+', exc=error('nothing to repeat',))
text

Text of the regular expression (perhaps invalid)

Side effects of assign filters:
  • type-checked (value must be of type str)
  • constant (read-only after initialization)
visit(visitor: plainbox.impl.xparsers.Visitor)

Visit all of the sub-nodes reachable from this node

Parameters:visitor – Visitor object that gets to explore this and all the other nodes
Returns:The return value of the visitor’s Visitor.visit() method, if any. The default visitor doesn’t return anything.
class plainbox.impl.xparsers.ReOk(*args, **kwargs)[source]

Bases: plainbox.impl.xparsers.Re

node representing a correct regular expression

as_dict() → dict

Return the data in this POD as a dictionary.

Note

UNSET values are not added to the dictionary.

as_tuple() → tuple

Return the data in this POD as a tuple.

Order of elements in the tuple corresponds to the order of field declarations.

col_offset

Column offset (0-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
enumerate_entries()
field_list = [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'text'>]
lineno

Line number (1-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
match(text: str) → bool[source]

check if the given text matches the expression

This method is provided by all of the subclasses of ReOk, sometimes the implementation is faster than a naive regular expression match.

>>> Re.parse("foo").match("foo")
True
>>> Re.parse("foo").match("f")
False
>>> Re.parse("[fF]oo").match("foo")
True
>>> Re.parse("[fF]oo").match("Foo")
True
namedtuple_cls

alias of ReOk

parse(text: str, lineno: int = 0, col_offset: int = 0) → plainbox.impl.xparsers.Re

Parse a bit of text and return a concrete subclass of Re

Parameters:text – The text to parse
Returns:If text is a correct regular expression then an instance of ReOk is returned. In practice exactly one of ReFixed or RePattern may be returned. If text is incorrect then an instance of ReErr is returned.

Examples:

>>> Re.parse("text")
ReFixed(text='text')
>>> Re.parse("pa[tT]ern")
RePattern(text='pa[tT]ern', re=re.compile('pa[tT]ern'))
>>> from sre_constants import error
>>> Re.parse("+")
ReErr(text='+', exc=error('nothing to repeat',))
text

Text of the regular expression (perhaps invalid)

Side effects of assign filters:
  • type-checked (value must be of type str)
  • constant (read-only after initialization)
visit(visitor: plainbox.impl.xparsers.Visitor)

Visit all of the sub-nodes reachable from this node

Parameters:visitor – Visitor object that gets to explore this and all the other nodes
Returns:The return value of the visitor’s Visitor.visit() method, if any. The default visitor doesn’t return anything.
class plainbox.impl.xparsers.RePattern(*args, **kwargs)[source]

Bases: plainbox.impl.xparsers.ReOk

node representing a regular expression pattern

as_dict() → dict

Return the data in this POD as a dictionary.

Note

UNSET values are not added to the dictionary.

as_tuple() → tuple

Return the data in this POD as a tuple.

Order of elements in the tuple corresponds to the order of field declarations.

col_offset

Column offset (0-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
enumerate_entries()
field_list = [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'text'>, <Field name:'re'>]
lineno

Line number (1-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
match(text: str) → bool[source]
namedtuple_cls

alias of RePattern

parse(text: str, lineno: int = 0, col_offset: int = 0) → plainbox.impl.xparsers.Re

Parse a bit of text and return a concrete subclass of Re

Parameters:text – The text to parse
Returns:If text is a correct regular expression then an instance of ReOk is returned. In practice exactly one of ReFixed or RePattern may be returned. If text is incorrect then an instance of ReErr is returned.

Examples:

>>> Re.parse("text")
ReFixed(text='text')
>>> Re.parse("pa[tT]ern")
RePattern(text='pa[tT]ern', re=re.compile('pa[tT]ern'))
>>> from sre_constants import error
>>> Re.parse("+")
ReErr(text='+', exc=error('nothing to repeat',))
re

regular expression object

Side effects of assign filters:
  • type-checked (value must be of type SRE_Pattern)
  • constant (read-only after initialization)
text

Text of the regular expression (perhaps invalid)

Side effects of assign filters:
  • type-checked (value must be of type str)
  • constant (read-only after initialization)
visit(visitor: plainbox.impl.xparsers.Visitor)

Visit all of the sub-nodes reachable from this node

Parameters:visitor – Visitor object that gets to explore this and all the other nodes
Returns:The return value of the visitor’s Visitor.visit() method, if any. The default visitor doesn’t return anything.
class plainbox.impl.xparsers.Visitor[source]

Bases: object

Class assisting in traversing Node trees.

This class can be used to explore the AST of any of the plainbox-parsed text formats. The way to use this method is to create a custom sub-class of the Visitor class and to define methods that correspond to the class of node one is interested in.

Example: >>> class Text(Node): ... text = F(“text”, str)

>>> class Group(Node):
...     items = F("items", list)
>>> class demo_visitor(Visitor):
...     def visit_Text_node(self, node: Text):
...         print("visiting text node: {}".format(node.text))
...         return self.generic_visit(node)
...     def visit_Group_node(self, node: Group):
...         print("visiting list node")
...         return self.generic_visit(node)
>>> Group(items=[
...     Text(text="foo"), Text(text="bar")
... ]).visit(demo_visitor())
visiting list node
visiting text node: foo
visiting text node: bar
generic_visit(node: plainbox.impl.xparsers.Node) → None[source]

visit method called on nodes without a dedicated visit method

visit()[source]

visit the specified node

class plainbox.impl.xparsers.WhiteList(*args, **kwargs)[source]

Bases: plainbox.impl.xparsers.Node

node representing a whole plainbox whitelist

as_dict() → dict

Return the data in this POD as a dictionary.

Note

UNSET values are not added to the dictionary.

as_tuple() → tuple

Return the data in this POD as a tuple.

Order of elements in the tuple corresponds to the order of field declarations.

col_offset

Column offset (0-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
entries

a list of comments and patterns

Side effects of assign filters:
  • type-checked (value must be of type list)
  • type-checked sequence (items must be of type Node)
  • constant (read-only after initialization)
enumerate_entries()
field_list = [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'entries'>]
lineno

Line number (1-based)

Side effects of assign filters:
  • type-checked (value must be of type int)
  • not negative
  • constant (read-only after initialization)
namedtuple_cls

alias of WhiteList

static parse(text: str, lineno: int = 1, col_offset: int = 0) → plainbox.impl.xparsers.WhiteList[source]

Parse a plainbox whitelist

Empty string is still a valid (though empty) whitelist

>>> WhiteList.parse("")
WhiteList(entries=[])

White space is irrelevant and gets ignored if it’s not of any semantic value. Since whitespace was never a part of the de-facto allowed pattern syntax one cannot create a job with ” ”.

>>> WhiteList.parse("   ")
WhiteList(entries=[])

As soon as there’s something interesting though, it starts to have meaning. Note that we differentiate the raw text ‘ a ‘ from the pattern object is represents ‘^namespace::a$’ but at this time, when we parse the text this contextual, semantic information is not available and is not a part of the AST.

>>> WhiteList.parse(" data ")
WhiteList(entries=[ReFixed(text=' data ')])

Data gets separated into line-based records. Any number of lines may exist in a single whitelist.

>>> WhiteList.parse("line")
WhiteList(entries=[ReFixed(text='line')])
>>> WhiteList.parse("line 1\nline 2\n")
WhiteList(entries=[ReFixed(text='line 1'), ReFixed(text='line 2')])

Empty lines are just ignored. You can re-create them by observing lack of continuity in the values of the lineno field.

>>> WhiteList.parse("line 1\n\nline 3\n")
WhiteList(entries=[ReFixed(text='line 1'), ReFixed(text='line 3')])

Data can be mixed with comments. Note that col_offset is finally non-zero here as the comments starts on the fourth character into the line:

>>> WhiteList.parse("foo # pick foo")
... 
WhiteList(entries=[ReFixed(text='foo '),
                   Comment(comment='# pick foo')])

Comments can also exist without any data:

>>> WhiteList.parse("# this is a comment")
WhiteList(entries=[Comment(comment='# this is a comment')])

Lastly, there are no exceptions at this stage, broken patterns are represented as such but no exceptions are ever raised:

>>> WhiteList.parse("[]")
... 
WhiteList(entries=[ReErr(text='[]', exc=error('un...',))])
visit(visitor: plainbox.impl.xparsers.Visitor)

Visit all of the sub-nodes reachable from this node

Parameters:visitor – Visitor object that gets to explore this and all the other nodes
Returns:The return value of the visitor’s Visitor.visit() method, if any. The default visitor doesn’t return anything.
comments powered by Disqus