Parsing and lexing

Introduction

Warning

This extension is EXPERIMENTAL. The behaviour of this extension including the names of its functions and any other documentation surrounding this extension may change without notice in a future release of PHP. This extension should be used at your own risk.

The parle extension provides general purpose lexing and parsing facilities. The implementation is based on » these libraries and requires a » C++14 capable compiler. The lexer is based on the regex matching, the parser is LALR(1). Lexers and parsers are generated on the fly and can be used immediately after they've been finalized. Parle deals with parsing and lexing, the appropriate data structures representation and processing are the implementer's task. Serialization and code generation are not supported by the extension, yet.

Lexer analysis is a process of splitting a character sequence into a list of lexemes. The lexeme list can be then used for the syntax analysis against a formal grammar. These operations are also known as lexing and parsing. This documentation doesn't aim to provide an exhaustive information on lexing and parsing. Good information in this regard is available on the numerous resources on the net. Several usage examples are included, to show the functionality. The extension is useful for PHP programmers both willing to learn or to utilize parsing and lexing. State machines and grammar parsing don't have to be implemented manually, these complex tasks are taken away by parle. Thanks to that, the development can be focused on the actual problem solving.

The common use case for parle is, when a data format is too complex to be handled by the regex matching with PCRE. The practical application is herewith wide. Be it a specific data format, a behavior modification of existing functions, even an own programming language and beyond. The helper methods such as Parle\Lexer::dump() to inspect the generated state machine, or Parle\Parser::dump() to inspect the generated grammar, are useful. The method Parle\Parser::trace() can also be used to track the parsing operation.

Installing/Configuring
- Requirements
- Installation
Predefined Constants
Pattern matching — Parle pattern matching
Examples
- Lexer examples
- Parser examples
Parle\Lexer — The Parle\Lexer class
- Parle\Lexer::advance — Process next lexer rule
- Parle\Lexer::build — Finalize the lexer rule set
- Parle\Lexer::callout — Define token callback
- Parle\Lexer::consume — Pass the data for processing
- Parle\Lexer::dump — Dump the state machine
- Parle\Lexer::getToken — Retrieve the current token
- Parle\Lexer::insertMacro — Insert regex macro
- Parle\Lexer::push — Add a lexer rule
- Parle\Lexer::reset — Reset lexer
Parle\RLexer — The Parle\RLexer class
- Parle\RLexer::advance — Process next lexer rule
- Parle\RLexer::build — Finalize the lexer rule set
- Parle\RLexer::callout — Define token callback
- Parle\RLexer::consume — Pass the data for processing
- Parle\RLexer::dump — Dump the state machine
- Parle\RLexer::getToken — Retrieve the current token
- Parle\RLexer::insertMacro — Insert regex macro
- Parle\RLexer::push — Add a lexer rule
- Parle\RLexer::pushState — Push a new start state
- Parle\RLexer::reset — Reset lexer
Parle\Parser — The Parle\Parser class
- Parle\Parser::advance — Process next parser rule
- Parle\Parser::build — Finalize the grammar rules
- Parle\Parser::consume — Consume the data for processing
- Parle\Parser::dump — Dump the grammar
- Parle\Parser::errorInfo — Retrieve the error information
- Parle\Parser::left — Declare a token with left-associativity
- Parle\Parser::nonassoc — Declare a token with no associativity
- Parle\Parser::precedence — Declare a precedence rule
- Parle\Parser::push — Add a grammar rule
- Parle\Parser::reset — Reset parser state
- Parle\Parser::right — Declare a token with right-associativity
- Parle\Parser::sigil — Retrieve a matching part of a rule
- Parle\Parser::sigilCount — Number of elements in matched rule
- Parle\Parser::sigilName — Retrieve a rule or token name
- Parle\Parser::token — Declare a token
- Parle\Parser::tokenId — Get token id
- Parle\Parser::trace — Trace the parser operation
- Parle\Parser::validate — Validate input
Parle\RParser — The Parle\RParser class
- Parle\RParser::advance — Process next parser rule
- Parle\RParser::build — Finalize the grammar rules
- Parle\RParser::consume — Consume the data for processing
- Parle\RParser::dump — Dump the grammar
- Parle\RParser::errorInfo — Retrieve the error information
- Parle\RParser::left — Declare a token with left-associativity
- Parle\RParser::nonassoc — Declare a token with no associativity
- Parle\RParser::precedence — Declare a precedence rule
- Parle\RParser::push — Add a grammar rule
- Parle\RParser::reset — Reset parser state
- Parle\RParser::right — Declare a token with right-associativity
- Parle\RParser::sigil — Retrieve a matching part of a rule
- Parle\RParser::sigilCount — Number of elements in matched rule
- Parle\RParser::sigilName — Retrieve a rule or token name
- Parle\RParser::token — Declare a token
- Parle\RParser::tokenId — Get token id
- Parle\RParser::trace — Trace the parser operation
- Parle\RParser::validate — Validate input
Parle\Stack — The Parle\Stack class
- Parle\Stack::pop — Pop an item from the stack
- Parle\Stack::push — Push an item into the stack
Parle\Token — The Parle\Token class
Parle\ErrorInfo — The Parle\ErrorInfo class
Parle\LexerException — The Parle\LexerException class
Parle\ParserException — The Parle\ParserException class