HumanQL¶ ↑
Description¶ ↑
Human Query Language for full text search engines. Provides a lenient parser and associated tools for a self-contained and search-engine agnostic query language suitable for use by end users. Lenient in that is will produce a parse tree for any input, given a default operator and by generally ignoring any unparsable syntax. Suitable for use by end users in that it supports potentially several operator variants and a query language not unlike some major web search and other commercial search engines.
The query language supports the following features at a high level:
-
Boolean operators: AND (infix), OR (infix), NOT (prefix) with an implied default operator and precedence rules, e.g. “boy or girl -infant”
-
Optional parenthesis for explicitly denoting precedence.
-
Quoted phrases (for proximity matching)
-
Declarable prefix scopes, e.g. “TITLE:(car or bike)”
The main components are each highly customizable:
HumanQL::QueryParser — Parses any arbitrary input string and outputs an Abstract Syntax Tree (AST)
HumanQL::TreeNormalizer — Normalizes and imposes limits on an AST, e.g. avoids pathological queries.
HumanQL::QueryGenerator — Given an AST, generates a Human Query Language string.
HumanQL::PostgreSQLGenerator — Given an AST, generate strings suitable for passing to PostgreSQL’s to_tsquery function.
Other generators are possible.
License¶ ↑
Copyright © 2016-2021 David Kellum
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at:
www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.