QL is a query language designed and developed at Semmle. This topic is intended to serve as an introduction to QL for users with a background in general purpose programming as well as in databases. For a general overview and getting started information, see Semmle QL.
About query languages and databases
The purpose of a query language is to provide a programming platform where you can ask questions about information stored in a database. A database management system manages the storage and administration of data and provides the querying mechanism. A query typically refers to the relevant database entities and specifies various conditions (called predicates) that must be satisfied by the results. Query evaluation involves checking these predicates and generating the results. Some of the desirable properties of a good query language and its implementation include:
- Declarative specifications - A Declarative specification describes properties that the result must satisfy, rather than providing the procedure to compute the result. In the context of database query languages, declarative specifications abstract away the details of the underlying database management system and query processing techniques. This greatly simplifies query writing.
- Expressiveness - A powerful query language allows you to write complex queries. This makes the language widely applicable.
- Efficient execution - Queries can be complex and databases can be very large, so it is crucial for a query language implementation to process and execute queries efficiently.
A database system conforms to a database model that determines the logical structure of the data as well as the storage/access/manipulation strategy. A relational model – which stores data in tables – is the most commonly used database model and SQL (Structured Query Language) is the most commonly used query language for relational databases.
QL is a query language for relational databases. The syntax of QL is similar to SQL, but the semantics of QL are based on Datalog, a declarative logic programming language often used as a query language. This makes QL primarily a logic language, and all operations in QL are logical operations. Furthermore, QL inherits recursive predicates from Datalog, and adds support for aggregates, making even complex queries concise and simple. For example, consider a database containing parent-child relationships for people. If we want to find the number of descendants of a person, intuitively the following process can be used:
- Find a descendant of the given person, that is, a child or a descendant of a child.
- Count the number of descendants found using the previous step.
When this query is written in QL, it closely resembles the above structure. Note that in the above process we use recursion to find a descendant of a person, and an aggregate to count the number of descendants. Translating these steps into the final query without adding any procedural details is possible due to the declarative nature of the language.
QL and object orientation
Object orientation is an important feature of QL. The benefits of object orientation are well known – it increases modularity, enables information hiding and allows code reuse. QL offers all these benefits without compromising on its logical foundation. This is achieved by defining a simple object model where classes are modeled as predicates and inheritance as implication. The libraries made available by Semmle for all supported languages make extensive use of classes and inheritance.
QL and general purpose programming languages
Here are a few prominent conceptual and functional differences between general purpose programming languages and QL:
- QL does not have any imperative features such as assignments to variables or file system operations.
- QL operates on sets of values and a query can be viewed as a complex sequence of set operations that defines the result of the query.
- QL's set-based semantics makes it very natural to process collections of values without having to worry about efficiently storing, indexing and traversing them.
- In object oriented programming languages, instantiating a class involves creating an object by allocating physical memory to hold the state of that instance of the class. In QL, classes are just logical properties – they do not get instantiated to physical objects. In fact, QL only works with the values that are in the database, so no new values are ever created at runtime.
These topics are discussed in detail in the QL primer.
How do I get started?
The best way to learn QL is to choose a programming language that you know well and start running QL queries on it. There are lots of worked examples for most languages, and also cookbooks of simple QL queries you can use to identify common programming elements.
See Learning QL for an overview of the resources available.
The following academic paper also provides an overview of QL and its semantics:
Other useful references on database query languages and Datalog:
- Database theory: Query languages
- Logic Programming and Databases book - Amazon page
- Foundations of Databases