Skip to content

SQL Query Pushdown

Audience: Data Users

Content Summary: This page describes predicate pushdown, the technique used to keep Immuta performant while keeping it flexible enough to connect to many different databases.

Overview

The Immuta Query Engine takes advantage of a performance optimization called pushdown, which means that parts of a query flow through the Immuta Query Engine and then are executed on the native database. One drawback of this process is that there are some parts of queries that are not pushed down to the native database, and then the work must happen in the Immuta layer as the data streams back through it. Depending on the operation, this may add latency.

To demonstrate the use of pushdown, let's consider two ways of executing the following query:

SELECT vendor_id
FROM taxi_trips
WHERE substr("vendor_id", 1, 1) = 'V'
LIMIT 10

If the query was not pushed down, Immuta might issue this statement to the backing database:

SELECT vendor_id
FROM DEFAULT.taxi_trips

Notice that the substr function is not passed to the native database at all. This means the substr "work" happens in the Immuta PostgreSQL layer as the rows stream back.

Immuta, however, actually does push down the substr function. The actual query that would be issued to the native database is

SELECT vendor_id
FROM default.taxi_trips
WHERE (substr(vendor_id,1,1) = 'V')
LIMIT 10

This allows Immuta to simply stream the results back to the user, which makes significant time difference when querying against a large data set.