Everything about Function Pushdown from TiDB to TiFlash

TiFlash is the columnar storage engine of TiDB, PingCAP’s flagship database product. It is the essential component of TiDB’s Hybrid Transactional and Analytical Processing (HTAP) architecture, and it enables users to access the value of their operational data in real time.

On April 5, we announced TiFlash as an open source project, and since then, we have received a lot of attention from the open source community. We have also launched many virtual code-reading meetups where we shared the stories and principles behind TiFlash’s design. I am so happy that many developers are interested in TiFlash and contributing to it.

In this post, I’ll show you how to contribute to TiFlash, using the pushdown function as an example. I’ve also selected a few TiFlash issues that you can use to practice contributing to the TiFlash community. Feel free to give them a try after you read this post.

TiDB architecture

Before we go into details why and how to push down functions from TiDB to TiFlash, let’s take a look at TiDB’s architecture.

TiDB is a distributed database with the following core components:

The TiDB server: a stateless SQL layer and TiDB’s computing engine. (Yes, it shares the name of the database system).
The Placement Driver (PD): the component that manages the metadata of the entire cluster.
The TiKV server: TiDB’s row storage engine.
The TiFlash server: TiDB’s columnar storage engine.

These components communicate with each other and form a complete TiDB system.

In this post, I’ll focus on how the TiDB server and TiFlash server work together to push down functions.

Function pushdown to TiFlash

The TiDB server pushes down operators to TiFlash which receives and executes them. Many operators, such as Projection and Selection, contain functions, so TiFlash must support and execute those functions as well. Otherwise, if a function fails, none of the related operators can be pushed down to TiFlash. In other words, TiFlash must support all functions pushed by the TiDB server.

Operators and functions pushdown from TiDB to TiFlash

How to push down functions

Now, let’s see how to push down functions from the TiDB server to TiFlash and how to make TiFlash support them.

Confirm the function execution logic

To let TiFlash support functions that are pushed down from the TiDB server, TiFlash’s execution logic must be consistent with TiDB’s. The sqrt function usually returns values in float64 format. Even if parameters are decimal, the evalReal function preprocesses them before they are converted into float64. The floor and ceil functions return either a decimal or integer value in the TiDB server according to parameters’ type and size.

These are typical value types, and it’s easy to keep them consistent in both TiFlash and the TiDB server. However, some functions involve special inputs and need special attention during implementation. For example, when the sqrt function calculates a negative number, should it return NaN, Null, or report an error?

So, before you start to contribute to TiFlash, it’s important to understand how TiDB implements those functions.

Make modifications on the TiDB side

Push down functions

Add functions that are to be pushed down to the /expression/expression.go::scalarExprSupportedByFlash directory in the TiDB repository. The TiDB planner decides whether or not the functions can be pushed down to TiFlash.

Unit test functions

Add the TestExprPushDownToFlash test function to the expression/expr_to_pb_test.go file in the TiDB repository. Execute the go test $BUILD/expression/expr_to_pb_test.go to run the unit tests locally.

Add the test function to the /planner/core/integration_test.go file in the TiDB repository. You can check the test function TestRightShiftPushDownToTiFlash in the same go file for reference. The name of the test case can use a format similar to Test${func_name}PushDownToTiFlash. Its test case is in the following form:

       store, clean := testkit.CreateMockStore(t)
        defer clean()
        tk := testkit.NewTestKit(t, store)
        tk.MustExec("use test")
        tk.MustExec("drop table if exists t")
        tk.MustExec("create table t (id int, value decimal(6,3), name char(128))")
        tk.MustExec("set @@tidb_allow_mpp=1; set @@tidb_enforce_mpp=1;")
        tk.MustExec("set @@tidb_isolation_read_engines = 'tiflash'")

        // Create virtual tiflash replica info.
        dom := domain.GetDomain(tk.Session())
        is := dom.InfoSchema()
        db, exists := is.SchemaByName(model.NewCIStr("test"))
        require.True(t, exists)
        for _, tblInfo := range db.Tables {
                if tblInfo.Name.L == "t" {
                        tblInfo.TiFlashReplica = &model.TiFlashReplicaInfo{
                                Count:     1,
                                Available: true,
                        }
                }
        }
        
        tk.MustQuery("explain select ${func}(a) from t;").Check(testkit.Rows(${plan}))

To verify whether the function ${func} in the ${plan} file is pushed down to TiFlash operators, run the unit test go test $BUILD/planner/core/integration_test.go locally.

Make modifications on the TiFlash side

Before we start to make a function pushdown, let’s learn some background information.

TiFlash vectorized computing

TiFlash is a vectorized analytical and computing engine. It not only stores and compresses data by column at the storage layer, but it also stores data in memory by column and calculates data by column at the computing layer.

As shown in figure below, TiFlash stores a batch of data in blocks in memory, and each block stores data in columns. When TiFlash computes, the column in the block is the computing unit and each column is processed and computed one by one.

How data is stored in TiFlash

IFunction interface

TiFlash’s function implementation codes are placed in the directory dbms/src/Functions. Let’s take the function FunctionLength in dbms/src/Functions/FunctionsString.cpp as an example to briefly introduce the workflow of a vectorized function. Vectorized functions inherit the IFunction interface in the dbms/src/Functions/IFunction.h directory. The interface is defined as follows. (Comments and some member functions are omitted).

class IFunction
{
public:
    virtual String getName() const = 0;
    
    virtual size_t getNumberOfArguments() const = 0;

    virtual DataTypePtr getReturnTypeImpl(const DataTypes & /*arguments*/) const;

    virtual void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result) const;
};

Notes:

getName returns the function name, which is its unique identifier.
getNumberOfArguments records how many arguments the vectorized function has.
getReturnTypeImpl infers data types for vectorized functions. This is because changes in the input parameter data types may change the output data types.
FunctionLength::getReturnTypeImpl returns values in Int64 format.
executeImpl is one of the main parts of a vectorized function and is responsible for its execution logic. It also determines whether a TiFlash vectorized function is vectorized enough or fast enough.
The behavior of FunctionLength::executeImpl is shown in the figure below. The FunctionLength reads str_column from the left storage block and creates len_column of the same size in the right block. In the str_column in the left column, the FunctionLength gets the str of each row, calls str.length(), and then inserts the results into the correspondinglen_column. Finally, the FunctionLength inserts the len_column into the computing block to complete a single round of calculation.

The workflow of the function FunctionLength


   void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result) const override
    {
        // 1.read str_column from block
        const IColumn * str_column = block.getByPosition(arguments[0]).column.get();
        
        // 2.create len_column
        int val_num = str_column->size();
        auto len_column = ColumnInt64::create();
        len_column->reserve(val_num);
        
        // 3.foreach str_column and compute
        Field str_field;
        for (int i = 0; i < val_num; ++i)
        {
            str_column->get(i, str_field);
            len_column->insert(static_cast<Int64>(str_field.get<String>().size()));
        }

        // 4.insert len_column to Block
        block.getByPosition(result).column = std::move(col_res);
    }

Data type

The code about TiFlash data types is placed in the directory dbms/src/DataTypes.

class IDataType : private boost::noncopyable
{
public:
    virtual String getName() const;

    virtual TypeIndex getTypeId();
    
    virtual MutableColumnPtr createColumn() const;
    
    ColumnPtr createColumnConst(size_t size, const Field & field) const;
}

DataType handles data type-related computing logic, such as type deduction and column creation. Each data type has a corresponding implementation in the formclass DataType${Type} final : public IDataType.

Note: DataType does not have a Nullable property. Instead, Nullable has a separate DataType implementation: DataTypeNullable in dbms/src/DataTypes/DataTypeNullable.h.

DataTypeNullable in dbms/src/DataTypes/DataTypeNullable.h

That’s why we get this implementation: DataTypeNullable(DataTypeString).isString() == false.

For DataTypeNullable, we usually use DataTypePtr data_type = removeNullable(nullable_data_type); to get its actual data types.

Columns

A column is a container that stores data during calculation. Column-related TiFlash code is placed in the dbms/src/Columns directory.

class IColumn : public COWPtr<IColumn>
{
public:
    virtual size_t size() const = 0;

    bool empty() const { return size() == 0; }

    virtual Field operator[](size_t n) const = 0;

    virtual void get(size_t n, Field & res) const = 0;
}

The logic to access the data stored in a column is:

for (size_t i = 0; i < column.size(); ++i)
    T data = column[i].get<T>();

There are two types of columns:

Constant columns: the ColumnConst in dbms/src/Columns/ColumnConst.h
Vector columns: the ColumnVector in dbms/src/Columns/ColumnVector.h

These two column types enable us to make special optimizations and implement specific functions more quickly. Let’s take the ModuloByConstantImpl in /dbms/src/Functions/modulo.cpp as an example. modulo(vector, const) converts a % b to a - a / b * b so that the function implementation can be speeded up. For detailed information, refer to this post.

The common way to use these two column types is shown below:

if (const ColumnVector * col = checkAndGetColumn<ColumnVector<Type>>(column.get()))
{
    // ...
}
else if (const ColumnConst * col = checkAndGetColumn<ColumnConst<Type>>(column.get()))
{
    // ...
}

We use DataType::CreateColumn and DataType::CreateColumnConst to build ColumnVector and ColumnConst respectively.

Type gymnastics with C++ templates

A vectorized function may contain many types of input parameters. Take the add function as an example: It has as many as 14 input data types including UInt8, UInt64, Int8, Int64, Float32, Float64, Decimal32 and Decimal256.

It can be very tedious to implement the execution logic for every data type. Using C++ templates for type gymnastics is a common way to develop functions more quickly. The following steps show you how to make the type gymnastic.

Abstract the execution logic of a vectorized function into a template function.

template<typename Type1, typename Type2>
void executeImpl(Column<Type1> arg1, Column<Type2> arg2, ...);

In IFunction::executeImpl, forward the parameters of different data types to the template function. There are at least two ways to forward parameters in TiFlash. You can choose whichever you prefer.

(1) Use DataType->getTypeId() to get the ID of each data type, and call the template function as a switch case, such as the PadImpl::executePad in dbms/src/Functions/FunctionsString.cpp.

       TypeIndex type_index = block.getByPosition(arguments[0]).type->getTypeId();
        switch (type_index)
        {
        case TypeIndex::UInt8:
            executeImpl<UInt8>(block, arguments);
            break;
        case TypeIndex::UInt16:
            executeImpl<UInt16>(block, arguments);
            break;
        case TypeIndex::UInt32:
            executeImpl<UInt32>(block, arguments);
            break;
        case TypeIndex::UInt64:
            executeImpl<UInt64>(block, arguments);
            break;
        case TypeIndex::Int8:
            executeImpl<Int8>(block, arguments);
            break;
        case TypeIndex::Int16:
            executeImpl<Int16>(block, arguments);
            break;
        case TypeIndex::Int32:
            executeImpl<Int32>(block, arguments);
            break;
        case TypeIndex::Int64:
            executeImpl<Int64>(block, arguments);
            break;
        default:
            throw Exception(fmt::format("the argument type of {} is invalid, expect integer, got {}", getName(), type_index), ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
        };

(2) Use castTypeToEither to get parameters’ data type and call the template function such as FormatImpl::executeImpl in dbms/src/Functions/FunctionsString.cpp.

   void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result) const override
    {
        bool is_type_valid = getType(block.getByPosition(arguments[0]).type, [&](const auto & type, bool) {
            using Type = std::decay_t<decltype(type)>;
            using FieldType = typename Type::FieldType;
            executeImpl<FieldType>(block, arguments);
                        
            return true;
        });
        if (!is_type_valid)
            throw Exception(fmt::format("argument of function {} is invalid.", getName()));
    }
    template <typename F>
    static bool getType(DataTypePtr type, F && f)
    {
        return castTypeToEither<
            DataTypeDecimal32,
            DataTypeDecimal64,
            DataTypeDecimal128,
            DataTypeDecimal256,
            DataTypeFloat32,
            DataTypeFloat64,
            DataTypeInt8,
            DataTypeInt16,
            DataTypeInt32,
            DataTypeInt64,
            DataTypeUInt8,
            DataTypeUInt16,
            DataTypeUInt32,
            DataTypeUInt64>(type.get(), std::forward<F>(f));
    }

Of course, if you are an experienced C++ developer, please use your own way to make this happen.

Push down functions

After you have learned all the background information, you can start to push down functions from TiDB to TiFlash.

Map TiDB functions to TiFlash functions

The TiDB server identifies a function as tipb::ScalarFuncSig, while TiFlash identifies a function as func_name.

When coding TiFlash, we map tipb::ScalarFuncSig to func_name in the form of a mapping table. The first step of pushing down a function is to rename the function in the TiFlash func_name table, and then add a mapping from tipb::ScalarFuncSig to func_name in the corresponding mapping table.

TiFalsh mapping table

Generally speaking, there are four types of functions: window function, aggregate function, distinct aggregation function, and scalar function. TiFlash has a mapping table for each function, and they correspond to one another as follows:

TiFlash mapping table	Functions
`window_func_map`	`window function`
`agg_func_map`	`aggregate function`
`distinct_agg_func_map`	`distinct aggregation function`
`scalar_func_map`	`scalar function`

TiFlash function mapping table

Build TiFlash functions

After you map the tipb::ScalarFuncSig table to func_name table, the pushdown function can automatically locate the corresponding TiFlash function builder according to its func_name, and use the builder to build a TiFlash function. Then, the TiFlash function will execute the function logic in the actual execution flow.

To build a TiFlash function, you can either reuse the pushdown function and create a new one.

Reusing functions saves time. For example, you can reuse the function ifNull(arg1, arg2) -> if(isNull(arg1), arg2, arg1) in TiFlash because creating an ifNull function is too time-consuming.

Note: In TiFlash, DAGExpressionAnalyzerHelper::function_builder_map records which functions are reused and how we reuse them.

Then, we add a corresponding DAGExpressionAnalyzerHelper::FunctionBuilder in TiFlash, and add the corresponding mapping <func_name, FunctionBuilder> in the DAGExpressionAnalyzerHelper::function_builder_map.

For specific FunctionBuilder implementations, you can refer to the DAGExpressionAnalyzerHelper.

When a pushdown function cannot be reused, you need to create a new function for TiFlash.

You need to write code in the dbms/src/Functions directory to implement the newly-created function. Different types of functions are stored in different subdirectories of the dbms/src/Functions. For example, the String related functions are stored in the FunctionString table, and you can call factory.registerFunction to register the function implementation to FunctionFactory.

Note: There are many ready-made ClickHouse functions in the dbms/src/Functions directory. Although not all of them are compatible with TiDB and cannot be used directly, you can search for functions with similar semantics and modify them to make them compatible with TiDB before you incorporate it into the TiFlash function system. This saves you a lot of time.

Unit test the pushdown functions

TiFlash’s unit test files are in the dbms/src/Functions/test directory, and the names are in gtest_${func_name}.cpp format.

The unit test template is as follows:

#include <TestUtils/FunctionTestUtils.h>
#include <TestUtils/TiFlashTestBasic.h>

namespace DB::tests
{
class {gtest_name} : public DB::tests::FunctionTest
{};

TEST_F({gtest_name}, {gtest_unit_name})
try
{
    const String & func_name = {function_name};
    // case1
    ASSERT_COLUMN_EQ(
        {ouput_result},
        executeFunction(
            func_name,
            {input_1},
            {input_2},
            ...,
            {input_n},);
    // case2 ...
    // case3 ...
}
CATCH

TEST_F({gtest_name}, {gtest_unit_name2})...
TEST_F({gtest_name}, {gtest_unit_name3})...
// ...
} // namespace DB::tests

Unit testing a TiFlash function should include at least four parts: data types, column types, boundary value types, and return value types. I will explain the four parts in detail using the function function(arg_1, arg_2, arg_3, … arg_n) as an example.

Data types

You need to test all the data types of each arg_i in the function function(arg_1, arg_2, arg_3, … arg_n). In theory, all arg_i should support DataTypeNullable(DataTypeNothing), but TiDB rarely uses it. So if you meet with such bugs, you can note them down first and fix them later.

Column types

For each data type arg_i supports, do the following:
If it is not nullable, you need to test two kinds of columns: ColumnVector and ColumnConst.

If it is nullable, you need to test three kinds of columns: ColumnVector, ColumnConst(ColumnNullable(non-null value)) and ColumnConst(ColumnNullable(null value)).

If it is DataTypeNullable(DataTypeNothing), you need to test two kinds of columns: ColumnVector and ColumnConst(ColumnNullable(null value)).

Boundary values

You can test a few commonly-used boundary values:

(1) Numeric (including int, double, and decimal): max/min, 0, and null

(2) String: empty, non-ASCII such as Chinese characters, null, and with/without collation

(3) Date: zero, before January 1,1970, daylight saving time, and null

For a specific function, you can also set boundary values according to its specific implementation.

Return value types

Make sure that return value types of TiFlash functions are consistent with that of MySQL and TiDB according to the MySQL documentation.

Notes:

TiFlash uses four internal representations of the decimal data: Decimal32, Decimal64, Decimal128, and Decimal256. You must test all of them.
In theory, each function’s arg_i should support all the data types that TiDB pushes down. However, it’s difficult to obtain all the types that TiDB may push down, so you can write your test based on the data types that TiFlash currently supports.
Some pushdown functions contain the data type information in their function signatures. For example, the a = bfunction signature includes EQInt, EQReal, EQString, EQDecimal, EQTime, EQDuration, and EQJson. Although a and b can be int, real, string, decimal, time, duration or json, TiDB ensures that they are in the same category when it pushes down the function. To optimize your workloads, you only need to test the equal function with the same data types.
For functions with infinite input parameters such as case when, you need to unit test its smallest loop.
Bugs may occur during the testing process. You can fix the easy ones when they come up. For more complicated bugs, open an issue to the TiFlash repository, and then annotate the corresponding test.

Run the unit tests

As described in the README.md file in the TiFlash repository, to run unit tests on your local system, you need to build -DCMAKE_BUILD_TYPE=DEBUG:

cd $BUILD cmake $WORKSPACE/tiflash -GNinja -DCMAKE_BUILD_TYPE=DEBUG ninja gtests_dbms # Most TiFlash unit tests ninja gtests_libdaemon # Settings related tests ninja gtests_libcommon

The unit test executables are at $BUILD/dbms/gtests_dbms, $BUILD/libs/libdaemon/src/tests/gtests_libdaemon and $BUILD/libs/libcommon/src/tests/gtests_libcommon.

Verification

After you develop functions on the TiDB and TiFlash sides, you need to verify if the pushdown process works.

Deploy a local cluster

First, you need to build your local TiFlash and TiDB binaries, and then use TiUP to start a cluster for testing:

tiup playground nightly --db.binpath ${my_tidb} --tiflash.binpath ${my_tiflash}

By default, one Placement Driver cluster, one TiKV, one TiDB, and one TiFlash cluster start together, and the nightly is referred to as the daily build of the master branch.

Then, use db.binpath and tiflash.binpath to specify your locally built TiDB and TiFlash. For a related deployment guide, see Quickly Deploy a Local TiDB Cluster.

Verify the pushdown process

Use a query with a format similar to explain select sum(sqrt(x)) from test to test if a function has been pushed down to TiFlash.

Create a TiFlash replica

create table test.t (xxx);
-- In general, just one node is started locally, so the number of TiFlash copies can only be set to 1.
alter table test.t set tiflash replica 1;

Use SQL commands to test the pushdown process. An example is below:

-- MPP is recommended. 
set tidb_enforce_mpp=1;
-- TiFlash is the mandatory processing engine. 
set tidb_isolation_read_engines='tiflash';
explain select xxxfunc(a) from t;

If the function is successfully pushed down to TiFlash, the explain clause output displays the Projection operator that contains this function is located in TiFlash. It usually takes a while to create a TiFlash replica. So, to make sure the execution result is correct, execute the explain clause several times. If you do that and still cannot find the function in TiFlash, there is a significant problem.

After the explain clause executes, you can remove it from the SQL commands and check the result again.

Integrated testing

To test the function pushdown, we suggest you add an integrated testing process.

In the TiFlash repository, /tests stores the integrated testing files. Before the committer merges your pull requests to the TiFlash master branch, the integrated test starts automatically. The actual TiDB, TiFlash, PD, and TiKV clusters will also be started to run the integrated tests.

Write integrated tests

Create a new file func.test in the /tests/fullstack-test/expr directory for the new pushdown function. Then set your own testing by referring to other function testing files such as the /tests/fullstack-test/expr/substring_index.test file in the same directory.

Run the integrated tests

To run integrated tests on your local system, do the following. You can also refer to scripts in the /tests directory.

Use TiUP to start your self-built TiDB and TiFlash clusters.
Modify the TiFlash and TiDB port configurations in the /tests/_env.sh file.
Call the /tests/run-test.sh to run the integrated test.

How to contribute

If you’re interested in TiFlash, feel free to join us and contribute to our GitHub repository. Here is some information you need to know before you start contributing to us.

Review the TiFlash Call for Participation page and select a function you’re interested in. On the issue page, comment the issue with /assign to claim your contribution. This way, the same function won’t be claimed twice.
Develop your function and test it locally by following the guidance above.
After the function is successfully pushed down to TiFlash, submit a pull request (PR) to the TiFlash repository and TiDB repository separately to make the corresponding changes. Be sure to:
- Put the TiFlash and TiDB PR links in each other’s description area.
- Add release notes in your TiFlash and TiDB PRs.
After both your PRs are reviewed and approved by at least two reviewers, the repository committer will merge your changes into the master branch of TiFlash repository.

Thank you for taking the time to read this long post. I really appreciate your interest. Please don’t forget to give this process a try and practice on your own. Feel free to claim an issue and contribute to us!

Keep reading:

Book a Demo

TiDB TiFlash

Experience modern data infrastructure firsthand.

Try TiDB Serverless

PingCAP/TiDB

Have questions? Let us know how we can help.

TiDB Cloud Dedicated

A fully-managed cloud DBaaS for predictable workloads

TiDB Cloud Serverless

A fully-managed cloud DBaaS for auto-scaling workloads

Start for Free Learn More

Everything You Need to Know about Function Pushdown from TiDB to TiFlash