Introduction to Segment Trees (Range Minimum Query)


Range Minimum Query

Given an array A of length N , answer getMin(i, j) qeuries. getMin(i, j) should return the index m , where A[m] = min_{i<k<j}\{A[k]\} .

There are many ways to do this. In the case where A 's element values are allowed to change, and yet we have to maintain getMin(i, j) to work correctly, a segment tree is our best option. Follows is a brief explanation of segment trees for this specific example. I'll put a more generalized explanation in another post.

Segment Trees

The idea behind a segment tree is to build a binary tree where each node represents a segment of A . For Range Minimum Query, each tree node will contain the value of getMin(segment) (the index of the minimum value in the node’s segment of the array).

Now, how are nodes and segments assigned? For simplicity, let’s assume that A ‘s length is a power of two:

  • Starting at the lowest level of the tree, we’ll have N leaf nodes each representing an element of A. As shown in the following figure, if we label leaf nodes from 0 to N - 1 from left to right, each node i will contain the value i .
  • For each internal node node starting from the bottom, we compute node.value as follows: node.value = m, A[m] = min\{A[node.left.value], A[node.right.value]\} . Meaning, a parent node node will get the value of one of its children such that A[node.value] is minimum.
  • Since this is a binary tree with the last level having N nodes, the tree will have the height of log(N) + 1 . Level 0 will have one node, the root, which holds getMin(0, N - 1) . This means that each node on a level L will represent a segment of length 2^{log(N) + 1 - L} .

rmq

The implementation is going to be straight forward from here on. We’ll represent the tree as an array tree of length 2 << ceil(log2(N + 1)). Node i has a left child 2 * i and a right child 2 * i + 1. On an update(i), we’ll traverse the tree from top to bottom, change array value at when we reach the leaf i, and update the tree as we go back up. On a getMin(range), we’ll recurse from root until we visit all maximal sub-segments of the range and return the required value. Here’s an implementation in C++ with this tutorial as a reference:

#include <iostream>
#include <vector>
#include <cstring>
#include <cmath>
using namespace std;

class segTree {
    // O(n)
    int *array, *tree;
    int arrayLen, treeLen;

    // O(n)
    void initialize(int node, int b, int e) {
        if (b == e)
            tree[node] = b;
        else {
            // recurse
            initialize(2 * node, b, (b + e) / 2);
            initialize(2 * node + 1, (b + e) / 2 + 1, e);
            // update value
            if (array[tree[2 * node]] <= array[tree[2 * node + 1]])
                tree[node] = tree[2 * node];
            else
                tree[node] = tree[2 * node + 1];
        }
    }
public:
    segTree(int *array, int arrayLen) {
        this->arrayLen = arrayLen;
        this->array = array;
        this->treeLen = 2 << (int)ceil(log2(arrayLen));
        cout << "treeLen=" << treeLen << endl;
        this->tree = new int[treeLen];
        memset(tree, -1, sizeof(int) * treeLen);
        initialize(1, 0, arrayLen - 1);
    }

    // O(log n)
    void update(int i, int v, int node = 1, int b = 0, int e = 0) {
        e = arrayLen - 1 - e;
        if (b == e) {
            array[i] = v;
        } else {
            int mid = (b + e) / 2;
            if (i <= mid)
                update(i, v, 2 * node, b, arrayLen - 1 - mid);
            else
                update(i, v, 2 * node + 1, mid + 1, arrayLen - 1 - e);
            if (array[tree[2 * node]] <= array[tree[2 * node + 1]])
                tree[node] = tree[2 * node];
            else
                tree[node] = tree[2 * node + 1];
        }
    }

    // O(log n)
    int query(int i, int j, int node = 1, int b = 0, int e = 0) {
        e = arrayLen - 1 - e;
        // bad interval
        if (i > e || j < b)
            return -1;
        // good interval
        if (b >= i && e <= j)
            return tree[node];
        // partial interval
        int left = query(i, j, 2 * node, b, arrayLen - 1 - (b + e) / 2);
        int right = query(i, j, 2 * node + 1, (b + e) / 2 + 1, arrayLen - 1 - e);
        if (left == -1)
            return tree[node] = right;
        if (right == -1)
            return tree[node] = left;
        if (array[left] <= array[right])
            return tree[node] = left;
        return tree[node] = right;
    }
};

int main() {
    int A[10] = { 2, 4, 3, 1, 6, 7, 8, 9, 1, 7 };
    segTree t(A, 10);
    cout << "getMin(0, 4) = " << t.query(0, 4) << endl;
    t.update(1, 0);
    cout << "getMin(0, 4) = " << t.query(0, 4) << endl;
    t.update(0, -1);
    cout << "getMin(0, 4) = " << t.query(0, 4) << endl;
    return 0;
}