• 6.5210/18.415: Advanced Algorithms
  • PSets
  • Calendar
  • Materials
  • Information
  • Splay Trees

    David Karger

    Binary Search Trees

    Splay Trees

    Sleator and Tarjan, “Self Adjusting Binary Search Trees” JACM 32(3) 1985

    Key design principle: Be lazy! That is, don’t work until you must.

    Path shortening:

    Re-balancing (via Rotations):

    Key operation: Splaying a node \(x\):

    Figures of rotations

    Implementing \(Search(x)\) operation:

    (Will talk about \(Insert(x)\) and \(Delete(x)\) later.)

    Analysis

    Our potential function: \(\Phi=\sum r(x)\).

    Some intuition:

    The Access Lemma

    Key technical fact:

    Access Lemma: Amortized time to splay a node \(x\) given root \(t\) is at most \[3(r(t)-r(x))+1 = O(\log(s(t)/s(x))).\] (Recall: Amortized cost of an operation \(=\) real cost \(+ \ \Delta \Phi\). )

    Note: \(x\) is the root of the tree after the splay operation and the rank of the root is always the same \(\log W\), where \(W=\sum_x w_x\) is the total weight of the nodes.
    \(\Rightarrow\) So, rank \(r'(x)\) after the splay is equal to \(r(t)\).
    \(\Rightarrow\) \(r(t)-r(x)=r'(x)-r(x)\) – the amortized cost of a splay is within a constant of its change in rank.
    Intuition:

    One immediate consequence of Access Lemma:

    Important detail: To make the above analysis work also for unsuccessful Search(x) operations, we need to splay the last node we visited in that search.

    Proof of the Access Lemma

           r                r' 
    
           z                x
          / \              / \
         y   D            A   y
        / \       ====>      / \
       x   C                B   z
      / \                      / \   
     A   B                    C   D
    

    Observe:

    Intuition:

    Actual Math (requires more care):

    Bounding Overall Change of Potential

    Important Caveat: The Access Lemma bounds only the amortized cost of each operation.
    \(\Rightarrow\) If there is a sequence of \(m\) operations, its total real cost is at most \[O(m \log n) - \left(\Phi_m - \Phi_0\right)=O(m \log n) + \Phi_0 - \Phi_m,\] where \(\Phi_m\) is the potential of our data structure at the end of processing the whole sequence and \(\Phi_0\) is its initial potential.
    Key question: How large can \(\Phi_0-\Phi_m\) be?

    Note:

    Also: Note that potential of empty tree is \(0\).
    \(\Rightarrow\) If we start with empty tree, no problem (but for that we need inserts...)

    Applications of Access Lemma

    By applying the above analysis with different weights we can derive a lot of interesting facts about performance of splay trees. (Key fact to keep in mind: the behavior of the splay tree is completely independent of our choice of the weights!)

    Update Operations

    We have two remaining operations we need to implement: \(Insert(x)\) and \(Delete(x)\). There are two ways to implement them:

    1. Insert/delete the item as in a standard BST and then just splay the inserted element/parent of deleted element.

      • Splaying ensures that our work is “paid for” by the resulting potential change.

      • But showing that this is the case requires a little bit of calculations.

    2. Alternatively, consider the following operations:

      • \(Split(x)\) splits the BST into two BST, one containing all the elements smaller or equal to \(x\) and one containing all the elements larger than \(x\).
        \(\Rightarrow\) Can be achieved by simply splaying \(x\) and thus takes amortized \(O(\log n)\) time (corresponds to taking all weights be equal to \(1\) in the analysis).

      • \(Join\) takes two BSTs \(S\) and \(T\) such that all elements in \(S\) are not larger than all elements in \(T\) and creates a single BST contains elements of both \(S\) and \(T\).
        \(\Rightarrow\) Can be achieved by splaying first the largest element \(x\) of \(S\) (note that then this element becomes a root with no right child) and adding the root of \(T\) as the right child of \(x\).
        \(\Rightarrow\) Joining increases only the potential of the new root, and only by at most \(O(\log n)\).
        \(\Rightarrow\) Thus, again, the whole operation takes \(O(\log n)\) amortized time (if we take all \(w_x\) equal to \(1\) in the analysis).

      Note: For most balanced trees such \(Split(x)/Join\) operations do not take only \(O(\log n)\) time!

    3. To implement \(Insert(x)\):

      • Perform \(Split(x')\), where \(x'\) is the largest element not larger than \(x\). (Finding \(x'\) can be easily amortized into the subsequent splay.)

      • Create a new BSTs with \(x\) as a root and the two BSTs obtained from splitting be its subtrees.
        \(\Rightarrow\) Total potential increased by \(O(\log n)\), i.e., the increase of potential of the root. (Remember: we are performing analysis with all weights being \(1\) here.)
        \(\Rightarrow\) Total amortized cost is \(O(\log n)\).

    4. To implement \(Delete(x)\):

      • Perform \(Split(x)\)
        \(\Rightarrow\) \(x\) will be the root of one of the resulting BSTs and will have no right child.

      • Remove \(x\) and perform \(Join\) on the resulting two BSTs.

      • \(O(\log n)\) amortized cost, as desired.

    Possible Heuristics