van Emde Boas Tree

van Emde Boas Trees

  • Intro => Ordered Set
Given a universe $U=[u]$ where $u=2^w$ maintain subset $S\subseteq U, S =n$ under:

member(x, S)

insert(x, S)

delete(x, S)

empty(S)

min(S)

max(S)

predecessor(x, S)

successor(x, S)

  • Naive
If we are willing to spend $O( U )$ space
Store $S$ as a bit-array $L$ of length $ U $ such that $L[x]=[x\in S]$, and keep track of the min and max values explicitly.
用一组大小为 $u$ 的数列存 ${0, 1}$ 值, $0$ 表示该位置所表达的值不在 $S$ 中,若为 $1$, 则 $x\in S$。

image-20230117105359527

For predecessor and successor, the worst case is the predecessor (successor) is in the head (end) of the array. Delete takes $O( U )$ time for the worst case as it needs to maintain the min and max. Go through the array to find the min and max. While insert only takes $O(1)$. The step is to set the corresponding value to $1$, and compare it with the min and max in the data structure.
  • Bit-Trie

image-20230117110102123

  • Twolevel

image-20230117110410824

此时的高位是 Summary,低位是 Cluster 且每一个 Cluster 接收其对应的高位作为参数,如果对应的 h 没有的话,比如没有1101,则summary里只剩下${00, 01, 10}$,则对应的cluster则存空集。

The worst case of the running time

empty(S), min(S), max(S) => $O(1)$

member(x, S) => $O(1) + 1 \times naive=O(1)$. 根据 $h$ 找到对应的 $cluster$,在 $cluster$ 里调用 $member$ 找($naive$)。

predecessor(x, S), successor(x, S) => $O(1)+1\times naive=\Theta(2^{\lceil{w/2}\rceil})=\Theta(\sqrt{ U }))$.

image-20230117124507108

最差的情况一定会调用一次`predecessor`,而且`predecessor`是用在$[w/2]$上的

delete(x, S) => $O(1)+2\times naive=\Theta(\sqrt{ U })$. 因为得删两个 substructure 里的数据,所以是 $2\times naive=2\times \Theta(2^{\lceil{w/2}\rceil})$

insert(x, S) => $O(1)+ 2\times naive=O(1)$. 同样是要插入两次,所以也是$2\times naive$

  • Recursive

Instead of using the naive structure for summary and clusters[h], recursively use the same type of structure (stop recursion when w = 1).

image-20230117130550922

From my perspective, we recursively divide the original part, including the original high part and low part, by 2 until the number of the bit equals to 1. In the original case, the summary has converted into a substructure with high part and low part.

image-20230117132404936

And this new structure is called “proto-vEB”.

  • Recursive - Theorem => Theorem 1
**_The recursion depth of this structure, when used on the universe $U = [2^w]$ is $\lceil{\log_2 w}\rceil = O(\log\log U )$._**

image-20230117132812812

Running Time

empty(S), min(S) and max(S) => $O(1)$.

member(x, S) => $O(1)+1\times recursion=O(d(w))=O(\log\log U )$. 如果该数存在,那么一定要遍历整个结构找到最底层所存的数。
predecessor(x, S), and successor(x, S) => $O(1)+1\times recursion=O(d(w))=O(\log\log U )$.
insert(x, S) and delete(x, S) => $O(1)+2\times recursion=\Theta(2^{d(w)})=\Theta(w)=\Theta(\log U )$.
If w is even, the depth of the summary and the depth of each cluster is the same. 对于这两个方法,一次迭代是为了在cluster的最底层中做更新,另一次迭代是为了在summary里更新min和max。
  • **_vEB: worst case $O(\log\log U )$ time_**

image-20230117140445661

*The basic idea is that each substructure do not have the redundent data. For example, 0001 as the min in the set, the substructure of the summary has already stores the minimum 0001, therefore, there is no need to store this value in the other substructures. The reason that the first cluster’s substructure is empty (max > min) is because the corresponding summary 00 related value 0001 already stored in the top level structure.*

Running Time

PREDECESSORw(x, S)

image-20230117142641535

  1. First, exclude the case that $x>S.max$.

  2. Then, exclude the case that $w=1$.

  3. Calculate the high part and low part. Otherwise, we are in a case where we may have to do recursion.

    After line 7, $S.min < x \le S.max$.

  4. If the corresponding cluster is not empty and the cluster.min < s, then return the value by doing a recursive call to the PREDECESSOR.

  5. If the summary is empty or $p\le S.summary$ which means this value is the second minimum in the set. Return $S.min$.

  6. Otherwise, call the PREDECESSOR(p, S.summary) to find the previous $p$ and return the corresponding cluster’s maximum.

Theorem - 2

PREDECESSORw(x,S) takes worst case $O(d(w)) = O(\log\log U )$ time.

Proof

By observing the pseudocode, we know that the recursion depth is a $d(w)$ which is $\log\log U $ and this function does at most one recursive call and everything else takes constant time.

INSERTw(x, S)

image-20230117145744580

  • RS-vEB

image-20230117150411835

  • R^2^S-vEB

image-20230117150504410




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Line Segment Intersection
  • Graduation
  • Review of SwAV
  • Review of SimSiam
  • Extracts from BYOL