bzdww

Get answers and suggestions for various questions from here

Data Structures and Algorithms (11): Red Black Trees

cms

table of Contents

1. Characteristics of red-black trees

2, red-black tree self-correction

1, change the node color

2, right-handed

3, left-handed

3, left-handed and right-handed code

4, insert operation

5, delete the operation

6, the efficiency of red and black trees

In the previous blog, we introduced the binary search tree. For a node, the key value of the node of the left subtree is smaller than the key value of the node. The key values ​​of all nodes in the right subtree are larger than the node. Key value. The binary search tree is a data structure whose time complexity for finding, inserting, and deleting operations is O(logn) and the base is 2. But we say that this time complexity is reflected in the balanced binary search tree, that is, if the inserted data is random, it is very efficient, but if the inserted data is ordered, such as from small to large [10, 20, 30, 40, 50] Insert into the binary search tree:

From big to small, it is all on the right side, which is no different from the linked list. In this case, the time complexity of the search is O(N) instead of O(logN). Of course, this is under the most unbalanced conditions. In reality, the efficiency of the binary search tree should be between O(N) and O(logN), depending on the degree of imbalance of the tree.

So in order to be able to search for a tree with a faster time O(logN), we need to ensure that the tree is always balanced (or mostly balanced), that is, the number of left subtree nodes and the right of each node. The number of subtree nodes is as equal as possible. The red-black tree is such a balanced tree. For a data item to be inserted (delete as well), the insertion routine checks whether the characteristics of the tree will be destroyed. If it is destroyed, the program will correct it as needed. Change the structure of the tree to maintain the balance of the tree.

1. Characteristics of red-black trees

There are two characteristics as follows:

1, the nodes have colors;

2. In the process of inserting and deleting, follow the different arrangement rules for maintaining these colors.

The first one is well understood. In red-black trees, the color of each node is either black or red. Of course, it can be any other two colors. The color here is used for marking. We can add a boolean variable isRed to the node class Node to represent the color information.

Second, the rules that must be observed when inserting or deleting a node are called red-black rules:

1. Each node is not red or black;

2. The root node is always black;

3. If the node is red, its children must be black (and not necessarily vice versa), (that is, there must be no two consecutive red nodes on all paths from each leaf to the root);

4. Each path from the root node to the leaf node or empty child node must contain the same number of black nodes (ie the same black height).

The number of black nodes on the path from the root node to the leaf node is called the black height, and rule 4 indicates that the black height from the root to the leaf node path must be the same.

Note: The newly inserted node color is always red , because inserting a red node is less likely than inserting a black node against the red-black rule, because inserting a black node always changes the black height (in violation of rule 4). However, only half of the chances of inserting a red node will violate rule 3 (because the parent node is black and the parent node is red, it violates rule 3). In addition, violation of Rule 3 is easier to correct than Rule 4. When inserting a new node, this balance may be broken, so how is the red-black tree corrected?

2, red-black tree self-correction

The red-black tree mainly corrects the balance in three ways, changing the node color, left-handedness and right-handedness.

1, change the node color

The newly inserted node is 15, and the new newly inserted color is red. Then we find that direct insertion violates rule 3, and it is changed to black but it is found to violate rule 4. At this time we change the color of its parent node to black, and the color of the sibling node of the parent node is also changed to black. Usually its grandparent node 50 color will change from black to red, but since 50 is the root node, we can't change the root node color here.

2, right-handed

The first thing to note is that the node itself will not rotate. The rotation changes the relationship between the nodes. Select a node as the top of the rotation. If you do a right-hand rotation, the top node will move to the right and to the right. The position of the child node whose left child node is moved up to its original position. The top node of the right hand must have a left child.

3, left-handed

The top node of the left hand must have a right child.

Note : We change the color to help us determine when to perform the rotation, and the rotation is to ensure the balance of the tree. Light change node color can't play any role. Rotation is the key operation. After adding a node or deleting a node, it may break the balance of the binary tree. Then when to perform the rotation and what rotation to perform, this is the key we need to focus on. Concerned.

3, left-handed and right-handed code

1, node class

The node class is similar to the node class of the binary tree, except that a variable of type boolean is added to represent the color of the node.

public class RBNode<T extends Comparable<T>> {
	boolean color;// 颜色
	T key;// 关键值
	RBNode<T> left;// 左子节点
	RBNode<T> right;// 右子节点
	RBNode<T> parent;// 父节点

	public RBNode(boolean color, T key, RBNode<T> parent, RBNode<T> left,
			RBNode<T> right) {
		this.color = color;
		this.key = key;
		this.parent = parent;
		this.left = left;
		this.right = right;
	}

	// 获得节点的关键值
	public T getKey() {
		return key;
	}

	// 打印节点的关键值和颜色信息
	public String toString() {
		return "" + key + (this.color == RED ? "R" : "B");
	}
}

2, the specific implementation of left-handed

/*************对红黑树节点x进行左旋操作 ******************/
/* 
 * 左旋示意图:对节点x进行左旋 
 *     p                       p 
 *    /                       / 
 *   x                       y 
 *  / \                     / \ 
 * lx  y      ----->       x  ry 
 *    / \                 / \ 
 *   ly ry               lx ly 
 * 左旋做了三件事: 
 * 1. 将y的左子节点赋给x的右子节点,并将x赋给y左子节点的父节点(y左子节点非空时) 
 * 2. 将x的父节点p(非空时)赋给y的父节点,同时更新p的子节点为y(左或右) 
 * 3. 将y的左子节点设为x,将x的父节点设为y 
 */

private void leftRotate(RBNode<T> x) {
	// 1. 将y的左子节点赋给x的右子节点,并将x赋给y左子节点的父节点(y左子节点非空时)
	RBNode<T> y = x.right;
	x.right = y.left;
	if (y.left != null) {
		y.left.parent = x;
	}

	// 2. 将x的父节点p(非空时)赋给y的父节点,同时更新p的子节点为y(左或右)
	y.parent = x.parent;
	if (x.parent == null) {
		this.root = y;// 如果x的父节点为空(即x为根节点),则将y设为根节点
	} else {
		if (x == x.parent.left) {// 如果x是左子节点
			x.parent.left = y;// 则也将y设为左子节点
		} else {
			x.parent.right = y;// 否则将y设为右子节点
		}
	}

	// 3. 将y的左子节点设为x,将x的父节点设为y
	y.left = x;
	x.parent = y;
}

3, the specific implementation of right-handed

/*************对红黑树节点y进行右旋操作 ******************/ 
/*
 * 左旋示意图:对节点y进行右旋
 *        p                   p
 *       /                   /
 *      y                   x
 *     / \                 / \
 *    x  ry   ----->      lx  y
 *   / \                     / \
 * lx  rx                   rx ry
 * 右旋做了三件事:
 * 1. 将x的右子节点赋给y的左子节点,并将y赋给x右子节点的父节点(x右子节点非空时)
 * 2. 将y的父节点p(非空时)赋给x的父节点,同时更新p的子节点为x(左或右)
 * 3. 将x的右子节点设为y,将y的父节点设为x
 */
private void rightRotate(RBNode<T> y) {
	// 1. 将y的左子节点赋给x的右子节点,并将x赋给y左子节点的父节点(y左子节点非空时)
	RBNode<T> x = y.left;
	y.left = x.right;
	if (x.right != null) {
		x.right.parent = y;
	}

	// 2. 将x的父节点p(非空时)赋给y的父节点,同时更新p的子节点为y(左或右)
	x.parent = y.parent;
	if (y.parent == null) {
		this.root = x;// 如果y的父节点为空(即y为根节点),则旋转后将x设为根节点
	} else {
		if (y == y.parent.left) {// 如果y是左子节点
			y.parent.left = x;// 则将x也设置为左子节点
		} else {
			y.parent.right = x;// 否则将x设置为右子节点
		}
	}

	// 3. 将x的左子节点设为y,将y的父节点设为y
	x.right = y;
	y.parent = x;
}

4, insert operation

As with the binary tree insertion, you must first find the location of the insertion and then insert the node. First look at the inserted front code:

/*********************** 向红黑树中插入节点 **********************/
public void insert(T key) {
	RBNode<T> node = new RBNode<T>(RED, key, null, null, null);
	if (node != null) {
		insert(node);
	}
}

public void insert(RBNode<T> node) {
	RBNode<T> current = null;// 表示最后node的父节点
	RBNode<T> x = this.root;// 用来向下搜索

	// 1.找到插入位置
	while (x != null) {
		current = x;
		int cmp = node.key.compareTo(x.key);
		if (cmp < 0) {
			x = x.left;
		} else {
			x = x.right;
		}
	}
	node.parent = current;// 找到了插入的位置,将当前current作为node的父节点

	// 2.接下来判断node是左子节点还是右子节点
	if (current != null) {
		int cmp = node.key.compareTo(current.key);
		if (cmp < 0) {
			current.left = node;
		} else {
			current.right = node;
		}
	} else {
		this.root = node;
	}

	// 3.利用旋转操作将其修正为一颗红黑树
	insertFixUp(node);
}

This is the same as the idea implemented in the binary search tree. I won't go into details here. Let's take a look at the last step of the insertFixUp(node) operation in the method. Because the insertion may cause the tree to be unbalanced, the insertFixUp(node) method mainly discusses the situation, analyzes when to change color, when to turn left, and when to rotate right. Let's first analyze the specific situation theoretically, and then look at the specific implementation of insertFixUp(node).

If it is the first insertion, since the original tree is empty, it will only violate the rule 2 of the red-black tree, so you only need to black out the root node; if the parent node of the inserted node is black, it will not violate the red. - Black tree rules, nothing to do; but in the following three cases, we will start to change color and rotate:

1. The parent node of the inserted node and its uncle node (the other child node of the grandparent node) are all red.

2. The parent node of the inserted node is red, the uncle node is black, and the inserted node is the right child of its parent.

3. The parent node of the inserted node is red, the uncle node is black, and the inserted node is the left child of its parent.

Let's take a look at how to do these three situations, and then give the implementation code.

In the following discussion, N, P, G, U are used to represent the associated nodes. N (now) represents the current node, P (parent) represents the parent node of N, U (uncle) represents the uncle node of N, and G (grandfather) represents the grandparent node of N, that is, the parent node of P and U.

For case 1: the parent node of the inserted node and its uncle node (another child of the grandparent node) are both red. At this point, there must be a grandparent node, but I don't know if the parent node is its left or right child, but because of the symmetry, we only need to discuss one side, and the other naturally corresponds to it. Consider here the case where the parent node is the left child of its grandparent, as shown in the left image below:

In this case, we have to do the following: black out the parent node (5) and the uncle node (8) of the current node (4), redeem the grandparent node (7), and become the image shown above. Case. Then point the current node to its grandparent node and start the algorithm again from the new current node (see the steps below). In this way, the picture on the right becomes situation 2.

For case 2: the parent node of the inserted node is red, the uncle node is black, and the inserted node is the right child of its parent . The operations we need to do are: take the parent node (2) of the current node (7) as a new node, and do a left-hand operation with the new current node as a fulcrum. After completion, as shown in the lower left picture, the lower left picture becomes situation 3.

For case 3: the parent node of the inserted node is red, the uncle node is black, and the inserted node is the left child of its parent . The operations we have to do are: black out the parent node of the current node (7), redeem the grandparent node (11), and do a right-hand operation on the grandfather node as a pivot point. Finally, the root node is blackened and the entire red-black tree is restored to equilibrium, as shown in the upper right picture. At this point, the insert operation is complete!

We can see that if it happens from situation 1, it will inevitably go through situations 2 and 3, which means that this is an entire process. Of course, in practice, it may not necessarily happen from case 1, if from situation 2 Start to happen, then take a situation 3 to complete the adjustment, if you just adjust the situation 3 directly, then the first two situations do not need to be adjusted. Therefore, the relationship between discoloration and rotation can be expressed as: discoloration -> left-handed -> right-handed.

At this point, we have completed all the insertions. Let's take a look at the concrete implementation of the insertFixUp method (which can be combined with the above analysis diagram, which is more beneficial and understandable):

private void insertFixUp(RBNode<T> node) {
	RBNode<T> parent, gparent;// 定义父节点和祖父节点

	// 需要修正的条件:父节点存在,且父节点的颜色是红色
	while (((parent = parentOf(node)) != null) && isRed(parent)) {
		gparent = parentOf(parent);// 获得祖父节点

		// 若父节点是祖父节点的左子节点,下面的else相反
		if (parent == gparent.left) {
			RBNode<T> uncle = gparent.right;// 获得叔叔节点

			// case1:叔叔节点也是红色
			if (uncle != null && isRed(uncle)) {
				setBlack(parent);// 把父节点和叔叔节点涂黑
				setBlack(gparent);
				setRed(gparent);// 把祖父节点涂红
				node = gparent;// 把位置放到祖父节点处
				continue;// 继续while循环,重新判断
			}

			// case2:叔叔节点是黑色,且当前节点是右子节点
			if (node == parent.right) {
				leftRotate(parent);// 从父节点出左旋
				RBNode<T> tmp = parent;// 然后将父节点和自己调换一下,为下面右旋做准备
				parent = node;
				node = tmp;
			}

			// case3:叔叔节点是黑色,且当前节点是左子节点
			setBlack(parent);
			setRed(gparent);
			rightRotate(gparent);
		} else {// 若父节点是祖父节点的右子节点,与上面的情况完全相反,本质是一样的
			RBNode<T> uncle = gparent.left;

			// case1:叔叔节点也是红色的
			if (uncle != null && isRed(uncle)) {
				setBlack(parent);
				setBlack(uncle);
				setRed(gparent);
				node = gparent;
				continue;
			}

			// case2:叔叔节点是黑色的,且当前节点是左子节点
			if (node == parent.left) {
				rightRotate(parent);
				RBNode<T> tmp = parent;
				parent = node;
				node = tmp;
			}

			// case3:叔叔节点是黑色的,且当前节点是右子节点
			setBlack(parent);
			setRed(gparent);
			leftRotate(gparent);
		}
	}
	setBlack(root);// 将根节点设置为黑色
} 

5, delete the operation

The above discussion of the red-black tree insertion operation, the next discussion of deletion, the red-black tree deletion and the binary search tree deletion is the same, but after the deletion, there is a balance of repair. Let's first recall the deletion of the binary search tree:

1. If the node to be deleted has no child nodes, delete it directly.

2. If the node to be deleted has only one child node, delete it directly and replace it with its child nodes.

3. If the node to be deleted has two child nodes, this situation is more complicated: first find out its successor node, then process the relationship between the "successor node" and the "parent node of the deleted node", and finally process the "successor" The relationship between the child nodes of a node and the child nodes of a deleted node. There will be different situations in each step.

In fact, the deletion process is too complicated. In many cases, a delete flag is added to the node class, which is not a true delete node. Detailed deletions are not discussed here.

6, the efficiency of red and black trees

The red-black tree has a search, insert, and delete time complexity of O(log2N). The additional overhead is that the storage space of each node is slightly increased because of a color variable that stores the red-black tree nodes. Insertion and deletion time is increased by a constant factor, because to rotate, the average insertion requires about one rotation, so the time complexity of insertion is O(log2N), (the calculation of time complexity is to omit the constant), but actually It is slower than an ordinary binary tree.

In most applications, the number of lookups is more than the number of inserts and deletes, so applying a red-black tree instead of a normal binary search tree does not have much time overhead overall. Moreover, the advantage of red-black trees is that the operation of ordered data is not slow to the time complexity of O(N).

Next article:

Zhang Xiaokang: data structures and algorithms (XII): 2-3-4 tree zhuanlan.zhihu.com


Data Structure and Algorithm Series:

Zhang Xiaokang: Data Structures and Algorithms (a): Introduction zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (ii): Array zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (III): bubbling, selection, insertion sort zhuanlan.zhihu.com Zhang Xiaokang: algorithms and data structures (IV): Stack zhuanlan.zhihu.com Zhang Xiaokang: algorithms and data structures (V): queue zhuanlan.zhihu.com Zhang Xiaokang: algorithms and data structures (VI): a prefix, infix, postfix zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (VII): list zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (VIII): Recursive zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (IX): advanced sorting zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (j): binary zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (XI): red-black tree zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (XII): 2-3-4 tree zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (XIII): hash table zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (XIV): Heap zhuanlan.zhihu.com Zhang Xiaokang: data structures and algorithms (xv): powerless to map zhuanlan.zhihu.com