bzdww

Get answers and suggestions for various questions from here

# Data Structure - Binary Search Tree (BST)

We have long known that data structures are the disciplines that study how to store and extract data. There are many similar operations for storing and extracting data in life, such as:

When you go to the supermarket, put your own bag in the cupboard, then use a small note to sweep it when you leave, and the cabinet with your bag opens; assign a dictionary and store enough words in the dictionary. Find the specified content when needed; the library has tens of thousands of books, and if necessary, you need to find the location of the book directly by the title.

We found that the three examples just have one thing in common, that is, the stored data not only contains the data itself, but also contains a "key". When searching, we use our key to compare with the key in the database. If it matches, the data corresponding to this "key" is the one we want.

So in a searchable data structure, the data is stored in the form of "Key - Value". For a dictionary, the key is the word, and the value is the interpretation of the word; for the library The key is the title of the book, and the value is the book itself.

Very understandable, but how to achieve it?

The most straightforward idea is to make a one-way linked list. When inserting the "key-value", insert it directly into the front of the header as a new header. Then, when searching, compare them from start to finish. If you find Key The same, the value of this element is returned, and it is easy to implement if it is deleted.

But in the worst case, the time complexity may be O(n). You may not feel big, the sorting is still O(n^2), but this is actually relative, we know that O(log(n)) is a first-order complexity lower than O(n), say white For a large n, log(n) is much smaller than n, so if we can find a complexity O(log(n)), then for massive data, it is better than O(n). ) It’s much faster.

The idea is very simple. Instead of making a line, it is better to make a tree.

As mentioned in the last lesson, the biggest difference between a tree structure and a singly linked list is that when the search process goes from the root to the leaf, it will encounter a fork in the road. It is because of the existence of the fork, so that we can find us earlier. That Value.

For the binary tree, what is the choice of this fork? Or, based on what is best?

There seems to be something in the high school mathematics called dichotomy. That is, we need to find the root of a function. First take an interval, prove that the root must be in this interval, then take the midpoint, judge whether the midpoint is the root, and then continue iterating...

So the same for binary trees! We let each key act as the "Chuhe Hanjie", go to the right, you will encounter all the other keys larger than the key, go left, you will encounter a smaller key.

and so:

The Binary Search Tree has a binary tree structure, each node has a comparable Key, and for any node, the key of all its left nodes is smaller than its Key, and all right The node's Key is larger than its Key. (hereinafter referred to as BST)

for example:

The picture shows a binary search tree. First of all, it has a binary tree structure. Needless to say? Each ball is a node, each node is a "key-value", the number written on the ball is Key, we find that the data type of Key is a positive integer, obviously it can be compared (String type, etc.) Then we find that each element is perfectly squeezed to the left, smaller than it, to the right.

Ok, let's take a look at what a full search tree needs to do:

It’s so tired, I’m finally finished, so trouble~

I chose a few of them that may have problems.

What type of Key and Value is this? The answer is Generic Type! What is a generic? To put it plainly, it is abstract, formalized. When BST is initialized, we need to pass two types like parameters, and let BST become a specific search tree. For example, key can be an integer, value can be a string. Etc., generics are an important basic knowledge of Java. If you don't understand, readers will understand it.

Looking at the penultimate and third, what is this iterator (Iterable)?

To put it bluntly, all the keys are extracted and made into an array-like thing. Then we can use the for-each statement to traverse this array-like thing without affecting the contents of the BST.

For example, there are three key-value pairs in your BST:

"A": "Arab" | "B": "Banana" | "C": "China"

You can do this:

BST<String, String> bst = new BST()<String, String>

For (String s: bst.keys()) {

System.out.println(s);

}

The contents of the for statement parentheses indicate "for each string in bst.keys()", the statement will automatically traverse all the strings returned by bst.keys, because bst.keys itself is an iterator that inherits the Iterable interface.

Ok, let's take a look at the implementation of the important methods. First of all, warn everyone that the method will use a large number of Recursive methods. It is not difficult, don't panic.

Let's first look at the implementation of the get() and put() methods:

Suppose BST is as above, we call get(9), what happens? In fact, it is:

∵ 9 > 8 Go right

∵ 9 < 10 Go left

∵ 9 = 9 Returns the value corresponding to 9.

Suppose now that put(5, "Love") is called, then we:

∵ 5 < 9 Go left

∵ 5 < 6 Go left

∵ 5 > 6 Go right

∵ The current element is null, change the element to (5, “Love”)

In fact, it is quite simple, then we look at the floor() method. The purpose of floor(x) is that the key closest to x and smaller than x is similar to the floor, lower than you, but closest to you.

Then if we call floor(6.5), we will:

∵ 6.5 < 8 So the floor must be on the left side of 8 (because the floor is to be made smaller than 6.5, but all elements on the right side of 8 and 8 are strictly greater than 6.5, so it is impossible!)

∵ 6.5 > 6 So the floor may be on the right side of 6 (because 6 is smaller than 6.5, it satisfies the necessary conditions for making the floor, but there may be elements smaller than 6 and smaller than 6.5 on the right side)

∵ 6.5 < 7 So the floor must be on the left side of 7 and go left.

The left side of ∵ 7 is null, so the logic is: floor must be on the left side of 7, the floor must be 6 or on the right side of 6, take an intersection, the only possible floor is the original 6~

Since floor and ceiling are bilaterally symmetric, they will not be described again.

Next, let's talk about the delete() method. This method is the most complicated, that is, delete a node from the existing BST. One thing to note is that deleting an element here is different from the binary heap of the previous lesson. That is Priority. Queue, we just operate on leaf and root, here we are going to delete the elements of a specific location.

Let us see, if I want to delete 4, is it very simple? It's good to let the node where 4 is equal to null. The difficulty is to delete 6, which is the node that connects the two sides. You can't directly remove it, like this:

In this case, is 8 connected to three elements? Is it wrong?

what is it now? In fact, it is very simple. After deleting 6, I will let 7 be the original 6 location.

I let the left side of 7 point to 4, let the right side of 7 point to the right branch of the original 6, and then delete the smallest element of the right branch (that is, 7), and then it will do.

I suggest that when you think about this problem, you must learn to flatten the positions of these elements, that is, project them all onto a number axis, so that their size relationship is very clear. Before deleting, 6 and 7 are phases. Neighboring elements, 6 is gone, 7 takes over its position, and it is reasonable.

There are so many basic ideas, and the specific implementation still needs code code.

``````package BST;

import edu.princeton.cs.algs4.Queue;

public class MyBST<Key extends Comparable<Key>, Value> {
private Node root;

/**
*
* @param key
* @param val
*/
public MyBST() {

}

/**
* put a key-value pair in the BST/
*
* @param key
* @param val
*/
public void put(Key key, Value val) {
if (key == null) {
throw new IllegalArgumentException("can't pass null key");
}
if (val == null) {
delete(key);
return;
}
root = put(root, key, val);
}

public Node put(Node current, Key key, Value val) {
if (current == null) { return new Node(key, val, 1); }
int cmp = key.compareTo(current.key);
if (cmp < 0) {
current.left = put(current.left, key, val);
} else if (cmp > 0) {
current.right = put(current.right, key, val);
} else {
current.value = val;
}
current.size = 1 + size(current.left) + size(current.right);
return current;
}

/**
* get a value with a specified key.
*
* @throws Exception
*/
public Value get(Key key) {
if (key == null) {
throw new IllegalArgumentException("cannot pass null key");
}
return get(root, key);

}

private Value get(Node current, Key key) {
if (current == null) {
return null;
}
int cmp = key.compareTo(current.key);
if (cmp < 0) {
return get(current.left, key);
} else if (cmp > 0) {
return get(current.right, key);
} else {
return current.value;
}
}
/**
* delete a key-value pair with the specified key.
*
* @throws Exception
*/
public void delete(Key key) {
if (key == null) {
throw new IllegalArgumentException("cannot pass null key");
}
root = delete(root, key);
}

private Node delete(Node current, Key key) {
if (current == null) {

}
int cmp = key.compareTo(current.key);
if (cmp < 0) {
current.left = delete(current.left, key);
} else if (cmp > 0) {
current.right = delete(current.right, key);
} else {
if (current.left == null) {
return current.right;
} else if (current.right == null){
return current.left;
} else {
Node tmp = current;
current = min(tmp.right);
current.right = deleteMin(tmp.right);
current.left = tmp.left;
}
}
current.size = size(current.left) + size(current.right) + 1;
return current;
}

/**
* check if the BST contains the specified key.
*
* @param key
* @return true if exists, and vice versa.
* @throws Exception
*/
public boolean contains(Key key) {
if (key == null) {
throw new IllegalArgumentException("cannot pass null key");
}
return get(key) != null;
}

/*private boolean contains(Node current, Key key) {
if (current == null) {
return false;
}
int cmp = key.compareTo(current.key);
if (cmp < 0) {
return contains(current.left, key);
} else if (cmp > 0) {
return contains(current.right, key);
} else {
return true;
}
}
*/
/**
* check if the table is empty.
*
* @return true if empty, and vice versa.
*/
public boolean isEmpty() {
return size() == 0;
}

/**
* return the number of key-value pairs.
*
* @return
*/
public int size() {
if (root == null) {
return 0;
}
return root.size;
}
public int size(Node node) {
if (node == null) {
return 0;
}
return node.size;
}

/**
* return the smallest key.
*
* @return the smallest key.
*/
public Key min() {
return min(root).key;
}

private Node min(Node current) {
if (current.left == null) {
return current;
}
return min(current.left);
}

/**
* return the largest key.
*
* @return the largest key.
*/
public Key max() {
return max(root).key;
}

private Node max(Node current) {
if (current.right == null) {
return current;
}
return max(current.right);
}

/**
* return the largest key smaller than the specified key.
*
* @param key
* @return the largest key smaller than the specified key.
*/
public Key floor(Key key) {
return floor(root, key);
}

private Key floor(Node current, Key key) {
if (current == null) { return null; }
if (key.compareTo(current.key) < 0) { return floor(current.left, key); }
// if key is smaller than the current one.
if (key.compareTo(current.key) == 0) { return current.key; }
Key tmp = floor(current.right, key);
if (tmp != null) {
return tmp;
} else return current.key;
}
/**
* return the smallest key larger than the specified key.
*
* @param key
* @return the smallest key larger than the specified key.
*/
public Key ceiling(Key key) {
return ceiling(root, key);
}

private Key ceiling(Node current, Key key) {
if (current == null) { return null; }
if (key.compareTo(current.key) > 0) { return ceiling(current.right, key); }
// if key is larger than the current one.
if (key.compareTo(current.key) == 0) { return current.key; }
Key tmp = ceiling(current.left, key);
if (tmp != null) { return tmp; }
return current.key;
}
/**
* return the rank of the specified key.
*
* @param key
* @return the rank of the specified key.
*/
public int rank(Key key) {
if (key == null) {
throw new IllegalArgumentException("cannot pass null key");
}
return rank(root, key);
}

private int rank(Node current, Key key) {
if (current == null) {
return 0;
}
int cmp = key.compareTo(current.key);
if (cmp < 0) {
return rank(current.left, key);
} else if (cmp > 0) {
return 1 + size(current.left) + rank(current.right, key);
} else {
return size(current.left);
}
}

/**
* return the key with rank k.
*
* @param k
* @return key of rank k.
*/
public Key select(int k) {
if (k < 0 || k > size() - 1) {
throw new IllegalArgumentException("rank out of bound");
}
return select(root, k).key;
}

private Node select(Node current, int k) {
if (current == null) {
return null;
}
int cmp = k - size(current.left);
// 'cmp < 0' indicates node to be selected is in the left tree.
// 'cmp > 0' indicates node to be selected is in the right tree.
// 'cmp = 0' indicates node to be selected is the current node.
if (cmp < 0) {
return select(current.left, k);
} else if (cmp > 0) {
return select(current.right, k - size(current.left) - 1);
} else {
return current;
}
}
/**
* delete the smallest key.
*/
public void deleteMin() {
root = deleteMin(root);
}

private Node deleteMin(Node current) {
if (current.left == null) {
return current.right;
}
current.left = deleteMin(current.left);
current.size = size(current.left) + size(current.right) + 1;
return current;
}

/**
* delete the largest key.
*/
public void deleteMax() {
root = deleteMax(root);
}

private Node deleteMax(Node current) {
if (current.right == null) {
return current.left;
}
current.right = deleteMax(current.right);
current.size = size(current.right) + size(current.left) + 1;
return current;
}
/**
* return the number of keys in [lo, hi].
*
* @param lo minimum endpoint.
* @param hi maximum endpoint.
* @return the number of keys in [lo, hi].
*/
public int size(Key lo, Key hi) {
if (lo == null || hi == null) {
throw new IllegalArgumentException("cannot pass null key");
}
if (lo.compareTo(hi) > 0) return 0;
if (contains(hi)) {
return rank(hi) - rank(lo) + 1;
}
return rank(hi) - rank(lo);
}

/**
* keys in [lo, hi] in sorted order.
*
* @param lo
* @param hi
* @return keys in [lo, hi] in sorted order.
*/
public Iterable<Key> keys(Key lo, Key hi) {
if (lo == null || hi == null) {
throw new IllegalArgumentException("cannot pass null key");
}
Queue<Key> queue = new Queue<Key>();
keys(root, lo, hi, queue);
return queue;
}

private void keys(Node current, Key lo, Key hi, Queue<Key> queue) {
if (current == null) {
return;
}
int cmpLo = lo.compareTo(current.key);
int cmpHi = hi.compareTo(current.key);
if (cmpLo < 0) {
keys(current.left, lo, hi, queue);
}
if (cmpLo <= 0 && cmpHi >= 0) {
queue.enqueue(current.key);
}
if (cmpHi > 0) {
keys(current.right, lo, hi, queue);
}
}

/**
* all keys in the table in sorted order.
*/
public Iterable<Key> keys() {
if (isEmpty()) return new Queue<Key>();
return keys(min(), max());
}

private class Node {
private Value value;
private Key key;
private Node left, right;
int size;
public Node(Key key, Value value, int size) {
this.value = value;
this.key = key;
this.size = size;
}
}

public static void main(String[] args) {
MyBST<String, String> bst = new MyBST<String, String>();
bst.put("C", "Ci");
bst.put("B", "Bi");
bst.put("E", "Ei");
bst.put("D", "Di");
bst.deleteMax();
for (String s: bst.keys()) {
System.out.println(s);
}

}
}
``````

The basic idea has been explained, please read the comments to understand the meaning of the code.