C-Language-Series-#173-Binary-Search-Tree-in-C
Welcome back to our C Language Series! In this installment, we're diving deep into one of the most fundamental and efficient data structures for organizing and retrieving data: the Binary Search Tree (BST). Understanding BSTs is crucial for any serious programmer, offering a robust solution for maintaining sorted data and performing operations like search, insertion, and deletion with impressive average-case efficiency.
A Binary Search Tree is a hierarchical data structure that mimics the concept of a tree, but with specific rules that make it highly effective for searching. Let's break down its properties, implementation in C, and common operations.
What is a Binary Search Tree?
A Binary Search Tree is a node-based binary tree data structure which has the following properties:
- The left subtree of a node contains only nodes with keys less than the node's key.
- The right subtree of a node contains only nodes with keys greater than the node's key.
- Both the left and right subtrees must also be binary search trees.
- There must be no duplicate keys.
These properties ensure that an in-order traversal of a BST will always yield a sorted list of its elements.
Why Use Binary Search Trees?
- Efficient Searching: On average, searching for an element takes O(log N) time, where N is the number of nodes.
- Efficient Insertion and Deletion: Like searching, these operations also average O(log N) time.
- Ordered Data: BSTs naturally keep data in a sorted order, making operations like finding minimum/maximum elements or performing range queries very fast.
BST Node Structure in C
Before we implement any operations, we need to define the structure of a single node in our BST. Each node will contain a piece of data (an integer key in our example) and pointers to its left and right children.
#include <stdio.h>
#include <stdlib.h>
// Definition of a BST node
struct Node {
int key;
struct Node *left;
struct Node *right;
};
// Function to create a new node
struct Node* createNode(int key) {
struct Node* newNode = (struct Node*)malloc(sizeof(struct Node));
if (newNode == NULL) {
perror("Failed to allocate memory for new node");
exit(EXIT_FAILURE);
}
newNode->key = key;
newNode->left = NULL;
newNode->right = NULL;
return newNode;
}
Implementing Core BST Operations
1. Insertion into a BST
To insert a new node, we start at the root and traverse the tree. If the new key is less than the current node's key, we go left; otherwise, we go right. We continue this until we find an empty spot (a NULL pointer), where we attach the new node.
// Function to insert a new key into the BST
struct Node* insert(struct Node* root, int key) {
// If the tree is empty, return a new node
if (root == NULL) {
return createNode(key);
}
// Otherwise, recur down the tree
if (key < root->key) {
root->left = insert(root->left, key);
} else if (key > root->key) { // Disallow duplicates (or handle as per requirement)
root->right = insert(root->right, key);
}
// If key is equal to root->key, it's a duplicate, do nothing (or handle error)
return root;
}
2. Searching in a BST
Searching is similar to insertion. We compare the target key with the current node's key. If they match, we found it. If the target is smaller, we search in the left subtree; if larger, in the right subtree. If we reach a NULL pointer, the key is not in the tree.
// Function to search for a key in the BST
struct Node* search(struct Node* root, int key) {
// Base Cases: root is null or key is present at root
if (root == NULL || root->key == key) {
return root;
}
// Key is greater than root's key
if (key > root->key) {
return search(root->right, key);
}
// Key is smaller than root's key
return search(root->left, key);
}
3. Traversal: In-Order
In-order traversal visits the left subtree, then the current node, then the right subtree. For a BST, this prints the elements in sorted ascending order.
// Function to perform in-order traversal of the BST
void inorderTraversal(struct Node* root) {
if (root != NULL) {
inorderTraversal(root->left);
printf("%d ", root->key);
inorderTraversal(root->right);
}
}
4. Finding the Minimum Value Node
This helper function is crucial for deletion. In a BST, the minimum value in any subtree is always the leftmost node in that subtree.
// Function to find the node with the minimum key value in a BST
struct Node* minValueNode(struct Node* node) {
struct Node* current = node;
// Loop down to find the leftmost leaf
while (current && current->left != NULL) {
current = current->left;
}
return current;
}
5. Deletion from a BST
Deletion is the most complex operation, as it has three main cases:
- Node with no children (Leaf Node): Simply remove the node.
- Node with one child: Replace the node with its child.
- Node with two children: Find the in-order successor (the smallest node in the right subtree), copy its content to the node to be deleted, and then delete the in-order successor (which will either be a leaf or have one child).
// Function to delete a key from the BST
struct Node* deleteNode(struct Node* root, int key) {
// Base case: If the tree is empty
if (root == NULL) {
return root;
}
// Recur down the tree
if (key < root->key) {
root->left = deleteNode(root->left, key);
} else if (key > root->key) {
root->right = deleteNode(root->right, key);
} else { // Node with key to be deleted found
// Case 1: Node with only one child or no child
if (root->left == NULL) {
struct Node* temp = root->right;
free(root);
return temp;
} else if (root->right == NULL) {
struct Node* temp = root->left;
free(root);
return temp;
}
// Case 2: Node with two children: Get the in-order successor
// (smallest in the right subtree)
struct Node* temp = minValueNode(root->right);
// Copy the in-order successor's content to this node
root->key = temp->key;
// Delete the in-order successor
root->right = deleteNode(root->right, temp->key);
}
return root;
}
Putting It All Together: A Main Function Example
Here's a simple main function to demonstrate the usage of the BST operations we've implemented.
int main() {
struct Node* root = NULL;
/* Let's build a BST
50
/ \
30 70
/ \ / \
20 40 60 80
*/
root = insert(root, 50);
insert(root, 30);
insert(root, 20);
insert(root, 40);
insert(root, 70);
insert(root, 60);
insert(root, 80);
printf("In-order traversal of the given tree: \n");
inorderTraversal(root); // Expected: 20 30 40 50 60 70 80
printf("\n");
printf("\nSearching for 40: ");
if (search(root, 40)) {
printf("Found.\n");
} else {
printf("Not Found.\n");
}
printf("Searching for 90: ");
if (search(root, 90)) {
printf("Found.\n");
} else {
printf("Not Found.\n");
}
printf("\nDelete 20\n");
root = deleteNode(root, 20);
printf("In-order traversal of the modified tree: \n");
inorderTraversal(root); // Expected: 30 40 50 60 70 80
printf("\n");
printf("\nDelete 70\n");
root = deleteNode(root, 70);
printf("In-order traversal of the modified tree: \n");
inorderTraversal(root); // Expected: 30 40 50 60 80
printf("\n");
printf("\nDelete 50 (root node)\n");
root = deleteNode(root, 50);
printf("In-order traversal of the modified tree: \n");
inorderTraversal(root); // Expected: 30 40 60 80
printf("\n");
// Important: Free all allocated memory to prevent memory leaks
// (A full BST destruction function would be needed for larger applications)
// For simplicity, we skip a full recursive free for this example.
return 0;
}
Time Complexity Summary
The efficiency of BST operations depends on the height of the tree.
- Average Case: If the tree is balanced, the height is log N. Thus, operations like search, insert, and delete take O(log N) time.
- Worst Case: If the tree becomes skewed (like a linked list, e.g., inserting elements in strictly increasing order), the height can be N. In this scenario, operations can degrade to O(N) time.
To avoid the worst-case scenario, techniques like self-balancing BSTs (e.g., AVL trees, Red-Black trees) are used in practice.
Conclusion
Binary Search Trees are a powerful and essential data structure in computer science. They provide an excellent foundation for understanding more complex tree structures and algorithms. While basic BSTs offer good average-case performance, being aware of their worst-case scenarios and knowing about self-balancing variants is key to building robust applications.
Practice implementing these operations, experiment with different insertion orders, and observe how the tree structure changes. This hands-on experience will solidify your understanding. Stay tuned for more topics in our C Language Series!