Unlocking Efficiency: Counting Sort in C
In our ongoing C Language Series, we've explored a multitude of sorting algorithms, from the comparison-based giants like Merge Sort and Quick Sort to simpler ones like Bubble Sort. Today, we delve into a fascinating non-comparison-based algorithm: Counting Sort. This algorithm offers exceptional performance under specific conditions, making it a valuable tool in your programming arsenal.
What is Counting Sort?
Counting Sort is an integer sorting algorithm that operates by counting the number of occurrences of each distinct element in the input array. It then uses these counts to determine the positions of each element in the sorted output array. Unlike comparison sorts (which have a lower bound of O(n log n)), Counting Sort can achieve linear time complexity, O(n + k), where 'n' is the number of elements and 'k' is the range of input values.
Key characteristics of Counting Sort:
- Non-comparison-based: It does not rely on comparing elements to sort them.
- Integer Sort: Primarily designed for sorting non-negative integers.
- Stable: It preserves the relative order of elements with equal values. This property is crucial when Counting Sort is used as a subroutine in other algorithms, such as Radix Sort.
How Counting Sort Works: The Algorithm Explained
Let's break down the Counting Sort algorithm into its fundamental steps. Imagine you want to sort an array of non-negative integers:
-
Determine Range and Maximum Element:
First, iterate through the input array to find the largest element (let's call it
max). This value is critical because it dictates the size of our auxiliarycountarray, which will store frequencies for numbers from 0 up tomax. -
Initialize Count Array:
Create a
countarray of sizemax + 1. Initialize all elements of this array to zero. Each indexiin thecountarray will eventually store the frequency of the numberifrom the input array. -
Populate Count Array (Frequency Count):
Iterate through the original input array. For each element
xyou encounter, incrementcount[x]. After this step,count[i]will hold the exact number of timesiappears in the input array. -
Modify Count Array (Cumulative Sum):
Iterate through the
countarray from the second element (index 1) up tomax. For eachcount[i], add the value ofcount[i-1]to it. After this modification,count[i]will store the total number of elements less than or equal toi. Essentially, it tells us the correct sorted position (or rather, the index *after* which all elements less than or equal toihave been placed) for each number.Example: If
count[5]becomes 10, it means there are 10 elements in the input array that are less than or equal to 5. The last occurrence of 5 in the sorted array will be at index 9 (0-indexed). -
Build Output Array:
Create an
outputarray of the same size as the input array. Now, iterate through the input array from right to left (fromsize-1down to 0). This specific iteration direction is crucial for maintaining the algorithm's stability.For each element
xfrom the input array:- Decrement
count[x]. This gives you the precise 0-indexed position for the currentxin theoutputarray. - Place
xatoutput[count[x]].
- Decrement
-
Copy Back to Original Array:
Finally, copy the elements from the sorted
outputarray back into the original input array. The original array is now sorted.
When to Use Counting Sort? Advantages and Limitations
Counting Sort shines in particular scenarios:
- Small Range of Integers: It's incredibly efficient when the range of input values (
k) is not significantly larger than the number of elements (n). Ideally,k ≈ n. - Subroutine for Radix Sort: Counting Sort is often used as a stable sorting subroutine in the more general Radix Sort algorithm, which can handle larger integer ranges by sorting digit by digit.
- Stability Required: If the relative order of equal elements must be preserved, Counting Sort is a great choice.
However, it has its limitations:
- Non-negative Integers Only: Directly, it works only for non-negative integers. Negative numbers require a slight modification (e.g., shifting all numbers by adding the absolute value of the minimum element).
- Space Complexity: The need for a
countarray of sizek+1and anoutputarray of sizenmeans its space complexity can be a drawback ifkis very large. - Not for Arbitrary Data Types: It cannot directly sort floating-point numbers, strings, or other complex data structures without a mapping function to convert them into suitable integer keys.
Time and Space Complexity Analysis
-
Time Complexity:
The time complexity of Counting Sort is
O(n + k), where:nis the number of elements in the input array.kis the range of input values (max_element - min_element + 1).
Each step (finding max, populating count array, modifying count array, building output array, copying back) takes linear time with respect to
nork. This makes it a very fast algorithm whenkis small (i.e., whenkis proportional ton). -
Space Complexity:
The space complexity is
O(n + k). This comes from:- The auxiliary
countarray of sizek + 1. - The auxiliary
outputarray of sizen.
- The auxiliary
C Language Implementation of Counting Sort
Let's put theory into practice with a C implementation. We'll include helper functions for finding the maximum element and printing the array for clarity, along with robust memory allocation.
#include <stdio.h>
#include <stdlib.h> // For malloc, calloc, and free
// Function to print an array
void printArray(int array[], int size) {
for (int i = 0; i < size; ++i) {
printf("%d ", array[i]);
}
printf("\n");
}
// Function to find the maximum element in an array
// Assumes array is not empty
int findMax(int array[], int size) {
int max = array[0];
for (int i = 1; i < size; ++i) {
if (array[i] > max) {
max = array[i];
}
}
return max;
}
// Counting Sort function
void countingSort(int array[], int size) {
if (size <= 1) return; // Already sorted or empty
// 1. Find the maximum element to determine the range
int max = findMax(array, size);
// Create an output array dynamically
int *output = (int *)malloc(size * sizeof(int));
if (output == NULL) {
fprintf(stderr, "Memory allocation for output array failed!\n");
return;
}
// Create a count array and initialize all elements to 0
// Size of count array is max + 1 because elements can be from 0 to max
int *count = (int *)calloc(max + 1, sizeof(int)); // calloc initializes to zero
if (count == NULL) {
fprintf(stderr, "Memory allocation for count array failed!\n");
free(output); // Clean up previously allocated memory
return;
}
// 2. Store the count of each element
for (int i = 0; i < size; i++) {
count[array[i]]++;
}
// 3. Store cumulative count (modified count array)
// count[i] now contains the actual position of this element in output array
// (specifically, the index + 1 of where the element i should go)
for (int i = 1; i <= max; i++) {
count[i] += count[i - 1];
}
// 4. Place elements from input array to output array
// Iterate from right to left to maintain stability
for (int i = size - 1; i >= 0; i--) {
// Correct position is count[array[i]] - 1 because count stores cumulative
// values, indicating the position *after* which elements are placed.
// E.g., if count[5] = 3, means there are 3 numbers <= 5.
// The last 5 would go at index 2 (0-indexed).
output[count[array[i]] - 1] = array[i];
count[array[i]]--; // Decrement count for current element to place duplicates correctly
}
// 5. Copy the sorted elements back to the original array
// Alternatively, memcpy(array, output, size * sizeof(int)); can be used.
for (int i = 0; i < size; i++) {
array[i] = output[i];
}
// Free dynamically allocated memory
free(output);
free(count);
}
// Main function to test the Counting Sort
int main() {
int array[] = {4, 2, 2, 8, 3, 3, 1};
int n = sizeof(array) / sizeof(array[0]);
printf("Original array: ");
printArray(array, n);
countingSort(array, n);
printf("Sorted array: ");
printArray(array, n);
// Another example
int array2[] = {9, 5, 1, 8, 3, 7, 2, 4, 6};
int n2 = sizeof(array2) / sizeof(array2[0]);
printf("\nOriginal array 2: ");
printArray(array2, n2);
countingSort(array2, n2);
printf("Sorted array 2: ");
printArray(array2, n2);
// Example with duplicate max value and single element
int array3[] = {1, 1, 1, 1, 1};
int n3 = sizeof(array3) / sizeof(array3[0]);
printf("\nOriginal array 3: ");
printArray(array3, n3);
countingSort(array3, n3);
printf("Sorted array 3: ");
printArray(array3, n3);
// Example with a single element
int array4[] = {7};
int n4 = sizeof(array4) / sizeof(array4[0]);
printf("\nOriginal array 4: ");
printArray(array4, n4);
countingSort(array4, n4);
printf("Sorted array 4: ");
printArray(array4, n4);
// Example with 0s
int array5[] = {0, 5, 0, 2, 1};
int n5 = sizeof(array5) / sizeof(array5[0]);
printf("\nOriginal array 5: ");
printArray(array5, n5);
countingSort(array5, n5);
printf("Sorted array 5: ");
printArray(array5, n5);
return 0;
}
Conclusion
Counting Sort is a powerful and efficient sorting algorithm for specific use cases. Its linear time complexity, O(n + k), makes it a superior choice over comparison-based sorts when dealing with integers within a limited range. While it requires additional space proportional to the range of input values and is primarily restricted to non-negative integers, its stability and speed make it an indispensable algorithm, especially as a building block for more advanced sorting techniques like Radix Sort. Understanding when and how to apply Counting Sort effectively will significantly enhance your problem-solving capabilities in C programming.