经典的线性时间排序问题,题目标签是hard,主要是要求了线性时间内解决问题,那么很自然想到基数排序和桶排序了。除此之外还有一种常用的线性时间排序,计数排序可以看做间隔为1的桶排序,在数字比较集中且可转化为整数时比较有效,因此计数排序不适合本题目。
题目:
题目是说要在线性时间和空间内找出排序后的数组相邻数字间的最大间隔。
Given an unsorted array, find the maximum difference between the successive elements in its sorted form.
Try to solve it in linear time/space.
Return 0 if the array contains less than 2 elements.
You may assume all elements in the array are non-negative integers and fit in the 32-bit signed integer range.
思路:
说道线性时间空间排序,那么只有计数排序、桶排序、基数排序可以使用了,事实上计数排序就是间隔为1的桶排序,这道题桶排序和基数排序都可以解,这篇博客给出桶排序和基数排序的答案。
假设有n个数,根据数组最大值和最小值来划分n+1个桶(buckets),在分好之后,必然有一个桶是空的, 这里也可以是n个桶结果是一样的,并且第一个桶和最后一个桶都有数字存储。那么间隔必然大于一个桶的空间,所以只需要考虑桶间的间隔就可以了。于是第一遍计算每个桶内的最大值和最小值,前一个非空桶的最大值与后一个非空桶最小值之间的间隔可能是最大值,然后进行搜索就可以了。
代码(桶排序法):
注意一些细节,比如最大值最小值相等,数字个数小于2等。
class Solution {
public:
int maximumGap(vector<int>& nums) {
const int len = nums.size();
if (len < 2){
return 0;
}
vector<vector<int>> buckets(len+1);
// can be optimized
int n_max = nums[0], n_min = nums[0];
for (auto num : nums){
if (num > n_max){
n_max = num;
}
else if(num < n_min){
n_min = num;
}
}
if (n_max == n_min){
return 0;
}
// build buckets
const float gap = (n_max - n_min) / (float)len;
for (auto num : nums){
int idx = (num - n_min) / gap;
if (buckets[idx].empty()){
buckets[idx].reserve(2);
buckets[idx].push_back(num);
buckets[idx].push_back(num);
}
else{
if (num > buckets[idx][1]){
buckets[idx][1] = num;
} else if (num < buckets[idx][0]){
buckets[idx][0] = num;
}
}
}
// search maxgap
int maxgap = 0;
int prev = 0;
for (int i = 0; i < buckets.size(); ++i){
if (buckets[i].empty()){
continue;
}
maxgap = max(maxgap, buckets[i][0] - buckets[prev][1]);
prev = i;
}
return maxgap;
}
};
这里还有更快的写法,当然思路是一样的,应用了一些编程上的优化。还有就是注意到最大值一定出现在空桶两边。上面的代码是8ms,下面的代码是6ms
class Solution {
public:
int maximumGapO(vector<int>& nums) {
// the straight-forward solution would be bucket sort
// the maximum gap is garantteed to be in different buckets
// O(N) in terms of both time and space
int n = nums.size();
if (n<2)
return 0;
int maxElement = nums[0];
int minElement = nums[0];
for (auto num : nums){
if (num > maxElement){
maxElement = num;
}
else if (num < minElement){
minElement = num;
}
}
//int minElement = *min_element(nums.begin(), nums.end());
//int maxElement = *max_element(nums.begin(), nums.end());
int length = maxElement - minElement;
if (length <= 1)
return length;
vector<int> bucket_min(n, INT_MAX); // minimum value in each bucket
vector<int> bucket_max(n, INT_MIN); // maximum value in each bucket
int i, index;
for (i = 0; i<n; i++)
{
index = 1.0*(nums[i] - minElement) / length*(n - 1);
bucket_min[index] = min(bucket_min[index], nums[i]);
bucket_max[index] = max(bucket_max[index], nums[i]);
}
int result = 0;
int premax = bucket_max[0];
for (i = 1; i<n; i++)
{
if (bucket_max[i] > INT_MIN) // there are elements in this bucket
{
result = max(result, bucket_min[i] - premax);
premax = bucket_max[i];
}
}
return result;
}
};
代码(基数排序法):
基数排数,从低位到高位以此进行稳定排序,对排序后的数组相邻做差求得最大间隔。基数排序虽然看着是线性时间的,但是实际开销的常数因子是比较大的,代码摆这里做个参考好了。
class Solution {
public:
int maximumGap(vector<int> &num) {
if (num.size() < 2) return 0;
int imax = num[0];
for (int x : num) {
if (x > imax) imax = x;
}
// LSD (least significant digit) radix sort
int base = 10;
vector<list<int>> buckets(base);
vector<int> numbers = num;
for (long long i = 1; i <= imax; i *= base) {
// distribute numbers
for (int x : numbers) {
int index = x / i % base;
buckets[index].push_back(x);
}
// put back numbers
for (int j = 0, k = 0; j < buckets.size(); j++) {
for (int x : buckets[j])
numbers[k++] = x;
buckets[j].clear();
}
}
int result = 0;
for (int i = 1; i < numbers.size(); i++) {
int gap = numbers[i] - numbers[i-1];
if (gap > result) result = gap;
}
return result;
}
};