深入理解 Sort Comparator

2024-12-30 (Mon.)

在不同编程语言的标准库中,排序对于自定义比较函数的处理逻辑有所不同。为避免混淆,在此稍作整理。

C

在这里先回顾 strcmp 的返回值:

strcmp() returns an integer indicating the result of the comparison, as follows:

  • 0, if the s1 and s2 are equal;
  • a negative value if s1 is less than s2;
  • a positive value if s1 is greater than s2.

几乎所有编程语言的自定义排序函数(如果用整数作为返回值)都遵循 strcmp 的约定。

具体而言,qsort 使用一个函数指针作为比较函数:

int comparison_fn_t (const void *, const void *);
void qsort (void *array, size_t count, size_t size, comparison_fn_t compare)
  • 返回值 < 0,排序结果中 ab 之前(可以理解为排序后和入参顺序相同)
  • 返回值 > 0,排序结果中 ba 之前
  • 返回值 = 0,由于 qsort 不稳定,不保证相等时的顺序。
#include <stdio.h>
#include <stdlib.h>
int cmp(const void *a, const void *b) {
return *(int *)a - *(int *)b;
}
int main() {
int arr[] = {3, 1, 4, 1, 5, 9, 6};
qsort(arr, sizeof(arr) / sizeof(int), sizeof(int), cmp);
return 0;
}

C++

C++ 的 std::sort (不稳定,O(Nlog(N)O(N \cdot \log(N))和 std::stable_sort(稳定,O(Nlog(N))O(N \cdot \log(N)) 或内存受限时 O(Nlog2(N))O(N \cdot \log^2(N)))都支持可选的 cmp 参数,定义如下:

cmp: comparison function object (i.e. an object that satisfies the requirements of Compare) which returns ​true if the first argument is less than (i.e. is ordered before) the second.

cmp 返回值为 booltrue 时保证排序结果中 ab 前面,false 时表示 a 不小于 b,这时分为两种情况:

  1. ab 不等:std::sortstd::stable_sort 会交换 ab 的位置,使得 ba 前;
  2. ab 相等:std::stable_sort 会维持数组中的原有顺序,std::sort 由于是不稳定排序,不保证顺序。
std::vector<int> arr {3, 1, 4, 1, 5, 9, 6};
// Ascending order (default)
std::sort(arr.begin(), arr.end());
std::sort(arr.begin(), arr.end(), std::less<int>());
std::sort(arr.begin(), arr.end(), [](int a, int b) { return a < b; });
// Descending order
std::sort(arr.begin(), arr.end(), std::greater<int>());
std::sort(arr.begin(), arr.end(), [](int a, int b) { return a > b; });

未提供 cmp 时,默认使用 std::less<T>,即 a < b。因此也可重载 operator<

// Ascending by x, then by y
struct Point {
int x, y;
bool operator<(const Point& rhs) const {
return x == rhs.x ? y < rhs.y : x < rhs.x;
}
}
std::vector<Point> arr {{3, 1}, {4, 1}, {5, 9}, {6, 2}};
std::sort(arr.begin(), arr.end());

JavaScript

JavaScript 的 Array.prototype.sort(in-place,返回原数组引用)和 Array.prototype.toSorted(原数组不变,返回排序的新数组)均支持可选的 compareFn,定义如下(和 strcmp 一致):

Reference

compareFn: A function that determines the order of the elements. The function is called with the following arguments:

a: The first element for comparison. Will never be undefined.
b: The second element for comparison. Will never be undefined.
It should return a number where:

A negative value indicates that a should come before b. A positive value indicates that a should come after b. Zero or NaN indicates that a and b are considered equal.

To memorize this, remember that (a, b) => a - b sorts numbers in ascending order.

ECMAScript 10 以上的 sort稳定的。当返回值为 0 时,ab 的相对顺序不变。

const arr = [3, 1, 4, 1, 5, 9, 6];
// Ascending order (default)
arr.sort();
arr.sort((a, b) => a - b);
arr.sort((a, b) => a < b ? -1 : 1);
arr.sort((a, b) => a > b ? 1 : -1);
// Descending order
arr.sort((a, b) => b - a);
arr.sort((a, b) => a > b ? -1 : 1);
arr.sort((a, b) => a < b ? 1 : -1);

Python

Python 的 list.sortsorted 方法通过 key 传入 lambda function 控制从 iterable item 中获取可比较的键值,reverse 指定结构体升 / 降序。

需要自定义排序时,要使用 functools.cmp_to_key 将 comparison function 转换为 key,定义如下(和 strcmp 一致):

A comparison function is any callable that accepts two arguments, compares them, and returns a negative number for less-than, zero for equality, or a positive number for greater-than. A key function is a callable that accepts one argument and returns another value to be used as the sort key.

class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return f"({self.x}, {self.y})"
arr = [Point(3, 1), Point(4, 1), Point(5, 9), Point(6, 2)]
# Ascending by x, then ascending by y
arr.sort(key = lambda p: (p.x, p.y))
arr.sort(key = cmp_to_key(lambda a, b: a.x - b.x if a.x != b.x else a.y - b.y))

Python 的 sortsorted 保证稳定

© 2019 - 2025Juntong Chen || Dark Mode || RSS