1. Comparator实现的忽略大小写比较
-
遍历两个字符串中长度矮的那个
(1) 看每个字符是否相等
(2) 字符转换为大写,查看是否相等
(3) 字符转换为小写,查看是否相等
这里在转换为大写不相等后,又判断了下转为小写是否相等
每个字符比较流程
public static final Comparator<String> CASE_INSENSITIVE_ORDER
= new CaseInsensitiveComparator();
private static class CaseInsensitiveComparator
implements Comparator<String>, java.io.Serializable {
// use serialVersionUID from JDK 1.2.2 for interoperability
private static final long serialVersionUID = 8575799808933029326L;
public int compare(String s1, String s2) {
int n1 = s1.length();
int n2 = s2.length();
int min = Math.min(n1, n2);
for (int i = 0; i < min; i++) {
char c1 = s1.charAt(i);
char c2 = s2.charAt(i);
if (c1 != c2) { //一个一个字符比较
c1 = Character.toUpperCase(c1); //转换成大写
c2 = Character.toUpperCase(c2);
if (c1 != c2) {
c1 = Character.toLowerCase(c1); //转换成小写
c2 = Character.toLowerCase(c2);
if (c1 != c2) {
// No overflow because of numeric promotion
return c1 - c2;
}
}
}
}
return n1 - n2;
}
/** Replaces the de-serialized object. */
private Object readResolve() { return CASE_INSENSITIVE_ORDER; }
}
2. equalsIgnoreCase方法
同样也是在toUpperCase比较不相等后,再调用了toLowerCase
public boolean equalsIgnoreCase(String anotherString) {
return (this == anotherString) ? true
: (anotherString != null)
&& (anotherString.value.length == value.length)
&& regionMatches(true, 0, anotherString, 0, value.length);
}
public boolean regionMatches(boolean ignoreCase, int toffset,
String other, int ooffset, int len) {
char ta[] = value;
int to = toffset;
char pa[] = other.value;
int po = ooffset;
// Note: toffset, ooffset, or len might be near -1>>>1.
if ((ooffset < 0) || (toffset < 0)
|| (toffset > (long)value.length - len)
|| (ooffset > (long)other.value.length - len)) {
return false;
}
while (len-- > 0) {
char c1 = ta[to++];
char c2 = pa[po++];
if (c1 == c2) { //1.本身是否相等
continue;
}
if (ignoreCase) { //如果是忽略大小写
// If characters don't match but case may be ignored,
// try converting both characters to uppercase.
// If the results match, then the comparison scan should
// continue.
char u1 = Character.toUpperCase(c1);
char u2 = Character.toUpperCase(c2);
if (u1 == u2) { //toUpperCase后是否相等
continue;
}
// Unfortunately, conversion to uppercase does not work properly
// for the Georgian alphabet, which has strange rules about case
// conversion. So we need to make one last check before
// exiting.
//toLowerCase后是否相等
if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
continue;
}
}
return false;
}
return true;
}
3. 为啥只要toUpperCase或者toLowerCase相等就可以判定为相等呢
在第2点的代码注释里面指出了原因:Georgian alphabet
存在Georgian alphabet字符,虽然本身与字符比较不相等;转换为大写或者转换为小写后就可能asscii的字符相等了
Georgian alphabet字符







网友评论