美文网首页
String忽略大小写比较源码阅读

String忽略大小写比较源码阅读

作者: 坠叶飘香 | 来源:发表于2019-04-18 17:34 被阅读0次

1. Comparator实现的忽略大小写比较

  • 遍历两个字符串中长度矮的那个
    (1) 看每个字符是否相等
    (2) 字符转换为大写,查看是否相等
    (3) 字符转换为小写,查看是否相等
    这里在转换为大写不相等后,又判断了下转为小写是否相等


    每个字符比较流程
public static final Comparator<String> CASE_INSENSITIVE_ORDER
                                         = new CaseInsensitiveComparator();
    private static class CaseInsensitiveComparator
            implements Comparator<String>, java.io.Serializable {
        // use serialVersionUID from JDK 1.2.2 for interoperability
        private static final long serialVersionUID = 8575799808933029326L;

        public int compare(String s1, String s2) {
            int n1 = s1.length();
            int n2 = s2.length();
            int min = Math.min(n1, n2);
            for (int i = 0; i < min; i++) {
                char c1 = s1.charAt(i);
                char c2 = s2.charAt(i);
                if (c1 != c2) { //一个一个字符比较
                    c1 = Character.toUpperCase(c1); //转换成大写
                    c2 = Character.toUpperCase(c2);
                    if (c1 != c2) {
                        c1 = Character.toLowerCase(c1); //转换成小写
                        c2 = Character.toLowerCase(c2);
                        if (c1 != c2) {
                            // No overflow because of numeric promotion
                            return c1 - c2;
                        }
                    }
                }
            }
            return n1 - n2;
        }

        /** Replaces the de-serialized object. */
        private Object readResolve() { return CASE_INSENSITIVE_ORDER; }
    }

2. equalsIgnoreCase方法

同样也是在toUpperCase比较不相等后,再调用了toLowerCase
public boolean equalsIgnoreCase(String anotherString) {
        return (this == anotherString) ? true
                : (anotherString != null)
                && (anotherString.value.length == value.length)
                && regionMatches(true, 0, anotherString, 0, value.length);
    }
public boolean regionMatches(boolean ignoreCase, int toffset,
            String other, int ooffset, int len) {
        char ta[] = value;
        int to = toffset;
        char pa[] = other.value;
        int po = ooffset;
        // Note: toffset, ooffset, or len might be near -1>>>1.
        if ((ooffset < 0) || (toffset < 0)
                || (toffset > (long)value.length - len)
                || (ooffset > (long)other.value.length - len)) {
            return false;
        }
        while (len-- > 0) {
            char c1 = ta[to++];
            char c2 = pa[po++];
            if (c1 == c2) { //1.本身是否相等
                continue;
            }
            if (ignoreCase) { //如果是忽略大小写
                // If characters don't match but case may be ignored,
                // try converting both characters to uppercase.
                // If the results match, then the comparison scan should
                // continue.
                char u1 = Character.toUpperCase(c1);
                char u2 = Character.toUpperCase(c2);
                if (u1 == u2) {  //toUpperCase后是否相等
                    continue;
                }
                // Unfortunately, conversion to uppercase does not work properly
                // for the Georgian alphabet, which has strange rules about case
                // conversion.  So we need to make one last check before
                // exiting.
                //toLowerCase后是否相等
                if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
                    continue;
                }
            }
            return false;
        }
        return true;
    }

3. 为啥只要toUpperCase或者toLowerCase相等就可以判定为相等呢

在第2点的代码注释里面指出了原因:Georgian alphabet
存在Georgian alphabet字符,虽然本身与字符比较不相等;转换为大写或者转换为小写后就可能asscii的字符相等了
Georgian alphabet字符

4. 既然这样,不应该让它们的比较结果为不相等才对吗,为啥要返回相等这样的结果?

相关文章

网友评论

      本文标题:String忽略大小写比较源码阅读

      本文链接:https://www.haomeiwen.com/subject/echtgqtx.html