LeetCode.028.Implement strStr()

题目描述

028 Implement strStr()
https://leetcode.com/problems/implement-strstr/description/

Return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

Example 1:

Input: haystack = "hello", needle = "ll"
Output: 2

Example 2:

Input: haystack = "aaaaa", needle = "bba"
Output: -1

解题过程

经典的子串匹配问题，如果用暴力搜索，时间复杂度为**O(m×n)，用KMP算法可把时间复杂度降为O(m+n)**，但是KMP算法很难理解，实现起来也不算简单。
先用暴力法写了一个，发现是可以通过的，不会超时，看来这道题的目的并不是让实现KMP算法，否则题目难度也不会是Easy了。

package _028_Implement_strStr;

class Solution {
    public int strStr(String haystack, String needle) {
        if(needle==null || needle.length()==0)
            return 0;
        char[] s = haystack.toCharArray();
        char[] p = needle.toCharArray();
        for(int i=0; i<=s.length-p.length;){
            for(int j=0; j<p.length;){
                if(s[i]==p[j]){
                    i++;
                    j++;
                }else{
                    i = i-j+1;
                    break;
                }
                if(j==p.length)
                    return i-j;
            }
        }
        return -1;
    }
}

public class Implement_strStr {
    public static void main(String[] args){
        Solution solution = new Solution();
        System.out.println(solution.strStr("", ""));
    }
}

但提交了好多次才AC，这道题的边界条件很难掌握，比如strStr(“”, “”)结果是0，strStr(“a”, “”)结果也是0，也就是空串是任意串的子串。strStr(“”, “a”)结果是-1，即如果主串为空，只要模式串不为空，结果肯定是-1

Java String类indexOf()方法的内部实现

我看最快的代码是直接调用java String类的indexOf()方法：

class Solution {
    public int strStr(String haystack, String needle) {
        return haystack.indexOf(needle);
    }
}

就查了下String类的indexOf()方法的内部实现，我以为肯定是KMP算法，没想到竟然是暴力搜索：

public int indexOf(String str, int fromIndex) {
    return indexOf(value, 0, value.length,
            str.value, 0, str.value.length, fromIndex);
}

内部搜索时使用char[]数组结构：

/**
* Code shared by String and StringBuffer to do searches. The
* source is the character array being searched, and the target
* is the string being searched for.
*
* @param  source      the characters being searched.
* @param  sourceOffset offset of the source string.
* @param  sourceCount  count of the source string.
* @param  target      the characters being searched for.
* @param  targetOffset offset of the target string.
* @param  targetCount  count of the target string.
* @param  fromIndex    the index to begin searching from.
*/
static int indexOf(char[] source, int sourceOffset, int sourceCount,
        char[] target, int targetOffset, int targetCount,
        int fromIndex) {
    if (fromIndex >= sourceCount) {
        return (targetCount == 0 ? sourceCount : -1);
    }
    if (fromIndex < 0) {
        fromIndex = 0;
    }
    if (targetCount == 0) {
        return fromIndex;
    }

    char first = target[targetOffset];
    int max = sourceOffset + (sourceCount - targetCount);

    for (int i = sourceOffset + fromIndex; i <= max; i++) {
        /* Look for first character. */
        if (source[i] != first) {
            while (++i <= max && source[i] != first);
        }

        /* Found first character, now look at the rest of v2 */
        if (i <= max) {
            int j = i + 1;
            int end = j + targetCount - 1;
            for (int k = targetOffset + 1; j < end && source[j]
                    == target[k]; j++, k++);

            if (j == end) {
                /* Found whole string. */
                return i - sourceOffset;
            }
        }
    }
    return -1;
}

先找源串中是否有和目标串（模式串）第一个字符相同的字符，从源的第一个字符开始，如果没找到，则判断源的后一个字符是否和目标第一个字符相同（相当于下标+1），以此循环，直到找到相同。判断的条件是下标偏移小于源的长度减去目标的长度，原因很好理解。
如果找到了和目标第一个字符相同的字符，判断目标字符后一个和源字符后一个是否相同，不同直接return false，相同则继续循环判断下一个字符是否相同，判断的条件是在目标字符长度范围内。

为什么String类的indexOf()方法没使用KMP算法

KMP对特殊的字符串比较好用，就是自身带有很多重复子串的那种，而在字符串不长的情况下，KMP比较耗时。

原来JDK的编写者们认为大多数情况下，字符串都不长，使用原始实现可能代价更低。
因为KMP和Boyer-Moore算法都需要预先计算处理来获得辅助数组，需要一定的时间和空间，这可能在短字符串查找中相比较原始实现耗费更大的代价。而且一般大字符串查找时，程序员们也会使用其它特定的数据结构，查找起来更简单。这有点类似于排除特定情况下的快速排序了。不同环境选择不同算法。

GitHub代码

leetcode-java / problems / _028_Implement_strStr /
https://github.com/masikkk/algorithms/tree/master/leetcode/leetcode/_028_Implement_strStr

当前位置 : 首页 » 文章分类 : » LeetCode.028.Implement strStr()