Learn

We now have all the tools in place to complete the full implementation of the Rabin-Karp algorithm! All that is left is to compute the hash of the pattern and compare it with substrings of the same length in the text. If they hash to the same value, it is safe to assume a match has been found. Here is what the pseudocode might look like:

  • Find the first substring in text equal in length to pattern.
  • Compare its polynomial hash with that of pattern.
  • If they are equal, increase occurrences by 1.
  • Iterate through remaining substrings in order
  • If the polynomial hash of any substring matches that of pattern, increase occurrences by 1.
  • Return occurrences.

Instructions

1.

Complete the function rabin_karp_algorithm() so that it returns the number of occurrences of pattern in text.

The polynomial hash of pattern and the polynomial hash of the first substring in text equal in length to pattern, i.e. polynomial_hash(text[: pattern_length]) Have been computed for you: substring_hash and pattern_hash. If their hash values match, increase the number of occurrences by 1.

Just like how you did in the last exercise, use a for loop to iterate through the remaining subsequent substrings in text equal in length to pattern.

Compute the polynomial rolling hash of each of these substrings inside the for loop, and compare them individually with the polynomial hash of pattern. Each time the hash values match, increase the number of occurrences by 1.

Click the hint for a guided walkthrough of the Rabin-Karp algorithm code.

2.

Great job with getting rabin_karp_algorithm() to work! Now we can come full circle and put it to the test against the same pattern and text that took naive_pattern_matching() a really long time to figure out back in Exercise 2.

To see if rabin_karp_algorithm() runs faster on the same input, create a string having 1000 repetitions of the character A and store it in a variable pattern. Then, create another string having 100000 repetitions of the character A and store it in a variable text.

Print rabin_karp_algorithm() by passing pattern and text as arguments to the function and notice how long it takes to count the number of occurrences of pattern in text. Did it run significantly faster?

3.

However, if you keep increasing the length of pattern and text, even rabin_karp_algorithm() will start to reach its limit. To see this for yourself, leave the value of pattern as it is but add another 0 to the number of characters A in text i.e. 1000000 to make it a million characters long.

Print rabin_karp_algorithm() by passing pattern and text as arguments to the function and notice how long it takes to count the number of occurrences of pattern in text!

That felt like an eternity and it seems like it’s the end of the road when it comes to dealing with gigantic strings. Quite the opposite, however!

Thanks to breakthroughs in computer science algorithms, there is a more advanced string-search algorithm called the Knuth-Morris-Pratt Algorithm that can outperform Rabin-Karp in counting occurrences of a single pattern in a large text. You will learn about it in a future lesson.

Sign up to start coding

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?