Learn

In trying to compute the hash values of all substrings, we can still bypass some of the calculations. That’s right! We don’t have to perform multiplication after multiplication to compute the hash of every substring.

You might have noticed that after calculating the hash of “ABCD” by multiplying the prime numbers, we don’t necessarily have to calculate the hash of “BCDE” by yet again multiplying individual prime numbers. Instead, to recover the hash of “BCDE” from that of “ABCD”, we can do something as follows:

  • remove the hash of “A” from the hash of “ABCD” by dividing it out, which gives us the hash of “BCD”
  • multiply the hash of “E” into the hash of “BCD” which gives us the hash of “BCDE”!
prime_hash_ABCD = prime_hash('ABCD') # prime hash of 'ABCD' rolling_hash_BCD = prime_hash_ABCD // prime_hash('A') #divide out the prime hash of 'A' prime_hash_BCDE = prime_hash('E') * prime_hash_BCD #multiply in the prime hash of 'E'

Notice how we performed only a single division operation and a single multiplication operation to calculate the hash of 'BCDE' instead of performing three multiplication operations as follows:

prime_hash_BCDE = prime_hash('B') * prime_hash('C') * prime_hash('D') * prime_hash('E') #three multiplication operations

We saved some computation time in the process! It might not seem like much, but imagine searching for a much larger substring, and we would be saving computation time in orders of magnitude.

This is the idea of rolling hashing.

Let’s calculate the hash values of all substrings of length 4 in uppercase but this time using the idea of rolling hashing. We will now recover the hash value of each subsequent substring of length 4 from the previous one using the divide + multiply trick we showed above.

But to begin this domino effect, we still have to start with the hash value of the first substring of length 4 that was calculated using multiplication via prime_hash(). We already have access to this value via prime_hash('ABCD').

Instructions

1.

Calculate the prime hash of 'ABCD' using the prime_hash() function and store it in a variable prime_hash_ABCD.

Calculate the prime hash of 'A' and divide it out from prime_hash_ABCD. Similarly, calculate the prime hash of 'E' and multiply it into prime_hash_ABCD. As claimed above, this should give the rolling hash of 'BCDE'. Store it in a variable rolling_hash_BCDE.

Calculate the prime hash of 'BCDE' directly using prime_hash() and store it in a variable prime_hash_BCDE. Print both rolling_hash_BCDE and prime_hash_BCDE and see the results. Is this as expected?

2.

Create a function called prime_rolling_hash() that takes the following parameters:

  • the substring s
  • the prime hash p of s
  • the next character c

This function should calculate the prime rolling hash of s. Remember the steps to find the prime rolling hash of the next substring:

  • divide out prime_hash(s[0]) from p
  • multiply in prime_hash(c) to p
3.

Use the prime_rolling_hash() function to calculate the prime rolling hash of all substrings of length 4 in uppercase starting from 'BCDE'.

Start by calculating the prime hash of 'ABCD' as p = prime_hash('ABCD') and setting the next character c = 'E'. Pass 'ABCD', p, c as arguments to prime_rolling_hash() to get the prime rolling hash of 'BCDE' and update the value of p.

Then, pass 'BCDE', p, 'F' as arguments to prime_rolling_hash() to get the prime rolling hash of 'CDEF', and so on… To automate this process, use a for loop.

Finally, create a dictionary prime_rolling_hash_values and store the mapping of each substring of length 4 to its corresponding hash computed using the rolling hashing method in this dictionary. Also include the hash of 'ABCD'. Print out prime_rolling_hash_values to see the results for yourself!

Click the hint for guided steps of this code.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?