Learn

You are probably wondering why we didn’t make use of any “rolling” property of the polynomial hash function and just calculated the hash values for all substrings by adding up the individual contributions of all the characters each time. And you are correct! As we saw in an earlier exercise, we can exploit this “rolling” property here as well to reduce the number of operations and speed up our computation time!

Instead of calculating the hash values of all substrings using the polynomial hash function definition every time, let’s try to recover the hash values of subsequent substrings just like we did with prime rolling hashing.

For example, we could recover the same polynomial hash of 'BCDE' using fewer computations from the polynomial hash of 'ABCD' as follows:

polynomial_hash_ABCD = polynomial_hash('ABCD') rolling_hash_BCDE = hash_ABCD - polynomial_hash('A') + polynomial_hash('E')

This seems plausible. After all, we used a similar divide + multiply trick for prime rolling hashing and are now using a subtract + add trick for polynomial rolling hashing. What could go wrong? Let’s find out!

Instructions

1.

Calculate the polynomial hash of 'ABCD' using the polynomial_hash() function and store it in a variable polynomial_hash_ABCD.

Calculate the polynomial hash of 'A' and remove it from hash_ABCD. Similarly, calculate the polynomial hash of 'E' and add it to hash_ABCD. As claimed above, this should give the rolling hash of 'BCDE'. Store it in a variable rolling_hash_BCDE.

Calculate the polynomial hash of 'BCDE' directly using polynomial_hash() and store it in a variable polynomial_hash_BCDE. Print both rolling_hash_BCDE and polynomial_hash_BCDE and see the results. Are you surprised?

2.

Indeed! The values do not match so something is not quite right!

This is because you are dealing with different powers of 26 for each character depending on the location we are at in the substring. There are three key subtleties that you need to understand to correctly recover the hash of the subsequent substrings:

  • Removing the hash of 'A' from the hash of 'ABCD' is not the same as removing the contribution of 'A' from the hash of 'ABCD'.
  • However, the hash of 'E' is the same as the contribution of 'E' to the hash of 'BCDE', so this part was correctly added.
  • What is not correctly added are the powers of 26 corresponding to characters 'B', 'C', 'D' before you added the hash of 'E', so you need to “shift” the corresponding terms to the left i.e. 'B' takes the place of 'A', 'C' takes the place of 'B', and so on …

Let’s make a quick detour and explore these ideas more concretely to get a better handle on “rolling” the hashes.

While calculating the polynomial hash value of 'ABCD', the computation was as follows:

ord('A') * 26**3 + ord('B') * 26**2 + ord('C') * 26**1 + ord('D') * 26**0

What you actually need to do to recover the hash of 'BCDE' is:

  • Remove ord('A') * 26^3 (NOT just ord(A) * 26**0)
  • Multiply the result by 26 to “shift” the result to the left i.e. obtain ord('B') * 26**3 + ord('C') * 26**2 + ord('D') * 26**2
  • Finally, add ord('E') * 26**0 which gives the same result as calculating the hash of 'BCDE' as polynomial_hash('BCDE')!

Re-calculate the rolling hash of 'BCDE' from the polynomial hash of 'ABCD' using this updated approach. Print the updated value of rolling_hash_BCDE and check to see if it matches the value of polynomial_hash_BCDE (print this out as well)!

3.

Complete the function polynomial_rolling_hash() that takes the following parameters:

  • the substring s
  • the polynomial hash H of s
  • the next character c

Remember the steps to find the polynomial rolling hash of the next substring:

  • remove ord(s[0]) * 26**(len(s) - 1) from H
  • multiply by 26 to shift the result to the left
  • add ord(c)
4.

Use the polynomial_rolling_hash() function to calculate the polynomial rolling hash of all substrings of length 4 in uppercase starting from 'BCDE'.

Start by calculating the polynomial hash of 'ABCD' as H = polynomial_hash('ABCD') and setting the next character c = 'E'. Pass 'ABCD', H, c as arguments to polynomial_rolling_hash() to get the polynomial rolling hash of 'BCDE' and update the value of H.

Then, pass 'BCDE', H, 'F' as arguments to polynomial_rolling_hash() to get the polynomial rolling hash of 'CDEF', and so on… To automate this process, use a for loop.

Finally, create a dictionary polynomial_rolling_hash_values and store the mapping of each substring of length 4 to its corresponding polynomial rolling hash in the dictionary. Also include the hash of 'ABCD'. Print out polynomial_rolling_hash_values to see the results for yourself!

Click the hint for guided steps.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?