To get a handle on a more advanced string-search algorithm like Rabin-Karp, it is a good idea to be comfortable with naive pattern-matching on a given string. In this exercise, we will refresh our understanding of the straightforward brute-force approach so that we can really appreciate what motivates the improvements in speed-up of the Rabin-Karp algorithm!
To quickly recap, this is how naive pattern-matching on a given string works:
- Iterate over the entire string to find all substrings equal in length to the pattern that we are trying to match.
- Iterate over each substring, and check to see if this matches the pattern.
Instructions
Complete the naive_pattern_matching()
function using nested for
loops.
There are text_length - pattern_length + 1
substrings in text
equal in length to pattern
. To see if any of these substrings match with pattern
, start by creating a for
loop to iterate over character indexes in text
ranging from index 0
up to, but not including, index text_length - pattern_length + 1
.
Inside this for
loop, assume that a substring already matches pattern
by creating a boolean variable pattern_match
and assigning it a value of True
.
Leave an empty line after this where you will fill the inner for
loop in the next checkpoint.
In the next line after the empty line, use an if
condition to increase the value of occurrences
by 1
if pattern_match
is True
.
Inside the for
loop, create another for
loop in the empty line from earlier that iterates over the length of substrings equal in length to pattern
i.e. pattern_length
.
Inside this inner loop, if any character in pattern
does NOT equal the character at the corresponding index in the substring being iterated over, set pattern_match
to False
and break
out of the inner loop.
Great job with getting naive_pattern_matching()
to work! Unfortunately, the brute-force algorithm is only good for short-length patterns and texts with few repetitions. It quickly begins to struggle with longer patterns and texts involving many repetitions, which is fairly common in many computer science applications.
To see this for yourself, create a string having 1000
repetitions of the character A
and store it in a variable pattern
. Then, create another string having 100000
repetitions of the character A
and store it in a variable text
.
Print naive_pattern_matching()
by passing pattern
and text
as arguments to the function and notice how long it takes to count the number of occurrences of pattern
in text
!