We’ve mentioned data structures before, and in this case its to do with how we should create our index such that it is quick to search for a particular keyword in the index, and retrieve the location of the keyword (in the form of a URL). Here a new method is introduced to us,
find = 'This is a sentence!'
print find.split() # will give us ['This','is','a','sentence!']
This method is useful for splitting up text by ‘space’, but as you can see it’s not very good at recognizing words and punctuation. In the example above, ‘sentence!’ is recognized as a word, so if we base our keyword search on the list returned by this method, we will miss out on the keyword ‘sentence’, because it does not have the exclamation mark! And we definitely know that they mean the word.
Also, a new construct, the triple quotes “””. They allow you to enter strings over multiple lines:
longText = """
This is a really
long string which is spread out
over a few lines
The next few chapters are about the Internet. It’s really funny how we use the Internet every day, and yet we have so little idea how it came about and how it works. So pay attention to those valuable lessons and appreciate the really smart people who invented this technology.