If you are working with strings in python, you know about a built-in method called replace(). This method will take two arguments.
- The pattern or substring that has to be replaced.
- The replacing string or characters.
This method works fine. But its capabilities are limited. This is where the sub() method from the regular expression comes into rescue.
replace() method in python
First let us see how to replace using the built-in replace() method.
In the above example, I have replaced all occurrences of white spaces with an asterisk(*). If you do not want to replace all instances you can also specify the count. The number of instances that are replaced will be directly proportional to the count.
In the above example, I have added one more value to the replace method. I have given an integer 1 indicating that only one instance of that substring should be replaced.
Replacing using sub() method from regular expression
Python has a built-in regular expression module. This module provides various methods to find patterns and whatnot. This module has a method called sub() which performs the same thing as replace() method in python, but only better.
import re re.sub(pattern, string, replacing_character, flags, count)
What makes the sub() method better from the replace() method? Well, the sub() method provides case insensitive pattern matching, whereas the replace() method provides case sensitive pattern matching.
Let us look at this with an example. Consider the following word.
This is HIBISCUS
It has the sub-string is in the following words.
If we replace the ‘is‘ in this string with replace() method, it will miss the IS in the HIBISCUS since it is in uppercase and the pattern we are using is lowercase. This clearly shows that the replace() method does not match strings or pattern which are in different cases.
But this can be avoided with the sub() method from re module. Just give the following argument to the sub() method and we will get case insensitive pattern matching.
Let us try the above example with sub() method from the module.
As you can see this method has replaced the IS from the HIBISCUS also. We can also give the count here.
Since the count is 1, only one occurrence of the whitespace is replaced with *.
Hope this article is helpful.