In this article, i will show you how to use python regular expression module ( python re module ) to parse a string to return the first matched string and all matched strings.
1. Use Python Regular Expression Module To Parse String Steps.
- Import re module.
# import python regular expression parse module. import re
- Create a re.Pattern object by invoke re.compile function. You should provide a pattern format string to the compile function like below.
# create the pattern format string. pattern_format_string = r'\d\d\d-\d\d\d\d\d\d\d\d' # create the re.Pattern object use re.compile function. reg_pattern = re.compile(pattern_format_string)
- To get the first matched string, you should invoke the re.Pattern object’s search function like below.
# invoke re.Pattern object's search method, pass the string that will be parsed to it. search_result = pattern_object.search(string) # get the first matched string in the result by invoke it's group function. print('search result: ' + search_result.group())
- To get all matched string, you should invoke the re.Pattern object’s findall function like below.
# invoke re.Pattern object's findall method. find_all_result = pattern_object.findall(string) print('find all result: ' + str(find_all_result))
2. Python Regular Expression Examples.
There are 2 function in this example, one invoke re.Pattern‘s search function, the other invoke re.Pattern‘s findall function. You can see code comments for detail explanation.
''' Created on Sep 22, 2020 @author: songzhao ''' # import python regular expression parse package. import re ''' This function will invoke the python regexo Pattern object's search method to get the first matched string. ''' def regexp_search_function(pattern, string): searched_result = pattern.search(string) print('search phone number result: ' + searched_result.group()) ''' This function will invoke the python regexo Pattern object's findall method to get all the matched string in a list. ''' def regexp_find_all_function(pattern, string): find_all_result = pattern.findall(string) print('find all phone number result: ' + str(find_all_result)) if __name__ == '__main__': # create a regexp pattern to match a phone number phone_number_format = r'\d\d\d-\d\d\d\d\d\d\d\d' phone_number_pattern = re.compile(phone_number_format) # this is the phone number string that contain 3 phone number, but only the first 2 match above phone number pattern. phone_number_string = 'phone_number_1: 010-88888889;phone_number_2:012-89877987; phont_number_3: 0893-898998' regexp_search_function(phone_number_pattern, phone_number_string) regexp_find_all_function(phone_number_pattern, phone_number_string) # create a regexp pattern to match a phone number with group, please notice the parentheses in the pattern format string. phone_number_format_use_group = r'(\d\d\d)-(\d\d\d\d\d\d\d\d)' phone_number_pattern = re.compile(phone_number_format_use_group) # above phone number pattern will return regexp_find_all_function(phone_number_pattern, phone_number_string)
Below is above example execution result.
search phone number result: 010-88888889 find all phone number result: ['010-88888889', '012-89877987'] find all phone number result: [('010', '88888889'), ('012', '89877987')]
Reference