\w
– Word character. That’s any letters, numbers, and the underscore character.\W
– Anything but word characters.\s
– Whitespaces.\S
– Anything but whitespaces.\d
– Numbers from 0 to 9\D
– Anything but numbers.\b
– Boundaries of words, defined by white space or the edges of the string.\B
– Anything but boundaries.\w{3}
– 3 word characters.\w{,3}
– 0 – 3 word characters.\w{3,}
– 3 or more word characters. No upper limit.\w{3, 5}
– 3 – 5 word characters in a row.\w?
– 0 or 1 word characters.\w*
– 0 or more word characters.\w+
– 1 or more word characters. Must occur at least once!
import re names_file = open("names.txt", encoding="utf-8") data = names_file.read() names_file.close() # Search for phone numbers with the following form: (555) 555-5555 print(re.search(r'\(\d\d\d\) \d\d\d-\d\d\d\d', data)) # Search for phone numbers: (555) 555-5555, 555-555-5555, 555 555-5555 print(re.findall(r'\(?\d{3}\)?-?\s?\d{3}-\d{4}', data)) # Search for last names with or without first names, such as: 'Bashar, Ghadanfar' print(re.findall(r'\w*, \w+', data))
.findall( )
Finds all non-overlapping occurrences of the pattern in the text.
Exercise
Write a function named first_number that takes a string as an argument. The function should search, with a regular expression, the first number in the string and return the match object.
Then, write a function named numbers() that takes two arguments: a count as an integer and a string. Return an re.search for exactly count numbers in the string.
import re def first_number(my_str): return re.search(r'\d', my_str) def numbers(integer, string): result = re.search(r'\d'*integer, string) return result