re.finditer()
returns iterator
of matched objects in the string while re.findall()
returns list
of matched patterns in string.
Refer below snippet for understanding difference between re.finditer()
and
re.findall()
.
Code Snippet 1 : Extracting domain name from text
re.finditer
import re
text = ''' Extract the doamin from the urls www.gcptutorials.com,
www.wikipedia.org, www.google.com'''
pattern = r'(www.([A-Za-z_0-9-]+)(.\w+))'
find_iter_result = re.finditer(pattern, text)
print(type(find_iter_result))
print(find_iter_result)
for i in find_iter_result:
print(i.group(2))
Output
< class 'callable_iterator' >
< callable_iterator object at 0x7f0c5cc24e48 >
gcptutorials
wikipedia
google
re.findall
import re
text = ''' Extract the domain from the urls www.gcptutorials.com,
www.wikipedia.org, www.google.com'''
pattern = r'(www.([A-Za-z_0-9-]+)(.\w+))'
find_all_result = re.findall(pattern, text)
print(type(find_all_result))
print(find_all_result)
for i in find_all_result:
print(i[1])
Output
< class 'list' >
[('www.gcptutorials.com', 'gcptutorials', '.com'), ('www.wikipedia.org', 'wikipedia', '.org'), ('www.google.com', 'google', '.com')]
gcptutorials
wikipedia
google
Code Snippet 2 : Extracting emails from text
re.finditer
import re
sample_str = 'dummy email foo@test.com, one more dummy email bar@gmail.com testing'
emails = re.finditer(r'[\w\.-]+@[\w\.-]+', sample_str)
for email in emails:
print(email)
print(email.group(0))
Example Output
<re.Match object; span=(12, 24), match='foo@test.com'>
foo@test.com
<re.Match object; span=(47, 60), match='bar@gmail.com'>
bar@gmail.com
re.findall
import re
sample_str = 'dummy email foo@test.com, one more dummy email bar@gmail.com testing'
emails = re.findall(r'[\w\.-]+@[\w\.-]+', sample_str)
for email in emails:
print (email)
Example Output
foo@test.com
bar@gmail.com
Similar Articles