Extracting BTC addresses from emails
I was asked if I had a tip to automatically extract Bitcoin addresses from emails (cfr. Retrieving and processing JSON data (BTC example)). I do.
My tool, re-search.py, comes with a regular expression to match Bitcoin addresses, and also with the Bitcoin address checksum validation algorithm.
Bitcoin addresses are base58check encoded integers with a checksum. The following regular expression will match a Bitcoin address:
\b[13][a-km-zA-HJ-NP-Z1-9]{25,34}\b
Of course, regular expressions can not be used for checksum calculations, and hence this regular expression will also match strings that are not valid Bitcoin addresses (e.g. correct syntax, but invalid checksum).
My re-search.py tool contains a function to validate Bitcoin addresses (BTCValidate) by checking the checksum. It is used like this:
(?#extra=P:BTCValidate)\b[13][a-km-zA-HJ-NP-Z1-9]{25,34}\b
(?# ... ) is a comment for regular expressions, and is thus ignored by regular expression engines, but re-search interprets this comment to take extra actions, like in this case, calling BTCValidate.
This is the command I use to extract Bitcoin addresses from emails:
Option -n with argument btc directs re-search.py to lookup and use the regular expression with name btc from its library. That's the regular expression for Bitcoin addresses.
Option -c directs re-search.py to perform case-sensitive matches (Bitcoin addresses can contain an uppercase letter L but not a lowercase letter l).
Option -u directs re-search.py to produce a list of unique Bitcoin addresses, i.e. to remove duplicate entries.
And finally, option -e directs re-search.py to extract strings from the files it processes (*.vir files). That's because the extortion emails that I have come in various formats: MIME files, RTF files, MSG files (e.g. ole files). ole files are a binary format, and by default re-search.py reads text files. Option -e extracts ASCII and UNICODE strings from binary files (and text files too) before processing.
Didier Stevens
Senior handler
Microsoft MVP
blog.DidierStevens.com DidierStevensLabs.com
Video: Retrieving and processing JSON data (BTC example)
I produced a video showing step-by-step how to retrieve and process JSON data, like I used in my diary entry Retrieving and processing JSON data (BTC example).
curl -s https://blockchain.info/multiaddr?active=1AWKTr1vq3946tyuxG7Q1mLcJum4rjnmro%7C1Dvd7Wb72JBTbAcfTrxSJCZZuf4tsT8V72 | jq -r ".addresses | .[] | [.address,.final_balance/100000000] | @csv"
Didier Stevens
Senior handler
Microsoft MVP
blog.DidierStevens.com DidierStevensLabs.com
Comments
Anonymous
Dec 3rd 2022
10 months ago
Anonymous
Dec 3rd 2022
10 months ago
<a hreaf="https://technolytical.com/">the social network</a> is described as follows because they respect your privacy and keep your data secure. The social networks are not interested in collecting data about you. They don't care about what you're doing, or what you like. They don't want to know who you talk to, or where you go.
<a hreaf="https://technolytical.com/">the social network</a> is not interested in collecting data about you. They don't care about what you're doing, or what you like. They don't want to know who you talk to, or where you go. The social networks only collect the minimum amount of information required for the service that they provide. Your personal information is kept private, and is never shared with other companies without your permission
Anonymous
Dec 26th 2022
9 months ago
Anonymous
Dec 26th 2022
9 months ago
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> nearest public toilet to me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> public bathroom near me</a>
Anonymous
Dec 26th 2022
9 months ago
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> nearest public toilet to me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> public bathroom near me</a>
Anonymous
Dec 26th 2022
9 months ago
Anonymous
Dec 26th 2022
9 months ago
https://defineprogramming.com/
Dec 26th 2022
9 months ago
distribute malware. Even if the URL listed on the ad shows a legitimate website, subsequent ad traffic can easily lead to a fake page. Different types of malware are distributed in this manner. I've seen IcedID (Bokbot), Gozi/ISFB, and various information stealers distributed through fake software websites that were provided through Google ad traffic. I submitted malicious files from this example to VirusTotal and found a low rate of detection, with some files not showing as malware at all. Additionally, domains associated with this infection frequently change. That might make it hard to detect.
https://clickercounter.org/
https://defineprogramming.com/
Dec 26th 2022
9 months ago
rthrth
Jan 2nd 2023
9 months ago