Your boss walks up to your coworker and tasks her with finding every single United States phone number lying within several million files.
Upon hearing this, you can’t help but feel relief that you weren’t asked to do this fully confident that your coworker will spend the next year of her life tracking this information down.
So when later that day, you see her back to hacking on one of her pet projects, you can’t help but stop and ask how the Where’s Waldo hunt for phone numbers is going.
“Oh that?” she asks surprised “I finished getting that information hours ago.”
How’d she do it?
Simple, she was using Regular Expressions. Here are five tools that will help you on your way to becoming a RegEx Jedi Knight.
Mastering Regular Expressions
Mastering Regular Expressions is probably the de-facto book if you are truly wanting to learn about Regular Expressions (RegEx to all youz hax0rs).
Four years ago I wrote a program that scraped data (legally, of course) from a county website. All the critical data was jumbled within hundreds of thousands of flatfiles so it was insanely cumbersome. Knowing that I would have to do some pretty crazy regular expressions, I purchased this book and never looked back.
My copy is dog dogeared, battered, and beat to a pulp but it has been the single greatest asset in appearing to Sherlock datasets out of thin air.
Grep
Ah yes, here we have old faithful. The *gasp* command line tool we so lovingly know of as Grep. Every *NIX environment now days has this beast installed by default and I use it every day to bend measly files to my will.
Grep is the most handy for me personally when I take over the development of a web application. I will use Grep to search through directories recursively for a certain method or class that I need to identify so that I am better able to see how the application works.
Combined with command line syntax highlighting, Grep is a tool that every code master should know about.
RegExr
RegExr is an Adobe Flex application for helping you write Regular Expressions. The great thing is that it visually gives you instant feedback and makes for one heck of a testing tool.
While not a very robust application, I choose to use it when I need to bust out some quick regex. It also allows you to save and tweet your creations into the wild. Head over and give it a shot, there are plenty of examples to help you get started.
RegexBuddy
RegExBuddy has many a time been a life saver and behaves similarly to RegExr. Though it has many more features, I like this tool primarily because it handles different programming languages and their various Regex nuances seamlessly.
I can take a piece of Regex code that I wrote for PHP using the preg_match() function, pull it into RegexBuddy and then make sure it will work effectively in a Python script that I may be hacking out.
Also the graphical interface is great for newbs, as you can click a UI button and watch as your regular expression is instantly modified.
PowerGrep
PowerGrep is the 800 pound gorilla in the room. If you are about to embark on, say, a Last Starfighter type of endeavor where your Regular Expression work is the difference between life and death, then I suggest you pony up the $160 bucks and see what PowerGrep can do for you.
The documentation is rich, the application is intuitive, and you will walk out of meetings having your very own coworkers asking “How’d he do that?”
Feel free to add your own tools of choice in the comments below!
Ben Miller says
Here’s another must have regular expressions tool, from the creator of my favorite regular expression comic.
jared.folkins says
Nice Ben!
Yeah I love the comic and I both the #LinuxCheatSheat and #RegularExpressionsShirt are totally awesome 🙂
jared.folkins says
*…I love both the #LinuxCheatSheat and #RegularExpressionsShirt are totally awesome
Whoops.