Using regex comparision in bash and BASH_REMATCH

Bash supports regular expressions in comparisons via the =~ operator.  But what is rarely used or documented is that you can use the ${BASH_REMATCH[n]}  array to access successful matches (back-references to capture groups). So if you use parentheses for grouping ()  in your regex, you can access the content of that group.

Here is an example where I am parsing date placeholders in a text with an optional offset (e.g. |YYYY.MM.DD|+2 ). Storing the format and offset in separate groups:




Multiply floats by 10,100, … in bash

A short one today. Bash can only handle integer numbers and not floats, so when someone searches the internet on how to use math on floats in bash the solution they find is usually “use bc” and looks something like this:

Or if they want the result to be an integer:

It’s a fine solution, and readable (which can mean a lot for people maintaining scripts). But if all you want to do is multiply by 10,100,1000, … you can achieve this faster with a bit of string manipulation:

It just splits the number into two strings, and assembles it again with the decimal shifted. Have a look at substring_removal and substring_expansion for more examples on how to modify strings in bash. I’d highly suggest either sticking this in a separate function, or commenting the code since it isn’t necessarily obvious what is going on

Since it is all pure bash and doesn’t need to spawn external commands, it quicker (not that bc  is slow, but if you are doing a lot of calculations, it can add up). I know what you are thinking “if your goal is speed, you shouldn’t be using bash”, that doesn’t mean we can’t write efficient code.

How to fetch IP ranges/entries from SPF records in bash

Recently I needed to fetch IP ranges from SPF records. After looking at different python/ruby/perl modules I came to the conclusion that a fancy module (sometimes with wonky dependencies) was overkill just to parse a simple SPF record. So I threw together a simple bash script that is mainly just fetching the SPF record with dig and grep:

It iterates through the options (it currently recognizes a, mx, ip4, ip6, include, and redirect), and then sorts the output by ipv4, then ipv6.

Download URL:

How to compare package version strings in bash

This is a little function I use to compare package version strings. Sometimes they can get complex with multiple different delimiters or strings in them. I cheated a bit by using sort –version-sort for the actual comparison. If you are looking for a pure bash version to compare simpler strings (e.g. compare 1.2.4 with 1.10.2), I’d suggest this stackoverflow posting.

The function takes three parameters (the version strings and the comparison you want to apply) and uses the return code to signal if the result was valid or not. This gives the function a somewhat natural feel, for example compare_version 3.2.0-113.155 “<” 3.2.0-130.145 would return true. Aside from < and > you can also use a few words like bigger/smaller, older/newer or higher/lower for comparing the strings.

List of return codes and meanings:



Convert configuration files to ansible templates

I’ve been playing around with ansible a lot lately, and I noticed that while changing stuff from “installed and configured manually” to “installed and configured by ansible” I was running into quite a few configuration files that needed to be manually turned into templates. It can be quite tedious to replace values in a configuration file with placeholders and put all those placeholders in a .yml file with default values.
Automating this is something I would have typically done in perl, but since I wanted to learn more about using regex in bash I decided to have a go at it in bash using regex and ${BASH_REMATCH}

The script takes a configuration file and spits out an ansible template, as well as the variable definitions you will need to add to your defaults/main.yml or vars/main.yml

The whole script is a bit to long to post here, but the interesting part is:

(You can download the full script here

You can use regular expressions in a [[ ]] with =~ (e.g. if [[ “boot” =~ ^b ]]), and you can access the result of the regular expression by using ( ) to mark what parts of the result to store and access them via $BASH_REMATCH (comparable to how you would do it for other languages). Here I am parsing out anything that looks like a key=value from the configfile (with multiple possible separators) and storing the results in BASH_REMATCH[1] and BASH_REMATCH[2]

Usage of the script is pretty straightforward. you give it a prefix for the variable names (so you don’t end up with multiple roles all using a common variable name like “port”), and either a local or remote file to work with, and it spits out something like this:

There a tons of different configuration file formats out there so this script won’t work perfectly 100% of the time, but it does do quite well and reduces the manually copy&pasting to a minimum.

How to install the latest Nmap for Debian/Ubuntu

A quick & dirty script to download the latest version of nmap (sourcecode) and generate a deb and install it (so that it’s correctly in the package management). Yes, I know this is not much more than a glorified configure && make && checkinstall

Bash snippet, verify ctrl+c

Lately I’ve been working on a pair of more elaborate scripts using ncat and openssl to transfer data between hosts. I’ll get around to posting it eventually, but until then a few small snippets that people may find useful.

Today we will catch ctrl+c and ask the user if he really want’s to terminate the script.

The initialize() and cleanup() are my usual function names I have in every script, making sure general settings and variables are defined and that on exit any tempfiles get deleted.
What has been added was a trap for the INT signal (ctrl+c) which calls the verify_quit() function, giving the user 10 seconds to press ctrl+c again to exit (via cleanup()) or return back to wherever we were in the code. There is one unavoidable caveat, the first ctrl+c will kill whatever the script was doing before it jumps into the verify_quit() function.

Simple “try” function for bash

Made a nice little try() function today for simplifying checking/dealing with  return codes from commands. It uses the function text() I posted earlier to colorfy output: How to easily add colored text output in bash scripts. The function accepts 2 parameters, how it should behave if a problem occurs and the command to be executed: try <silent|warn|fatal> command

silent: save the return status in the global variable command_status
warn: if the command has a return code > 0, print a warning and save the return status in the global variable command_status
fatal: if the command has a return code > 0, print an error and exit

Obviously not as versatile as a python try/except, bu streamlines verifying the command return codes.
Example Usage:

Warning: ‘false‘ failed with return code –1
ls: cannot access doesnotexist: No such file or directory
Error: ‘ls -al doesnotexist‘ failed with return code –2


How to break down CIDR subnets in Bash

I was playing around with subnets in bash recently and needed an elegant/easy way to split up a subnet into smaller subnets. First I used 2 functions I found on to convert an IP addresse to and from an integer. After that it was just a bit of math in bash to split up any networks too big.
Any network larger than $maxSubnet gets split up.
Here the useful code:

Output of script:


How to get the intersecting area of two polygons in MySQL

I was playing around with spatial features of MySQL this weekend and stumbled into a problem where I was looking for the area of two rectangles that overlap.  MySQL provides a function to check if they overlap, but no function to extract the region that overlaps.

I’ve never written a stored routine in MySQL before, so I decided it would be a good exercise to try making one. As you can see the function is pretty straightforward and it assumes you are working with rectangles, but other than that it does what it is supposed to.
You pass the function 2 polygons (e.g. Intersection(a.poly,b.poly)), and it returns the intersecting area as a new polygon.

Example comparing some rectangles in 2 tables using the function:



How to check if a IP (ipv4) address is valid in pure Bash

Here is a small bash function to check if a IP is valid (4 octets, each octet < 256). I find it somewhat elegant since instead of using a lot of case/if/then constructs or a crazy long regex it splits the IP into each octet (and stores them in an array, and then uses a combination of regex and bit shifting to check each octet.

The function will return 0 if the IP is valid, and 1 or higher if it encountered an error (you can check with the $? variable directly after calling the function)

How to add locking to a shell script (the easy way)

I haven’t posted anything with bash here for a while, so today I’ll throw in a little snippet to use flock to make sure a script is only running once.  This is very handy in cron jobs that you want to run often, but there shouldn’t be multiple instances of the script running at the same time.
Since it is small and easy I’d recommend adding it to any code you don’t want running multiple times since “that script” you just wrote, that runs 10 minutes now, might turn into a monster in 6 months and run 45 minutes when things change (data grows, more stuff to do).  Better safe than sorry.

Basically we rely on flock to do the heavy lifting and we just add some logic around it:

man flock will show you more details to the parameters used and even some examples. Basically it will use trap to make sure the lock is released if the script ends in any way. 200 is a random file descriptor I chose for this example, it can be anything numeric. flock -xn means it will attempt to acquire an exclusive lock, and if that fails it will exit with an error.

Putting this somewhere at the top of your script will simply end the script if it finds an existing lock. flock has a few other options like -wait or nor using -n that allow you to not exit but wait for the lock to end (with wait a variable amount of seconds). And thus with a bit of creativity enabling you to only lock specific parts of the code (e.g. database calls, file changes, …) and handling failed lock attempts more gracefully than an exit.

How to increase Fraps performance with a ramdisk

I recently started playing Battlefield 3 and remembered that I have a Fraps license so I installed it and started recording some stuff. Unsurprisingly the performance made a big dip when I recorded. A glance at my PC told me the harddrive was at fault, probably bringing the whole system down due to IO.

Since my PC has more than enough RAM I decided to set up a 5Gb Ramdisk to see if that helped. It did, when writing the video files to the ramdisk I hardly had any performance hit. Unfortunately 5GB isn’t going to last long while recording 1920×1080 @ 40FPS (a few minutes footage at most).

Here is my little cmd file to create a 5GB ramdisk as drive J: and format it for usage:

So my next thought was to see if I could write a script to move files off the ramdisk when they were done being written to by Fraps. This obviously was going to cause IO load … the reason we were having performance issues in the first place, so I was skeptical about if this was going to help any. Especially since I also had to move the files away quick enough so that the drive wouldn’t fill up completely with the next file Fraps was writing. I wrote a little powershell script for this (yeah, a *nix Sysadmin writing scripts in powershell …)

Here is my little powershell script to copy the finished files from my ramdisk to a normal HDD (please excuse  possible ugliness, I’m a powershell noob):

The last little problem I noticed is that the 5GB ramdrive wasn’t big enough (Fraps seems to create some dummy files and fills them up). Forcing Fraps to make smaller files by toggeling the recording fixed that though -> pressing F9 twice fast will drop a few frames though. I used my Logitech G13 for that, just had a key mapped to press F9 quickly every 60 seconds. The shortest gap I could get working reliably is 50ms.

This all probably sounds awfully complicated, but it works and solves my problem. Fraps is great software, but it would be immensly helpful if you could set the file size in the settings (instead of it defaulting to 4GB). Or, even better, if Fraps could rework their IO system to work more efficiently.

So to sum everything up:
– create ramdrive
– start script that copies files from the ramdrive to a normal HDD
– set fraps to store videos on the ramdrive
– start game, press F9 to start recording and then press the G13 key to toggle the F9 periodically

How to build an efficient GeoIP SQL table

This here is a very handy little script I threw together to generate a geoip.sql table for quickly determining which country a IP is from. I already hear you saying “Just convert the IP to an INT and use BETWEEN, how hard can it be”. And you are right, that works. And it may even be your easiest solution, but it just isn’t fast. And if you are planning on hammering the table with thousands of queries you are going to end up looking for something fast.

A while back I found a very interesting posting at that described how to use Spacial Indexes together with MySQL’s GIS to speed up the queries. The posting has been online for a while and both it and the replies are worth reading.

All I did was make a small bash script to download the current “lite” version of GeoIP CSV file from, use the information from the posting to throw/transform it into a local database table and dump out a .sql file that can be easily imported into any other database. The script isn’t failproof though, it expects your user to be able to use mysql and have permission to create databases/tables and “load data local infile”.

How to easily add colored text output in bash scripts

Here is small snippet that can give your shell scripts some nice output: As with the script, just download it to the same directory as your own script and add it with

It contains one simple function called text with the syntax text “text to be output”. Color can be red, green, yellow, blue or grey. The function does not automatically add a linebreak to the putput, so pop a \n in there if you need it. I prefer using it together with printf for clean and easy color output.

Here are some examples of how the function can be used, and below the corresponding output:


normal text
blue text, yellow text
Status of script: [ERROR]
Status of script: [OK]

How to add debugging to shellscripts

Debugging bash scripts is pretty straightforward, throwing around a couple echo and set -x quickly gives you what you need. But what if you want to add a nice breakpoint,  debugging to lots of paces in the code or turn all debugging on or off at once? Then this little script I wrote is the right thing for you: just download it to the same directory as your script and include it with the following line:

It contains 4 simple functions that will make your bash coding easier.
debug and breakpoint both print the argument with a timestamp to STDERR
You can turn off all the functions by adding a DEBUG=false into your code




Talk about weird words … ok, according to Wikipedia disemvoweling is the term for replacing or removing vowels from words. Commonly used as a tool for moderating.  I’m pretty sure everyone has run across  certain disemvoweled  words on the internet like f*ck or sh*t. Anyway I went and made a pure html/javascript page that does just that, removes any vowels from an inputted text. The usefullness can certainly be argued, it was more for me to brush up on my javascript and css skills.

captcha cracking

This is a pretty old posting from 2009 I just recently discovered in my “drafts” directory. Nowadays there are probably easier and more elegant ways of defeating a captcha, but for old times sake, here is my simple approach.

Eclectic and Marko were so kind as to “provide” me a captcha to play around with. Took me a few days of poking around and googling but in the end it was easier than I had thought. As long as there aren’t and logic errors in the code (e.g. bad or no session handling) you probably won’t get around some kind of OCR. As OCR software I decided to use gocr because it is free, runs under linux, and it is fairly easy to train to specific needs. Because I knew which libraries were being used to create the captcha images, it was possible for me to build a testing area. This just speeds things up a bit, the process would have worked just as well off the original website. First off: the spambot in action ->, and the website it accesses:

Now I’ll describe the steps I took to defeat the captcha. Look at what happens on failed and successful inputs, first write a script that works if you enter the solution manually. I used the following 2 php functions for getting and posting stuff (and keeping the session intact)

Now train a gocr database for the images. Obviously it get’s better the more you train it.
Since curl is taking care of  session handling, we can use the get_url() function for downloading the captcha image. I pipe it through this shell command to make it easier for gocr to read:

It turnes this:

into this:

Since the valid captcha result is always the same length, we can check if gocr matched all the chars. If it looks good we can use post_url() to continue our session and throw all the fields at the form and submit it. See, wasn’t that hard. Most of the time is spent training gocr and converting the image into something easier to read. It doesn’t solve 100% of the images, more like 80-90%, but still better than nothing ;-).

Wireshark remote capturing

yeah, this is real simple stuff, not really worth writing a script for it. but on the other hand it saves me from remembering how to do it every time I need it (which isn’t often). So here is a little script to setup remote capturing with wireshark.
All it basically does is ssh to the remote host and tcpdump sucking the output via stdout through the ssh connection to a local pipe, that is then used by wireshark to display the stream. Because of this you may want to make sure you aren’t capturing your own ssh data when doing this 😉

MySQL selecting IPs via CIDR

Quick little snippet here for selecting IPs from a database based off a CIDR subnet. First off a table structure with some test data:

Now let’s say we want all IPs from the subnet, using a simple 173.192.175.% would provide false results since you don’t want the whole /24.

If your IP is stored as an unsigned int (good for you) than you can use this snippet to search for matching IPs:

If your IP is stored as a varchar (for whatever reason), the only difference is a inet_aton() around the IP field.

No matter which one you use, the result will be: