Posted in Internet Stuff, Programming

captcha cracking

This is a pretty old posting from 2009 I just recently discovered in my “drafts” directory. Nowadays there are probably easier and more elegant ways of defeating a captcha, but for old times sake, here is my simple approach.

Eclectic and Marko were so kind as to “provide” me a captcha to play around with. Took me a few days of poking around and googling but in the end it was easier than I had thought. As long as there aren’t and logic errors in the code (e.g. bad or no session handling) you probably won’t get around some kind of OCR. As OCR software I decided to use gocr because it is free, runs under linux, and it is fairly easy to train to specific needs. Because I knew which libraries were being used to create the captcha images, it was possible for me to build a testing area. This just speeds things up a bit, the process would have worked just as well off the original website. First off: the spambot in action ->, and the website it accesses:

Now I’ll describe the steps I took to defeat the captcha. Look at what happens on failed and successful inputs, first write a script that works if you enter the solution manually. I used the following 2 php functions for getting and posting stuff (and keeping the session intact)

Now train a gocr database for the images. Obviously it get’s better the more you train it.
Since curl is taking care of  session handling, we can use the get_url() function for downloading the captcha image. I pipe it through this shell command to make it easier for gocr to read:

It turnes this:

into this:

Since the valid captcha result is always the same length, we can check if gocr matched all the chars. If it looks good we can use post_url() to continue our session and throw all the fields at the form and submit it. See, wasn’t that hard. Most of the time is spent training gocr and converting the image into something easier to read. It doesn’t solve 100% of the images, more like 80-90%, but still better than nothing ;-).

Posted in Internet Stuff, Tech

Wireless bridge & dd-wrt

I recently bought the WL-330gE_M from Asus. It is a pair of access points pre-configured to bridge 2 LAN networks via wireless, all you have to do is take them out of the box and plug them in, straightforward and simple, no configuration needed. They are intended to enable hooking up devices to the internet that don’t have wireless and without pulling cables through the house (e.g. dvd player, TV, cable box, …).

The package arrived last week and it was a matter of minutes plugging the devices in and having everything working.  Everything worked without any setup, took me longer to get them out of the box than to hook them up.


Unfortunately our network storage (NAS) is also on the other end of this wireless bridge, and I noticed that when I move large files around (>2GB) or while streaming video/audio off the NAS the connection was dropping out. I don’t mean “ups and downs in the speeed” that is to be expected over wireless, I mean “connections resetting, copy actions aborting with error messages”. Not fun. Unfortunately since the devices are geared toward the “no configuration necessary, just unpack and hook up” crowd, there is no webinterface to see a syslog of what is happening or changing settings. Nada.

After this happening a few times it got really frustrating. I can live with slow, but connections dropping is out of the question. My original plan was to just reset the devices, flash them with a WL-330gE firmware and reconfigure the bridging (the only difference I could find was that the WL-330gE_M is black and not white, and comes preconfigured, and probably has a slightly different firmware without management capabilities).  While I was looking at different options and possibilities I went over to dd-wrt and happily saw that the WL-330gE was supported in the router database. So I decided if I was going to mess around with firmware, I could just as well throw dd-wrt on it.

Even though I am a system administrator, I don’t have the urge to have every device in the house running on Linux with a shell I can ssh in to. I’m perfectly fine with a simple interface that does what I want it to. But the wireless settings I can fine tune in dd-wrt are priceless (especially since I wanted to debug and fix the connection dropouts), normally you only get these options with cisco grade hardware.

The firmware upgrade process of the devices is simple and straightforward. Pull and reapply power with the reset button pressed until the power LED starts flashing, then shove the new firmware onto the device via tftp. Either with the “Firmware Restoration” tool from asus, or with a normal tftp client. I used later. Since this is so straightforward I guess I could also switch over to the official firmware if I wanted to, making two WL-330gE out of the WL-330gE_M pair (saves money since the pair is cheaper that buying two separately).

When in recovery mode (waiting for someone to tftp a new firmware onto it), the device has the IP by default. This is just a rough summary of the steps, anyone wanting to do this should really read through the whole process of deploying dd-wrt with asus, there is important information there (even if the example is a WL500, the WL330 is similar). Just because it worked for my hardware,firmware,setup doesn’t mean you have the same hardware or are deploying the same version I did. Read the dd-wrt documentation before you brick your device.

Clear current settings from the nvram:

Wait 5 min, reboot into recovery, throw a dd-wrt firmware on the device ( I used DD-WRT v24-sp2 (08/12/10) mini – build 14929, standard works fine too).

Wait 5 mins, reboot and open To be on the safe side feel free to navigate to Administration -> Factory Defaults to make sure no junk was left behind.  To get bridging configured there are multiple possibilites depending on your needs. For plain LAN bridging you will probably want WDS or one device setup as a AP and the second as a Client Bridge (I used the latter option). One thing you will want to do is go to Setup -> Networking and set the WAN port to “disabled” since the device only has LAN and Wireless.

The rest is fairly ease, set up one device as an AP, chose WPA2 with a good long strong PSK. After testing if the AP works with e.g. a laptop, you can set up the 2nd device as a Client Bridge, just make sure you are on the same channel, same SSID, same security settings.  After everything is up and running now would be a good time to pull backups from the configuration. Might as well tweak around in the wireless advanced settings. If you mess up anything badly enough that it won’t connect again … well that is why you made the configuration backups 😉

As you probably guessed by now, the connection drops are gone, connection is smooth and stable. Peak speed is not quite as fast as before because I throttled some things and tweaked settings for stability, but still good. Turning the TX antenna output power from 71 down to 65 helped a lot and got the maximum out of the connection (probably less crap pulling my SNR down). And now I can see what the access point is doing and where problems are when they arise 😉

Posted in Internet Stuff

WordPress & code formatting

I’ve been using the WordPress plugin “developer formatter” for years and it worked pretty good … for a while. Unfortunately it stopped being developed sometime in 2008, which was OK since it did everything I wanted and worked fine. Unfortunately months later I noticed that the plugin broke the visual editor for new posting in my SVN version of WordPress, and unsusprisingly when the WordPress changes wenn to a live version it broke my editor there. But I liked the plugin so much, that I just started using the html editor to make postings here (and have been doing so for over a year). It works, but it isn’t the easies way to write up postings.

I finally bit in the sour apple and searched for alternatives that work without breaking a current WordPress. Turns out there are a few, and none really do exactly I want 😉
Right now I’ve narrowed down the selection to either My Syntax or WP-Syntax I’m going to play around with both, and as soon as I’ve decided which one fits my needs better, I’ll start fixing all the code tags in the blog (ugh). SO bear with me the next few days without code formating.

Posted in Internet Stuff, Programming, Server

Wireshark remote capturing

yeah, this is real simple stuff, not really worth writing a script for it. but on the other hand it saves me from remembering how to do it every time I need it (which isn’t often). So here is a little script to setup remote capturing with wireshark.
All it basically does is ssh to the remote host and tcpdump sucking the output via stdout through the ssh connection to a local pipe, that is then used by wireshark to display the stream. Because of this you may want to make sure you aren’t capturing your own ssh data when doing this 😉

Posted in Internet Stuff, Programming

MySQL selecting IPs via CIDR

Quick little snippet here for selecting IPs from a database based off a CIDR subnet. First off a table structure with some test data:

Now let’s say we want all IPs from the subnet, using a simple 173.192.175.% would provide false results since you don’t want the whole /24.

If your IP is stored as an unsigned int (good for you) than you can use this snippet to search for matching IPs:

If your IP is stored as a varchar (for whatever reason), the only difference is a inet_aton() around the IP field.

No matter which one you use, the result will be:

Posted in Me

Moving to the USA

I haven’t posted here for a while. One of the reasons is that I started posting more day-to-day stuff on facebook, but the main reason was that the last few months my wife and I have been busy organizing stuff. She got a job offer over here, so December and January we were busy getting everything organized to move from Germany to USA. In December we flew over for a week to get an apartment rented, buy a car, set up a bank account, insurance, … all the fun stuff you go through when moving to a new place. Let it be said, that christmas time is not the advised time to organize such stuff :-p
Steffi flew in on the 15th, and I came over a week later. And now our new home is in (the cold) Twin Cities 🙂 Our internet conenction isn’t going to be installed till Feb 1st (along with phone and TV), so I don’t have much to do except for cleaning the appartment, assembling furniture (IKEA is just around the corner) and shopping … and of course keeping an eye out for interresting jobs here. I’ll keep you all posted on how things are going here, If the sun ever comes out I’ll go out to shoot some photos.

Posted in Internet Stuff, Photography

Windows reinstall and Adobe fun

I never got around to posting it, but a few weeks back the hard drive of my PC with Windows on it died … a little bit. Technically a large chunk of the harddrive is simply unaccessable. after poking and pushing I at least got windows to boot up again, but a large part of the software was dead. I bough a new drive and went through the fun process of installing a fresh windows, patching it, and then installing all the software again.

I didn’t get around to installing my video and picture software on the new windows until this morning, and it turned out to be lots of fun. Due to pure luck I found the license key for sony vegas (it is shown in the splash screen when starting up, shortly before it crashes due to my harddisk malfunction). Any Photomatix was where I keep most licenses stored. But my Photoshop license was more of a challenge. Adobe only allows 2 activated copies of the software per license, activated copies are bound to hardware … you probably see where this is going. I couldn’t deactivate the old installation since the harddrive was kinda dead, and the new installation says “different hardware (new harddrives), must be a different computer”. Yay, fun. The bright side was that the support was easily contacted and they could reset the activation counter (after lecturing me about using it on “2 computers” and deactivating, bla bla bla). I learned one thing: the more expensive the software, the more problems you have with licenses. A shame I never liked Gimp for photo editing.

Posted in Programming, Server

Controlling SSH identities

SSH has a few strange undocumented “features”. One of which is the way it handles identities via agent and command line. It is possible to specify an identity file to use for ssh via the -i parameter (ssh -i identity_file $host). What the manpage doesn’t mention, is that the specified identity isn’t forced for the connection, it is just added to the list of possible identities.
To make matters worse, ssh tries the identities from the agent first. So if agent forwarding is enabled and valid for the destination the ssh command will never use the identity specified with -i. Why is this “bad”? Because the identity specified may be used for specific tasks with commands linked to them on the destination (e.g. automatic restarts, backups, …)

Sooo, as a solution I whipped up the following function as a workaround in my scripts, I add a function called “xssh”:

I know it looks ugly, if it finds a key in the agent it makes a ssh connection to the current host with agent forwarding deactivated and then executes the ssh parameters passed. If no key is found in the ssh agent it does everything as normal.

Posted in Me

a bit of Baking

Wife did some baking tonight, so I whipped out the camera and took some shots. Unfortunately I noticed that the flash hot shoe seems to have gone unresponsive after a fall a few weeks ago, but using the built-in flash as a commander to control an off-camera flash still works fine.


Posted in Internet Stuff, Server

XEN 3.4 with ipv6 routing

Yes, there are a few postings out there about getting ipv6 routing running with XEN. But I’ll throw this online anyway since there are a few changes I had to make for it to work on my server. This text is intended for people who know their way around Linux and XEN so it will be a bit technical and won’t spell out every single step you have to make.

Most of the changes are based off scripts and information from BenV and wnagele (latter is interesting for me since I am also running XEN on a hetzner server). Have a look at the two links if anything is unclear. Now let’s start the fun 🙂

First of all we need IPv6 up and running on the host (dom0). Add the IP and gateway to your /etc/network/interfaces
This is what mine looks like:
iface eth0 inet6 static
address 2a01:4f8:100:1123::2
netmask 64
gateway 2a01:4f8:100:1120::1
pre-up ip -6 route add 2a01:4f8:100:1120::1 dev eth0

Check if the IP address is responding to the outside world (e.g. with, if everything looks ok, proceed …
Now we need to enable a few things to get routing and neighbor discovery running on the host (dom0). Edit your /etc/sysctl.conf and add/change these 2 entries (and while you are at it, set them with “sysctl -w” too):

So, your host should by now be online with ipv6 and soon be able to route packets to it’s guests. By default XEN will only take care of IPv4 when a guest is created, so here is a small patchfile that adds support for IPv6: xen-ipv6-vif-route.patch. The patch changes vif-route and, while these files may be in different places depending on your distribution, /etc/xen/scripts/ is where they can commonly be found. Download the patch to the directory with the scripts to be changed and execute a “patch -p0 < xen-ipv6-vif-route.patch” ( gets a few new IPv6 functions, and iptables now won’t try to change stuff for IPv6 IPs. vif-route changes are: ndp is enabled for the vif device and the route/neighbor IPv6 settings are set)

So, now that the scripts know how to setup all our IPv6 needs, we need to add the IPv6 IP to our guest settings (.cfg file typically found in /etc/xen/). What we want to change is the “vif” setting. Add the IPv6 IP of the guest to the IPv4 IP (just the IP without the trailing /network, space separated form the IPv4 IP):
vif = [ 'mac=B1:A3:3F:25:11:B8, ip=2a01:4f8:100:1123::5' ]

Now you can create the guest(domU) and add the IPv6 IP to the /etc/network/interfaces of the guest if you haven’t so already (it uses the host (dom0) as the gateway).

iface eth0 inet6 static
address 2a01:4f8:100:1123::5
netmask 64
gateway 2a01:4f8:100:1123::2

Restart the networking on the guest (or reboot it) and you should now be able to ping the guest from the internet. See, easy wasn’t it 🙂

Posted in Programming, Server

Script of the day – clean up stale .ssh/known_hosts

This little script takes an IP or hostname as a parameter, and if there is an offending key in the .ssh/known_hosts it removes it and replaces it with the current valid one useful if you are moving/reinstalling a large amount of servers …

Posted in Tech

Checking a list of IPs against RBL

This is more a reminder to myself than anything else … this is small snippet that takes a list of IPs and does a whois on all that aren’t in a RBL

Lets say we have al list of IPs in a file “iplist.txt”:

Snippet that checks the IPs (can of course be easily changed to check IPs that are IN a RBL)

Posted in Programming

bash: using the content of a variable as variable name

Since the implementation of Arrays in Bash is somewhat lacking compared to higher level programming languages (only one-dimensional), and hash lists require a bit of work to set up, you may run into a situation where you have a small list of key/value pairs that are both variable and you need to store.
There are various solutions for the problem, e.g. creating two arrays (one for the keys, one for the values, and combining them by using the same index values for the entries), or using the functions from the link above to build a hash list. For me the easiest way to solve the problem, if I only have a few variables and don’t want to bloat the code, is to (mis)use declare. declare is intended for setting the type of a variable (constant, array, integer,…), but has the nice side affect that you can use variables in the key name, and you can set the value of the variable.

declare ${Key}=${Value}

$File_Config is variable holding the name of a configfile, the content of the file could look like this:


after the snippet has read the configfile, you can use $Configuration_foo, $Configuration_bar and $Configuration_foobar in your script. The keynames could also have came from a mysql query, array, command line args, …

Posted in Server

back online

The hard drive crash threw me offline a few days due to strange problems with software raids, Xen and acpi. Turns out that using the latest Xen kernel from debian testing branch on a software raid only works of you don’t set “acpi=off” as a kernel parameter. If acpi is turned off, the script “scripts/local-top/mdadm” in the initrd can’t find the devices needed to mount the software raid … causing the whole boot process to come to a grinding halt.

If I find some time I’ll do some more tests, untill then my server will be running with acpi turned on

btw. the hard disk replacement was easy. after the new drive was popped in it was just a copy the partition table and add the partitions of the new disk to the raid

Posted in Tech

dead hard drive

It seems that one of the hard drives in my server died last night. Thanks to the raid no data is lost, but the server will be offline shortly this week (tuesday morning probably) to replace the faulty drive with a fresh new one.

Posted in Me, Photography

Pictures from the Mediterranean cruise

We shot over 600 photos on our cruise. Anyone who missed it on Facebook: we visited Savona (Italy), Barcelona, Palma de Mallorca (both Spain), Tunis (Tunisia), La Valletta (Malta), Cantania and Rom (both Italy). Anyway, here is a selection of some of the photos. Took me the better of the day to sort out which photos were worth while posting, and then post-processing them.






La Valletta

Posted in Photography, Tech

Adobe Lightroom 3

I always shoot in RAW + JPEG. For normal point-and-shoot vacation stuff I’m generally satisfied with the JPEG the camera spits out. But I would never shoot only JPEG. The additional information of RAW shouldn’t be underestimated, and to be honest I often tweak around. It makes a difference if you are working on the original RAW data, or if you are working on the JPEG copy the camera has already processed.
Up till now I’ve been post-processing my images with Adobe Bridge and Photoshop CS3. Since I was planning on post-processing a whole load of pictures I decided to see what software there is out there to streamline the work flow a bit (Bridge and CS3 do the job, and the raw converter in CS3 does offer a wide variety of options, but it is still tedious switching to be switching between both programs and working on multiple RAW images at the same time). While I was away, Adobe released Lightroom 3, so I checked out the reviews and it sounded good. I downloaded the 30 day trial version and to sum it up my experience so far … I’m impressed.
I haven’t worked with Lightroom previously, so I can’t say how much has changed in this version. But I really like the details that make life easier when handling collections of images. Being an Adobe product it also offers interfaces to various Photoshop functions (I only own CS3, I could imagine it offers more options if you have the current version CS5 installed). I could go on and on with things I like about it, but I’ll just sum it up and say: It really streamlines the work flow of post-processing photographs from import to print/upload/web/presentation and if you are shooting RAW it has a whole lot of fun stuff to play around with directly built in.
Since I shoot with a Nikon D80 that tends to produce a fair amount of image noise if I go past ISO 400 I liked the noise reduction features of Lightroom, both color and luminance noise can be reduced greatly with sliders for fine tuning.

It’s a good piece of software, and when the 30 day trial ends I’ll probably go buy it.