So in the past few months, something has been whacking my server’s apache log files. I did some tracking down, and found some disappointing stuff. This is clearly someone attempting to exploit apache (or web servers in general), by sending invalid, and in fact, data designed to overflow the server process and grant some permissions.

The log entries would look like this:
xx.yy.zz.aa - - [08/Feb/2005:07:11:45 -0700] "SEARCH /\x90\x02\xb1\x02\xb1\x02\xb1\x02\xb1\x02\xb1\x02\xb1\x02\xb1\x02
... etc etc ad nauseum

In and of itself, this was a mere nuisance. The apache process is smarter than that, and just sends back a 414 (request too long), and drops it. The problem, however, occurs when I try to then later parse the logfile myself. Like, say … when I generate usage data for a site. The first step is to run ‘logresolve’ (supplied by apache), against the file, and have it spit out the same file with all the domains resolved. Problem is: logresolve has NO CLUE how to deal with the above part of the logfile. Ah well, so whatever… I have a few months worth of bad and missing logfile data because of it. I’m testing my new script now to make sure it doesn’t happen again.