Show: Search - Categories - Posts
Posted on Tuesday the 14th of April 2009 at 9:55 PM
The other day I was looking at the bandwidth usage for the sites I have hosted. When comparing my bandwidth usage to the amount of page views I receive a day it is clear that most of my bandwidth is being used by people who choose to leech things off my site. I'm not really concerned about my bandwidth usage. The shared hosting service I use oversells its bandwidth to try and impress new customers while in reality you would probably outrun your CPU quota before you got anywhere near your bandwidth quota. However, being the sadistic person I sometimes am I thought it'd be funny to mess with the people who were leeching images and MP3 files off of me.

Upon investigating exactly where all my bandwidth was going it is clear that some wise ass spider found my files folder and some wise ass streaming music site is leeching mp3 files directly off my server. Probably so they can provide free music without actually doing anything illegal other than indexing thousands of music files that are illegally hosted on other web servers. It sounds like a great service and judging from the ease in which it found and utilized my music files it sounds like it works great as well.

So I set off to make an htaccess file to redirect all of the incoming requests for mp3, wav, midi, and similar file types to a php file that'd determine if I should deliver what they're looking for and if not what I should deliver instead. The horrible truth about me is that I am absolutely terrible at regular expressions. I can read and understand regular expressions but when it comes to writing them my pre-existing knowledge doesn't seem to matter. So after a failed search for a pre-existing regular expression I suddenly remembered that I already made it, for almost the exact same purpose, one year ago.

This is what I now have:

RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?mydomain.com/.*$ [NC]
RewriteRule ([\w\d_-]+\.jpe?g|gif|bmp|png|mp3|wav|midi?|zip|rar)$ test.php [L]

The first line blocks any referrers without any header information. This is usually the case with mp3 players and such. The second line prevents whatever website is listed there from being run through the php file and is instead delivered the file without any interference, in my case it'd be "saurdo.com" and you can list as many of those conditions as you want. The regular expression part is the part that makes me feel stupid. It seems so obvious when reading it but creating it from nothing was somehow a struggle. The "test.php" file is the file that all of the files with those extensions must be run through.

One year ago. When I did this originally. I gave up because, after doing a quick search for a similar script, I saw that someone made exactly what I was trying to make. Unfortunately a script doesn't yet exist for filtering any file type of your choosing. So that's what I have set off to do. My ultimate goal for this script is that when some unknown website tries to leech something off of your website the script logs this attempt and returns either the file that was requested or a file that you have selected to display whenever some unknown website tries to access that file type. Later, you can view these requests from the unknown site in an control panel and from there you can select if you want to allow the site to leech your files or if you want to block that site. When you choose to block you are given the choice to display a special file for that website for that filetype.

I haven't completely thought this idea through yet. Something like this would be really entertaining for me and probably for other site owners. I plan to pack as many features into this as I possibly can before I even contemplate releasing it somewhere. So nobody should be expecting anything until several months into the future. I have already run into my first problem, when passing an mp3 file through php it will not stream properly in Firefox, but it will for IE8, IE7, IE6, Opera, and Chrome. I guess it's finally Firefox's turn to give me some trouble.