Ad blocking with /etc/hosts

This is an old trick, but generating an /etc/hosts file with dummy entries for common ad / spam / "marketing" sites is a really efficient way to block them from all browsers.

http://someonewhocares.org/hosts/ has a nicely maintained list mapping thousands of sites to 127.0.0.1 (there's a 0.0.0.0 version too). Simply overwrite your /etc/hosts with this (checking for entries in there first of course!) to use.

I've combined this with apache (usually installed and running on nearly every desktop). By using a custom combined log I can see which site requested the ad, the ad service used and the params.

My format:

LogFormat "%h %l %u %t %>s %b \"%{Referer}i\" -> %V \"%r\" \"%{User-Agent}i\"" combined

This gives me nice entries like:

127.0.0.1 - - [15/May/2014:12:21:25 +0100] 404 203 "http://someonewhocares.org/hosts/zero/" \
-> stats.g.doubleclick.net "GET /dc.js HTTP/1.1" \
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/537.75.14"

127.0.0.1 - - [15/May/2014:12:23:31 +0100] 404 203 "http://someonewhocares.org/hosts/" -> \
stats.g.doubleclick.net "GET /dc.js HTTP/1.1" \
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/537.75.14"

:D

The main reason I've switched to this is I've noticed that my browser ad blockers are themselves eating a lot of resources (which are critical on my home 4GB of RAM iMac) and they don't always work (or their definition of ad is different to mine).

It's sad that it comes to this, I was quite happy with the old style of Google text ads or thoughtful Carbon ad networks. The single worst offender for littering their site with tracking goo has to be the Guardian, which takes a really perceptible amount of time to load on my machines.

Kudos to PageFair for trying to bring some sanity to online advertising.