Science Defense Force
A collection of (potentially) anti-science copyright enforcement IP ranges you may want to ban.
copyrightinfringementportal.com / ScanSafe London
iptables -I INPUT -s 80.254.145.1/255.255.255.0 -j DROP
iptables -I INPUT -s 80.254.146.1/255.255.255.0 -j DROP
iptables -I INPUT -s 80.254.147.1/255.255.255.0 -j DROP
iptables -I INPUT -s 80.254.158.1/255.255.255.0 -j DROP
Lightspeed Systems
iptables -I INPUT -s 69.84.207.128/255.255.255.128 -j DROP
fairshare.cc
iptables -I INPUT -s 64.41.145.0/255.255.255.0 -j DROP
iptables -I INPUT -s 209.249.53.0/255.255.255.0 -j DROP
paper.li (?, not sure if static)
# Broad Amazon EC2-6 (204.236.128.0 - 204.236.255.255)
#iptables -I INPUT -s 204.236.128.0/17 -j DROP
# Fine, single IP, still... is it static?
iptables -I INPUT -s 204.236.139.131 -j DROP
Intellectual property bot (?, unknown)
iptables -I INPUT -s 50.31.96.0/28 -j DROP
Intellectual property bot (?, unknown)
iptables -I INPUT -s 208.75.57.30 - DROP
CHINANET Beijing province network (?, political docs)
# Probably too broad.
iptables -I FORWARD -s 220.181.0.0/255.255.0.0 -j DROP
iptables -I FORWARD -s 123.112.0.0/255.255.0.0 -j DROP
BrandDimensions
iptables -I FORWARD -s 72.14.164.147 -j DROP
Chinese spider looking for political docs
iptables -I INPUT -s 180.166.96.142 -j DROP
Additionally a modified nginx configuration entry like the below might be useful.
$http_user_agent ~* (PaperLiBot|Lightspeedsystems|Slurp|Googlebot|bingbot|Baiduspider|YandexBot|msnbot|facebookexternalhit|FairShare|Gee.Bit|yacybot|SISTRIX|Mediapartners-Google|aboundex|WBSearchBot|360spider)