A potential reader contacted me about the blog being unavailable from his home or office. He was able to ping blog.jdpfu.com, but HTTP requests never returned, ever.
I have been forced to block large areas of the internet from accessing the blog website due to blog spammers and other nefarious attempted access. If you are affected, I am truly sorry.
While blocking by IP is terrible, I did not feel as there was any choice. The costs of the connectivity, servers, power, and my time to maintain software and create content are not free. I have been doing this on my dime the last … er … 10+ yrs. I hate blog spam and copyright infringement, so vicious techniques are used to fight it.
IP Blocking
I try to avoid blocking residential ISPs, but sometimes there is not any choice.
China, Russia, and Romania IP lists are used to block all access. Sorry.
There are about 200 other subnets that are also blocked – most are inside the USA from hosting providers or specific businesses – AFTER abusive use is seen. For example, a corporate Microsoft subnet was blog spamming here, so half of Microsofts public IPs were blocked. Many providers in Norther Virginia and Maryland are also blocked due to other considerations. I apologize to the guys at BlueCoat for blocking their public IPs.
If you can ping, but not connect to the blog, it is likely you are trying to access the blog from one of those blocked subnets. Use TOR or a VPN with an exit node in a different subnet or country.
Referral Blocking
Referrals from a small number of domains are blocked. Seem they were stealing bandwidth. Since I do not have much bandwidth to start, it was necessary to block them.
403 error is returned.
Client User-Agent Blocking
Certain clients are blocked because they are abusive or only used to grab the entire website. If your client is abusive, it may be added to the blocked list too. Older clients that nobody should be using anymore also get blocked from time to time. I always figure these are old content grabbing tools. Popular network modules from php, perl, python, java, ruby with easy-to-understand user-agents are also blocked.
RSS readers that hit the feeds here more often than once an hour are abusive in my book. It is unusual for more than 1 article to be posted a week, so hitting the RSS feed every 5 mins is crazy. Once a day would be ideal, if I were asked. A number of RSS readers have been blocked. Sorry if you are impacted.
403 error is returned.
Windows 10
403 error is returned. Get a different OS.
Check the workarounds below.
Search Engine Blocking
Certain search engines do not honor the /robots.txt file here. Some were downright abusive like Baidu. At one point Baidu was using 80% of the total bandwidth, but no sending any (0%) referrals. Huh? Why would I allow a search engine to continue doing that? Well, I would not, sorry to Baidu users – that search engine is blocked. The blocking of Baidu is not perfect, but works well enough.
Limits of Blocking
I realize that it is trivial to get around most of these blocking methods. I also realize that my little blog is not worth any manual attempt by the large tracking/spiders to figure out. If someone cannot be bothered to change the user-agent, I do not think they would get much useful information from this blog anyway.
Workarounds
If you are receiving a 403 return, either change the user-agent or use a different client.
If you are dropping into a black hole, use TOR or some VPN with exit nodes in a different country.
Try using a google-cache or the way-back machine to see the page you want. That will work.
I will NOT whitelist any specific IPs.