search signals | SEO the /ha(rd|cker)/ way

Checking Status Codes From the Command Line

Using cURL to Check HTTP Headers and Status Codes

This tutorial is still in draft and is incomplete. Please use with caution.

Checking for a 301 or 302 redirect is a pretty common SEO task.

There are a lot of different tools you can use, from Chrome Extensions and Firefox Plugins to full websites.

However, the simplest and purest way of checking a status code is from the Command Line.

If you consider that fact that Googlebot and Bingbot are non-browser centric applications, then the closer to the metal you are, the closer to reality you are.

cURL Is Your Best Friend

If you’re not familiar with cURL, you really should be. It’s a super powerful command line tool and the best thing is it’s everywhere. If you shell into an Linux based server, you’ll have all the features of cURL available to you.

Changing Your User Agent

Using the -A, --user-agent flag you can easily change the user agent to anything you want.

This is particuarly useful when auditing a site to make sure it’s not user agent cloaking for Googlebot or Bingbot.

curl --user-agent 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)' http://searchsignals.com/

Additionally, you can create a unique user agent so you can easily filter out your own requests from the server logs, especially if you’re using an IP outside of your office network.

HEAD Requests vs GET Requests

It’s common for many users of cURL to use the -I, --head flag to just get the http headers so the page won’t scroll.

However, I actually strongly suggest against this. On rare occassions, CDNs or other various server configurations can sometimes respond with a different status code or HTTP Headers for a HEAD request.

Given, you’ll want to replicate the search crawler’s experience as closely as possible, you should always be making GET requests.

Since GET requests also grab the body of the request, you might end up having to scroll quite a bit just to get back to the top so you can read the status code and HTTP Headers.

Here’s a little helper modification to make your life a bit easier.

You can also alias it so you don’t have to type it every single time.

curl -D /tmp/headers.txt -s http://searchsignals.com/tutorials/ > /dev/null && cat /tmp/headers.txt

Bulk Checking with Parallel Requests

There are a variety of ways to make multiple requests at a time with cURL.