Bringing Firefox Back!

Firefox has had no shortage of controversial security and privacy changes in recent memory. Frankly, the sum is enough to make you worry about the future of the browser itself. A few of Mozilla’s more interesting choices that come to mind are:

  1. Integrating Pocket (which it now owns)
  2. Selling advertising on new tab pages
  3. Choosing Yahoo over Google as their default search engine
  4. Marketing itself with connections to the Mr. Robot TV show

I’ve fielded more than one malware removal question after telling someone to “just search for and download 7-Zip” because of their multi-year default to Yahoo search. Try finding the legitimate download in Yahoo’s search results compared to Google’s!

With these fiascos now behind it, we’re left with but a few transgressions preventing it from being, if not as good as we’d fondly like to remember it, at least the only real competitor to Chrome. Without being able to travel back to simpler times where Firefox 3 and the Firebug browser extension brought web development to life for me, I’d have to say that Firefox has never been better. All that’s left is ditching Pocket and cleaning up the new tab screen.

Disabling Pocket

  • Visit about:config and search for / change “extensions.pocket.enabled” to “false”.

    Cleaning up the New Tab Page

  • Open a new tab and click on the Settings button. Uncheck all the options.

That’s it. Restart Firefox and rebel, rebel, rebel!

Spidering / Link Checking With wget

I use XENU for link checking sites and finding missing assets but I couldn’t figure out how to make sure that it was following the redirects it encountered. For example, if an inline image source is “/images/sitelogo.jpg” but that 301 redirects to “/images/sitelogo-new.jpg”, XENU will report the redirect (as an error if you prefer), but what I really want to know is whether the destination of that redirect was a 200 OK (or a 404, or something else unintended). It wasn’t clear to me if XENU was ensuring that the file existed after being redirected.

I tried out a few other free tools but none seemed even as good as XENU. It was then that I stumbled upon the “spider” option in wget. You can set it free on a URL like so:

wget --spider -l 2 -r -p -o wgetOutput.log http://somesite.net

This will spider the URL up to 2 levels deep and ensure that any inline assets on the pages within those levels are also downloaded. The “-p” option ensures that inline assets like images or css are downloaded from a page even when the maximum number of levels in the “-l” option is reached. The output is logged to wgetOutput.log

At the very end of wgetOutput.log you’ll find a list of broken links that looks something like this. You will also get a ton of other useful information about every request that it made – so you know exactly what it’s doing!

Spider mode enabled. Check if remote file exists.
--2013-08-06 20:10:40--  http://somesite.net/images/sitelogo-new.png
Reusing existing connection to somesite.net:80.
HTTP request sent, awaiting response... 200 OK
Length: 4153 (4.1K) [image/png]
Remote file exists but does not contain any link -- not retrieving.
 
Removing somesite.net/images/sitelogo-new.png.
unlink: No such file or directory

Other Useful Options

Specify a user agent:

-U "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)"

Spider a site that forces you to log in:

  1. Get the Cookie Exporter Add-on for Firefox.
  2. Log into the site you want to spider.
  3. From Firefox, run Tools -> Export Cookies -> cookiesFile.txt
  4. Use the “–load-cookies” option:
    --load-cookies cookiesFile.txt

Complete Example:

wget --spider -l 2 -r -p -o wgetOutput.log -U "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)" --load-cookies cookiesFile.txt http://somesite.net