New Program Suggestion
SSA monitors your website and notifies you of any changes
Brought to you by:
nickthegeek,
terryheff
As a web host/designer for the last 15 years I've had a problem that it would seem could be easily fixed with a little utility program. I've tried some that seem like they should accomplish what I need, but none have worked with any kind of reliability.
The problem is that websites over time fill up with outdated or no longer needed files. Primarily image files. As company sites change product lines and revise images and run special short term events they leave behind images. And I often can't tell if an image is currently being used by a page on the site or not. Other times I'm asked to step into a site that was previously managed by someone else or a whole line of other someones and it's a complete mess of files.
So this program idea is similar to a "Link Checker", but I'm really more interested in having it do a comparison of the site. Simply create an index of all the files and their locations on the site with the ability to exclude folders of course. Then spider the site and gather all the filenames with path information into another index. Now compare indexes and show the differences. And have the flexibility to run the second process on just one html file or one folder of html files, or the entire domain. But the key desired result is you will be able to see what files are not being accessed by any of your existing webpages. This should include web pages that have no link to them when the site is spidered. So you can identify pages that either got lost from being part of the website or that were initially put up as a temporary directly accessed page and it's now perhaps time to get rid of them and their associated image files.
The link checker programs I've tried are geared more for broken "a href" links and don't report broken "img src" tags. The one or two I found that were supposed to also report image tags failed to do so with any reliability.
Whether this is a stand along program run on a PC or a php script run on the website doesn't much matter to me, I just can't believe someone hasn't created something like this already.
A further enhancement down the road would be to have it also check some search engines to see if unlinked image or html files are showing up in the search engines at all and maybe give a rating of some kind as to how valuable the file is so the manager can have some sense of if deleting this rogue image or html page is going to have any affect on his site popularity or traffic.
There you go... I think that would be a very useful utility for webmasters everywhere.
Hi BZFriend
SSA v1.4 is not designed to incorporate your suggestions. However, I have also developed SSA Multisite (now on version 1.5.3), which does a lot more and supersedes this version. You can download and find all the info at http://simplesiteaudit.terryheffernan.net
Please use the site forum for anything you wish to discuss.
Cheers
I have a script that will search the contents of all text based files in a web for the existence of a list of file names. This, in effect, will find any orphaned files that exist on the site.
It's still in its raw state as a simple script with no front end, but it will find orphaned files. Let me know if you're interested.
Last edit: Terry Heffernan 2012-12-14