#303 check the apache logs for frequent 404s

GREEN
closed
1(low)
2014-09-09
2011-05-09
No

Finding broken links (and adding redirects) significantly improves the usability of a website. Assuming the logs are in the standard place and you have permissions to read them the command:

zcat /var/log/apache2/access.log.*.gz | grep ' 404 ' | awk '{print $7,$11}' | sort | uniq -c | sort -n

will give you a list of missing URLs on the website, the URL(s) pointing to them and a count.

Discussion

  • James Cummings

    James Cummings - 2012-06-29
    • assigned_to: nobody --> dsew
     
  • James Cummings

    James Cummings - 2013-11-10
    • Group: --> AMBER
    • Priority: 5 --> 1(low)
     
  • James Cummings

    James Cummings - 2013-11-10

    Catgorising as AMBER

     
  • James Cummings

    James Cummings - 2013-11-19

    At Council face-to-face assigned to JC, GREEN, JC to ask DS to make the TEI-C webserver logs available somewhere we can see them; then action on MH to write script to generate lists of bad links of various types. JC also to check google analytics/Webmaster tools. Deadline to report back by next conference call.

     
  • James Cummings

    James Cummings - 2013-11-19
    • assigned_to: David Sewell --> James Cummings
    • Group: AMBER --> GREEN
     
  • David Sewell

    David Sewell - 2013-11-19

    Currently, the log directory for HTTP files on tei-c.org is readable by root only. I'll see whether Shayne is willing to loosen up the permissions, or whether he has a better suggestion for monitoring the errors.

     
  • David Sewell

    David Sewell - 2013-11-20

    Logs are now world-readable under /var/log/httpd

     
  • Martin Holmes

    Martin Holmes - 2013-11-20

    I'll still need to be able to log into the server, though, won't I? Could they be copied to an external location or into the web space?

     
  • Martin Holmes

    Martin Holmes - 2014-01-24

    Having looked at the logs, there's very little that we need to worry about here. I think someone should run linkchecker against the TEI site once in a while, but anyone can do that any time. I suggest closing this ticket.

     
    Last edit: Martin Holmes 2014-01-24
  • James Cummings

    James Cummings - 2014-09-09
    • status: open --> closed
     
  • James Cummings

    James Cummings - 2014-09-09

    Ticket closed as per council recommendation; do this occasionally.

     
  • Martin Holmes

    Martin Holmes - 2014-09-09

    Actually we have a built-in checker that runs once a week on the Jenkins servers, so all we have to do is monitor the results from that.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks