Menu

#303 check the apache logs for frequent 404s

GREEN
closed
1(low)
2014-09-09
2011-05-09
No

Finding broken links (and adding redirects) significantly improves the usability of a website. Assuming the logs are in the standard place and you have permissions to read them the command:

zcat /var/log/apache2/access.log.*.gz | grep ' 404 ' | awk '{print $7,$11}' | sort | uniq -c | sort -n

will give you a list of missing URLs on the website, the URL(s) pointing to them and a count.

Discussion

  • James Cummings

    James Cummings - 2012-06-29
    • assigned_to: nobody --> dsew
     
  • James Cummings

    James Cummings - 2013-11-10
    • Group: --> AMBER
    • Priority: 5 --> 1(low)
     
  • James Cummings

    James Cummings - 2013-11-10

    Catgorising as AMBER

     
  • James Cummings

    James Cummings - 2013-11-19

    At Council face-to-face assigned to JC, GREEN, JC to ask DS to make the TEI-C webserver logs available somewhere we can see them; then action on MH to write script to generate lists of bad links of various types. JC also to check google analytics/Webmaster tools. Deadline to report back by next conference call.

     
  • James Cummings

    James Cummings - 2013-11-19
    • assigned_to: David Sewell --> James Cummings
    • Group: AMBER --> GREEN
     
  • David Sewell

    David Sewell - 2013-11-19

    Currently, the log directory for HTTP files on tei-c.org is readable by root only. I'll see whether Shayne is willing to loosen up the permissions, or whether he has a better suggestion for monitoring the errors.

     
  • David Sewell

    David Sewell - 2013-11-20

    Logs are now world-readable under /var/log/httpd

     
  • Martin Holmes

    Martin Holmes - 2013-11-20

    I'll still need to be able to log into the server, though, won't I? Could they be copied to an external location or into the web space?

     
  • Martin Holmes

    Martin Holmes - 2014-01-24

    Having looked at the logs, there's very little that we need to worry about here. I think someone should run linkchecker against the TEI site once in a while, but anyone can do that any time. I suggest closing this ticket.

     

    Last edit: Martin Holmes 2014-01-24
  • James Cummings

    James Cummings - 2014-09-09
    • status: open --> closed
     
  • James Cummings

    James Cummings - 2014-09-09

    Ticket closed as per council recommendation; do this occasionally.

     
  • Martin Holmes

    Martin Holmes - 2014-09-09

    Actually we have a built-in checker that runs once a week on the Jenkins servers, so all we have to do is monitor the results from that.