Menu

#1065 html tags and internet links in notes

phpGedView
open
nobody
5
2008-12-04
2008-12-04
tales
No

I am not sure if this is a bug or feature request. I exported my gedcom to show html tags. When I looked at some notes, I found that bold did not show bold, but rather showed the html codes in notes (<b> for example). Italics presented the same problem. I consulted my son, who managed a few quick fixes so it would show correctly, but he is not satisfied completely with the coding since and is not sure how to fix it the "correct" way. He did tell me to pass some information along that he found.

The expand_urls function has a line that says:

preg_replace("/<(?!br)/i", "&lt;", $text) // no html except br

This is at least part of what causes it to rewrite the < bracket as &lt; (he couldn't find where they did the > bracket at ... apparently not in that function ...). It does this unless it sees <br Then it leaves it alone.

It would help if you could support some basic formatting (I.e. bold and italics -- maybe even underlining) by modifying the preg_replace to allow the additional formatting functions. If all you allow is BR, I, and B then there is no security risk at all and it should be a simple fix.

I hope I have explained this situation correctly.

Discussion

  • tales

    tales - 2008-12-04

    My son just sent me a better fix which he and a friend found.

    Here's the code that he used to fix it. You would just need to do a simple cut and paste to fix the issue:

    preg_replace("/<(?!(b|i|br|\/b|\/i))/i", "&lt;", $text) // no html except br, b, and I

    It's in the expand_urls($text) function in the functions_print_facts.php file. The updated function in it's entirety is below.

    function expand_urls($text) {
    // Some versions of RFC3987 have an appendix B which gives the following regex
    // (([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
    // This matches far too much while a "precise" regex is several pages long.
    // This is a compromise.
    $URL_REGEX='((https?|ftp]):)(//([^\s/?#<>]*))?([^\s?#<>]*)(\?([^\s#<>]*))?(#[^\s?#<>]+)?';

    return preg_replace_callback(
    '/'.addcslashes("(?!>)$URL_REGEX(?!</a>)", '/').'/i',
    create_function( // Insert <wbr/> codes into the replaced string
    '$m',
    'return "<a href=\"".$m[0]."\" target=\"blank\">".preg_replace("/\b/", "<wbr/>", $m[0])."</a>";'
    ),
    preg_replace("/<(?!(b|i|br|\/b|\/i))/i", "&lt;", $text) // no html except br, b, and i
    );
    }

     
  • Greg Roach

    Greg Roach - 2008-12-04

    FYI, an exception was made for <br/> simply because PGV creates them itself (i.e. it converts \n into <br/>).

    BTW, here's an example of how a bold tag can be a security risk.

    <b onclick="alert('evil javascript');">Click Here!</b>

     
  • tales

    tales - 2008-12-05

    I don't let users edit the gedcom on my site so not a real issue for my particular situation. Reaffirms my decision not to let them edit it ever.

    Agreed that your example of malicious code would not be caught with the sample code I sent earlier, but to address that concern, it is easy enough to expand out the code snippet and explicitly allow only <b> and </b> and <I> and </I> so that the example you give you could never happen. That should be just a modification to the search string. Not sure if there would be other risks I'm not thinking of, but with some tight checks of the tag, it seems like it would be easy to implement.

    This would quickly and easily give you a feature that would enhance the look and feel of the Notes and other sections for all the users of PHPGedview.

     
  • Stew Stronski

    Stew Stronski - 2008-12-12

    I don't really have any preference one way or another on the issue. But I did have the thought that allowing more html codes of whatever sort will also mean that those codes have to be stripped out again whenever the gedcom is downloaded for use in a desktop program. Also there are things like the reports which would possibly have to be modified to handle these codes.

    So it's probably not quite as simple as just filtering what is allowed in.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.