Menu

Welcome to Help

Help
2003-07-11
2013-04-23
  • Nobody/Anonymous

    Welcome to Help

     
    • mukokawa wong

      mukokawa wong - 2003-08-05

      use this sample : "HtmlCharsetDetector.java",i cannot get the correct result of detected ,

      the url is "http://www.google.com",

      the result is :CHARSET = ASCII
      but i found this page's charset is UTF-8;

      so i wish you can help me!
      thanx!

       
    • vdoss

      vdoss - 2005-08-22

      This tool samples the characters to guess the code point. Since all characters in google.com is English it narrowed down to ASCII. As you might know ASCII and UTF-8 share the same code points for English characters. The use of this tool is when the meta tag is missing in the HTML page. If you are writing a spider you first want to check the html meta tag to get the correct charset. If it is missing then you pass the data to this tool to guess the charset.

       

Log in to post a comment.