You can subscribe to this list here.
| 2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
(3) |
Nov
(1) |
Dec
(24) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2008 |
Jan
(51) |
Feb
(36) |
Mar
(41) |
Apr
(33) |
May
(20) |
Jun
(26) |
Jul
(52) |
Aug
(29) |
Sep
(9) |
Oct
(10) |
Nov
(4) |
Dec
(34) |
| 2009 |
Jan
(14) |
Feb
(35) |
Mar
(36) |
Apr
(32) |
May
(11) |
Jun
(7) |
Jul
(22) |
Aug
(65) |
Sep
(15) |
Oct
(5) |
Nov
(11) |
Dec
(53) |
| 2010 |
Jan
(21) |
Feb
(15) |
Mar
(5) |
Apr
|
May
(13) |
Jun
(8) |
Jul
(3) |
Aug
(2) |
Sep
(2) |
Oct
(4) |
Nov
(4) |
Dec
(3) |
| 2011 |
Jan
(2) |
Feb
(2) |
Mar
(7) |
Apr
(1) |
May
(2) |
Jun
|
Jul
(6) |
Aug
(13) |
Sep
(4) |
Oct
(1) |
Nov
|
Dec
|
| 2012 |
Jan
(3) |
Feb
(6) |
Mar
(11) |
Apr
(6) |
May
(12) |
Jun
(1) |
Jul
(8) |
Aug
(16) |
Sep
(11) |
Oct
(11) |
Nov
(5) |
Dec
(13) |
| 2013 |
Jan
(9) |
Feb
(3) |
Mar
(5) |
Apr
(5) |
May
(6) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(4) |
Nov
|
Dec
|
| 2014 |
Jan
(3) |
Feb
(2) |
Mar
(1) |
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(4) |
Dec
|
| 2015 |
Jan
|
Feb
|
Mar
(1) |
Apr
(4) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Ronald B. <rb...@rb...> - 2017-05-18 18:54:32
|
Hi Rafa, have done some test but the only thing i can see is the EOFExceptions is thrown once for every paser run. Some measurement shows no difference if we replace this exception by some other code. Will have another look if you can provide more details and maybe a case that shows the problem. BTW: i did all this with the html-unit branch of the project. We did there at least some small Java 6 adjustments. RBRi On Thu, 18 May 2017 10:03:47 +0200 Rafa Guerrero wrote: > >Hi Ronald, >I'm attaching you an screenshot with the exceptions that I'm seeing in the >profiler: > >[image: Imágenes integradas 1] > >As you can see the amount of EOFExceptions that we are seeing is much >higher that expected. > >About the info, we are doing the parse in order to do some operations with >from an HTML that we are getting from a JSP. One thing to consider is that >the HTML is from our clients, not ours, so we don't have control over it. > >Let me know if there's anything else. > >Best, >Rafa > >2017-05-17 21:34 GMT+02:00 Ronald Brill <rb...@rb...>: > >> >I'm a developer in an application with high concurrency and I've seen, >> >profiling with Yourkit, that there's a huge amount of exceptions >> >EOFExceptions from the scan function in HTMLScanner class. >> >> Hi Rafa, >> >> can you please provide some more details - will be great to have an >> example or maybe some profiler output that visualizes the problem. >> >> RBRi >> >> > > |
|
From: Ronald B. <rb...@rb...> - 2017-05-17 18:35:01
|
>I'm a developer in an application with high concurrency and I've seen, >profiling with Yourkit, that there's a huge amount of exceptions >EOFExceptions from the scan function in HTMLScanner class. Hi Rafa, can you please provide some more details - will be great to have an example or maybe some profiler output that visualizes the problem. RBRi |
|
From: Rafa G. <raf...@ma...> - 2017-05-10 18:14:57
|
Hello, I'm a developer in an application with high concurrency and I've seen, profiling with Yourkit, that there's a huge amount of exceptions EOFExceptions from the scan function in HTMLScanner class. I'm concern about the performance of all the exceptions, is there any way to do this other way? As you know generating a huge amount of exceptions can penalize the performance of the JVM. Thanks for the help. -- *Rafa Guerrero* [image: Inline image 2] Marfeel Solutions S.L. Avda. Josep Tarradellas 20-30, 6th Floor, 08029 Barcelona, Spain ES: (+34) 93 178 59 50 ext. 106 US: (+1) 917-341-2540 ext. 106 UK: (+44) 207-048-37-28 <%28%2B44%29%20704-837-28> ext. 106 |
|
From: <mgu...@us...> - 2015-04-17 12:40:41
|
Revision: 349
http://sourceforge.net/p/nekohtml/code/349
Author: mguillem
Date: 2015-04-17 12:40:39 +0000 (Fri, 17 Apr 2015)
Log Message:
-----------
release 1.9.22
Added Paths:
-----------
branches/nekohtml-1.9.22/
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2015-04-17 12:26:34
|
Revision: 348
http://sourceforge.net/p/nekohtml/code/348
Author: mguillem
Date: 2015-04-17 12:26:32 +0000 (Fri, 17 Apr 2015)
Log Message:
-----------
preparing release 1.9.22
Modified Paths:
--------------
trunk/build.xml
trunk/doc/changes.html
trunk/doc/index.html
trunk/pom.xml
Modified: trunk/build.xml
===================================================================
--- trunk/build.xml 2015-04-16 15:53:16 UTC (rev 347)
+++ trunk/build.xml 2015-04-17 12:26:32 UTC (rev 348)
@@ -4,7 +4,7 @@
<!-- PROPERTIES -->
<property file='build-custom.properties' />
- <property name='version' value='1.9.22-SNAPSHOT'/>
+ <property name='version' value='1.9.22'/>
<property name='name' value='nekohtml'/>
<property name='fullname' value='${name}-${version}'/>
<property name='Title' value='NekoHTML'/>
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2015-04-16 15:53:16 UTC (rev 347)
+++ trunk/doc/changes.html 2015-04-17 12:26:32 UTC (rev 348)
@@ -28,7 +28,7 @@
<h2>Releases</h2>
<dl>Elements
- <dt>Version 1.9.22 (to be released)</dt>
+ <dt>Version 1.9.22 (17 Apr 2015)</dt>
<dd>Element <code>NOBR</code> closes <code>NOBR</code>, <code>BUTTON</code> closes <code>BUTTON</code> (patch from Ronald Brill),
element <code>EMBED</code> has no body (patch from Ronald Brill),
element <code>A</code> shouldn't be inline (patch from Ahmed Ashour),
Modified: trunk/doc/index.html
===================================================================
--- trunk/doc/index.html 2015-04-16 15:53:16 UTC (rev 347)
+++ trunk/doc/index.html 2015-04-17 12:26:32 UTC (rev 348)
@@ -1,7 +1,7 @@
<title>NekoHTML</title>
<link rel=stylesheet type=text/css href=style.css>
-<h1>CyberNeko HTML Parser <sub>1.9.21</sub></h1>
+<h1>CyberNeko HTML Parser <sub>1.9.22</sub></h1>
<div style="right: 10; top: 10; position: absolute">
<a href="http://sourceforge.net/projects/nekohtml"><img src="http://sflogo.sourceforge.net/sflogo.php?group_id=195122&type=12" width="120" height="30" border="0" alt="Get NekoHTML at SourceForge.net. Fast, secure and Free Open Source software downloads" /></a>
</div>
@@ -57,8 +57,8 @@
following location:
<ul>
<li>NekoHTML
- [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.21.zip'>zip</a>]
- [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.21.tar.gz'>tgz</a>]
+ [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.22.zip'>zip</a>]
+ [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.22.tar.gz'>tgz</a>]
</ul>
<h2>Requirements and Limitations</h2>
Modified: trunk/pom.xml
===================================================================
--- trunk/pom.xml 2015-04-16 15:53:16 UTC (rev 347)
+++ trunk/pom.xml 2015-04-17 12:26:32 UTC (rev 348)
@@ -4,7 +4,7 @@
<artifactId>nekohtml</artifactId>
<name>Neko HTML</name>
<description>An HTML parser and tag balancer.</description>
- <version>1.9.22-SNAPSHOT</version>
+ <version>1.9.22</version>
<url>http://nekohtml.sourceforge.net/</url>
<licenses>
<license>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2015-04-16 15:53:18
|
Revision: 347
http://sourceforge.net/p/nekohtml/code/347
Author: mguillem
Date: 2015-04-16 15:53:16 +0000 (Thu, 16 Apr 2015)
Log Message:
-----------
improved detection of compatible encodings from meta charset when only decode is supported (patch from Steve McKay)
Issue #20
Modified Paths:
--------------
trunk/doc/changes.html
trunk/src/org/cyberneko/html/HTMLScanner.java
Added Paths:
-----------
trunk/data/meta/test-meta-encoding3.html
trunk/data/meta/test-meta-encoding3.html.canonical
trunk/data/meta/test-meta-encoding3.html.settings
Added: trunk/data/meta/test-meta-encoding3.html
===================================================================
--- trunk/data/meta/test-meta-encoding3.html (rev 0)
+++ trunk/data/meta/test-meta-encoding3.html 2015-04-16 15:53:16 UTC (rev 347)
@@ -0,0 +1,14 @@
+<head>
+<meta http-equiv="Content-Type" content="text/html;charset=iso-2022-cn">
+</head>
+$)AKNLe
+
+PB
+
+$)G\XM||U
+
+IzN"~
+
+d;\XM||U
+
+$)A#?
Added: trunk/data/meta/test-meta-encoding3.html.canonical
===================================================================
--- trunk/data/meta/test-meta-encoding3.html.canonical (rev 0)
+++ trunk/data/meta/test-meta-encoding3.html.canonical 2015-04-16 15:53:16 UTC (rev 347)
@@ -0,0 +1,13 @@
+(HTML
+(HEAD
+"\n
+(META
+Acontent text/html;charset=iso-2022-cn
+Ahttp-equiv Content-Type
+)META
+"\n
+)HEAD
+(BODY
+"\n宋体\n\n新\n\n細明體\n\n宋体\n\n浠茇忘�\n\n?\n\n
+)BODY
+)HTML
Added: trunk/data/meta/test-meta-encoding3.html.settings
===================================================================
--- trunk/data/meta/test-meta-encoding3.html.settings (rev 0)
+++ trunk/data/meta/test-meta-encoding3.html.settings 2015-04-16 15:53:16 UTC (rev 347)
@@ -0,0 +1 @@
+property http://cyberneko.org/html/properties/default-encoding UTF-8
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2015-04-16 15:42:57 UTC (rev 346)
+++ trunk/doc/changes.html 2015-04-16 15:53:16 UTC (rev 347)
@@ -31,7 +31,8 @@
<dt>Version 1.9.22 (to be released)</dt>
<dd>Element <code>NOBR</code> closes <code>NOBR</code>, <code>BUTTON</code> closes <code>BUTTON</code> (patch from Ronald Brill),
element <code>EMBED</code> has no body (patch from Ronald Brill),
- element <code>A</code> shouldn't be inline (patch from Ahmed Ashour).
+ element <code>A</code> shouldn't be inline (patch from Ahmed Ashour),
+ improved detection of compatible encodings from meta charset when only decode is supported (patch from Steve McKay).
</dd>
<dt>Version 1.9.21 (2 Jun 2014)</dt>
Modified: trunk/src/org/cyberneko/html/HTMLScanner.java
===================================================================
--- trunk/src/org/cyberneko/html/HTMLScanner.java 2015-04-16 15:42:57 UTC (rev 346)
+++ trunk/src/org/cyberneko/html/HTMLScanner.java 2015-04-16 15:53:16 UTC (rev 347)
@@ -3709,17 +3709,33 @@
* be the same in both encodings
*/
boolean isEncodingCompatible(final String encoding1, final String encoding2) {
- final String reference = "<html><head><meta http-equiv=\"Content-Type\" content=\"text/html;charset=";
try {
- final byte[] bytesEncoding1 = reference.getBytes(encoding1);
- final String referenceWithEncoding2 = new String(bytesEncoding1, encoding2);
- return reference.equals(referenceWithEncoding2);
+ try {
+ return canRoundtrip(encoding1, encoding2);
+ }
+ catch (final UnsupportedOperationException e) {
+ // if encoding1 only supports decode, we can test it the other way to only decode with it
+ try {
+ return canRoundtrip(encoding2, encoding1);
+ }
+ catch (final UnsupportedOperationException e1) {
+ // encoding2 only supports decode too. Time to give up.
+ return false;
+ }
+ }
}
catch (final UnsupportedEncodingException e) {
return false;
}
}
+ private boolean canRoundtrip(final String encodeCharset, final String decodeCharset) throws UnsupportedEncodingException {
+ final String reference = "<html><head><meta http-equiv=\"Content-Type\" content=\"text/html;charset=";
+ final byte[] bytesEncoding1 = reference.getBytes(encodeCharset);
+ final String referenceWithEncoding2 = new String(bytesEncoding1, decodeCharset);
+ return reference.equals(referenceWithEncoding2);
+ }
+
private boolean endsWith(final XMLStringBuffer buffer, final String string) {
final int l = string.length();
if (buffer.length < l) {
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2015-04-16 15:43:06
|
Revision: 346
http://sourceforge.net/p/nekohtml/code/346
Author: mguillem
Date: 2015-04-16 15:42:57 +0000 (Thu, 16 Apr 2015)
Log Message:
-----------
Element A shouldn't be inline (patch from Ahmed Ashour)
Issue #21
Modified Paths:
--------------
trunk/doc/changes.html
trunk/src/org/cyberneko/html/HTMLElements.java
Added Paths:
-----------
trunk/data/a/test-a-around-applet.html
trunk/data/a/test-a-around-applet.html.canonical
trunk/data/a/test-a-around-dd.html
trunk/data/a/test-a-around-dd.html.canonical
trunk/data/abbr/
trunk/data/abbr/test-abbr-around-applet.html
trunk/data/abbr/test-abbr-around-applet.html.canonical
trunk/data/abbr/test-abbr-around-center.html
trunk/data/abbr/test-abbr-around-center.html.canonical
trunk/data/abbr/test-abbr-around-del.html
trunk/data/abbr/test-abbr-around-del.html.canonical
trunk/data/abbr/test-abbr-around-dir.html
trunk/data/abbr/test-abbr-around-dir.html.canonical
trunk/data/abbr/test-abbr-around-dt.html
trunk/data/abbr/test-abbr-around-dt.html.canonical
trunk/data/abbr/test-abbr-around-fieldset.html
trunk/data/abbr/test-abbr-around-fieldset.html.canonical
trunk/data/abbr/test-abbr-around-isindex.html
trunk/data/abbr/test-abbr-around-isindex.html.canonical
trunk/data/abbr/test-abbr-around-keygen.html
trunk/data/abbr/test-abbr-around-keygen.html.canonical
trunk/data/abbr/test-abbr-around-listing.html
trunk/data/abbr/test-abbr-around-listing.html.canonical
trunk/data/abbr/test-abbr-around-marquee.html
trunk/data/abbr/test-abbr-around-marquee.html.canonical
trunk/data/abbr/test-abbr-around-menu.html
trunk/data/abbr/test-abbr-around-menu.html.canonical
trunk/data/abbr/test-abbr-around-multicol.html
trunk/data/abbr/test-abbr-around-multicol.html.canonical
trunk/data/abbr/test-abbr-around-noembed.html
trunk/data/abbr/test-abbr-around-noembed.html.canonical
trunk/data/abbr/test-abbr-around-noframes.html
trunk/data/abbr/test-abbr-around-noframes.html.canonical
trunk/data/abbr/test-abbr-around-nolayer.html
trunk/data/abbr/test-abbr-around-nolayer.html.canonical
trunk/data/abbr/test-abbr-around-noscript.html
trunk/data/abbr/test-abbr-around-noscript.html.canonical
trunk/data/abbr/test-abbr-around-object.html
trunk/data/abbr/test-abbr-around-object.html.canonical
trunk/data/abbr/test-abbr-around-pre.html
trunk/data/abbr/test-abbr-around-pre.html.canonical
trunk/data/abbr/test-abbr-around-ruby.html
trunk/data/abbr/test-abbr-around-ruby.html.canonical
trunk/data/abbr/test-abbr-around-s.html
trunk/data/abbr/test-abbr-around-s.html.canonical
Added: trunk/data/a/test-a-around-applet.html
===================================================================
--- trunk/data/a/test-a-around-applet.html (rev 0)
+++ trunk/data/a/test-a-around-applet.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<a><applet></applet></a>
\ No newline at end of file
Added: trunk/data/a/test-a-around-applet.html.canonical
===================================================================
--- trunk/data/a/test-a-around-applet.html.canonical (rev 0)
+++ trunk/data/a/test-a-around-applet.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(A
+(APPLET
+)APPLET
+)A
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/a/test-a-around-dd.html
===================================================================
--- trunk/data/a/test-a-around-dd.html (rev 0)
+++ trunk/data/a/test-a-around-dd.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<a><dd></dd></a>
\ No newline at end of file
Added: trunk/data/a/test-a-around-dd.html.canonical
===================================================================
--- trunk/data/a/test-a-around-dd.html.canonical (rev 0)
+++ trunk/data/a/test-a-around-dd.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(A
+(DD
+)DD
+)A
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-applet.html
===================================================================
--- trunk/data/abbr/test-abbr-around-applet.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-applet.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><applet></applet></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-applet.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-applet.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-applet.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(APPLET
+)APPLET
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-center.html
===================================================================
--- trunk/data/abbr/test-abbr-around-center.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-center.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><center></center></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-center.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-center.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-center.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(CENTER
+)CENTER
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-del.html
===================================================================
--- trunk/data/abbr/test-abbr-around-del.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-del.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><del></del></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-del.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-del.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-del.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(DEL
+)DEL
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-dir.html
===================================================================
--- trunk/data/abbr/test-abbr-around-dir.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-dir.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><dir></dir></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-dir.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-dir.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-dir.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(DIR
+)DIR
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-dt.html
===================================================================
--- trunk/data/abbr/test-abbr-around-dt.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-dt.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><dt></dt></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-dt.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-dt.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-dt.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(DT
+)DT
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-fieldset.html
===================================================================
--- trunk/data/abbr/test-abbr-around-fieldset.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-fieldset.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><fieldset></fieldset></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-fieldset.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-fieldset.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-fieldset.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(FIELDSET
+)FIELDSET
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-isindex.html
===================================================================
--- trunk/data/abbr/test-abbr-around-isindex.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-isindex.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><isindex></isindex></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-isindex.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-isindex.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-isindex.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(ISINDEX
+)ISINDEX
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-keygen.html
===================================================================
--- trunk/data/abbr/test-abbr-around-keygen.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-keygen.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><keygen></keygen></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-keygen.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-keygen.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-keygen.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(KEYGEN
+)KEYGEN
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-listing.html
===================================================================
--- trunk/data/abbr/test-abbr-around-listing.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-listing.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><listing></listing></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-listing.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-listing.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-listing.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(LISTING
+)LISTING
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-marquee.html
===================================================================
--- trunk/data/abbr/test-abbr-around-marquee.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-marquee.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><marquee></marquee></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-marquee.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-marquee.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-marquee.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(MARQUEE
+)MARQUEE
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-menu.html
===================================================================
--- trunk/data/abbr/test-abbr-around-menu.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-menu.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><menu></menu></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-menu.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-menu.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-menu.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(MENU
+)MENU
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-multicol.html
===================================================================
--- trunk/data/abbr/test-abbr-around-multicol.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-multicol.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><multicol></multicol></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-multicol.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-multicol.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-multicol.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(MULTICOL
+)MULTICOL
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-noembed.html
===================================================================
--- trunk/data/abbr/test-abbr-around-noembed.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-noembed.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><noembed></noembed></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-noembed.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-noembed.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-noembed.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(NOEMBED
+)NOEMBED
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-noframes.html
===================================================================
--- trunk/data/abbr/test-abbr-around-noframes.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-noframes.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><noframes></noframes></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-noframes.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-noframes.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-noframes.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(NOFRAMES
+)NOFRAMES
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-nolayer.html
===================================================================
--- trunk/data/abbr/test-abbr-around-nolayer.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-nolayer.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><nolayer></nolayer></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-nolayer.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-nolayer.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-nolayer.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(NOLAYER
+)NOLAYER
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-noscript.html
===================================================================
--- trunk/data/abbr/test-abbr-around-noscript.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-noscript.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><noscript></noscript></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-noscript.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-noscript.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-noscript.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(NOSCRIPT
+)NOSCRIPT
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-object.html
===================================================================
--- trunk/data/abbr/test-abbr-around-object.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-object.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><object></object></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-object.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-object.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-object.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(OBJECT
+)OBJECT
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-pre.html
===================================================================
--- trunk/data/abbr/test-abbr-around-pre.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-pre.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><pre></pre></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-pre.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-pre.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-pre.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(PRE
+)PRE
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-ruby.html
===================================================================
--- trunk/data/abbr/test-abbr-around-ruby.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-ruby.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><ruby></ruby></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-ruby.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-ruby.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-ruby.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(RUBY
+)RUBY
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-s.html
===================================================================
--- trunk/data/abbr/test-abbr-around-s.html (rev 0)
+++ trunk/data/abbr/test-abbr-around-s.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1 @@
+<abbr><s></s></abbr>
\ No newline at end of file
Added: trunk/data/abbr/test-abbr-around-s.html.canonical
===================================================================
--- trunk/data/abbr/test-abbr-around-s.html.canonical (rev 0)
+++ trunk/data/abbr/test-abbr-around-s.html.canonical 2015-04-16 15:42:57 UTC (rev 346)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(ABBR
+(S
+)S
+)ABBR
+)BODY
+)HTML
\ No newline at end of file
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2014-11-27 11:21:06 UTC (rev 345)
+++ trunk/doc/changes.html 2015-04-16 15:42:57 UTC (rev 346)
@@ -30,7 +30,8 @@
<dt>Version 1.9.22 (to be released)</dt>
<dd>Element <code>NOBR</code> closes <code>NOBR</code>, <code>BUTTON</code> closes <code>BUTTON</code> (patch from Ronald Brill),
- element <code>EMBED</code> has no body (patch from Ronald Brill).
+ element <code>EMBED</code> has no body (patch from Ronald Brill),
+ element <code>A</code> shouldn't be inline (patch from Ahmed Ashour).
</dd>
<dt>Version 1.9.21 (2 Jun 2014)</dt>
Modified: trunk/src/org/cyberneko/html/HTMLElements.java
===================================================================
--- trunk/src/org/cyberneko/html/HTMLElements.java 2014-11-27 11:21:06 UTC (rev 345)
+++ trunk/src/org/cyberneko/html/HTMLElements.java 2015-04-16 15:42:57 UTC (rev 346)
@@ -193,7 +193,7 @@
// initialize array of element information
ELEMENTS_ARRAY['A'-'A'] = new Element[] {
// A - - (%inline;)* -(A)
- new Element(A, "A", Element.INLINE, BODY, new short[] {A}),
+ new Element(A, "A", Element.CONTAINER, BODY, new short[] {A}),
// ABBR - - (%inline;)*
new Element(ABBR, "ABBR", Element.INLINE, BODY, null),
// ACRONYM - - (%inline;)*
@@ -201,7 +201,7 @@
// ADDRESS - - (%inline;)*
new Element(ADDRESS, "ADDRESS", Element.BLOCK, BODY, new short[] {P}),
// APPLET
- new Element(APPLET, "APPLET", 0, BODY, null),
+ new Element(APPLET, "APPLET", Element.CONTAINER, BODY, null),
// AREA - O EMPTY
new Element(AREA, "AREA", Element.EMPTY, MAP, null),
};
@@ -211,7 +211,7 @@
// BASE - O EMPTY
new Element(BASE, "BASE", Element.EMPTY, HEAD, null),
// BASEFONT
- new Element(BASEFONT, "BASEFONT", 0, HEAD, null),
+ new Element(BASEFONT, "BASEFONT", Element.EMPTY, HEAD, null),
// BDO - - (%inline;)*
new Element(BDO, "BDO", Element.INLINE, BODY, null),
// BGSOUND
@@ -233,7 +233,7 @@
// CAPTION - - (%inline;)*
new Element(CAPTION, "CAPTION", Element.INLINE, TABLE, null),
// CENTER,
- new Element(CENTER, "CENTER", 0, BODY, new short[] {P}),
+ new Element(CENTER, "CENTER", Element.CONTAINER, BODY, new short[] {P}),
// CITE - - (%inline;)*
new Element(CITE, "CITE", Element.INLINE, BODY, null),
// CODE - - (%inline;)*
@@ -241,25 +241,25 @@
// COL - O EMPTY
new Element(COL, "COL", Element.EMPTY, TABLE, null),
// COLGROUP - O (COL)*
- new Element(COLGROUP, "COLGROUP", 0, TABLE, new short[]{COL,COLGROUP}),
+ new Element(COLGROUP, "COLGROUP", Element.CONTAINER, TABLE, new short[]{COL,COLGROUP}),
// COMMENT
new Element(COMMENT, "COMMENT", Element.SPECIAL, HTML, null),
};
ELEMENTS_ARRAY['D'-'A'] = new Element[] {
// DEL - - (%flow;)*
- new Element(DEL, "DEL", 0, BODY, null),
+ new Element(DEL, "DEL", Element.INLINE, BODY, null),
// DFN - - (%inline;)*
new Element(DFN, "DFN", Element.INLINE, BODY, null),
// DIR
- new Element(DIR, "DIR", 0, BODY, new short[] {P}),
+ new Element(DIR, "DIR", Element.CONTAINER, BODY, new short[] {P}),
// DIV - - (%flow;)*
new Element(DIV, "DIV", Element.CONTAINER, BODY, new short[]{P}),
// DD - O (%flow;)*
- new Element(DD, "DD", 0, BODY, new short[]{DT,DD,P}),
+ new Element(DD, "DD", Element.BLOCK, BODY, new short[]{DT,DD,P}),
// DL - - (DT|DD)+
new Element(DL, "DL", Element.BLOCK, BODY, new short[] {P}),
// DT - O (%inline;)*
- new Element(DT, "DT", 0, BODY, new short[]{DT,DD,P}),
+ new Element(DT, "DT", Element.BLOCK, BODY, new short[]{DT,DD,P}),
};
ELEMENTS_ARRAY['E'-'A'] = new Element[] {
// EM - - (%inline;)*
@@ -269,7 +269,7 @@
};
ELEMENTS_ARRAY['F'-'A'] = new Element[] {
// FIELDSET - - (#PCDATA,LEGEND,(%flow;)*)
- new Element(FIELDSET, "FIELDSET", 0, BODY, new short[] {P}),
+ new Element(FIELDSET, "FIELDSET", Element.CONTAINER, BODY, new short[] {P}),
// FONT
new Element(FONT, "FONT", Element.CONTAINER, BODY, null),
// FORM - - (%block;|SCRIPT)+ -(FORM)
@@ -277,7 +277,7 @@
// FRAME - O EMPTY
new Element(FRAME, "FRAME", Element.EMPTY, FRAMESET, null),
// FRAMESET - - ((FRAMESET|FRAME)+ & NOFRAMES?)
- new Element(FRAMESET, "FRAMESET", 0, HTML, null),
+ new Element(FRAMESET, "FRAMESET", Element.CONTAINER, HTML, null),
};
ELEMENTS_ARRAY['H'-'A'] = new Element[] {
// (H1|H2|H3|H4|H5|H6) - - (%inline;)*
@@ -308,13 +308,13 @@
// INS - - (%flow;)*
new Element(INS, "INS", Element.INLINE, BODY, null),
// ISINDEX
- new Element(ISINDEX, "ISINDEX", 0, HEAD, null),
+ new Element(ISINDEX, "ISINDEX", Element.INLINE, HEAD, null),
};
ELEMENTS_ARRAY['K'-'A'] = new Element[] {
// KBD - - (%inline;)*
new Element(KBD, "KBD", Element.INLINE, BODY, null),
// KEYGEN
- new Element(KEYGEN, "KEYGEN", 0, BODY, null),
+ new Element(KEYGEN, "KEYGEN", Element.EMPTY, BODY, null),
};
ELEMENTS_ARRAY['L'-'A'] = new Element[] {
// LABEL - - (%inline;)* -(LABEL)
@@ -328,19 +328,19 @@
// LINK - O EMPTY
new Element(LINK, "LINK", Element.EMPTY, HEAD, null),
// LISTING
- new Element(LISTING, "LISTING", 0, BODY, new short[] {P}),
+ new Element(LISTING, "LISTING", Element.BLOCK, BODY, new short[] {P}),
};
ELEMENTS_ARRAY['M'-'A'] = new Element[] {
// MAP - - ((%block;) | AREA)+
new Element(MAP, "MAP", Element.INLINE, BODY, null),
// MARQUEE
- new Element(MARQUEE, "MARQUEE", 0, BODY, null),
+ new Element(MARQUEE, "MARQUEE", Element.CONTAINER, BODY, null),
// MENU
- new Element(MENU, "MENU", 0, BODY, new short[] {P}),
+ new Element(MENU, "MENU", Element.CONTAINER, BODY, new short[] {P}),
// META - O EMPTY
new Element(META, "META", Element.EMPTY, HEAD, new short[]{STYLE,TITLE}),
// MULTICOL
- new Element(MULTICOL, "MULTICOL", 0, BODY, null),
+ new Element(MULTICOL, "MULTICOL", Element.CONTAINER, BODY, null),
};
ELEMENTS_ARRAY['N'-'A'] = new Element[] {
// NEXTID
@@ -348,17 +348,17 @@
// NOBR
new Element(NOBR, "NOBR", Element.INLINE, BODY, new short[]{NOBR}),
// NOEMBED
- new Element(NOEMBED, "NOEMBED", 0, BODY, null),
+ new Element(NOEMBED, "NOEMBED", Element.CONTAINER, BODY, null),
// NOFRAMES - - (BODY) -(NOFRAMES)
- new Element(NOFRAMES, "NOFRAMES", 0, null, null),
+ new Element(NOFRAMES, "NOFRAMES", Element.CONTAINER, null, null),
// NOLAYER
- new Element(NOLAYER, "NOLAYER", 0, BODY, null),
+ new Element(NOLAYER, "NOLAYER", Element.CONTAINER, BODY, null),
// NOSCRIPT - - (%block;)+
- new Element(NOSCRIPT, "NOSCRIPT", 0, new short[]{BODY}, null),
+ new Element(NOSCRIPT, "NOSCRIPT", Element.CONTAINER, new short[]{BODY}, null),
};
ELEMENTS_ARRAY['O'-'A'] = new Element[] {
// OBJECT - - (PARAM | %flow;)*
- new Element(OBJECT, "OBJECT", 0, BODY, null),
+ new Element(OBJECT, "OBJECT", Element.CONTAINER, BODY, null),
// OL - - (LI)+
new Element(OL, "OL", Element.BLOCK, BODY, new short[] {P}),
// OPTGROUP - - (OPTION)+
@@ -374,7 +374,7 @@
// PLAINTEXT
new Element(PLAINTEXT, "PLAINTEXT", Element.SPECIAL, BODY, null),
// PRE - - (%inline;)* -(%pre.exclusion;)
- new Element(PRE, "PRE", 0, BODY, new short[] {P}),
+ new Element(PRE, "PRE", Element.BLOCK, BODY, new short[] {P}),
};
ELEMENTS_ARRAY['Q'-'A'] = new Element[] {
// Q - - (%inline;)*
@@ -392,11 +392,11 @@
// RTC
new Element(RTC, "RTC", 0, RUBY, new short[]{RBC}),
// RUBY
- new Element(RUBY, "RUBY", 0, BODY, new short[]{RUBY}),
+ new Element(RUBY, "RUBY", Element.CONTAINER, BODY, new short[]{RUBY}),
};
ELEMENTS_ARRAY['S'-'A'] = new Element[] {
// S
- new Element(S, "S", 0, BODY, null),
+ new Element(S, "S", Element.INLINE, BODY, null),
// SAMP - - (%inline;)*
new Element(SAMP, "SAMP", Element.INLINE, BODY, null),
// SCRIPT - - %Script;
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: Ahmed A. <asa...@ya...> - 2015-03-12 07:41:22
|
Hi all, As part of HtmlUnit development, I would like to ask about the below points regarding NekoHtml, as I would submit patches accordingly. 1 - HTMLElements are currently fixed, and it seems they need to be configured according to the browser used (e.g. IE/FF/CHROME/etc). a) Should HTMLElement be an interface with default implementation, but the client code can override accordingly b) Or should NekoHtml provide implementations of specific browsers version (FF31 ESR/IE8/IE11/CHROME) with an option of the client code to customize that more 2 - Wouldn't it be better to depend on specific (e.g. latest 2.11) Xerces, instead of still being compatible with 2.0/2.1 3- If #2 is ok, how about moving the build to maven 4- What about removing the comments at the end of each method, e.g. "} // putItem(String, Object):Object", this is not the standard java coding convention. Thanks a lot, Ahmed |
|
From: <mgu...@us...> - 2014-11-27 11:21:10
|
Revision: 345
http://sourceforge.net/p/nekohtml/code/345
Author: mguillem
Date: 2014-11-27 11:21:06 +0000 (Thu, 27 Nov 2014)
Log Message:
-----------
run tests with xerces-2.11.0 as well
define xerces-2.11.0 as maven dependency
Modified Paths:
--------------
trunk/build.xml
trunk/pom.xml
Added Paths:
-----------
trunk/lib/xerces-2.11.0/
trunk/lib/xerces-2.11.0/xercesImpl-2.11.0.jar
trunk/lib/xerces-2.11.0/xml-apis.jar
Modified: trunk/build.xml
===================================================================
--- trunk/build.xml 2014-11-27 11:11:10 UTC (rev 344)
+++ trunk/build.xml 2014-11-27 11:21:06 UTC (rev 345)
@@ -308,6 +308,7 @@
<delete dir="${build.dir}/junit"/>
<mkdir dir="${build.dir}/junit"/>
+ <testWith xercesVersion="2.11.0"/>
<testWith xercesVersion="2.10.0"/>
<testWith xercesVersion="2.9.1"/>
<testWith xercesVersion="2.8.1"/>
Added: trunk/lib/xerces-2.11.0/xercesImpl-2.11.0.jar
===================================================================
(Binary files differ)
Index: trunk/lib/xerces-2.11.0/xercesImpl-2.11.0.jar
===================================================================
--- trunk/lib/xerces-2.11.0/xercesImpl-2.11.0.jar 2014-11-27 11:11:10 UTC (rev 344)
+++ trunk/lib/xerces-2.11.0/xercesImpl-2.11.0.jar 2014-11-27 11:21:06 UTC (rev 345)
Property changes on: trunk/lib/xerces-2.11.0/xercesImpl-2.11.0.jar
___________________________________________________________________
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Added: trunk/lib/xerces-2.11.0/xml-apis.jar
===================================================================
(Binary files differ)
Index: trunk/lib/xerces-2.11.0/xml-apis.jar
===================================================================
--- trunk/lib/xerces-2.11.0/xml-apis.jar 2014-11-27 11:11:10 UTC (rev 344)
+++ trunk/lib/xerces-2.11.0/xml-apis.jar 2014-11-27 11:21:06 UTC (rev 345)
Property changes on: trunk/lib/xerces-2.11.0/xml-apis.jar
___________________________________________________________________
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Modified: trunk/pom.xml
===================================================================
--- trunk/pom.xml 2014-11-27 11:11:10 UTC (rev 344)
+++ trunk/pom.xml 2014-11-27 11:21:06 UTC (rev 345)
@@ -20,7 +20,7 @@
<dependency>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
- <version>2.10.0</version>
+ <version>2.11.0</version>
</dependency>
</dependencies>
<developers>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-11-27 11:11:18
|
Revision: 344
http://sourceforge.net/p/nekohtml/code/344
Author: mguillem
Date: 2014-11-27 11:11:10 +0000 (Thu, 27 Nov 2014)
Log Message:
-----------
EMBED has no body, BUTTON closes BUTTON (patch from Ronald Brill)
Modified Paths:
--------------
trunk/data/canonical/test014.html
trunk/doc/changes.html
trunk/src/org/cyberneko/html/HTMLElements.java
Added Paths:
-----------
trunk/data/button/
trunk/data/button/test-button_closes_button.html
trunk/data/button/test-button_closes_button.html.canonical
trunk/data/embed/
trunk/data/embed/test-embed_closes_embed.html
trunk/data/embed/test-embed_closes_embed.html.canonical
Added: trunk/data/button/test-button_closes_button.html
===================================================================
--- trunk/data/button/test-button_closes_button.html (rev 0)
+++ trunk/data/button/test-button_closes_button.html 2014-11-27 11:11:10 UTC (rev 344)
@@ -0,0 +1 @@
+<button>hello<button>world</button></button>
\ No newline at end of file
Added: trunk/data/button/test-button_closes_button.html.canonical
===================================================================
--- trunk/data/button/test-button_closes_button.html.canonical (rev 0)
+++ trunk/data/button/test-button_closes_button.html.canonical 2014-11-27 11:11:10 UTC (rev 344)
@@ -0,0 +1,12 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(BUTTON
+"hello
+)BUTTON
+(BUTTON
+"world
+)BUTTON
+)BODY
+)HTML
Modified: trunk/data/canonical/test014.html
===================================================================
--- trunk/data/canonical/test014.html 2014-11-27 10:53:06 UTC (rev 343)
+++ trunk/data/canonical/test014.html 2014-11-27 11:11:10 UTC (rev 344)
@@ -15,9 +15,8 @@
)PARAM
"\n
(EMBED
-"\n
)EMBED
-"\n
+"\n \n
(NOEMBED
"\n
)NOEMBED
Added: trunk/data/embed/test-embed_closes_embed.html
===================================================================
--- trunk/data/embed/test-embed_closes_embed.html (rev 0)
+++ trunk/data/embed/test-embed_closes_embed.html 2014-11-27 11:11:10 UTC (rev 344)
@@ -0,0 +1 @@
+<embed><embed></embed></embed>
\ No newline at end of file
Added: trunk/data/embed/test-embed_closes_embed.html.canonical
===================================================================
--- trunk/data/embed/test-embed_closes_embed.html.canonical (rev 0)
+++ trunk/data/embed/test-embed_closes_embed.html.canonical 2014-11-27 11:11:10 UTC (rev 344)
@@ -0,0 +1,10 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(EMBED
+)EMBED
+(EMBED
+)EMBED
+)BODY
+)HTML
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2014-11-27 10:53:06 UTC (rev 343)
+++ trunk/doc/changes.html 2014-11-27 11:11:10 UTC (rev 344)
@@ -29,7 +29,8 @@
<dl>Elements
<dt>Version 1.9.22 (to be released)</dt>
- <dd>Element <code>NOBR</code> closes <code>NOBR</code> (patch from Ronald Brill).
+ <dd>Element <code>NOBR</code> closes <code>NOBR</code>, <code>BUTTON</code> closes <code>BUTTON</code> (patch from Ronald Brill),
+ element <code>EMBED</code> has no body (patch from Ronald Brill).
</dd>
<dt>Version 1.9.21 (2 Jun 2014)</dt>
Modified: trunk/src/org/cyberneko/html/HTMLElements.java
===================================================================
--- trunk/src/org/cyberneko/html/HTMLElements.java 2014-11-27 10:53:06 UTC (rev 343)
+++ trunk/src/org/cyberneko/html/HTMLElements.java 2014-11-27 11:11:10 UTC (rev 344)
@@ -227,7 +227,7 @@
// BR - O EMPTY
new Element(BR, "BR", Element.EMPTY, BODY, null),
// BUTTON - - (%flow;)* -(A|%formctrl;|FORM|FIELDSET)
- new Element(BUTTON, "BUTTON", Element.INLINE | Element.BLOCK, BODY, null),
+ new Element(BUTTON, "BUTTON", Element.INLINE | Element.BLOCK, BODY, new short[]{BUTTON}),
};
ELEMENTS_ARRAY['C'-'A'] = new Element[] {
// CAPTION - - (%inline;)*
@@ -265,7 +265,7 @@
// EM - - (%inline;)*
new Element(EM, "EM", Element.INLINE, BODY, null),
// EMBED
- new Element(EMBED, "EMBED", 0, BODY, null),
+ new Element(EMBED, "EMBED", Element.EMPTY, BODY, null),
};
ELEMENTS_ARRAY['F'-'A'] = new Element[] {
// FIELDSET - - (#PCDATA,LEGEND,(%flow;)*)
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-11-27 10:53:11
|
Revision: 343
http://sourceforge.net/p/nekohtml/code/343
Author: mguillem
Date: 2014-11-27 10:53:06 +0000 (Thu, 27 Nov 2014)
Log Message:
-----------
NOBR closes NOBR (patch from Ronald Brill)
Modified Paths:
--------------
trunk/doc/changes.html
trunk/src/org/cyberneko/html/HTMLElements.java
Added Paths:
-----------
trunk/data/nobr/
trunk/data/nobr/test-nobr_closes_nobr.html
trunk/data/nobr/test-nobr_closes_nobr.html.canonical
Added: trunk/data/nobr/test-nobr_closes_nobr.html
===================================================================
--- trunk/data/nobr/test-nobr_closes_nobr.html (rev 0)
+++ trunk/data/nobr/test-nobr_closes_nobr.html 2014-11-27 10:53:06 UTC (rev 343)
@@ -0,0 +1 @@
+<nobr>hello<nobr>world</nobr></nobr>
\ No newline at end of file
Added: trunk/data/nobr/test-nobr_closes_nobr.html.canonical
===================================================================
--- trunk/data/nobr/test-nobr_closes_nobr.html.canonical (rev 0)
+++ trunk/data/nobr/test-nobr_closes_nobr.html.canonical 2014-11-27 10:53:06 UTC (rev 343)
@@ -0,0 +1,12 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(NOBR
+"hello
+)NOBR
+(NOBR
+"world
+)NOBR
+)BODY
+)HTML
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2014-11-27 10:51:35 UTC (rev 342)
+++ trunk/doc/changes.html 2014-11-27 10:53:06 UTC (rev 343)
@@ -28,6 +28,10 @@
<h2>Releases</h2>
<dl>Elements
+ <dt>Version 1.9.22 (to be released)</dt>
+ <dd>Element <code>NOBR</code> closes <code>NOBR</code> (patch from Ronald Brill).
+ </dd>
+
<dt>Version 1.9.21 (2 Jun 2014)</dt>
<dd>Ensure that closing unknown element only closes matching unknown element and not any unknown element,
added definition for HTML5 tag <code>SECTION</code>.
Modified: trunk/src/org/cyberneko/html/HTMLElements.java
===================================================================
--- trunk/src/org/cyberneko/html/HTMLElements.java 2014-11-27 10:51:35 UTC (rev 342)
+++ trunk/src/org/cyberneko/html/HTMLElements.java 2014-11-27 10:53:06 UTC (rev 343)
@@ -346,7 +346,7 @@
// NEXTID
new Element(NEXTID, "NEXTID", Element.EMPTY, BODY, null),
// NOBR
- new Element(NOBR, "NOBR", Element.INLINE, BODY, null),
+ new Element(NOBR, "NOBR", Element.INLINE, BODY, new short[]{NOBR}),
// NOEMBED
new Element(NOEMBED, "NOEMBED", 0, BODY, null),
// NOFRAMES - - (BODY) -(NOFRAMES)
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-11-27 10:51:43
|
Revision: 342
http://sourceforge.net/p/nekohtml/code/342
Author: mguillem
Date: 2014-11-27 10:51:35 +0000 (Thu, 27 Nov 2014)
Log Message:
-----------
upgrade version to 1.9.22-SNAPSHOT
Modified Paths:
--------------
trunk/build.xml
trunk/pom.xml
Modified: trunk/build.xml
===================================================================
--- trunk/build.xml 2014-06-02 10:27:33 UTC (rev 341)
+++ trunk/build.xml 2014-11-27 10:51:35 UTC (rev 342)
@@ -4,7 +4,7 @@
<!-- PROPERTIES -->
<property file='build-custom.properties' />
- <property name='version' value='1.9.21'/>
+ <property name='version' value='1.9.22-SNAPSHOT'/>
<property name='name' value='nekohtml'/>
<property name='fullname' value='${name}-${version}'/>
<property name='Title' value='NekoHTML'/>
Modified: trunk/pom.xml
===================================================================
--- trunk/pom.xml 2014-06-02 10:27:33 UTC (rev 341)
+++ trunk/pom.xml 2014-11-27 10:51:35 UTC (rev 342)
@@ -4,7 +4,7 @@
<artifactId>nekohtml</artifactId>
<name>Neko HTML</name>
<description>An HTML parser and tag balancer.</description>
- <version>1.9.21</version>
+ <version>1.9.22-SNAPSHOT</version>
<url>http://nekohtml.sourceforge.net/</url>
<licenses>
<license>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-06-02 10:27:36
|
Revision: 341
http://sourceforge.net/p/nekohtml/code/341
Author: mguillem
Date: 2014-06-02 10:27:33 +0000 (Mon, 02 Jun 2014)
Log Message:
-----------
Release 1.9.21
Added Paths:
-----------
branches/nekohtml-1.9.21/
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-06-02 09:48:28
|
Revision: 340
http://sourceforge.net/p/nekohtml/code/340
Author: mguillem
Date: 2014-06-02 09:48:24 +0000 (Mon, 02 Jun 2014)
Log Message:
-----------
preparing release 1.9.21
Modified Paths:
--------------
trunk/build.xml
trunk/doc/changes.html
trunk/doc/index.html
trunk/pom.xml
Modified: trunk/build.xml
===================================================================
--- trunk/build.xml 2014-03-11 12:07:31 UTC (rev 339)
+++ trunk/build.xml 2014-06-02 09:48:24 UTC (rev 340)
@@ -4,7 +4,7 @@
<!-- PROPERTIES -->
<property file='build-custom.properties' />
- <property name='version' value='1.9.21-SNAPSHOT'/>
+ <property name='version' value='1.9.21'/>
<property name='name' value='nekohtml'/>
<property name='fullname' value='${name}-${version}'/>
<property name='Title' value='NekoHTML'/>
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2014-03-11 12:07:31 UTC (rev 339)
+++ trunk/doc/changes.html 2014-06-02 09:48:24 UTC (rev 340)
@@ -28,7 +28,7 @@
<h2>Releases</h2>
<dl>Elements
- <dt>Version 1.9.21 (to be released)</dt>
+ <dt>Version 1.9.21 (2 Jun 2014)</dt>
<dd>Ensure that closing unknown element only closes matching unknown element and not any unknown element,
added definition for HTML5 tag <code>SECTION</code>.
</dd>
Modified: trunk/doc/index.html
===================================================================
--- trunk/doc/index.html 2014-03-11 12:07:31 UTC (rev 339)
+++ trunk/doc/index.html 2014-06-02 09:48:24 UTC (rev 340)
@@ -1,7 +1,7 @@
<title>NekoHTML</title>
<link rel=stylesheet type=text/css href=style.css>
-<h1>CyberNeko HTML Parser <sub>1.9.20</sub></h1>
+<h1>CyberNeko HTML Parser <sub>1.9.21</sub></h1>
<div style="right: 10; top: 10; position: absolute">
<a href="http://sourceforge.net/projects/nekohtml"><img src="http://sflogo.sourceforge.net/sflogo.php?group_id=195122&type=12" width="120" height="30" border="0" alt="Get NekoHTML at SourceForge.net. Fast, secure and Free Open Source software downloads" /></a>
</div>
@@ -57,8 +57,8 @@
following location:
<ul>
<li>NekoHTML
- [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.20.zip'>zip</a>]
- [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.20.tar.gz'>tgz</a>]
+ [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.21.zip'>zip</a>]
+ [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.21.tar.gz'>tgz</a>]
</ul>
<h2>Requirements and Limitations</h2>
Modified: trunk/pom.xml
===================================================================
--- trunk/pom.xml 2014-03-11 12:07:31 UTC (rev 339)
+++ trunk/pom.xml 2014-06-02 09:48:24 UTC (rev 340)
@@ -4,7 +4,7 @@
<artifactId>nekohtml</artifactId>
<name>Neko HTML</name>
<description>An HTML parser and tag balancer.</description>
- <version>1.9.21-SNAPSHOT</version>
+ <version>1.9.21</version>
<url>http://nekohtml.sourceforge.net/</url>
<licenses>
<license>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-03-11 12:07:35
|
Revision: 339
http://sourceforge.net/p/nekohtml/code/339
Author: mguillem
Date: 2014-03-11 12:07:31 +0000 (Tue, 11 Mar 2014)
Log Message:
-----------
- ensure that closing unknown element only closes matching unknown element and not any unknown element,
- added definition for HTML5 tag SECTION
Modified Paths:
--------------
trunk/build.xml
trunk/doc/changes.html
trunk/pom.xml
trunk/src/org/cyberneko/html/HTMLElements.java
trunk/src/org/cyberneko/html/HTMLTagBalancer.java
Added Paths:
-----------
trunk/data/section/
trunk/data/section/test-section-unknown.html
trunk/data/section/test-section-unknown.html.canonical
trunk/data/unknown/test-non-html-ns.html
trunk/data/unknown/test-non-html-ns.html.canonical
trunk/data/unknown/test-unknown-multiple.html
trunk/data/unknown/test-unknown-multiple.html.canonical
Removed Paths:
-------------
trunk/data/canonical/test-non-html-ns.html
trunk/data/test-non-html-ns.html
Modified: trunk/build.xml
===================================================================
--- trunk/build.xml 2014-02-13 14:19:00 UTC (rev 338)
+++ trunk/build.xml 2014-03-11 12:07:31 UTC (rev 339)
@@ -4,7 +4,7 @@
<!-- PROPERTIES -->
<property file='build-custom.properties' />
- <property name='version' value='1.9.20'/>
+ <property name='version' value='1.9.21-SNAPSHOT'/>
<property name='name' value='nekohtml'/>
<property name='fullname' value='${name}-${version}'/>
<property name='Title' value='NekoHTML'/>
Deleted: trunk/data/canonical/test-non-html-ns.html
===================================================================
--- trunk/data/canonical/test-non-html-ns.html 2014-02-13 14:19:00 UTC (rev 338)
+++ trunk/data/canonical/test-non-html-ns.html 2014-03-11 12:07:31 UTC (rev 339)
@@ -1,9 +0,0 @@
-(HTML
-(HEAD
-)HEAD
-(BODY
-(H:BODY
-A{http://www.w3.org/2000/xmlns/}xmlns:H urn:not-a-html-ns
-)H:BODY
-)BODY
-)HTML
\ No newline at end of file
Added: trunk/data/section/test-section-unknown.html
===================================================================
--- trunk/data/section/test-section-unknown.html (rev 0)
+++ trunk/data/section/test-section-unknown.html 2014-03-11 12:07:31 UTC (rev 339)
@@ -0,0 +1,7 @@
+<section>
+<form>
+Hello
+</isslot>
+World!
+</form>
+</section>
Added: trunk/data/section/test-section-unknown.html.canonical
===================================================================
--- trunk/data/section/test-section-unknown.html.canonical (rev 0)
+++ trunk/data/section/test-section-unknown.html.canonical 2014-03-11 12:07:31 UTC (rev 339)
@@ -0,0 +1,14 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(SECTION
+"\n
+(FORM
+"\nHello\n\nWorld!\n
+)FORM
+"\n
+)SECTION
+"\n
+)BODY
+)HTML
\ No newline at end of file
Deleted: trunk/data/test-non-html-ns.html
===================================================================
--- trunk/data/test-non-html-ns.html 2014-02-13 14:19:00 UTC (rev 338)
+++ trunk/data/test-non-html-ns.html 2014-03-11 12:07:31 UTC (rev 339)
@@ -1 +0,0 @@
-<html><head></head><h:body xmlns:h='urn:not-a-html-ns'>
\ No newline at end of file
Copied: trunk/data/unknown/test-non-html-ns.html (from rev 328, trunk/data/test-non-html-ns.html)
===================================================================
--- trunk/data/unknown/test-non-html-ns.html (rev 0)
+++ trunk/data/unknown/test-non-html-ns.html 2014-03-11 12:07:31 UTC (rev 339)
@@ -0,0 +1 @@
+<html><head></head><h:body xmlns:h='urn:not-a-html-ns'>
\ No newline at end of file
Copied: trunk/data/unknown/test-non-html-ns.html.canonical (from rev 328, trunk/data/canonical/test-non-html-ns.html)
===================================================================
--- trunk/data/unknown/test-non-html-ns.html.canonical (rev 0)
+++ trunk/data/unknown/test-non-html-ns.html.canonical 2014-03-11 12:07:31 UTC (rev 339)
@@ -0,0 +1,9 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(H:BODY
+A{http://www.w3.org/2000/xmlns/}xmlns:H urn:not-a-html-ns
+)H:BODY
+)BODY
+)HTML
\ No newline at end of file
Added: trunk/data/unknown/test-unknown-multiple.html
===================================================================
--- trunk/data/unknown/test-unknown-multiple.html (rev 0)
+++ trunk/data/unknown/test-unknown-multiple.html 2014-03-11 12:07:31 UTC (rev 339)
@@ -0,0 +1 @@
+<toto><div></foo></div><span></span></toto>
\ No newline at end of file
Added: trunk/data/unknown/test-unknown-multiple.html.canonical
===================================================================
--- trunk/data/unknown/test-unknown-multiple.html.canonical (rev 0)
+++ trunk/data/unknown/test-unknown-multiple.html.canonical 2014-03-11 12:07:31 UTC (rev 339)
@@ -0,0 +1,12 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(TOTO
+(DIV
+)DIV
+(SPAN
+)SPAN
+)TOTO
+)BODY
+)HTML
\ No newline at end of file
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2014-02-13 14:19:00 UTC (rev 338)
+++ trunk/doc/changes.html 2014-03-11 12:07:31 UTC (rev 339)
@@ -27,6 +27,12 @@
<h2>Releases</h2>
<dl>Elements
+
+ <dt>Version 1.9.21 (to be released)</dt>
+ <dd>Ensure that closing unknown element only closes matching unknown element and not any unknown element,
+ added definition for HTML5 tag <code>SECTION</code>.
+ </dd>
+
<dt>Version 1.9.20 (13 Feb 2014)</dt>
<dd>Fix IllegalArgumentException occurring with entities having invalid UTF-16 code and use replacement character (�) instead,
fix ArrayIndexOutOfBoundsException occurring when stream ends with a \r in attribute's value (#154).
Modified: trunk/pom.xml
===================================================================
--- trunk/pom.xml 2014-02-13 14:19:00 UTC (rev 338)
+++ trunk/pom.xml 2014-03-11 12:07:31 UTC (rev 339)
@@ -4,7 +4,7 @@
<artifactId>nekohtml</artifactId>
<name>Neko HTML</name>
<description>An HTML parser and tag balancer.</description>
- <version>1.9.20</version>
+ <version>1.9.21-SNAPSHOT</version>
<url>http://nekohtml.sourceforge.net/</url>
<licenses>
<license>
Modified: trunk/src/org/cyberneko/html/HTMLElements.java
===================================================================
--- trunk/src/org/cyberneko/html/HTMLElements.java 2014-02-13 14:19:00 UTC (rev 338)
+++ trunk/src/org/cyberneko/html/HTMLElements.java 2014-03-11 12:07:31 UTC (rev 339)
@@ -128,7 +128,8 @@
public static final short S = RUBY+1;
public static final short SAMP = S+1;
public static final short SCRIPT = SAMP+1;
- public static final short SELECT = SCRIPT+1;
+ public static final short SECTION = SCRIPT+1;
+ public static final short SELECT = SECTION+1;
public static final short SMALL = SELECT+1;
public static final short SOUND = SMALL+1;
public static final short SPACER = SOUND+1;
@@ -400,6 +401,8 @@
new Element(SAMP, "SAMP", Element.INLINE, BODY, null),
// SCRIPT - - %Script;
new Element(SCRIPT, "SCRIPT", Element.SPECIAL, new short[]{HEAD,BODY}, null),
+
+ new Element(SECTION, "SECTION", Element.CONTAINER, BODY, new short[]{SELECT}),
// SELECT - - (OPTGROUP|OPTION)+
new Element(SELECT, "SELECT", Element.CONTAINER, BODY, new short[]{SELECT}),
// SMALL - - (%inline;)*
@@ -509,7 +512,13 @@
* @param ename The element name.
*/
public static final Element getElement(final String ename) {
- return getElement(ename, NO_SUCH_ELEMENT);
+ Element element = getElement(ename, NO_SUCH_ELEMENT);
+ if (element == NO_SUCH_ELEMENT) {
+ element = new Element(UNKNOWN, ename.toUpperCase(), Element.CONTAINER, new short[]{BODY,HEAD}/*HTML*/, null);
+ element.parent = NO_SUCH_ELEMENT.parent;
+ element.parentCodes = NO_SUCH_ELEMENT.parentCodes;
+ }
+ return element;
} // getElement(String):Element
/**
Modified: trunk/src/org/cyberneko/html/HTMLTagBalancer.java
===================================================================
--- trunk/src/org/cyberneko/html/HTMLTagBalancer.java 2014-02-13 14:19:00 UTC (rev 338)
+++ trunk/src/org/cyberneko/html/HTMLTagBalancer.java 2014-03-11 12:07:31 UTC (rev 339)
@@ -1068,7 +1068,7 @@
fErrorReporter.reportWarning("HTML2007", new Object[]{ename,iname});
}
if (fDocumentHandler != null) {
- // PATCH: Marc-Andr\xE9 Morissette
+ // PATCH: Marc-Andr� Morissette
callEndElement(info.qname, i < depth - 1 ? synthesizedAugs() : augs);
}
}
@@ -1184,7 +1184,8 @@
int depth = -1;
for (int i = fElementStack.top - 1; i >=fragmentContextStackSize_; i--) {
Info info = fElementStack.data[i];
- if (info.element.code == element.code) {
+ if (info.element.code == element.code
+ && (elementCode != HTMLElements.UNKNOWN || (elementCode == HTMLElements.UNKNOWN && element.name.equals(info.element.name)))) {
depth = fElementStack.top - i;
break;
}
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-02-13 14:19:04
|
Revision: 338
http://sourceforge.net/p/nekohtml/code/338
Author: mguillem
Date: 2014-02-13 14:19:00 +0000 (Thu, 13 Feb 2014)
Log Message:
-----------
prepare release 1.9.20
Modified Paths:
--------------
trunk/build.xml
trunk/doc/changes.html
trunk/doc/index.html
trunk/pom.xml
Modified: trunk/build.xml
===================================================================
--- trunk/build.xml 2014-02-13 14:00:48 UTC (rev 337)
+++ trunk/build.xml 2014-02-13 14:19:00 UTC (rev 338)
@@ -4,7 +4,7 @@
<!-- PROPERTIES -->
<property file='build-custom.properties' />
- <property name='version' value='1.9.20-SNAPSHOT'/>
+ <property name='version' value='1.9.20'/>
<property name='name' value='nekohtml'/>
<property name='fullname' value='${name}-${version}'/>
<property name='Title' value='NekoHTML'/>
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2014-02-13 14:00:48 UTC (rev 337)
+++ trunk/doc/changes.html 2014-02-13 14:19:00 UTC (rev 338)
@@ -27,9 +27,10 @@
<h2>Releases</h2>
<dl>Elements
- <dt>Version 1.9.20 (to be released)</dt>
+ <dt>Version 1.9.20 (13 Feb 2014)</dt>
<dd>Fix IllegalArgumentException occurring with entities having invalid UTF-16 code and use replacement character (�) instead,
fix ArrayIndexOutOfBoundsException occurring when stream ends with a \r in attribute's value (#154).
+ </dd>
<dt>Version 1.9.19 (9 Oct 2013)</dt>
<dd>Element <code>LI</code> closes <code>DIV</code>,
Modified: trunk/doc/index.html
===================================================================
--- trunk/doc/index.html 2014-02-13 14:00:48 UTC (rev 337)
+++ trunk/doc/index.html 2014-02-13 14:19:00 UTC (rev 338)
@@ -1,7 +1,7 @@
<title>NekoHTML</title>
<link rel=stylesheet type=text/css href=style.css>
-<h1>CyberNeko HTML Parser <sub>1.9.19</sub></h1>
+<h1>CyberNeko HTML Parser <sub>1.9.20</sub></h1>
<div style="right: 10; top: 10; position: absolute">
<a href="http://sourceforge.net/projects/nekohtml"><img src="http://sflogo.sourceforge.net/sflogo.php?group_id=195122&type=12" width="120" height="30" border="0" alt="Get NekoHTML at SourceForge.net. Fast, secure and Free Open Source software downloads" /></a>
</div>
@@ -57,8 +57,8 @@
following location:
<ul>
<li>NekoHTML
- [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.19.zip'>zip</a>]
- [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.19.tar.gz'>tgz</a>]
+ [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.20.zip'>zip</a>]
+ [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.20.tar.gz'>tgz</a>]
</ul>
<h2>Requirements and Limitations</h2>
Modified: trunk/pom.xml
===================================================================
--- trunk/pom.xml 2014-02-13 14:00:48 UTC (rev 337)
+++ trunk/pom.xml 2014-02-13 14:19:00 UTC (rev 338)
@@ -4,7 +4,7 @@
<artifactId>nekohtml</artifactId>
<name>Neko HTML</name>
<description>An HTML parser and tag balancer.</description>
- <version>1.9.20-SNAPSHOT</version>
+ <version>1.9.20</version>
<url>http://nekohtml.sourceforge.net/</url>
<licenses>
<license>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-02-13 14:00:53
|
Revision: 337
http://sourceforge.net/p/nekohtml/code/337
Author: mguillem
Date: 2014-02-13 14:00:48 +0000 (Thu, 13 Feb 2014)
Log Message:
-----------
fix ArrayIndexOutOfBoundsException occurring when stream ends with a \r in attribute's value (#154).
Modified Paths:
--------------
trunk/doc/changes.html
trunk/src/org/cyberneko/html/HTMLScanner.java
trunk/test/java/org/cyberneko/html/DOMFragmentParserTest.java
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2014-01-23 09:55:24 UTC (rev 336)
+++ trunk/doc/changes.html 2014-02-13 14:00:48 UTC (rev 337)
@@ -28,7 +28,8 @@
<h2>Releases</h2>
<dl>Elements
<dt>Version 1.9.20 (to be released)</dt>
- <dd>Fix IllegalArgumentException occurring with entities having invalid UTF-16 code and use replacement character (�) instead.
+ <dd>Fix IllegalArgumentException occurring with entities having invalid UTF-16 code and use replacement character (�) instead,
+ fix ArrayIndexOutOfBoundsException occurring when stream ends with a \r in attribute's value (#154).
<dt>Version 1.9.19 (9 Oct 2013)</dt>
<dd>Element <code>LI</code> closes <code>DIV</code>,
Modified: trunk/src/org/cyberneko/html/HTMLScanner.java
===================================================================
--- trunk/src/org/cyberneko/html/HTMLScanner.java 2014-01-23 09:55:24 UTC (rev 336)
+++ trunk/src/org/cyberneko/html/HTMLScanner.java 2014-02-13 14:00:48 UTC (rev 337)
@@ -3040,13 +3040,13 @@
else if (c == '\r' || c == '\n') {
if (c == '\r') {
int c2 = fCurrentEntity.read();
- if (c2 != '\n') {
- fCurrentEntity.rewind();
- }
- else {
+ if (c2 == '\n') {
fNonNormAttr.append('\r');
c = c2;
}
+ else if (c2 != -1) {
+ fCurrentEntity.rewind();
+ }
}
if (acceptSpace) {
fStringBuffer.append(fNormalizeAttributes ? ' ' : '\n');
Modified: trunk/test/java/org/cyberneko/html/DOMFragmentParserTest.java
===================================================================
--- trunk/test/java/org/cyberneko/html/DOMFragmentParserTest.java 2014-01-23 09:55:24 UTC (rev 336)
+++ trunk/test/java/org/cyberneko/html/DOMFragmentParserTest.java 2014-02-13 14:00:48 UTC (rev 337)
@@ -21,6 +21,13 @@
*/
public class DOMFragmentParserTest extends TestCase {
/**
+ * See <a href="https://sourceforge.net/p/nekohtml/bugs/154/">Bug 154</a>.
+ */
+ public void testAttrEndingWithCRAtEndOfStream() throws Exception {
+ doTest("<a href=\"\r", "<A href=\"
\"/>");
+ }
+
+ /**
* See <a href="http://sourceforge.net/support/tracker.php?aid=2828553">Bug 2828553</a>.
*/
public void testInvalidProcessingInstruction() throws Exception {
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-01-23 09:55:32
|
Revision: 336
http://sourceforge.net/p/nekohtml/code/336
Author: mguillem
Date: 2014-01-23 09:55:24 +0000 (Thu, 23 Jan 2014)
Log Message:
-----------
changed version number to 1.9.20-SNAPSHOT
Modified Paths:
--------------
trunk/build.xml
trunk/pom.xml
Modified: trunk/build.xml
===================================================================
--- trunk/build.xml 2014-01-23 09:43:41 UTC (rev 335)
+++ trunk/build.xml 2014-01-23 09:55:24 UTC (rev 336)
@@ -4,14 +4,14 @@
<!-- PROPERTIES -->
<property file='build-custom.properties' />
- <property name='version' value='1.9.19'/>
+ <property name='version' value='1.9.20-SNAPSHOT'/>
<property name='name' value='nekohtml'/>
<property name='fullname' value='${name}-${version}'/>
<property name='Title' value='NekoHTML'/>
<property name='FullTitle' value='CyberNeko HTML Parser'/>
<property name='Name' value='${Title} ${version}'/>
<property name='author' value='Andy Clark, Marc Guillemot'/>
- <property name='copyright' value='(C) Copyright 2002-2013, ${author}. All rights reserved.'/>
+ <property name='copyright' value='(C) Copyright 2002-2014, ${author}. All rights reserved.'/>
<property name='URL' value='http://nekohtml.sourceforge.net/index.html'/>
<property name='compile.source' value='1.3' />
Modified: trunk/pom.xml
===================================================================
--- trunk/pom.xml 2014-01-23 09:43:41 UTC (rev 335)
+++ trunk/pom.xml 2014-01-23 09:55:24 UTC (rev 336)
@@ -4,7 +4,7 @@
<artifactId>nekohtml</artifactId>
<name>Neko HTML</name>
<description>An HTML parser and tag balancer.</description>
- <version>1.9.19</version>
+ <version>1.9.20-SNAPSHOT</version>
<url>http://nekohtml.sourceforge.net/</url>
<licenses>
<license>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-01-23 09:43:45
|
Revision: 335
http://sourceforge.net/p/nekohtml/code/335
Author: mguillem
Date: 2014-01-23 09:43:41 +0000 (Thu, 23 Jan 2014)
Log Message:
-----------
Fix IllegalArgumentException occurring with entities having invalid UTF-16 code and use replacement character (�) instead.
Modified Paths:
--------------
trunk/doc/changes.html
trunk/src/org/cyberneko/html/HTMLScanner.java
Added Paths:
-----------
trunk/data/entities/test-entity-bad-UTF16-code.html
trunk/data/entities/test-entity-bad-UTF16-code.html.canonical
Added: trunk/data/entities/test-entity-bad-UTF16-code.html
===================================================================
--- trunk/data/entities/test-entity-bad-UTF16-code.html (rev 0)
+++ trunk/data/entities/test-entity-bad-UTF16-code.html 2014-01-23 09:43:41 UTC (rev 335)
@@ -0,0 +1 @@
+�
\ No newline at end of file
Added: trunk/data/entities/test-entity-bad-UTF16-code.html.canonical
===================================================================
--- trunk/data/entities/test-entity-bad-UTF16-code.html.canonical (rev 0)
+++ trunk/data/entities/test-entity-bad-UTF16-code.html.canonical 2014-01-23 09:43:41 UTC (rev 335)
@@ -0,0 +1,7 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+"�
+)BODY
+)HTML
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2014-01-23 09:31:11 UTC (rev 334)
+++ trunk/doc/changes.html 2014-01-23 09:43:41 UTC (rev 335)
@@ -27,6 +27,9 @@
<h2>Releases</h2>
<dl>Elements
+ <dt>Version 1.9.20 (to be released)</dt>
+ <dd>Fix IllegalArgumentException occurring with entities having invalid UTF-16 code and use replacement character (�) instead.
+
<dt>Version 1.9.19 (9 Oct 2013)</dt>
<dd>Element <code>LI</code> closes <code>DIV</code>,
handle Unicode supplementary character (#3609978, patch from Dan Rabe),
Modified: trunk/src/org/cyberneko/html/HTMLScanner.java
===================================================================
--- trunk/src/org/cyberneko/html/HTMLScanner.java 2014-01-23 09:31:11 UTC (rev 334)
+++ trunk/src/org/cyberneko/html/HTMLScanner.java 2014-01-23 09:43:41 UTC (rev 335)
@@ -533,6 +533,8 @@
/** Resource identifier. */
private final XMLResourceIdentifierImpl fResourceId = new XMLResourceIdentifierImpl();
+ private final char REPLACEMENT_CHARACTER = '\uFFFD'; // the � character
+
//
// Public methods
//
@@ -1381,7 +1383,15 @@
fDocumentHandler.startGeneralEntity(name, id, encoding, locationAugs());
}
str.clear();
- appendChar(str, value);
+ try {
+ appendChar(str, value);
+ }
+ catch (final IllegalArgumentException e) { // when value is not valid as UTF-16
+ if (fReportErrors) {
+ fErrorReporter.reportError("HTML1005", new Object[]{name});
+ }
+ str.append(REPLACEMENT_CHARACTER);
+ }
fDocumentHandler.characters(str, locationAugs());
if (fNotifyCharRefs) {
fDocumentHandler.endGeneralEntity(name, locationAugs());
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2014-01-23 09:31:14
|
Revision: 334
http://sourceforge.net/p/nekohtml/code/334
Author: mguillem
Date: 2014-01-23 09:31:11 +0000 (Thu, 23 Jan 2014)
Log Message:
-----------
moved entity tests together
Added Paths:
-----------
trunk/data/entities/
trunk/data/entities/test-entities-not-complete.html
trunk/data/entities/test-entities-not-complete.html.canonical
trunk/data/entities/test-entities.html
trunk/data/entities/test-entities.html.canonical
trunk/data/entities/test022.html
trunk/data/entities/test022.html.canonical
trunk/data/entities/test029.html
trunk/data/entities/test029.html.canonical
trunk/data/entities/test085.html
trunk/data/entities/test085.html.canonical
trunk/data/entities/test086.html
trunk/data/entities/test086.html.canonical
trunk/data/entities/test089.html
trunk/data/entities/test089.html.canonical
Removed Paths:
-------------
trunk/data/canonical/test-entities-not-complete.html
trunk/data/canonical/test-entities.html
trunk/data/canonical/test022.html
trunk/data/canonical/test029.html
trunk/data/canonical/test085.html
trunk/data/canonical/test086.html
trunk/data/canonical/test089.html
trunk/data/test-entities-not-complete.html
trunk/data/test-entities.html
trunk/data/test022.html
trunk/data/test029.html
trunk/data/test085.html
trunk/data/test086.html
trunk/data/test089.html
Deleted: trunk/data/canonical/test-entities-not-complete.html
===================================================================
--- trunk/data/canonical/test-entities-not-complete.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/canonical/test-entities-not-complete.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1,16 +0,0 @@
-(HTML
-(HEAD
-)HEAD
-(BODY
-"Some entities with missing ; should be recognized but not all. FF and IE behave differently!\n
-(A
-Ahref foo?a=1&prod=sd
-"link 1
-)A
-"\n
-(A
-Ahref foo?a=1©=sd
-"link 2
-)A
-)BODY
-)HTML
Deleted: trunk/data/canonical/test-entities.html
===================================================================
--- trunk/data/canonical/test-entities.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/canonical/test-entities.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1,13 +0,0 @@
-(HTML
-(HEAD
-)HEAD
-(BODY
-"&unknown1; & &unknown2; &unknown3;
-(BR
-)BR
-"\n&##32;
-(BR
-)BR
-"\n&unknown1 & &unknown2 &unknown3&
-)BODY
-)HTML
\ No newline at end of file
Deleted: trunk/data/canonical/test022.html
===================================================================
--- trunk/data/canonical/test022.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/canonical/test022.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1,7 +0,0 @@
-(HTML
-(HEAD
-)HEAD
-(BODY
-"&foo;
-)BODY
-)HTML
\ No newline at end of file
Deleted: trunk/data/canonical/test029.html
===================================================================
--- trunk/data/canonical/test029.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/canonical/test029.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1,7 +0,0 @@
-(HTML
-(HEAD
-)HEAD
-(BODY
-"&#foo;
-)BODY
-)HTML
\ No newline at end of file
Deleted: trunk/data/canonical/test085.html
===================================================================
--- trunk/data/canonical/test085.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/canonical/test085.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1,7 +0,0 @@
-(HTML
-(HEAD
-)HEAD
-(BODY
-"&
-)BODY
-)HTML
\ No newline at end of file
Deleted: trunk/data/canonical/test086.html
===================================================================
--- trunk/data/canonical/test086.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/canonical/test086.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1,7 +0,0 @@
-(HTML
-(HEAD
-)HEAD
-(BODY
-"&#x
-)BODY
-)HTML
\ No newline at end of file
Deleted: trunk/data/canonical/test089.html
===================================================================
--- trunk/data/canonical/test089.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/canonical/test089.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1,7 +0,0 @@
-(HTML
-(HEAD
-)HEAD
-(BODY
-"&
-)BODY
-)HTML
\ No newline at end of file
Copied: trunk/data/entities/test-entities-not-complete.html (from rev 328, trunk/data/test-entities-not-complete.html)
===================================================================
--- trunk/data/entities/test-entities-not-complete.html (rev 0)
+++ trunk/data/entities/test-entities-not-complete.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1,3 @@
+Some entities with missing ; should be recognized but not all. FF and IE behave differently!
+<a href="foo?a=1&prod=sd">link 1</a>
+<a href="foo?a=1©=sd">link 2</a>
\ No newline at end of file
Copied: trunk/data/entities/test-entities-not-complete.html.canonical (from rev 328, trunk/data/canonical/test-entities-not-complete.html)
===================================================================
--- trunk/data/entities/test-entities-not-complete.html.canonical (rev 0)
+++ trunk/data/entities/test-entities-not-complete.html.canonical 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1,16 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+"Some entities with missing ; should be recognized but not all. FF and IE behave differently!\n
+(A
+Ahref foo?a=1&prod=sd
+"link 1
+)A
+"\n
+(A
+Ahref foo?a=1©=sd
+"link 2
+)A
+)BODY
+)HTML
Copied: trunk/data/entities/test-entities.html (from rev 328, trunk/data/test-entities.html)
===================================================================
--- trunk/data/entities/test-entities.html (rev 0)
+++ trunk/data/entities/test-entities.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1,3 @@
+&unknown1; & &unknown2; &unknown3;<br/>
+&##32;<br/>
+&unknown1 & &unknown2 &unknown3&
\ No newline at end of file
Copied: trunk/data/entities/test-entities.html.canonical (from rev 328, trunk/data/canonical/test-entities.html)
===================================================================
--- trunk/data/entities/test-entities.html.canonical (rev 0)
+++ trunk/data/entities/test-entities.html.canonical 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1,13 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+"&unknown1; & &unknown2; &unknown3;
+(BR
+)BR
+"\n&##32;
+(BR
+)BR
+"\n&unknown1 & &unknown2 &unknown3&
+)BODY
+)HTML
\ No newline at end of file
Copied: trunk/data/entities/test022.html (from rev 328, trunk/data/test022.html)
===================================================================
--- trunk/data/entities/test022.html (rev 0)
+++ trunk/data/entities/test022.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1 @@
+&foo;
\ No newline at end of file
Copied: trunk/data/entities/test022.html.canonical (from rev 328, trunk/data/canonical/test022.html)
===================================================================
--- trunk/data/entities/test022.html.canonical (rev 0)
+++ trunk/data/entities/test022.html.canonical 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1,7 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+"&foo;
+)BODY
+)HTML
\ No newline at end of file
Copied: trunk/data/entities/test029.html (from rev 328, trunk/data/test029.html)
===================================================================
--- trunk/data/entities/test029.html (rev 0)
+++ trunk/data/entities/test029.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1 @@
+&#foo;
\ No newline at end of file
Copied: trunk/data/entities/test029.html.canonical (from rev 328, trunk/data/canonical/test029.html)
===================================================================
--- trunk/data/entities/test029.html.canonical (rev 0)
+++ trunk/data/entities/test029.html.canonical 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1,7 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+"&#foo;
+)BODY
+)HTML
\ No newline at end of file
Copied: trunk/data/entities/test085.html (from rev 328, trunk/data/test085.html)
===================================================================
--- trunk/data/entities/test085.html (rev 0)
+++ trunk/data/entities/test085.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1 @@
+&
\ No newline at end of file
Copied: trunk/data/entities/test085.html.canonical (from rev 328, trunk/data/canonical/test085.html)
===================================================================
--- trunk/data/entities/test085.html.canonical (rev 0)
+++ trunk/data/entities/test085.html.canonical 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1,7 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+"&
+)BODY
+)HTML
\ No newline at end of file
Copied: trunk/data/entities/test086.html (from rev 328, trunk/data/test086.html)
===================================================================
--- trunk/data/entities/test086.html (rev 0)
+++ trunk/data/entities/test086.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1 @@
+&#x
\ No newline at end of file
Copied: trunk/data/entities/test086.html.canonical (from rev 328, trunk/data/canonical/test086.html)
===================================================================
--- trunk/data/entities/test086.html.canonical (rev 0)
+++ trunk/data/entities/test086.html.canonical 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1,7 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+"&#x
+)BODY
+)HTML
\ No newline at end of file
Copied: trunk/data/entities/test089.html (from rev 328, trunk/data/test089.html)
===================================================================
--- trunk/data/entities/test089.html (rev 0)
+++ trunk/data/entities/test089.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1 @@
+&
\ No newline at end of file
Copied: trunk/data/entities/test089.html.canonical (from rev 328, trunk/data/canonical/test089.html)
===================================================================
--- trunk/data/entities/test089.html.canonical (rev 0)
+++ trunk/data/entities/test089.html.canonical 2014-01-23 09:31:11 UTC (rev 334)
@@ -0,0 +1,7 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+"&
+)BODY
+)HTML
\ No newline at end of file
Deleted: trunk/data/test-entities-not-complete.html
===================================================================
--- trunk/data/test-entities-not-complete.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/test-entities-not-complete.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1,3 +0,0 @@
-Some entities with missing ; should be recognized but not all. FF and IE behave differently!
-<a href="foo?a=1&prod=sd">link 1</a>
-<a href="foo?a=1©=sd">link 2</a>
\ No newline at end of file
Deleted: trunk/data/test-entities.html
===================================================================
--- trunk/data/test-entities.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/test-entities.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1,3 +0,0 @@
-&unknown1; & &unknown2; &unknown3;<br/>
-&##32;<br/>
-&unknown1 & &unknown2 &unknown3&
\ No newline at end of file
Deleted: trunk/data/test022.html
===================================================================
--- trunk/data/test022.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/test022.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1 +0,0 @@
-&foo;
\ No newline at end of file
Deleted: trunk/data/test029.html
===================================================================
--- trunk/data/test029.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/test029.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1 +0,0 @@
-&#foo;
\ No newline at end of file
Deleted: trunk/data/test085.html
===================================================================
--- trunk/data/test085.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/test085.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1 +0,0 @@
-&
\ No newline at end of file
Deleted: trunk/data/test086.html
===================================================================
--- trunk/data/test086.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/test086.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1 +0,0 @@
-&#x
\ No newline at end of file
Deleted: trunk/data/test089.html
===================================================================
--- trunk/data/test089.html 2013-10-09 14:12:54 UTC (rev 333)
+++ trunk/data/test089.html 2014-01-23 09:31:11 UTC (rev 334)
@@ -1 +0,0 @@
-&
\ No newline at end of file
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2013-10-09 14:12:57
|
Revision: 333
http://sourceforge.net/p/nekohtml/code/333
Author: mguillem
Date: 2013-10-09 14:12:54 +0000 (Wed, 09 Oct 2013)
Log Message:
-----------
fixed version number in title
Modified Paths:
--------------
trunk/doc/index.html
Modified: trunk/doc/index.html
===================================================================
--- trunk/doc/index.html 2013-10-09 13:53:01 UTC (rev 332)
+++ trunk/doc/index.html 2013-10-09 14:12:54 UTC (rev 333)
@@ -1,7 +1,7 @@
<title>NekoHTML</title>
<link rel=stylesheet type=text/css href=style.css>
-<h1>CyberNeko HTML Parser <sub>1.9.18</sub></h1>
+<h1>CyberNeko HTML Parser <sub>1.9.19</sub></h1>
<div style="right: 10; top: 10; position: absolute">
<a href="http://sourceforge.net/projects/nekohtml"><img src="http://sflogo.sourceforge.net/sflogo.php?group_id=195122&type=12" width="120" height="30" border="0" alt="Get NekoHTML at SourceForge.net. Fast, secure and Free Open Source software downloads" /></a>
</div>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2013-10-09 13:53:04
|
Revision: 332
http://sourceforge.net/p/nekohtml/code/332
Author: mguillem
Date: 2013-10-09 13:53:01 +0000 (Wed, 09 Oct 2013)
Log Message:
-----------
release 1.9.19
Added Paths:
-----------
branches/nekohtml-1.9.19/
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2013-10-09 13:42:29
|
Revision: 331
http://sourceforge.net/p/nekohtml/code/331
Author: mguillem
Date: 2013-10-09 13:42:25 +0000 (Wed, 09 Oct 2013)
Log Message:
-----------
preparing release 1.9.19
Modified Paths:
--------------
trunk/build.xml
trunk/doc/changes.html
trunk/doc/index.html
trunk/pom.xml
Modified: trunk/build.xml
===================================================================
--- trunk/build.xml 2013-10-09 07:37:53 UTC (rev 330)
+++ trunk/build.xml 2013-10-09 13:42:25 UTC (rev 331)
@@ -4,7 +4,7 @@
<!-- PROPERTIES -->
<property file='build-custom.properties' />
- <property name='version' value='1.9.19-SNAPSHOT'/>
+ <property name='version' value='1.9.19'/>
<property name='name' value='nekohtml'/>
<property name='fullname' value='${name}-${version}'/>
<property name='Title' value='NekoHTML'/>
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2013-10-09 07:37:53 UTC (rev 330)
+++ trunk/doc/changes.html 2013-10-09 13:42:25 UTC (rev 331)
@@ -27,10 +27,12 @@
<h2>Releases</h2>
<dl>Elements
- <dt>Version 1.9.19 (to be realeased)</dt>
+ <dt>Version 1.9.19 (9 Oct 2013)</dt>
<dd>Element <code>LI</code> closes <code>DIV</code>,
handle Unicode supplementary character (#3609978, patch from Dan Rabe),
- change <code>LABEL</code> to inline element (#152).
+ change <code>LABEL</code> to inline element (#152),
+ fixed resource leak (#151).
+ </dd>
<dt>Version 1.9.18 (27 Feb 2013)</dt>
<dd>Elements <code>ADDRESS</code>, <code>CENTER</code>, <code>DD</code>, <code>DIR</code>, <code>DL</code>, <code>DT</code>, <code>FIELDSET</code>, <code>LISTING</code>, <code>LI</code>, <code>MENU</code>, <code>OL</code>, <code>PRE</code>, <code>UL</code>, and <code>XMP</code> close <code>P</code> (#3595486, patch from Ahmed Ashour),
Modified: trunk/doc/index.html
===================================================================
--- trunk/doc/index.html 2013-10-09 07:37:53 UTC (rev 330)
+++ trunk/doc/index.html 2013-10-09 13:42:25 UTC (rev 331)
@@ -57,8 +57,8 @@
following location:
<ul>
<li>NekoHTML
- [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.18.zip'>zip</a>]
- [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.18.tar.gz'>tgz</a>]
+ [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.19.zip'>zip</a>]
+ [<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.19.tar.gz'>tgz</a>]
</ul>
<h2>Requirements and Limitations</h2>
@@ -110,9 +110,9 @@
<td><a href='mailto:nek...@li...'>post</a>
</table>
If you find a problem with NekoHTML, please
-<a href='http://sourceforge.net/tracker/?func=add&group_id=195122&atid=952178'>file
+<a href='http://sourceforge.net/p/nekohtml/bugs/'>file
a bug</a>.
<div class='copyright'>
-(C) Copyright 2002-2009, Andy Clark, Marc Guillemot. All rights reserved.
+(C) Copyright 2002-2013, Andy Clark, Marc Guillemot. All rights reserved.
</div>
Modified: trunk/pom.xml
===================================================================
--- trunk/pom.xml 2013-10-09 07:37:53 UTC (rev 330)
+++ trunk/pom.xml 2013-10-09 13:42:25 UTC (rev 331)
@@ -4,7 +4,7 @@
<artifactId>nekohtml</artifactId>
<name>Neko HTML</name>
<description>An HTML parser and tag balancer.</description>
- <version>1.9.19-SNAPSHOT</version>
+ <version>1.9.19</version>
<url>http://nekohtml.sourceforge.net/</url>
<licenses>
<license>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2013-10-09 07:37:55
|
Revision: 330
http://sourceforge.net/p/nekohtml/code/330
Author: mguillem
Date: 2013-10-09 07:37:53 +0000 (Wed, 09 Oct 2013)
Log Message:
-----------
fixed resource leak (#151)
Modified Paths:
--------------
trunk/src/org/cyberneko/html/HTMLEntities.java
Modified: trunk/src/org/cyberneko/html/HTMLEntities.java
===================================================================
--- trunk/src/org/cyberneko/html/HTMLEntities.java 2013-09-30 08:04:25 UTC (rev 329)
+++ trunk/src/org/cyberneko/html/HTMLEntities.java 2013-10-09 07:37:53 UTC (rev 330)
@@ -17,6 +17,7 @@
package org.cyberneko.html;
import java.io.IOException;
+import java.io.InputStream;
import java.util.Collections;
import java.util.Enumeration;
import java.util.HashMap;
@@ -96,7 +97,9 @@
/** Loads the entity values in the specified resource. */
private static void load0(final Properties props, final String filename) {
try {
- props.load(HTMLEntities.class.getResourceAsStream(filename));
+ final InputStream stream = HTMLEntities.class.getResourceAsStream(filename);
+ props.load(stream);
+ stream.close();
}
catch (final IOException e) {
System.err.println("error: unable to load resource \""+filename+"\"");
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <mgu...@us...> - 2013-09-30 08:04:30
|
Revision: 329
http://sourceforge.net/p/nekohtml/code/329
Author: mguillem
Date: 2013-09-30 08:04:25 +0000 (Mon, 30 Sep 2013)
Log Message:
-----------
changed LABEL to inline element (#152).
Modified Paths:
--------------
trunk/doc/changes.html
trunk/src/org/cyberneko/html/HTMLElements.java
Added Paths:
-----------
trunk/data/a/test-a_around-label.html
trunk/data/a/test-a_around-label.html.canonical
Added: trunk/data/a/test-a_around-label.html
===================================================================
--- trunk/data/a/test-a_around-label.html (rev 0)
+++ trunk/data/a/test-a_around-label.html 2013-09-30 08:04:25 UTC (rev 329)
@@ -0,0 +1 @@
+<a href=foo"><label>hello</label></a>
\ No newline at end of file
Added: trunk/data/a/test-a_around-label.html.canonical
===================================================================
--- trunk/data/a/test-a_around-label.html.canonical (rev 0)
+++ trunk/data/a/test-a_around-label.html.canonical 2013-09-30 08:04:25 UTC (rev 329)
@@ -0,0 +1,12 @@
+(HTML
+(HEAD
+)HEAD
+(BODY
+(A
+Ahref foo"
+(LABEL
+"hello
+)LABEL
+)A
+)BODY
+)HTML
Modified: trunk/doc/changes.html
===================================================================
--- trunk/doc/changes.html 2013-05-13 07:17:56 UTC (rev 328)
+++ trunk/doc/changes.html 2013-09-30 08:04:25 UTC (rev 329)
@@ -29,7 +29,8 @@
<dl>Elements
<dt>Version 1.9.19 (to be realeased)</dt>
<dd>Element <code>LI</code> closes <code>DIV</code>,
- handle Unicode supplementary character (#3609978, patch from Dan Rabe).
+ handle Unicode supplementary character (#3609978, patch from Dan Rabe),
+ change <code>LABEL</code> to inline element (#152).
<dt>Version 1.9.18 (27 Feb 2013)</dt>
<dd>Elements <code>ADDRESS</code>, <code>CENTER</code>, <code>DD</code>, <code>DIR</code>, <code>DL</code>, <code>DT</code>, <code>FIELDSET</code>, <code>LISTING</code>, <code>LI</code>, <code>MENU</code>, <code>OL</code>, <code>PRE</code>, <code>UL</code>, and <code>XMP</code> close <code>P</code> (#3595486, patch from Ahmed Ashour),
@@ -99,7 +100,7 @@
<dt>Version 1.9.12 (20 Apr 2009)
[<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.12.zip'>zip</a>]
[<a href='http://downloads.sourceforge.net/nekohtml/nekohtml-1.9.12.tar.gz'>tgz</a>]
- <dd>fixed NPE when parsing from a Character stream (patch provided by Ludger B\xFCnger, #2503982),
+ <dd>fixed NPE when parsing from a Character stream (patch provided by Ludger B�nger, #2503982),
when closing comment --> is missing, comment ends with > (patch provided by Tatsuhiko Miyabe, #2552096),
don't treat tags with non-HTML namespace like HTML tags (patch provided by Tatsuhiko Miyabe, #2551958),
force creation of BODY rather than of HEAD for unknown tags,
@@ -218,9 +219,9 @@
added features to strip CDATA delimiters (i.e. "<![CDATA[" and
"]]>") from <script> and <style> elements suggested by Dan Sojka;
fixed tag-balancing problem reported by Egor Samarkhanov;
- applied augmentations patches donated by Marc-Andr\xE9 Morissette;
+ applied augmentations patches donated by Marc-Andr� Morissette;
implemented augmentation performance enhancements inspired by
- Marc-Andr\xE9 Morissette;
+ Marc-Andr� Morissette;
fixed ignore-outside-content bug reported by Chris Erskine;
and
updated link to Xerces download site.
Modified: trunk/src/org/cyberneko/html/HTMLElements.java
===================================================================
--- trunk/src/org/cyberneko/html/HTMLElements.java 2013-05-13 07:17:56 UTC (rev 328)
+++ trunk/src/org/cyberneko/html/HTMLElements.java 2013-09-30 08:04:25 UTC (rev 329)
@@ -252,7 +252,7 @@
// DIR
new Element(DIR, "DIR", 0, BODY, new short[] {P}),
// DIV - - (%flow;)*
- new Element(DIV, "DIV", Element.BLOCK, BODY, new short[]{P}),
+ new Element(DIV, "DIV", Element.CONTAINER, BODY, new short[]{P}),
// DD - O (%flow;)*
new Element(DD, "DD", 0, BODY, new short[]{DT,DD,P}),
// DL - - (DT|DD)+
@@ -317,13 +317,13 @@
};
ELEMENTS_ARRAY['L'-'A'] = new Element[] {
// LABEL - - (%inline;)* -(LABEL)
- new Element(LABEL, "LABEL", 0, BODY, null),
+ new Element(LABEL, "LABEL", Element.INLINE, BODY, null),
// LAYER
new Element(LAYER, "LAYER", Element.BLOCK, BODY, null),
// LEGEND - - (%inline;)*
new Element(LEGEND, "LEGEND", Element.INLINE, FIELDSET, null),
// LI - O (%flow;)*
- new Element(LI, "LI", 0, new short[]{BODY,UL,OL}, new short[]{LI,P,DIV}),
+ new Element(LI, "LI", Element.CONTAINER, new short[]{BODY,UL,OL}, new short[]{LI,P}),
// LINK - O EMPTY
new Element(LINK, "LINK", Element.EMPTY, HEAD, null),
// LISTING
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|