ekhtml-devel Mailing List for El-Kabong - HTML Parser
Brought to you by:
jick
You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(9) |
Oct
(2) |
Nov
(16) |
Dec
(7) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(1) |
Feb
(2) |
Mar
(5) |
Apr
(1) |
May
(4) |
Jun
(1) |
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2004 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(6) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
| 2010 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Dave D. <dd...@re...> - 2011-07-29 20:27:29
|
I'm using ekhtml to parse the text from javascript HTML editor. My problem is that when processing the parsed text, if something interesting occurs (for example, an overset line or an illegal attribute value) there's no good way to get back to the raw html that corresponds to the problem. I can do a sort of "dead reckoning" by counting the characters in all the tags that are presented through the callbacks, but this is both cumbersome and not completely accurate. What I would like to see is a third field in every "ekhtml_string_t" table, which is the index in the original text of the tag being presented. |
|
From: 李仁林 <lir...@gm...> - 2010-02-28 13:05:33
|
Hello everyone ,i am now using libekhtml, but i found some problems. As we know, some tags may contain some *Non-letter char*, for example: <div class="body"> this cann't be recognized by ekhtml lib,when i feed "div class=\"body\"" as the tag parameter to the ekhtml_parser_startcb_add or ekhtml_parser_endcb_add function. is there any solution? thanks for any some instructions and help! -- 天行健,君子以自强不息! |
|
From: John K. S. <jo...@st...> - 2007-12-23 22:34:07
|
Hi Simon - The mktables project generates that file. Have you run that? > Hi, > > I followed your tips in the post related to fixing Win32 compilation. > However, I still can't manage to make it compile. I'm using Visual > Studio 6.0, and it is looking for a header file called "ekhtml_tables.h" > (included from ekhtml.c, and possibly other files), which doesn't exist > in the distribution I have. > > Any tips? > > Thanks, > > Simon Ouellette |
|
From: Simon O. <sim...@vi...> - 2007-12-23 20:56:48
|
Hi, I followed your tips in the post related to fixing Win32 compilation. However, I still can't manage to make it compile. I'm using Visual Studio 6.0, and it is looking for a header file called "ekhtml_tables.h" (included from ekhtml.c, and possibly other files), which doesn't exist in the distribution I have. Any tips? Thanks, Simon Ouellette |
|
From: Jon T. <jt...@p0...> - 2006-06-21 16:06:37
|
Your users should be entering standards compliant HTML. Ekhtml =20 offers some help for poorly formed HTML, but this is particularly =20 bad. It won't be fixed in a newer version. -- Jon On Jun 21, 2006, at 12:20 AM, mingqiang.lee wrote: > yes,i know it's a reason,but it is an user's input.I can't restrict =20= > user's activities. > i think it's a bug of ekhtml,and i don't know the authors if notice =20= > this situation or not...and whether they want to fix this problem =20 > in the next version. > > > > -----=E5=8E=9F=E5=A7=8B=E9=82=AE=E4=BB=B6----- > =E5=8F=91=E4=BB=B6=E4=BA=BA:"Jon Travis" > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4:2006-06-21 10:33:09 > =E6=94=B6=E4=BB=B6=E4=BA=BA:"mingqiang.lee" > =E6=8A=84=E9=80=81:"john sterling" ,ekh...@li... > =E4=B8=BB=E9=A2=98:Re: [ekhtml-devel] ekhtml lib coredum p! > > > From: "Jon Travis" > To: "mingqiang.lee" > Date: Wed, 21 Jun 2006 10:33:09 +0800 (CST) > Subject: Re: [ekhtml-devel] ekhtml lib coredum p! > > Because your onclick argument isn't quoted? > > -- Jon > > > On Jun 18, 2006, at 7:51 PM, mingqiang.lee wrote: > >> ?lt;/DIV> >> ?thank you for your attention=EF=BC=8Ci have resoved the coredump =20 >> problem.i think it may be a "r\n" problem,i downloaded the codes =20 >> in windows,and copy it to linux env,thus a lots of "\r\n" were =20 >> appended to the end of line.thought it does pass the compile,but =20 >> it work wrong.The second time i trimed those "\r\n" and it didn't =20 >> not coredump any more! >> ?but i found ekhtml still has some bugs.for example,if the tag is >> >> <IMG onmousewheel=3D"return bbimg(this)" style=3D"CURSOR: hand" =20 >> onclick=3Dwindow.open(this.src); alt=3D=E7=82=B9=E5=87=BB=E6=9F=A5=E7=9C= =8B=E5=8E=9F=E5=9B=BE src=3D"http://=20 >> bbsimg.qq.com/2005/01/12/006/501.jpg" onload=3D"javascript:if=20 >> (this.width>screen.width*0.7)this.style.width=3Dscreen.width*0.7;" =20= >> border=3D0> >> ?lt;/DIV> >> ekhtml can only determine the first three =20 >> attributes-"onmousewheel","style","onclick". >> does any one has idea? >> ?lt;/DIV> >> -----=E5=8E=9F=E5=A7=8B=E9=82=AE=E4=BB=B6----- >> =E5=8F=91=E4=BB=B6=E4=BA=BA:"John Sterling" >> =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4:2006-06-16 09:29:59 >> =E6=94=B6=E4=BB=B6=E4=BA=BA:"mingqiang.lee" >> =E6=8A=84=E9=80=81:ekh...@li... >> =E4=B8=BB=E9=A2=98:Re: [ekhtml-devel] ekhtml lib coredump! >> >> >> Can you provide more context? For example, what does AttrInTable =20 >> do? What does the rest of the loop look like? Do you know what is =20 >> corrupted or null? Does it happen the first time it enters the =20 >> loop? Or after you reassign P? jks On Jun 13, 2006, at 10:41 PM, =20 >> mingqiang.lee wrote: > hello all: > I found the newest version =20 >> ekhtml would coredump!i was used to use > old version ekhtml,and =20 >> it works normally thought it has some > bugs.But after i replace =20 >> the old version with the newest one serveral > days before,i found =20= >> my program can't run any more. > > my code: > void =20 >> HtmlFilter::Handle_StartTag(void *Data, ekhtml_string_t *Tag, > =20 >> ekhtml_attr_t *Attrs) > { > ekhtml_attr_t *P =3D (ekhtml_attr_t *)=20= >> Attrs; > while (P) > { > if (AttrInTable(P->name.str, P-=20 >> >name.len))//core dump here!!! > > > does it have any problem?does =20= >> the usage of newest ekhtml lib differ > from old version? > > > > =20 >> > > > > =E4=BD=A0 =E4=B8=8D =E6=83=B3 =E8=AF=95 =E8=AF=95 =E4=BB=8A = =E5=A4=8F =E6=9C=80 =E2=80=9C=E9=85=B7=E2=80=9D =E7=9A=84 =E9=82=AE =E7=AE= =B1 =20 >> =E5=90=97 =EF=BC=9F > =E8=95=B4 =E6=B6=B5 =E4=B8=AD =E5=8D=8E =E4=BC=A0= =E7=BB=9F =E6=96=87 =E5=8C=96 =E4=BA=8E =E4=B8=96 =E7=95=8C =E4=B8=80 = =E6=B5=81 =E7=A7=91 =20 >> =E3=80=96=E2=96=A1=E8=89=8B=E3=80=96=E2=96=A1=E8=89=8B =E4=B9=8B = =E4=B8=AD=EF=BC=8C=E5=88=9B =E6=96=B0 Ajax =E3=80=96=E2=96=A1=E8=89=8B=E3=80= =96=E2=96=A1=E8=89=8B =E6=9C=AF=EF=BC=8C=20 >> 126 =E2=80=9CD =E3=80=96=E2=96=A1=E8=89=8B=E6=8E=90=E3=80=80~{=E5=88=92= =E2=80=9D=E7=81=AB =E7=83=AD =E4=BD=93 =E9=AA=8C =E4=B8=AD =EF=BC=81 > =20= >> _______________________________________________ > ekhtml-devel =20 >> mailing list > ekh...@li... > https://=20 >> lists.sourceforge.net/lists/listinfo/ekhtml-devel >> >> >> >> >> >> >> =E4=BD=A0 =E4=B8=8D =E6=83=B3 =E8=AF=95 =E8=AF=95 =E4=BB=8A =E5=A4=8F = =E6=9C=80 =E2=80=9C=E9=85=B7=E2=80=9D =E7=9A=84 =E9=82=AE =E7=AE=B1 =E5=90= =97 =EF=BC=9F >> =E8=95=B4 =E6=B6=B5 =E4=B8=AD =E5=8D=8E =E4=BC=A0 =E7=BB=9F =E6=96=87 = =E5=8C=96 =E4=BA=8E =E4=B8=96 =E7=95=8C =E4=B8=80 =E6=B5=81 =E7=A7=91 = =E6=8A=80 =E4=B9=8B =20 >> =E4=B8=AD=EF=BC=8C=E5=88=9B =E6=96=B0 Ajax =E6=8A=80 =E6=9C=AF=EF=BC=8C= 126 =E2=80=9CD =E8=AE=A1 =E5=88=92=E2=80=9D=E7=81=AB =E7=83=AD =E4=BD=93 = =E9=AA=8C =20 >> =E4=B8=AD =EF=BC=81 >> _______________________________________________ >> ekhtml-devel mailing list >> ekh...@li... >> https://lists.sourceforge.net/lists/listinfo/ekhtml-devel > > > > > > > > > =E4=BD=A0 =E4=B8=8D =E6=83=B3 =E8=AF=95 =E8=AF=95 =E4=BB=8A =E5=A4=8F = =E6=9C=80 =E2=80=9C=E9=85=B7=E2=80=9D =E7=9A=84 =E9=82=AE =E7=AE=B1 =E5=90= =97 =EF=BC=9F > =E8=95=B4 =E6=B6=B5 =E4=B8=AD =E5=8D=8E =E4=BC=A0 =E7=BB=9F =E6=96=87 = =E5=8C=96 =E4=BA=8E =E4=B8=96 =E7=95=8C =E4=B8=80 =E6=B5=81 =E7=A7=91 = =E6=8A=80 =E4=B9=8B =20 > =E4=B8=AD=EF=BC=8C=E5=88=9B =E6=96=B0 Ajax =E6=8A=80 =E6=9C=AF=EF=BC=8C1= 26 =E2=80=9CD =E8=AE=A1 =E5=88=92=E2=80=9D=E7=81=AB =E7=83=AD =E4=BD=93 = =E9=AA=8C =20 > =E4=B8=AD =EF=BC=81 |
|
From: mingqiang.lee <min...@16...> - 2006-06-21 07:20:32
|
yes,i know it's a reason,but it is an user's input.I can't restrict user'= s activities. i think it's a bug of ekhtml,and i don't know the authors if notice this = situation or not...and whether they want to fix this problem in the next = version. =20 =20 -----=D4=AD=CA=BC=D3=CA=BC=FE----- =B7=A2=BC=FE=C8=CB:"Jon Travis"=20 =B7=A2=CB=CD=CA=B1=BC=E4:2006-06-21 10:33:09 =CA=D5=BC=FE=C8=CB:"mingqiang.lee"=20 =B3=AD=CB=CD:"john sterling" ,ekh...@li... =D6=F7=CC=E2:Re: [ekhtml-devel] ekhtml lib coredum p! From: "Jon Travis"=20 To: "mingqiang.lee"=20 Date: Wed, 21 Jun 2006 10:33:09 +0800 (CST) Subject: Re: [ekhtml-devel] ekhtml lib coredum p! Because your onclick argument isn't quoted? -- Jon On Jun 18, 2006, at 7:51 PM, mingqiang.lee wrote: ?lt;/DIV>=20 ?thank you for your attention=A3=ACi have resoved the coredump problem.i = think it may be a "r\n" problem,i downloaded the codes in windows,and cop= y it to linux env,thus a lots of "\r\n" were appended to the end of line.= thought it does pass the compile,but it work wrong.The second time i trim= ed those "\r\n" and it didn't not coredump any more! ?but i found ekhtml still has some bugs.for example,if the tag is=20 <IMG onmousewheel=3D"return bbimg(this)" style=3D"CURSOR: hand" onclick=3D= window.open(this.src); alt=3D=B5=E3=BB=F7=B2=E9=BF=B4=D4=AD=CD=BC src=3D"= <A href=3D"http://bbsimg.qq.com/2005/01/12/006/501.jpg">http://bbsimg.qq.= com/2005/01/12/006/501.jpg</A>" onload=3D"javascript:if(this.width>screen= =2Ewidth*0.7)this.style.width=3Dscreen.width*0.7;" border=3D0> ?lt;/DIV>=20 ekhtml can only determine the first three attributes-"onmousewheel","styl= e","onclick". does any one has idea? ?lt;/DIV>=20 -----=D4=AD=CA=BC=D3=CA=BC=FE----- =B7=A2=BC=FE=C8=CB:"John Sterling"=20 =B7=A2=CB=CD=CA=B1=BC=E4:2006-06-16 09:29:59 =CA=D5=BC=FE=C8=CB:"mingqiang.lee"=20 =B3=AD=CB=CD:ekh...@li... =D6=F7=CC=E2:Re: [ekhtml-devel] ekhtml lib coredump! Can you provide more context? For example, what does AttrInTable do? =20 What does the rest of the loop look like? Do you know what is=20 corrupted or null? Does it happen the first time it enters the loop? =20 Or after you reassign P? jks On Jun 13, 2006, at 10:41 PM, mingqiang.lee wrote: > hello all: > I found the newest version ekhtml would coredump!i was used to use=20 > old version ekhtml,and it works normally thought it has some=20 > bugs.But after i replace the old version with the newest one serveral=20 > days before,i found my program can't run any more. > > my code: > void HtmlFilter::Handle_StartTag(void *Data, ekhtml_string_t *Tag,=20 > ekhtml_attr_t *Attrs) > { > ekhtml_attr_t *P =3D (ekhtml_attr_t *)Attrs; > while (P) > { > if (AttrInTable(P->name.str, P->name.len))//core dump here!!! > > > does it have any problem?does the usage of newest ekhtml lib differ=20 > from old version? > > > > > > > > =C4=E3 =B2=BB =CF=EB =CA=D4 =CA=D4 =BD=F1 =CF=C4 =D7=EE =A1=B0=BF=E1=A1= =B1 =B5=C4 =D3=CA =CF=E4 =C2=F0 =A3=BF > =D4=CC =BA=AD =D6=D0 =BB=AA =B4=AB =CD=B3 =CE=C4 =BB=AF =D3=DA =CA=C0 =BD= =E7 =D2=BB =C1=F7 =BF=C6 =A1=BC=A1=F5=F4=BB=A1=BC=A1=F5=F4=BB =D6=AE =D6=D0= =A3=AC=B4=B4 =D0=C2 Ajax =A1=BC=A1=F5=F4=BB=A1=BC=A1=F5=F4=BB =CA=F5=A3=AC= 126 =A1=B0D =A1=BC=A1=F5=F4=BB=C6=FE=A1=A1~{=BB=AE=A1=B1=BB=F0 =C8=C8 =CC= =E5 =D1=E9 =D6=D0 =A3=A1=20 > _______________________________________________ > ekhtml-devel mailing list > ekh...@li... > https://lists.sourceforge.net/lists/listinfo/ekhtml-devel =C4=E3 =B2=BB =CF=EB =CA=D4 =CA=D4 =BD=F1 =CF=C4 =D7=EE =A1=B0=BF=E1=A1=B1= =B5=C4 =D3=CA =CF=E4 =C2=F0 =A3=BF=20 =D4=CC =BA=AD =D6=D0 =BB=AA =B4=AB =CD=B3 =CE=C4 =BB=AF =D3=DA =CA=C0 =BD= =E7 =D2=BB =C1=F7 =BF=C6 =BC=BC =D6=AE =D6=D0=A3=AC=B4=B4 =D0=C2 Ajax =BC= =BC =CA=F5=A3=AC126 =A1=B0D =BC=C6 =BB=AE=A1=B1=BB=F0 =C8=C8 =CC=E5 =D1=E9= =D6=D0 =A3=A1=20 _______________________________________________ ekhtml-devel mailing list ekh...@li... https://lists.sourceforge.net/lists/listinfo/ekhtml-devel |
|
From: Jon T. <jt...@p0...> - 2006-06-21 02:33:21
|
Because your onclick argument isn't quoted? -- Jon On Jun 18, 2006, at 7:51 PM, mingqiang.lee wrote: > > thank you for your attention=EF=BC=8Ci have resoved the coredump =20 > problem.i think it may be a "r\n" problem,i downloaded the codes in =20= > windows,and copy it to linux env,thus a lots of "\r\n" were =20 > appended to the end of line.thought it does pass the compile,but it =20= > work wrong.The second time i trimed those "\r\n" and it didn't not =20 > coredump any more! > but i found ekhtml still has some bugs.for example,if the tag is > > <IMG onmousewheel=3D"return bbimg(this)" style=3D"CURSOR: hand" =20 > onclick=3Dwindow.open(this.src); alt=3D=E7=82=B9=E5=87=BB=E6=9F=A5=E7=9C= =8B=E5=8E=9F=E5=9B=BE src=3D"http://=20 > bbsimg.qq.com/2005/01/12/006/501.jpg" onload=3D"javascript:if=20 > (this.width>screen.width*0.7)this.style.width=3Dscreen.width*0.7;" =20 > border=3D0> > > ekhtml can only determine the first three =20 > attributes-"onmousewheel","style","onclick". > does any one has idea? > > > -----=E5=8E=9F=E5=A7=8B=E9=82=AE=E4=BB=B6----- > =E5=8F=91=E4=BB=B6=E4=BA=BA:"John Sterling" > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4:2006-06-16 09:29:59 > =E6=94=B6=E4=BB=B6=E4=BA=BA:"mingqiang.lee" > =E6=8A=84=E9=80=81:ekh...@li... > =E4=B8=BB=E9=A2=98:Re: [ekhtml-devel] ekhtml lib coredump! > > > Can you provide more context? For example, what does AttrInTable do? > What does the rest of the loop look like? Do you know what is > corrupted or null? Does it happen the first time it enters the loop? > Or after you reassign P? > > jks > > On Jun 13, 2006, at 10:41 PM, mingqiang.lee wrote: > > > hello all: > > I found the newest version ekhtml would coredump!i was used to =20= > use > > old version ekhtml,and it works normally thought it has some > > bugs.But after i replace the old version with the newest one =20 > serveral > > days before,i found my program can't run any more. > > > > my code: > > void HtmlFilter::Handle_StartTag(void *Data, ekhtml_string_t *Tag, > > ekhtml_attr_t *Attrs) > > { > > ekhtml_attr_t *P =3D (ekhtml_attr_t *)Attrs; > > while (P) > > { > > if (AttrInTable(P->name.str, P->name.len))//core dump here!!! > > > > > > does it have any problem?does the usage of newest ekhtml lib differ > > from old version? > > > > > > > > > > > > > > > > =E4=BD=A0 =E4=B8=8D =E6=83=B3 =E8=AF=95 =E8=AF=95 =E4=BB=8A =E5=A4=8F = =E6=9C=80 =E2=80=9C=E9=85=B7=E2=80=9D =E7=9A=84 =E9=82=AE =E7=AE=B1 =E5=90= =97 =EF=BC=9F > > =E8=95=B4 =E6=B6=B5 =E4=B8=AD =E5=8D=8E =E4=BC=A0 =E7=BB=9F =E6=96=87 = =E5=8C=96 =E4=BA=8E =E4=B8=96 =E7=95=8C =E4=B8=80 =E6=B5=81 =E7=A7=91 =20= > =E3=80=96=EF=A1=BE=E8=89=8B=E3=80=96=EF=A1=BE=E8=89=8B =E4=B9=8B = =E4=B8=AD=EF=BC=8C=E5=88=9B =E6=96=B0 Ajax =E3=80=96=EF=A1=BE=E8=89=8B=E3=80= =96=EF=A1=BE=E8=89=8B =E6=9C=AF=EF=BC=8C=20 > 126 =E2=80=9CD =E3=80=96=EF=A1=BE=E8=89=8B=E6=8E=90=E3=80=80~{=E5=88=92=E2= =80=9D=E7=81=AB =E7=83=AD =E4=BD=93 =E9=AA=8C =E4=B8=AD =EF=BC=81 > > _______________________________________________ > > ekhtml-devel mailing list > > ekh...@li... > > https://lists.sourceforge.net/lists/listinfo/ekhtml-devel > > > > > > > > =E4=BD=A0 =E4=B8=8D =E6=83=B3 =E8=AF=95 =E8=AF=95 =E4=BB=8A =E5=A4=8F = =E6=9C=80 =E2=80=9C=E9=85=B7=E2=80=9D =E7=9A=84 =E9=82=AE =E7=AE=B1 =E5=90= =97 =EF=BC=9F > =E8=95=B4 =E6=B6=B5 =E4=B8=AD =E5=8D=8E =E4=BC=A0 =E7=BB=9F =E6=96=87 = =E5=8C=96 =E4=BA=8E =E4=B8=96 =E7=95=8C =E4=B8=80 =E6=B5=81 =E7=A7=91 = =E6=8A=80 =E4=B9=8B =20 > =E4=B8=AD=EF=BC=8C=E5=88=9B =E6=96=B0 Ajax =E6=8A=80 =E6=9C=AF=EF=BC=8C1= 26 =E2=80=9CD =E8=AE=A1 =E5=88=92=E2=80=9D=E7=81=AB =E7=83=AD =E4=BD=93 = =E9=AA=8C =20 > =E4=B8=AD =EF=BC=81 > _______________________________________________ > ekhtml-devel mailing list > ekh...@li... > https://lists.sourceforge.net/lists/listinfo/ekhtml-devel |
|
From: mingqiang.lee <min...@16...> - 2006-06-19 02:51:32
|
=20 thank you for your attention=A3=ACi have resoved the coredump problem.i= think it may be a "r\n" problem,i downloaded the codes in windows,and co= py it to linux env,thus a lots of "\r\n" were appended to the end of line= =2Ethought it does pass the compile,but it work wrong.The second time i t= rimed those "\r\n" and it didn't not coredump any more! but i found ekhtml still has some bugs.for example,if the tag is=20 <IMG onmousewheel=3D"return bbimg(this)" style=3D"CURSOR: hand" onclick=3D= window.open(this.src); alt=3D=B5=E3=BB=F7=B2=E9=BF=B4=D4=AD=CD=BC src=3D"= <A href=3D"http://bbsimg.qq.com/2005/01/12/006/501.jpg">http://bbsimg.qq.= com/2005/01/12/006/501.jpg</A>" onload=3D"javascript:if(this.width>screen= =2Ewidth*0.7)this.style.width=3Dscreen.width*0.7;" border=3D0> =20 ekhtml can only determine the first three attributes-"onmousewheel","styl= e","onclick". does any one has idea? =20 -----=D4=AD=CA=BC=D3=CA=BC=FE----- =B7=A2=BC=FE=C8=CB:"John Sterling"=20 =B7=A2=CB=CD=CA=B1=BC=E4:2006-06-16 09:29:59 =CA=D5=BC=FE=C8=CB:"mingqiang.lee"=20 =B3=AD=CB=CD:ekh...@li... =D6=F7=CC=E2:Re: [ekhtml-devel] ekhtml lib coredump! Can you provide more context? For example, what does AttrInTable do? =20 What does the rest of the loop look like? Do you know what is=20 corrupted or null? Does it happen the first time it enters the loop? =20 Or after you reassign P? jks On Jun 13, 2006, at 10:41 PM, mingqiang.lee wrote: > hello all: > I found the newest version ekhtml would coredump!i was used to use=20 > old version ekhtml,and it works normally thought it has some=20 > bugs.But after i replace the old version with the newest one serveral=20 > days before,i found my program can't run any more. > > my code: > void HtmlFilter::Handle_StartTag(void *Data, ekhtml_string_t *Tag,=20 > ekhtml_attr_t *Attrs) > { > ekhtml_attr_t *P =3D (ekhtml_attr_t *)Attrs; > while (P) > { > if (AttrInTable(P->name.str, P->name.len))//core dump here!!! > > > does it have any problem?does the usage of newest ekhtml lib differ=20 > from old version? > > > > > > > > =C4=E3 =B2=BB =CF=EB =CA=D4 =CA=D4 =BD=F1 =CF=C4 =D7=EE =A1=B0=BF=E1=A1= =B1 =B5=C4 =D3=CA =CF=E4 =C2=F0 =A3=BF > =D4=CC =BA=AD =D6=D0 =BB=AA =B4=AB =CD=B3 =CE=C4 =BB=AF =D3=DA =CA=C0 =BD= =E7 =D2=BB =C1=F7 =BF=C6 =A6=EC=F4=BB=A6=EC=F4=BB =D6=AE =D6=D0=A3=AC=B4=B4= =D0=C2 Ajax =A6=EC=F4=BB=A6=EC=F4=BB =CA=F5=A3=AC126 =A1=B0D =A6=EC=F4=BB= =C6=FE=A1=A1~{=BB=AE=A1=B1=BB=F0 =C8=C8 =CC=E5 =D1=E9 =D6=D0 =A3=A1=20 > _______________________________________________ > ekhtml-devel mailing list > ekh...@li... > https://lists.sourceforge.net/lists/listinfo/ekhtml-devel |
|
From: John S. <jo...@st...> - 2006-06-16 01:30:16
|
Can you provide more context? For example, what does AttrInTable do?
What does the rest of the loop look like? Do you know what is
corrupted or null? Does it happen the first time it enters the loop?
Or after you reassign P?
jks
On Jun 13, 2006, at 10:41 PM, mingqiang.lee wrote:
> hello all:
> I found the newest version ekhtml would coredump!i was used to use
> old version ekhtml,and it works normally thought it has some
> bugs.But after i replace the old version with the newest one serveral
> days before,i found my program can't run any more.
>
> my code:
> void HtmlFilter::Handle_StartTag(void *Data, ekhtml_string_t *Tag,
> ekhtml_attr_t *Attrs)
> {
> ekhtml_attr_t *P = (ekhtml_attr_t *)Attrs;
> while (P)
> {
> if (AttrInTable(P->name.str, P->name.len))//core dump here!!!
>
>
> does it have any problem?does the usage of newest ekhtml lib differ
> from old version?
>
>
>
>
>
>
>
> 你 不 想 试 试 今 夏 最 “酷” 的 邮 箱 吗 ?
> 蕴 涵 中 华 传 统 文 化 于 世 界 一 流 科 技 之 中,创 新 Ajax 技 术,126 “D 计 划”火 热 体 验 中 !
> _______________________________________________
> ekhtml-devel mailing list
> ekh...@li...
> https://lists.sourceforge.net/lists/listinfo/ekhtml-devel
|
|
From: mingqiang.lee <min...@16...> - 2006-06-14 02:41:52
|
hello all:
I found the newest version ekhtml would coredump!i was used to use old version ekhtml,and it works normally thought it has some bugs.But after i replace the old version with the newest one serveral days before,i found my program can't run any more.
my code:
void HtmlFilter::Handle_StartTag(void *Data, ekhtml_string_t *Tag, ekhtml_attr_t *Attrs)
{
ekhtml_attr_t *P = (ekhtml_attr_t *)Attrs;
while (P)
{
if (AttrInTable(P->name.str, P->name.len))//core dump here!!!
does it have any problem?does the usage of newest ekhtml lib differ from old version? |
|
From: SourceForge.net <no...@so...> - 2004-06-09 12:21:42
|
Bugs item #969551, was opened at 2004-06-09 15:21 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=500165&aid=969551&group_id=62314 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Osnat (osnat) Assigned to: Nobody/Anonymous (nobody) Summary: memory leak in ekhtml_parser_destroy() Initial Comment: memory leak is found in the function ekhtml_parser_destroy(). hnode_t is being deleted from hash table without being freed. replacing the call for hash_scan_delete() with a call for hash_scan_delfree() solves this leak. here is the fixed function: void ekhtml_parser_destroy(ekhtml_parser_t *ekparser) { hnode_t *hn; hscan_t hs; hash_scan_begin(&hs, ekparser->startendcb); while((hn = hash_scan_next(&hs))) { ekhtml_string_t *key = (ekhtml_string_t *)hnode_getkey(hn); ekhtml_tag_container *cont = hnode_get(hn); hash_scan_delfree(ekparser- >startendcb, hn); // mem leak fix free((char *)key->str); free(key); free(cont); } hash_destroy(ekparser->startendcb); ekhtml_parser_starttag_cleanup(ekparser); free(ekparser->buf); free(ekparser); } ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=500165&aid=969551&group_id=62314 |
|
From: SourceForge.net <no...@so...> - 2004-02-19 23:37:10
|
Bugs item #900755, was opened at 2004-02-19 18:29 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=500165&aid=900755&group_id=62314 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Josh (jmmankoff) Assigned to: Nobody/Anonymous (nobody) Summary: unquoted attribute value with comma handled poorly Initial Comment: In ekhtml_mktables.c, the method that specifies which characters are valid attribute values (valid_attrvalue) does not include the comma (,), the semicolon (;), or the vertical bar (|). If any of these characters shows up in an unquoted attribute value, then the rest of the element is not parsed. An example that seems to show up pretty commonly is: <AREA SHAPE=rectangle COORDS=40,20,30,40 HREF="http://..."/>. In this case, the HREF is never found. As another example, if an unquoted HREF contains any of these characters, than the value will only include characters up to, but not including, the first offending character. I propose adding these characters to the list of characters in valid_attrvalue. Here is the diff of ekhtml_mktables.c for this change: 72c72,73 < in == '$' || in == '_') --- > in == '$' || in == '_' || in == ',' || in == ';' || > in == '|') Attached is the modified file ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=500165&aid=900755&group_id=62314 |
|
From: Jon T. <jt...@p0...> - 2003-07-23 20:34:32
|
That is actually the correct behaviour. It does not buffer up the entire data before calling the callback -- it makes calls to that callback at various times (like when the buffer needs to be flushed to refill and such). If the buffer kept growing, in order for you to get a full data callback, it would potentially take a lot of memory. You'll need to do your own buffering in the callback if you want to deal with the data as a whole. -- Jon On Wednesday, July 23, 2003, at 12:49 PM, Unknown Unknown wrote: > Hello, > > I just started using ekhtml a few weeks ago to parse some ebay html > pages and I am having a problem with the data callback returning > incomplete strings of data. The problem seems to stem from the > feedsize. Some values of the feedsize work fine and others don't, of > course this varies for each file, causing inconsistant parsing > results. > > I have attached a small sample of the program to this e-mail, > hopefully you guys won't mind (it's only 7kb), with an html file that > exhibits the problem. Inside the source file there are two feedsizes, > one exhibiting the problem and one that works fine. Let me know if > this is a bug or if I am doing something incorrectly because I was > under the impression that any feedsize would still result in the same > data returned by the callback. > > The program is run with the command: > ./buffbug < testhtml/1534097125.html > > Thanks for any and all help, > Jacob Abrams > > _________________________________________________________________ > Add photos to your messages with MSN 8. Get 2 months FREE*. > http://join.msn.com/?page=features/featuredemail > <buffbug.tgz> |
|
From: Unknown U. <sat...@ho...> - 2003-07-23 19:50:09
|
Hello, I just started using ekhtml a few weeks ago to parse some ebay html pages and I am having a problem with the data callback returning incomplete strings of data. The problem seems to stem from the feedsize. Some values of the feedsize work fine and others don't, of course this varies for each file, causing inconsistant parsing results. I have attached a small sample of the program to this e-mail, hopefully you guys won't mind (it's only 7kb), with an html file that exhibits the problem. Inside the source file there are two feedsizes, one exhibiting the problem and one that works fine. Let me know if this is a bug or if I am doing something incorrectly because I was under the impression that any feedsize would still result in the same data returned by the callback. The program is run with the command: ./buffbug < testhtml/1534097125.html Thanks for any and all help, Jacob Abrams _________________________________________________________________ Add photos to your messages with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail |
|
From: SourceForge.net <no...@so...> - 2003-06-28 22:20:45
|
Bugs item #762526, was opened at 2003-06-28 22:20 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=500165&aid=762526&group_id=62314 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: weltraumkuh (weltraumhund) Assigned to: Nobody/Anonymous (nobody) Summary: > doesn't work in attributes Initial Comment: a > doesn't work in attributes. Example: <a href="#" name="x->y" > or <a test=">" > ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=500165&aid=762526&group_id=62314 |
|
From: Todd F. <ta...@le...> - 2003-05-01 18:56:31
|
Just wanted to send a follow up. It turns out I was doing something wrong on my part and it was nothing to do with ekhtml sorry :)* -todd ekh...@li... wrote: >Send ekhtml-devel mailing list submissions to > ekh...@li... > >To subscribe or unsubscribe via the World Wide Web, visit > https://lists.sourceforge.net/lists/listinfo/ekhtml-devel >or, via email, send a message with subject or body 'help' to > ekh...@li... > >You can reach the person managing the list at > ekh...@li... > >When replying, please edit your Subject line so it is more specific >than "Re: Contents of ekhtml-devel digest..." > > >Today's Topics: > > 1. Status (Todd Fisher) > >--__--__-- > >Message: 1 >Date: Wed, 30 Apr 2003 02:44:48 -0400 >From: Todd Fisher <ta...@le...> >Organization: Lehigh University >To: ekh...@li... >Subject: [ekhtml-devel] Status > >Hi, > I recently downloaded the release version 0.32 of ekhtml. In using >it i'm having a little difficulty... >the example application runs just fine and if I use the library to do a >single parse, so streamming it works... > >however, when i try to use it for streaming i get a core dump and it >would appear that the library is >returning in one of my event handlers a bad string pointer. Anyways, >here's how i'm using the library >and here's the stack trace. > >I also have a question about initializing callbacks, is the library case >insensitve when i set tag handlers >like in the following init method? Obviously, i can figure this out with >a few quick tests but since >i'm asking i figured it should be easy enough to answer... > >Thanks in advance, >-todd > > >void PageProc::init() >{ > ekhtml_parser_datacb_set ( ekparser, handle_clear_text ); > ekhtml_parser_startcb_add ( ekparser, "title", handle_title_start ); > ekhtml_parser_endcb_add ( ekparser, "title", handle_title_end ); > ekhtml_parser_startcb_add ( ekparser, "TITLE", handle_title_start ); > ekhtml_parser_endcb_add ( ekparser, "TITLE", handle_title_end ); > ekhtml_parser_startcb_add ( ekparser, "a", handle_a_tag_start ); > ekhtml_parser_startcb_add ( ekparser, "A", handle_a_tag_start ); > ekhtml_parser_startcb_add ( ekparser, "meta", handle_meta_tag_start ); > ekhtml_parser_startcb_add ( ekparser, "META", handle_meta_tag_start ); >} > >void PageProc:: >parse_init( Page *p, HostIndex *hindex ) >{ > parse_data = new ParseData; > parse_data->check_is_html = true; > parse_data->is_html = true; > parse_data->noindex = false; > parse_data->nofollow = false; > parse_data->add_to_title = false; > parse_data->page = p; > parse_data->host_index = hindex; > this->ekparser = ekhtml_parser_new( NULL ); > init(); > ekhtml_parser_cbdata_set( this->ekparser, p ); >} > >void PageProc:: >parse_feed( const char *new_bytes, size_t bytes ) >{ > if( parse_data->check_is_html ){ // first call check if the doc is html > parse_data->check_is_html = false; // only check once > parse_data->is_html = html_check( new_bytes, bytes ); > parse_data->page->type = "HTML"; > } > else{ > parse_data->page->type = "UNKOWN"; > } > if( parse_data->is_html ){ > ekhtml_string_t str; > str.str = new_bytes; > str.len = bytes; > fprintf( stderr, "parsing document\n" ); > ekhtml_parser_feed( this->ekparser, &str ); > fprintf( stderr, "flushing\n" ); > ekhtml_parser_flush( this->ekparser, 0 ); > } > fprintf( stderr, "Appending to content buffer\n" ); > // if its html or not always store the new > // document because we can later do a more > // extensive analysis of the doc type > parse_data->page->content.append( new_bytes, bytes ); > fprintf( stderr, "done\n" ); >} >void PageProc:: >parse_close() >{ > ekhtml_parser_flush( ekparser, 1 ); > ekhtml_parser_destroy( ekparser ); > // set the mod date for the page > time_t rawtime; > time ( &rawtime ); > parse_data->page->mod_date = asctime( localtime( &rawtime ) ); > > // free our parse data > delete parse_data; >} > >0 0x4207c45c in memcpy () from /lib/tls/libc.so.6 >#1 0x4029b40e in std::string::_Rep::_M_clone(std::allocator<char> >const&, unsigned) () from /usr/lib/libstdc++.so.5 >#2 0x40299146 in std::string::reserve(unsigned) () from >/usr/lib/libstdc++.so.5 >#3 0x40299642 in std::string::append(char const*, unsigned) () from >/usr/lib/libstdc++.so.5 >#4 0x0804ed1a in handle_clear_text (cbdata=0x806c000, str=0x402c939c) >at PageProc.cc:195 >#5 0x40218fc2 in ekhtml_parse_special (parser=0x4214abdc, >state_data=0x806716c, > curp=0x200fa64 <Address 0x200fa64 out of bounds>, endp=0x42134014 >"", baddata=0xbfffdc64) at ekhtml_special.c:64 >#6 0x402185ec in ekhtml_parser_flush (parser=0x8067120, flushall=0) at >ekhtml.c:173 >#7 0x0804e4fb in PageProc::parse_feed(char const*, unsigned) >(this=0xbfffdfc0, > new_bytes=0x805dc91 "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 >Transitional//EN\">\n<html lang=\"en\">\n<head>\n\t<meta >http-equiv=\"content-type\" content=\"text/html; >charset=iso-8859-1\">\n\t<meta http-equiv=\"refresh\" content=\"1"..., > bytes=517) at PageProc.cc:54 >#8 0x0804ee16 in retrieve_doc(void*, unsigned, unsigned, PageProc*) >(ptr=0x805dc91, size=1, nmemb=517, parser=0xbfffdfc0) > at PageProc.cc:213 >#9 0x400227c4 in Curl_client_write (data=0x805d610, type=1, > ptr=0x805dc91 "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 >Transitional//EN\">\n<html lang=\"en\">\n<head>\n\t<meta >http-equiv=\"content-type\" content=\"text/html; >charset=iso-8859-1\">\n\t<meta http-equiv=\"refresh\" content=\"1"..., > len=517) at sendf.c:309 >#10 0x400322bf in Curl_httpchunk_read (conn=0x8068fe0, > datap=0x805dc91 "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 >Transitional//EN\">\n<html lang=\"en\">\n<head>\n\t<meta >http-equiv=\"content-type\" content=\"text/html; >charset=iso-8859-1\">\n\t<meta http-equiv=\"refresh\" content=\"1"..., > length=1407, wrote=0xbfffddb4) at http_chunks.c:183 >#11 0x4003052c in Curl_readwrite (conn=0x8068fe0, done=0xbfffdf0f "") at >transfer.c:841 >#12 0x4003155d in Transfer (conn=0x8068fe0) at transfer.c:1318 >#13 0x40031b3a in Curl_perform (data=0x805d610) at transfer.c:1657 >#14 0x40031ebc in curl_easy_perform (curl=0x805d610) at easy.c:247 >#15 0x0804a5df in main () at page_proc_test.cc:22 >#16 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6 > > > > > >--__--__-- > >_______________________________________________ >ekhtml-devel mailing list >ekh...@li... >https://lists.sourceforge.net/lists/listinfo/ekhtml-devel > > >End of ekhtml-devel Digest > > |
|
From: Dmitri T. <dm...@ne...> - 2003-05-01 18:23:02
|
T0ssIHRoYW5rcywgaXQncyBlYXNpZXIgdGhhbiBJIHRob3VnaHQgOikgIE1heWJlIHRoYXQgc2hv dWxkIGJlIGFuDQpvcHRpb24gdG8gdGhlIHBhcnNlcj8NCiANCiAgLSBEbWl0cmkuDQogDQo= |
|
From: Jon T. <jt...@p0...> - 2003-05-01 17:30:24
|
If the tags are converted to upper case, it makes it easier for the callback functions to verify which tag they are being called for. I.e. they don't have to do case insensitive comparisons. -- Jon On Thursday, May 1, 2003, at 07:54 AM, Dmitri Tikhonov wrote: > Is there an internal reason why tag names are converted to upper case? > I've > removed the code that does it and the change does not seem to break > anything. > I'm very curious to find out why it's done. > > - Dmitri. > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > ekhtml-devel mailing list > ekh...@li... > https://lists.sourceforge.net/lists/listinfo/ekhtml-devel > |
|
From: Dmitri T. <dm...@ne...> - 2003-05-01 14:54:49
|
Is there an internal reason why tag names are converted to upper case? I've removed the code that does it and the change does not seem to break anything. I'm very curious to find out why it's done. - Dmitri. |
|
From: Todd F. <ta...@le...> - 2003-04-30 06:46:49
|
Hi,
I recently downloaded the release version 0.32 of ekhtml. In using
it i'm having a little difficulty...
the example application runs just fine and if I use the library to do a
single parse, so streamming it works...
however, when i try to use it for streaming i get a core dump and it
would appear that the library is
returning in one of my event handlers a bad string pointer. Anyways,
here's how i'm using the library
and here's the stack trace.
I also have a question about initializing callbacks, is the library case
insensitve when i set tag handlers
like in the following init method? Obviously, i can figure this out with
a few quick tests but since
i'm asking i figured it should be easy enough to answer...
Thanks in advance,
-todd
void PageProc::init()
{
ekhtml_parser_datacb_set ( ekparser, handle_clear_text );
ekhtml_parser_startcb_add ( ekparser, "title", handle_title_start );
ekhtml_parser_endcb_add ( ekparser, "title", handle_title_end );
ekhtml_parser_startcb_add ( ekparser, "TITLE", handle_title_start );
ekhtml_parser_endcb_add ( ekparser, "TITLE", handle_title_end );
ekhtml_parser_startcb_add ( ekparser, "a", handle_a_tag_start );
ekhtml_parser_startcb_add ( ekparser, "A", handle_a_tag_start );
ekhtml_parser_startcb_add ( ekparser, "meta", handle_meta_tag_start );
ekhtml_parser_startcb_add ( ekparser, "META", handle_meta_tag_start );
}
void PageProc::
parse_init( Page *p, HostIndex *hindex )
{
parse_data = new ParseData;
parse_data->check_is_html = true;
parse_data->is_html = true;
parse_data->noindex = false;
parse_data->nofollow = false;
parse_data->add_to_title = false;
parse_data->page = p;
parse_data->host_index = hindex;
this->ekparser = ekhtml_parser_new( NULL );
init();
ekhtml_parser_cbdata_set( this->ekparser, p );
}
void PageProc::
parse_feed( const char *new_bytes, size_t bytes )
{
if( parse_data->check_is_html ){ // first call check if the doc is html
parse_data->check_is_html = false; // only check once
parse_data->is_html = html_check( new_bytes, bytes );
parse_data->page->type = "HTML";
}
else{
parse_data->page->type = "UNKOWN";
}
if( parse_data->is_html ){
ekhtml_string_t str;
str.str = new_bytes;
str.len = bytes;
fprintf( stderr, "parsing document\n" );
ekhtml_parser_feed( this->ekparser, &str );
fprintf( stderr, "flushing\n" );
ekhtml_parser_flush( this->ekparser, 0 );
}
fprintf( stderr, "Appending to content buffer\n" );
// if its html or not always store the new
// document because we can later do a more
// extensive analysis of the doc type
parse_data->page->content.append( new_bytes, bytes );
fprintf( stderr, "done\n" );
}
void PageProc::
parse_close()
{
ekhtml_parser_flush( ekparser, 1 );
ekhtml_parser_destroy( ekparser );
// set the mod date for the page
time_t rawtime;
time ( &rawtime );
parse_data->page->mod_date = asctime( localtime( &rawtime ) );
// free our parse data
delete parse_data;
}
0 0x4207c45c in memcpy () from /lib/tls/libc.so.6
#1 0x4029b40e in std::string::_Rep::_M_clone(std::allocator<char>
const&, unsigned) () from /usr/lib/libstdc++.so.5
#2 0x40299146 in std::string::reserve(unsigned) () from
/usr/lib/libstdc++.so.5
#3 0x40299642 in std::string::append(char const*, unsigned) () from
/usr/lib/libstdc++.so.5
#4 0x0804ed1a in handle_clear_text (cbdata=0x806c000, str=0x402c939c)
at PageProc.cc:195
#5 0x40218fc2 in ekhtml_parse_special (parser=0x4214abdc,
state_data=0x806716c,
curp=0x200fa64 <Address 0x200fa64 out of bounds>, endp=0x42134014
"", baddata=0xbfffdc64) at ekhtml_special.c:64
#6 0x402185ec in ekhtml_parser_flush (parser=0x8067120, flushall=0) at
ekhtml.c:173
#7 0x0804e4fb in PageProc::parse_feed(char const*, unsigned)
(this=0xbfffdfc0,
new_bytes=0x805dc91 "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01
Transitional//EN\">\n<html lang=\"en\">\n<head>\n\t<meta
http-equiv=\"content-type\" content=\"text/html;
charset=iso-8859-1\">\n\t<meta http-equiv=\"refresh\" content=\"1"...,
bytes=517) at PageProc.cc:54
#8 0x0804ee16 in retrieve_doc(void*, unsigned, unsigned, PageProc*)
(ptr=0x805dc91, size=1, nmemb=517, parser=0xbfffdfc0)
at PageProc.cc:213
#9 0x400227c4 in Curl_client_write (data=0x805d610, type=1,
ptr=0x805dc91 "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01
Transitional//EN\">\n<html lang=\"en\">\n<head>\n\t<meta
http-equiv=\"content-type\" content=\"text/html;
charset=iso-8859-1\">\n\t<meta http-equiv=\"refresh\" content=\"1"...,
len=517) at sendf.c:309
#10 0x400322bf in Curl_httpchunk_read (conn=0x8068fe0,
datap=0x805dc91 "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01
Transitional//EN\">\n<html lang=\"en\">\n<head>\n\t<meta
http-equiv=\"content-type\" content=\"text/html;
charset=iso-8859-1\">\n\t<meta http-equiv=\"refresh\" content=\"1"...,
length=1407, wrote=0xbfffddb4) at http_chunks.c:183
#11 0x4003052c in Curl_readwrite (conn=0x8068fe0, done=0xbfffdf0f "") at
transfer.c:841
#12 0x4003155d in Transfer (conn=0x8068fe0) at transfer.c:1318
#13 0x40031b3a in Curl_perform (data=0x805d610) at transfer.c:1657
#14 0x40031ebc in curl_easy_perform (curl=0x805d610) at easy.c:247
#15 0x0804a5df in main () at page_proc_test.cc:22
#16 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6
|
|
From: Mladen T. <mt...@ap...> - 2003-03-06 19:42:45
|
> -----Original Message----- > From: Jon Travis > Excellent, thanks for the patches. > > Ideally I'd like to centralize these defines, so I don't > have to change them in multiple locations when a release > occurs. Any way that these #s can be gleaned from the > configure.in and this file autogenerated? > > -- Jon You'll need some kind of parser to do that, like awk or vb script. Although those #.. EKHTML_VER_* are never unused, so it doesn't mater for now ;). MT. |
|
From: Jon T. <jt...@p0...> - 2003-03-06 19:08:17
|
Excellent, thanks for the patches.
Ideally I'd like to centralize these defines, so I don't
have to change them in multiple locations when a release
occurs. Any way that these #s can be gleaned from the
configure.in and this file autogenerated?
-- Jon
On Thursday, March 6, 2003, at 05:25 AM, Mladen Turk wrote:
> Hi all,
>
> Here are some changes that enables compiling on WIN32.
>
> 1. Use the ekhtml_config.hw (like APR's apr.hw, etc...)
>
> Since there isn't autoconf on WIN32 here is a file that can be copied
> to
> ekhtml_config.h during building of mktables. I'm using VC7 so didn't
> supply
> the patch for .dsp's. So here it is:
>
> /* ekhtml_config.hw
> This file is in the public domain.
>
> Descriptive text for the C preprocessor macros that
> the distributed Autoconf macros can define.
> No software package will use all of them; autoheader copies the ones
> your configure.in uses into your configuration header file
> templates.
>
> The entries are in sort -df order: alphabetical, case insensitive,
> ignoring punctuation (such as underscores). Although this order
> can split up related entries, it makes it easier to check whether
> a given entry is in the file.
>
> Leave the following blank line there!! Autoheader needs it. */
>
> #define EKHTML_HASH_BITS 32
> #define EKHTML_VER_BUGFIX 0
> #define EKHTML_VER_MAJOR 3
> #define EKHTML_VER_MINOR 3
>
> #ifdef _MSC_VER
> #define inline __inline
> #endif
>
>
> 2. Runtime fixes for mktables debug builds:
>
> Assertion in msvcrt caused by the fact that all the
> characters larger then 127 are transformed to negative integer values.
> So the mktables fails.
> Simple casting to unsigned char resolves that.
>
>
> RCS file: /cvsroot/ekhtml/ekhtml/src/ekhtml_mktables.c,v
> retrieving revision 1.2
> diff -u -3 -r1.2 ekhtml_mktables.c
> --- ekhtml_mktables.c 22 Sep 2002 04:49:57 -0000 1.2
> +++ ekhtml_mktables.c 6 Mar 2003 13:15:09 -0000
> @@ -46,13 +46,13 @@
>
> /* valid_tagname: Character map for a tagname AFTER the first letter
> */
> static EKHTML_CHARMAP_TYPE valid_tagname(char in){
> - if(in == '-' || in == '.' || isdigit(in) || isalpha(in))
> + if(in == '-' || in == '.' || isdigit((unsigned char)in) ||
> isalpha((unsigned char)in))
> return 1;
> return 0;
> }
>
> static EKHTML_CHARMAP_TYPE valid_whitespace(char in){
> - return isspace(in) ? 1 : 0;
> + return isspace((unsigned char)in) ? 1 : 0;
> }
>
> /* attribute name AFTER the first character */
> @@ -75,13 +75,13 @@
> }
>
> static EKHTML_CHARMAP_TYPE valid_begattrname(char in){
> - return (isalpha(in) || in == '_') ? 1 : 0;
> + return (isalpha((unsigned char)in) || in == '_') ? 1 : 0;
> }
>
> static EKHTML_CHARMAP_TYPE ekhtml_state(char in){
> if(in == '/')
> return EKHTML_STATE_ENDTAG;
> - if(isalpha(in))
> + if(isalpha((unsigned char)in))
> return EKHTML_STATE_STARTTAG;
> if(in == '!')
> return EKHTML_STATE_NONE; /* Must be determined by caller */
>
>
>
> MT.
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Etnus, makers of TotalView, The
> debugger
> for complex code. Debugging C/C++ programs can leave you feeling lost
> and
> disoriented. TotalView can help you find your way. Available on major
> UNIX
> and Linux platforms. Try it free. www.etnus.com
> _______________________________________________
> ekhtml-devel mailing list
> ekh...@li...
> https://lists.sourceforge.net/lists/listinfo/ekhtml-devel
>
|
|
From: Mladen T. <mt...@ap...> - 2003-03-06 13:28:54
|
Hi all,
Here are some changes that enables compiling on WIN32.
1. Use the ekhtml_config.hw (like APR's apr.hw, etc...)
Since there isn't autoconf on WIN32 here is a file that can be copied to
ekhtml_config.h during building of mktables. I'm using VC7 so didn't
supply
the patch for .dsp's. So here it is:
/* ekhtml_config.hw
This file is in the public domain.
Descriptive text for the C preprocessor macros that
the distributed Autoconf macros can define.
No software package will use all of them; autoheader copies the ones
your configure.in uses into your configuration header file templates.
The entries are in sort -df order: alphabetical, case insensitive,
ignoring punctuation (such as underscores). Although this order
can split up related entries, it makes it easier to check whether
a given entry is in the file.
Leave the following blank line there!! Autoheader needs it. */
#define EKHTML_HASH_BITS 32
#define EKHTML_VER_BUGFIX 0
#define EKHTML_VER_MAJOR 3
#define EKHTML_VER_MINOR 3
#ifdef _MSC_VER
#define inline __inline
#endif
2. Runtime fixes for mktables debug builds:
Assertion in msvcrt caused by the fact that all the
characters larger then 127 are transformed to negative integer values.
So the mktables fails.
Simple casting to unsigned char resolves that.
RCS file: /cvsroot/ekhtml/ekhtml/src/ekhtml_mktables.c,v
retrieving revision 1.2
diff -u -3 -r1.2 ekhtml_mktables.c
--- ekhtml_mktables.c 22 Sep 2002 04:49:57 -0000 1.2
+++ ekhtml_mktables.c 6 Mar 2003 13:15:09 -0000
@@ -46,13 +46,13 @@
/* valid_tagname: Character map for a tagname AFTER the first letter
*/
static EKHTML_CHARMAP_TYPE valid_tagname(char in){
- if(in == '-' || in == '.' || isdigit(in) || isalpha(in))
+ if(in == '-' || in == '.' || isdigit((unsigned char)in) ||
isalpha((unsigned char)in))
return 1;
return 0;
}
static EKHTML_CHARMAP_TYPE valid_whitespace(char in){
- return isspace(in) ? 1 : 0;
+ return isspace((unsigned char)in) ? 1 : 0;
}
/* attribute name AFTER the first character */
@@ -75,13 +75,13 @@
}
static EKHTML_CHARMAP_TYPE valid_begattrname(char in){
- return (isalpha(in) || in == '_') ? 1 : 0;
+ return (isalpha((unsigned char)in) || in == '_') ? 1 : 0;
}
static EKHTML_CHARMAP_TYPE ekhtml_state(char in){
if(in == '/')
return EKHTML_STATE_ENDTAG;
- if(isalpha(in))
+ if(isalpha((unsigned char)in))
return EKHTML_STATE_STARTTAG;
if(in == '!')
return EKHTML_STATE_NONE; /* Must be determined by caller */
MT.
|
|
From: Jon T. <jt...@p0...> - 2003-03-03 05:21:25
|
Hey Dmitri Sorry for the delay on your patch. I seem to have dropped off on that one. Lemme take a look at it again tomorrow and we'll either commit it or discuss it. -- Jon On Sunday, March 2, 2003, at 08:47 PM, Dmitri Tikhonov wrote: > Hi, > > any idea when 0.3.3 is coming out? I saw that configure.in was > modified > two months ago -- the version was upped to 0.3.3. However, there's no > tag. Also, there's an outstanding issue: > > http://sourceforge.net/tracker/ > index.php?func=detail&aid=658740&group_id=62314&atid=500165 > > Will my patch be accepted? > > The development of ekhtml seems to hove stopped for some reason. I > volunteer to help out. > > - Dmitri. > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > ekhtml-devel mailing list > ekh...@li... > https://lists.sourceforge.net/lists/listinfo/ekhtml-devel > |
|
From: Dmitri T. <dm...@ne...> - 2003-03-03 04:51:53
|
Hi, any idea when 0.3.3 is coming out? I saw that configure.in was modified two months ago -- the version was upped to 0.3.3. However, there's no tag. Also, there's an outstanding issue: http://sourceforge.net/tracker/index.php?func=detail&aid=658740&group_id=62314&atid=500165 Will my patch be accepted? The development of ekhtml seems to hove stopped for some reason. I volunteer to help out. - Dmitri. |