From: Frazier, J. J. <Joe...@Pe...> - 2002-03-08 14:02:32
|
Well, for converting HTML to RTF, I believe Johan was meaning that you = should be using an HTML parser AND a RTF Generator to: read HTML file watching for events when an event happens check the event data such as what tag fired the = event and then pass that info along with the (tag) data off to the RTF = generator object. =20 This would be very similar to how the XML::SAX* modules work. I have = not really worked with the XML::SAX* modules but a few times, but = basically, you write your own package object and use XML::SAX* to = capture events in the HTML source file. It passes these to your = package subs, which you can then do conditional processing based on what = event is sent. And then, you pass this data to where ever you need = (usually an XML writer. Basically, this is a way to transform one xml = document into something else, either XML, HTML, CSV, or whatever format = you can write up). =20 While going with this approach would take a little longer: 1)its main advantage is that it is easier to package into a real module = to share (hint) 2) its extensable 3) with the events already defined in HTML and the events already = defined in RTF output, it will be far less work to change the parsing = rules then the role your own approach taken in the sub below.(you dont = have to worry about. I can "agree" with you on your point about RTF::Parser's lack of = documentation, but it still is a decent prebuilt package. Generally, = "we" end up missing something when trying to do something manually that = a module already has been built to do. I tried the RTF::Parser's rtf2html.bat and found it did a very good job. = Now, granted, I did not pass anything odd into it the html file, but it = created very nice HTML output. Hope this helps. Joe Frazier, Jr. Technical Support Engineer Peopleclick Service Support Tel: +1-800-841-2365 E-Mail: mailto:su...@pe... > -----Original Message----- > From: Ultimate Red Dragon [mailto:scc...@ho...] > Sent: Thursday, March 07, 2002 6:22 PM > To: per...@li... > Subject: [perl-win32-gui-users] Re: Re: RTF 2 HTML >=20 >=20 > Well, in reply to Johan. I'll admit that I kinda knew those=20 > were there, but=20 > the documentation on them is either horrible or non-existent=20 > (depending on=20 > which RTF modules you look at.) As for the HTML2RTF, I know=20 > of no already=20 > existing interpreter, but I plan on using HTML::Parser to=20 > make it simpler. >=20 > Anyway, I managed to get it to properly translate '<', '>'=20 > and '&' into=20 > their HTML counterparts. Please point out any bugs or=20 > suggestions you have. >=20 > sub rtf2html{ > my $re =3D $main->reDesc; #Just set this to the RichEdit object > my $oldtext =3D $re->Text(); > my @escapes; > { > my $temp =3D -1; > while(($temp =3D index($oldtext,'<',$temp+1)) !=3D -1){ > push(@escapes,[$temp,'<']); > } > $temp =3D -1; > while(($temp =3D index($oldtext,'>',$temp+1)) !=3D -1){ > push(@escapes,[$temp,'>']); > } > $temp =3D -1; > while(($temp =3D index($oldtext,'&',$temp+1)) !=3D -1){ > push(@escapes,[$temp,'&']); > } > } >=20 > @escapes =3D sort({ $a->[0] <=3D> $b->[0] } @escapes); > foreach (@escapes){ > print $_->[0]." =3D ".$_->[1]."\n"; > } >=20 > my $i =3D 0; > my $b =3D 0; > my $u =3D 0; > my $text =3D ''; >=20 > my $offset =3D 0; > foreach my $x (0..length($oldtext)){ > $re->Select($x,$x+1); > my %att =3D $re->GetCharFormat(); > if(($i && !exists($att{-italic})) || (!$i &&=20 > exists($att{-italic}))){ > $i =3D $att{-italic}; > $text .=3D ($i ? '<I>' : '</I>'); > } > if(($b && !exists($att{-bold})) || (!$b &&=20 > exists($att{-bold}))){ > $b =3D $att{-bold}; > $text .=3D ($b ? '<B>' : '</B>'); > } > if(($u && !exists($att{-underline})) || (!$u &&=20 > exists($att{-underline}))){ > $u =3D $att{-underline}; > $text .=3D ($u ? '<U>' : '</U>'); > } > if(defined($escapes[0]->[0]) && $x =3D=3D $escapes[0]->[0]){ > my $temp =3D shift(@escapes); > $text .=3D $temp->[1]; > }else{ > $text .=3D substr($oldtext,$x,1); > } > } > $text =3D~ s/\r//g; > $text =3D~ s/\n/<BR>/gi; > return $text; > } >=20 >=20 >=20 > Date: Thu, 07 Mar 2002 09:47:52 +0100 > To: per...@li... > From: Johan Lindstrom <jo...@ba...> > Subject: Re: [perl-win32-gui-users] RTF 2 HTML >=20 > At 23:37 2002-03-06 -0500, Ultimate Red Dragon wrote: > >It's not that great, I don't claim it's efficient, just=20 > that it works. > > > >Currently, it supports new lines, bold, italics and underline. >=20 > This seems to be similar to what you want: > http://search.cpan.org/search?dist=3DRTF-Parser >=20 >=20 > >I'm working on converting < and > correctly, as well as a=20 > HTML 2 RTF sub > >(or is there already one?) >=20 > There are HTML parsers and RTF generators on CPAN. >=20 > Here is the search for module names with RTF: > http://search.cpan.org/search?mode=3Dmodule&query=3Drtf > (but note that you often can get a lot more results by searching the > documentation rather than the module name) >=20 >=20 > /J >=20 > -------- ------ ---- --- -- -- -- - - - - - > Johan Lindstr=F6m Sourcerer @ Boss Casinos jo...@ba... >=20 > Latest bookmark: "(GUI) Windows Programming FAQ" > http://www.perlmonks.org/index.pl?node_id=3D108708 >=20 > _________________________________________________________________ > Send and receive Hotmail on your mobile device: http://mobile.msn.com >=20 >=20 > _______________________________________________ > Perl-Win32-GUI-Users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perl-win32-gui-users >=20 |