htdig-dev Mailing List for ht://Dig (Page 85)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Star office can save directly to good  html  format or to 
Ms Office 2000/Xp  format. The choice is made within  the "save as" option.
You should be able to index directly from the saved docuement if you chose
the "Web page" format.

-----Original Message-----
From: Gilles Detillieux [mailto:gr...@sc...]
Sent: Saturday, June 08, 2002 4:31 AM
To: cba...@eu...
Cc: htd...@li...
Subject: Re: [htdig-dev] openoffice parser

According to EuropeanServers - Christophe BAEGERT:
> we use htdig on word documents, but now we've switched to OpenOffice.org,
and
> we haven't any parser. Does it exist or is it planned ?

I haven't heard of or found leads to an OpenOffice.org to HTML document
converter.  However, the OpenOffice.org web site states that these
documents are XML, so it should be pretty easy for someone familiar with
basic HTML, and with Perl, awk or sed scripting, to whip up a rudimentary
XML to HTML converter specific for these documents, so it could pick
out the elements you want and surround them by appropriate HTML tags
so htdig indexes them using the word types you want (i.e. titles, meta
keywords & description, hyperlinks and their descriptions, plain text).
Even simpler, you could probably feed the XML straight into htdig's
HTML parser and it would at least index most/all the text as plain text.
Not having actually seen any OpenOffice.org documents, though, it's hard
for me to speculate on exactly how easy or difficult the task might be.

-- 
Gilles R. Detillieux              E-mail: <gr...@sc...>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)

_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -
http://devcon.sprintpcs.com/adp/index.cfm?source=osdntextlink

_______________________________________________
htdig-dev mailing list
htd...@li...
https://lists.sourceforge.net/lists/listinfo/htdig-dev

2001	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (47)	Nov (74)	Dec (66)
2002	Jan (95)	Feb (102)	Mar (83)	Apr (64)	May (55)	Jun (39)	Jul (23)	Aug (77)	Sep (88)	Oct (84)	Nov (66)	Dec (46)
2003	Jan (56)	Feb (129)	Mar (37)	Apr (63)	May (59)	Jun (104)	Jul (48)	Aug (37)	Sep (49)	Oct (157)	Nov (119)	Dec (54)
2004	Jan (51)	Feb (66)	Mar (39)	Apr (113)	May (34)	Jun (136)	Jul (67)	Aug (20)	Sep (7)	Oct (10)	Nov (14)	Dec (3)
2005	Jan (40)	Feb (21)	Mar (26)	Apr (13)	May (6)	Jun (4)	Jul (23)	Aug (3)	Sep (1)	Oct (13)	Nov (1)	Dec (6)
2006	Jan (2)	Feb (4)	Mar (4)	Apr (1)	May (11)	Jun (1)	Jul (4)	Aug (4)	Sep	Oct (4)	Nov	Dec (1)
2007	Jan (2)	Feb (8)	Mar (1)	Apr (1)	May (1)	Jun	Jul (2)	Aug	Sep (1)	Oct	Nov	Dec
2008	Jan (1)	Feb	Mar (1)	Apr (2)	May	Jun	Jul (1)	Aug	Sep (1)	Oct	Nov	Dec
2009	Jan	Feb	Mar (2)	Apr	May (1)	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2010	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec (1)
2011	Jan	Feb	Mar (1)	Apr	May (1)	Jun	Jul	Aug	Sep	Oct (1)	Nov	Dec
2012	Jan	Feb	Mar	Apr	May	Jun	Jul (1)	Aug	Sep	Oct	Nov	Dec
2013	Jan	Feb	Mar	Apr (1)	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2016	Jan (1)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2017	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (1)	Dec

htdig-dev Mailing List for ht://Dig (Page 85)

htdig-dev — Developer Discussion for the ht://Dig project