A HTML scraper that uses machine learning frameworks to extract labelled fields from raw HTML. The project also involves the development of a tool to display the semi structured data generated by the scraper component.

Project Activity

See All Activity >

License

GNU General Public License version 2.0 (GPLv2)

Follow Galateia HTML Extractor

Galateia HTML Extractor Web Site

Other Useful Business Software
Stop vibe-debugging. Icon
Stop vibe-debugging.

Plug Claude into your app's actual errors.

AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
Free 30 days.
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
1
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5

User Reviews

  • Galateia works perfect.
    1 user found this review helpful.
Read more reviews >

Additional Project Details

Intended Audience

Science/Research

Programming Language

Python

Related Categories

Python XML Software, Python HTML XHTML, Python Search Engines, Python Information Analysis Software

Registered

2008-06-27