A HTML scraper that uses machine learning frameworks to extract labelled fields from raw HTML. The project also involves the development of a tool to display the semi structured data generated by the scraper component.

Project Activity

See All Activity >

License

GNU General Public License version 2.0 (GPLv2)

Follow Galateia HTML Extractor

Galateia HTML Extractor Web Site

Other Useful Business Software
Stop Storing Third-Party Tokens in Your Database Icon
Stop Storing Third-Party Tokens in Your Database

Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
Try Auth0 for Free
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
1
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5

User Reviews

  • Galateia works perfect.
    1 user found this review helpful.
Read more reviews >

Additional Project Details

Intended Audience

Science/Research

Programming Language

Python

Related Categories

Python XML Software, Python HTML XHTML, Python Search Engines, Python Information Analysis Software

Registered

2008-06-27