DCTFinder Icon



Extract title and creation time from web page.

Add a Review
0 Downloads (This Week)
Last Update:
Download dct-finder-2015-01-22.jar
Browse All Files


Web pages do not offer reliable metadata concerning their creation date and time. However, getting the document creation time is a necessary step for allowing to apply temporal normalization systems to web pages. DCTFinder is a system that parses a web page and extracts from its content the title and the creation date of this web page. DCTFinder combines heuristic title detection, supervised learning with Conditional Random Fields (CRFs) for document date extraction, and rule-based creation time recognition.

DCTFinder is released under CeCILL free software license agreement.

The system is described in the following paper (see 'Files' section):
Xavier Tannier. "Extracting News Web Page Creation Time with DCTFinder". Proceedings of the 9th Language Resources and Evaluation Conference. Reykjavik, Iceland.

DCTFinder Web Site


Other Useful Business Software

The Leading Provider of Business VoIP Phone Systems Icon

Award-Winning Business VoIP Phone System

The Leading Provider of Business VoIP Phone Systems Icon
1 of 5 2 of 5 3 of 5 4 of 5 5 of 5
62 Reviews
  • Unlimited Calling, Faxing, Video Conferencing
  • 24/7 U.S Based Customer Support
  • Super Reliable, Simple to Use
Write a Review

User Reviews

Be the first to post a review of DCTFinder!

Additional Project Details

Intended Audience

Information Technology, Science/Research

Programming Language




Thanks for helping keep SourceForge clean.

Screenshot instructions:
Red Hat Linux   Ubuntu

Click URL instructions:
Right-click on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Briefly describe the problem (required):

Upload screenshot of ad (required):
Select a file, or drag & drop file here.

Please provide the ad click URL, if possible:

Get latest updates about Open Source Projects, Conferences and News.

No, thanks