Download Latest Version v2.4 source code.tar.gz (14.2 MB)
Email in envelope

Get an email when there's a new version of Watcher

Home / v2.4
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2025-07-31 4.6 kB
v2.4 source code.tar.gz 2025-07-31 14.2 MB
v2.4 source code.zip 2025-07-31 14.4 MB
Totals: 3 Items   28.5 MB 4

v2.4

This release brings major improvements to the Threat Watcher module, including a new word reliability scoring, state-of-the-art NER detection, reduced false positives, a smarter trending algorithm, and several bug fixes and optimizations.

Update Procedure for Docker

Please follow this process:

[WARNING] Manual Deletion Step:

This operation will permanently delete all existing data in the Source, BannedWord, and TrendyWord tables. If you have custom sources, banned words, or other critical data, make sure to back them up or export them before proceeding.

Before anything else, clean existing data to avoid conflicts. Run the following commands in the Django shell in this order: bash python manage.py shell -c "from threats_watcher.models import Source, BannedWord, TrendyWord; Source.objects.all().delete(); BannedWord.objects.all().delete(); TrendyWord.objects.all().delete()"

Then continue with the update procedure:

  1. Pull the latest Docker image from the repository.

  2. Rebuild the Docker image (important for the new dependencies): bash docker compose build

  3. Stop all containers: bash docker compose down

  4. Apply database migrations and Repopulate the database with the new blocklist and sources (new fields added): bash docker compose run watcher bash python manage.py migrate python manage.py populate_db

  5. Restart the containers: bash docker compose up

If you run Watcher without Docker ### 1. Install all system dependencies ```bash sudo apt update && sudo apt install -y \ build-essential \ libsasl2-dev \ libldap2-dev \ libssl-dev \ curl \ git ``` ### 2. Install Rust (required for tokenizers/transformers) ```bash curl https://sh.rustup.rs -sSf | sh -s -- -y source $HOME/.cargo/env ``` ### 3. (Re)install Python dependencies ```bash pip install --upgrade pip pip install --no-cache-dir -r requirements.txt ``` ### 4. Install torch, torchvision, torchaudio with CPU support ```bash pip install --extra-index-url https://download.pytorch.org/whl/cpu torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 ``` ### 5. Install NLTK dependencies ```bash python ./nltk_dependencies.py ```

What’s Changed

ThreatWatcher – Major Improvements

  • Reliability scoring for each trending word:
  • Each source in sources.csv now features a confident score (1 = 100%, 2 = 50%, 3 = 20%).
  • The reliability for each word is the average confidence of the sources where it appeared.
  • New field shown in UI (“Reliability %” column).

  • Entity extraction now uses BERT-base-NER:

  • Improved word/entity detection in news titles.
  • 10× smaller blocklist needed; blocklist file reduced.
  • Vastly fewer false positives.
  • For more information on BERT-base-NER : https://huggingface.co/dslim/bert-base-NER

  • Trending algorithm refactor:

  • Now only the last 30 days of news headlines are used for trending word calculation.
  • Old: Words could “dominate” from historic surges (e.g. 200 hits a year ago + 1 this month = trending).
  • New: Words must truly be trending this month to rank.
  • Minimum occurrences for trend detection reduced from 7 → 5.

  • Improved testing coverage:

  • Three new unit tests added in the backend to validate recent changes.
  • Existing frontend tests adjusted to reflect UI updates (e.g. Reliability column).

  • Improved Entity Detection, Reliability Scoring, and Trending Algorithm by @ygalnezri and @LeonNadot in https://github.com/thalesgroup-cert/Watcher/pull/224

  • v2.4 by @ygalnezri and @LeonNadot in https://github.com/thalesgroup-cert/Watcher/pull/225

Breaking changes & warnings

  • If you use custom code for word parsing/blocklist:
  • Review your blocklist (now much smaller).
  • Word detection logic has changed (BERT, NER).
  • sources.csv structure:
  • Now requires a confident column.
  • Ensure your source feeds are updated to match the new format.
  • Database migration required (new fields).
  • Minimum word occurrence is now 5 (was 7), can be changed in settings.py.

Full Changelog: https://github.com/thalesgroup-cert/Watcher/compare/v2.3...v2.4

Source: README.md, updated 2025-07-31