Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-07-31 | 4.6 kB | |
v2.4 source code.tar.gz | 2025-07-31 | 14.2 MB | |
v2.4 source code.zip | 2025-07-31 | 14.4 MB | |
Totals: 3 Items | 28.5 MB | 4 |
v2.4
This release brings major improvements to the Threat Watcher module, including a new word reliability scoring, state-of-the-art NER detection, reduced false positives, a smarter trending algorithm, and several bug fixes and optimizations.
Update Procedure for Docker
Please follow this process:
[WARNING] Manual Deletion Step:
This operation will permanently delete all existing data in the Source, BannedWord, and TrendyWord tables. If you have custom sources, banned words, or other critical data, make sure to back them up or export them before proceeding.
Before anything else, clean existing data to avoid conflicts. Run the following commands in the Django shell in this order:
bash
python manage.py shell -c "from threats_watcher.models import Source, BannedWord, TrendyWord; Source.objects.all().delete(); BannedWord.objects.all().delete(); TrendyWord.objects.all().delete()"
Then continue with the update procedure:
-
Pull the latest Docker image from the repository.
-
Rebuild the Docker image (important for the new dependencies):
bash docker compose build
-
Stop all containers:
bash docker compose down
-
Apply database migrations and Repopulate the database with the new blocklist and sources (new fields added):
bash docker compose run watcher bash python manage.py migrate python manage.py populate_db
-
Restart the containers:
bash docker compose up
If you run Watcher without Docker
### 1. Install all system dependencies ```bash sudo apt update && sudo apt install -y \ build-essential \ libsasl2-dev \ libldap2-dev \ libssl-dev \ curl \ git ``` ### 2. Install Rust (required for tokenizers/transformers) ```bash curl https://sh.rustup.rs -sSf | sh -s -- -y source $HOME/.cargo/env ``` ### 3. (Re)install Python dependencies ```bash pip install --upgrade pip pip install --no-cache-dir -r requirements.txt ``` ### 4. Install torch, torchvision, torchaudio with CPU support ```bash pip install --extra-index-url https://download.pytorch.org/whl/cpu torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 ``` ### 5. Install NLTK dependencies ```bash python ./nltk_dependencies.py ```What’s Changed
ThreatWatcher – Major Improvements
- Reliability scoring for each trending word:
- Each source in
sources.csv
now features aconfident
score (1 = 100%, 2 = 50%, 3 = 20%). - The reliability for each word is the average confidence of the sources where it appeared.
-
New field shown in UI (“Reliability %” column).
-
Entity extraction now uses BERT-base-NER:
- Improved word/entity detection in news titles.
- 10× smaller blocklist needed; blocklist file reduced.
- Vastly fewer false positives.
-
For more information on BERT-base-NER : https://huggingface.co/dslim/bert-base-NER
-
Trending algorithm refactor:
- Now only the last 30 days of news headlines are used for trending word calculation.
- Old: Words could “dominate” from historic surges (e.g. 200 hits a year ago + 1 this month = trending).
- New: Words must truly be trending this month to rank.
-
Minimum occurrences for trend detection reduced from 7 → 5.
-
Improved testing coverage:
- Three new unit tests added in the backend to validate recent changes.
-
Existing frontend tests adjusted to reflect UI updates (e.g. Reliability column).
-
Improved Entity Detection, Reliability Scoring, and Trending Algorithm by @ygalnezri and @LeonNadot in https://github.com/thalesgroup-cert/Watcher/pull/224
- v2.4 by @ygalnezri and @LeonNadot in https://github.com/thalesgroup-cert/Watcher/pull/225
Breaking changes & warnings
- If you use custom code for word parsing/blocklist:
- Review your blocklist (now much smaller).
- Word detection logic has changed (BERT, NER).
sources.csv
structure:- Now requires a
confident
column. - Ensure your source feeds are updated to match the new format.
- Database migration required (new fields).
- Minimum word occurrence is now 5 (was 7), can be changed in
settings.py
.
Full Changelog: https://github.com/thalesgroup-cert/Watcher/compare/v2.3...v2.4