|
From: Alex O. <no...@gi...> - 2025-10-21 13:12:04
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 8d09cc4e5005b0a1c8bf41918058d2c21c2f773e https://github.com/internetarchive/heritrix3/commit/8d09cc4e5005b0a1c8bf41918058d2c21c2f773e Author: Adam Miller <ad...@ar...> Date: 2025-10-07 (Tue, 07 Oct 2025) Changed paths: M modules/src/main/java/org/archive/modules/extractor/ConfigurableExtractorJS.java M modules/src/main/java/org/archive/modules/extractor/ExtractorHTML.java M modules/src/main/java/org/archive/modules/extractor/ExtractorJS.java Log Message: ----------- feat: Pass along script tag attributes to JS extractor for configurable rejection. Commit: 26bb59621ba8ef5d841f3be221bd5e11f124b99f https://github.com/internetarchive/heritrix3/commit/26bb59621ba8ef5d841f3be221bd5e11f124b99f Author: Alex Osborne <aos...@nl...> Date: 2025-10-21 (Tue, 21 Oct 2025) Changed paths: M modules/src/main/java/org/archive/modules/extractor/ConfigurableExtractorJS.java M modules/src/main/java/org/archive/modules/extractor/ExtractorHTML.java M modules/src/main/java/org/archive/modules/extractor/ExtractorJS.java Log Message: ----------- Merge pull request #672 from internetarchive/adam/add_configurable_extractor_js_script_attribute_rules feat: Add configurable regex rules to block extraction of script tags based on the attributes of the tag. Compare: https://github.com/internetarchive/heritrix3/compare/29d2d7919bbb...26bb59621ba8 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |