From: Alex O. <no...@gi...> - 2024-11-20 08:12:28
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 5ed6147c3e3f69441b3771c95b4dbd56645e1f06 https://github.com/internetarchive/heritrix3/commit/5ed6147c3e3f69441b3771c95b4dbd56645e1f06 Author: Kristinn Sigurðsson <kri...@la...> Date: 2024-10-29 (Tue, 29 Oct 2024) Changed paths: M modules/src/main/java/org/archive/modules/CrawlURI.java Log Message: ----------- Treat manifest hops same as navlink hops Links from manifests (e.g. sitemaps) should not receive the preferential treatment sometimes accorded to "transitive" hops. Most commonly this is about giving priority to discovered (probable) embeds. Manifests should be regarded as more analogous with a directory page. Commit: 29cc045c221a2b2747d5f4b884cdeb375ccf2cbd https://github.com/internetarchive/heritrix3/commit/29cc045c221a2b2747d5f4b884cdeb375ccf2cbd Author: Alex Osborne <aos...@nl...> Date: 2024-11-20 (Wed, 20 Nov 2024) Changed paths: M modules/src/main/java/org/archive/modules/CrawlURI.java Log Message: ----------- Merge pull request #623 from kris-sigur/manfest-links Treat manifest hops same as navlink hops Compare: https://github.com/internetarchive/heritrix3/compare/25c73da7e334...29cc045c221a To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |