You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
(50) |
Oct
(197) |
Nov
(305) |
Dec
(295) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(429) |
Feb
(694) |
Mar
(443) |
Apr
(479) |
May
(357) |
Jun
(74) |
Jul
(218) |
Aug
(162) |
Sep
(156) |
Oct
(340) |
Nov
(132) |
Dec
(224) |
2005 |
Jan
(170) |
Feb
(122) |
Mar
(265) |
Apr
(215) |
May
(139) |
Jun
(247) |
Jul
(179) |
Aug
(116) |
Sep
(103) |
Oct
(125) |
Nov
(97) |
Dec
(221) |
2006 |
Jan
(132) |
Feb
(18) |
Mar
(23) |
Apr
(35) |
May
(71) |
Jun
(268) |
Jul
(220) |
Aug
(376) |
Sep
(181) |
Oct
(71) |
Nov
(131) |
Dec
(172) |
2007 |
Jan
(125) |
Feb
(79) |
Mar
(90) |
Apr
(76) |
May
(91) |
Jun
(64) |
Jul
(113) |
Aug
(96) |
Sep
(40) |
Oct
(30) |
Nov
(85) |
Dec
(56) |
2008 |
Jan
(37) |
Feb
(79) |
Mar
(22) |
Apr
(6) |
May
(13) |
Jun
(22) |
Jul
(83) |
Aug
(50) |
Sep
(8) |
Oct
(32) |
Nov
(55) |
Dec
(28) |
2009 |
Jan
(15) |
Feb
(30) |
Mar
(28) |
Apr
(69) |
May
(82) |
Jun
(19) |
Jul
(64) |
Aug
(71) |
Sep
(53) |
Oct
(84) |
Nov
(105) |
Dec
(40) |
2010 |
Jan
(11) |
Feb
(19) |
Mar
(24) |
Apr
(58) |
May
(15) |
Jun
(35) |
Jul
(14) |
Aug
(13) |
Sep
(31) |
Oct
(15) |
Nov
(39) |
Dec
(10) |
2011 |
Jan
(59) |
Feb
(32) |
Mar
(10) |
Apr
(37) |
May
(20) |
Jun
(21) |
Jul
(39) |
Aug
(9) |
Sep
(31) |
Oct
(29) |
Nov
(3) |
Dec
(1) |
2012 |
Jan
(7) |
Feb
(4) |
Mar
(5) |
Apr
(12) |
May
(5) |
Jun
(8) |
Jul
(9) |
Aug
(6) |
Sep
(15) |
Oct
(1) |
Nov
(3) |
Dec
(9) |
2013 |
Jan
(9) |
Feb
(2) |
Mar
(41) |
Apr
(13) |
May
(9) |
Jun
(20) |
Jul
(5) |
Aug
(22) |
Sep
(5) |
Oct
(3) |
Nov
(13) |
Dec
(8) |
2014 |
Jan
(27) |
Feb
(16) |
Mar
(7) |
Apr
(14) |
May
(10) |
Jun
(2) |
Jul
(16) |
Aug
(6) |
Sep
(6) |
Oct
(11) |
Nov
(7) |
Dec
|
2015 |
Jan
|
Feb
(7) |
Mar
(4) |
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(2) |
Sep
(2) |
Oct
(5) |
Nov
(1) |
Dec
|
2016 |
Jan
(15) |
Feb
(5) |
Mar
(4) |
Apr
(1) |
May
(7) |
Jun
(16) |
Jul
(6) |
Aug
(2) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2017 |
Jan
|
Feb
(1) |
Mar
(3) |
Apr
|
May
(4) |
Jun
(25) |
Jul
|
Aug
|
Sep
(4) |
Oct
(11) |
Nov
(9) |
Dec
(1) |
2018 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(10) |
Aug
|
Sep
(1) |
Oct
(2) |
Nov
(12) |
Dec
(4) |
2019 |
Jan
(3) |
Feb
(21) |
Mar
(17) |
Apr
(13) |
May
(6) |
Jun
(4) |
Jul
|
Aug
(65) |
Sep
|
Oct
(4) |
Nov
(7) |
Dec
|
2020 |
Jan
(23) |
Feb
(6) |
Mar
(14) |
Apr
(25) |
May
(11) |
Jun
(9) |
Jul
(7) |
Aug
(7) |
Sep
(1) |
Oct
(4) |
Nov
(4) |
Dec
|
2021 |
Jan
(8) |
Feb
(11) |
Mar
(1) |
Apr
(6) |
May
(30) |
Jun
(60) |
Jul
(43) |
Aug
(23) |
Sep
(16) |
Oct
|
Nov
(7) |
Dec
(13) |
2022 |
Jan
(7) |
Feb
(2) |
Mar
(17) |
Apr
(16) |
May
(9) |
Jun
(2) |
Jul
(18) |
Aug
|
Sep
(3) |
Oct
(1) |
Nov
(2) |
Dec
|
2023 |
Jan
(7) |
Feb
|
Mar
(11) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
(7) |
Oct
(5) |
Nov
(2) |
Dec
|
2024 |
Jan
|
Feb
(4) |
Mar
(8) |
Apr
(5) |
May
(5) |
Jun
(12) |
Jul
(2) |
Aug
(12) |
Sep
(25) |
Oct
(47) |
Nov
(46) |
Dec
(3) |
2025 |
Jan
(6) |
Feb
(14) |
Mar
(8) |
Apr
(23) |
May
(34) |
Jun
(44) |
Jul
(8) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Alex O. <no...@gi...> - 2025-06-04 02:04:39
|
Branch: refs/heads/web-auth-basic Home: https://github.com/internetarchive/heritrix3 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-06-04 02:04:30
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: eaf72b5ce299c3b9dd4f03dcb00af0ddd366e644 https://github.com/internetarchive/heritrix3/commit/eaf72b5ce299c3b9dd4f03dcb00af0ddd366e644 Author: Alex Osborne <aos...@nl...> Date: 2025-05-25 (Sun, 25 May 2025) Changed paths: M CHANGELOG.md M docs/operating.rst M engine/src/main/java/org/archive/crawler/Heritrix.java M engine/src/main/java/org/archive/crawler/restlet/RateLimitGuard.java Log Message: ----------- Add `--web-auth basic` command-line option This option enables HTTP Basic authentication for the web interface instead of the default Digest authentication. This is useful when running Heritrix behind a reverse proxy that adds external authentication as typically they don't support Digest auth for the upstream server. #641 Commit: 497d79500575ee769fe320cce4c9fb11f5811df5 https://github.com/internetarchive/heritrix3/commit/497d79500575ee769fe320cce4c9fb11f5811df5 Author: Alex Osborne <aos...@nl...> Date: 2025-06-04 (Wed, 04 Jun 2025) Changed paths: M CHANGELOG.md M docs/operating.rst M engine/src/main/java/org/archive/crawler/Heritrix.java M engine/src/main/java/org/archive/crawler/restlet/RateLimitGuard.java Log Message: ----------- Merge pull request #654 from internetarchive/web-auth-basic Add `--web-auth basic` command-line option Compare: https://github.com/internetarchive/heritrix3/compare/721c78ec8b20...497d79500575 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-27 01:15:10
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 4dc990e1ffffc91146011f25e190ce2238e0afa9 https://github.com/internetarchive/heritrix3/commit/4dc990e1ffffc91146011f25e190ce2238e0afa9 Author: Valadaress <578...@us...> Date: 2025-05-26 (Mon, 26 May 2025) Changed paths: M docker/Dockerfile M docker/Dockerfile.contrib M docker/docker-compose.yml Log Message: ----------- Update Docker compose to newest version Commit: c73cb60dbabf07e930d36e62a51eed9d16999d48 https://github.com/internetarchive/heritrix3/commit/c73cb60dbabf07e930d36e62a51eed9d16999d48 Author: Valadaress <578...@us...> Date: 2025-05-26 (Mon, 26 May 2025) Changed paths: M docker/Dockerfile M docker/Dockerfile.contrib Log Message: ----------- Adjust dependencies apt-get for Dockerfile and Dockerfile.contrib Commit: 721c78ec8b20ef21e8060dfaee9d156cc4e38e2d https://github.com/internetarchive/heritrix3/commit/721c78ec8b20ef21e8060dfaee9d156cc4e38e2d Author: Alex Osborne <aos...@nl...> Date: 2025-05-27 (Tue, 27 May 2025) Changed paths: M docker/Dockerfile M docker/Dockerfile.contrib M docker/docker-compose.yml Log Message: ----------- Merge pull request #655 from Valadaress/master Updated Dockerfile and Docker- compose.yml to heritrix v3.9.0 Compare: https://github.com/internetarchive/heritrix3/compare/0f0db370ee19...721c78ec8b20 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-24 15:30:54
|
Branch: refs/heads/web-auth-basic Home: https://github.com/internetarchive/heritrix3 Commit: eaf72b5ce299c3b9dd4f03dcb00af0ddd366e644 https://github.com/internetarchive/heritrix3/commit/eaf72b5ce299c3b9dd4f03dcb00af0ddd366e644 Author: Alex Osborne <aos...@nl...> Date: 2025-05-25 (Sun, 25 May 2025) Changed paths: M CHANGELOG.md M docs/operating.rst M engine/src/main/java/org/archive/crawler/Heritrix.java M engine/src/main/java/org/archive/crawler/restlet/RateLimitGuard.java Log Message: ----------- Add `--web-auth basic` command-line option This option enables HTTP Basic authentication for the web interface instead of the default Digest authentication. This is useful when running Heritrix behind a reverse proxy that adds external authentication as typically they don't support Digest auth for the upstream server. #641 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-24 13:47:19
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 0f0db370ee196a382bf754f4362bc29a54283719 https://github.com/internetarchive/heritrix3/commit/0f0db370ee196a382bf754f4362bc29a54283719 Author: Alex Osborne <aos...@nl...> Date: 2025-05-24 (Sat, 24 May 2025) Changed paths: M engine/src/main/java/org/archive/crawler/Heritrix.java Log Message: ----------- UI: Disable Jetty graceful shutdown for faster restarts Graceful shutdown would be useful if you could deploy the UI in a high-availability configuration and direct new requests to a different instance while the current instance finished its outstanding ones. But as you can't, it's just making restarting Heritrix slow for little benefit. To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-23 08:45:14
|
Branch: refs/heads/bidi Home: https://github.com/internetarchive/heritrix3 Commit: 396f7a06c6cb596aeca3b943f523603ad503a05b https://github.com/internetarchive/heritrix3/commit/396f7a06c6cb596aeca3b943f523603ad503a05b Author: Alex Osborne <aos...@nl...> Date: 2025-05-23 (Fri, 23 May 2025) Changed paths: M commons/pom.xml A commons/src/main/java/org/archive/net/MitmProxy.java A commons/src/main/java/org/archive/net/webdriver/BiDiEvent.java A commons/src/main/java/org/archive/net/webdriver/BiDiJson.java A commons/src/main/java/org/archive/net/webdriver/BiDiModule.java A commons/src/main/java/org/archive/net/webdriver/Browser.java A commons/src/main/java/org/archive/net/webdriver/BrowsingContext.java A commons/src/main/java/org/archive/net/webdriver/LocalWebDriverBiDi.java A commons/src/main/java/org/archive/net/webdriver/Network.java A commons/src/main/java/org/archive/net/webdriver/Script.java A commons/src/main/java/org/archive/net/webdriver/Session.java A commons/src/main/java/org/archive/net/webdriver/WebDriverBiDi.java A commons/src/main/java/org/archive/net/webdriver/WebDriverException.java A commons/src/main/java/org/archive/util/IdleBarrier.java M engine/pom.xml A engine/src/main/java/org/archive/crawler/processor/Browser.java M engine/src/main/resources/org/archive/crawler/restlet/profile-crawler-beans.cxml A engine/src/test/java/org/archive/crawler/processor/BrowserTest.java A engine/src/test/resources/logging.properties M modules/pom.xml A modules/src/main/java/org/archive/modules/behaviors/Behavior.java A modules/src/main/java/org/archive/modules/behaviors/ExtractLinks.java A modules/src/main/java/org/archive/modules/behaviors/Page.java A modules/src/main/java/org/archive/modules/behaviors/ScrollDown.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP2.java Log Message: ----------- Add Browser processor using WebDriver BiDi The Browser processor can load a fetched page in a local web browser, record any requests the browser makes and run behaviors that interact with the page such as scrolling down and extracting links. This differs from my previous attempt (ExtractorChrome) in a few ways: - Uses the new WebDriver BiDi standard instead of the Chrome Devtools Protocol. The new protocol is mostly browser-agnostic, more consistent and hopefully more stable. - Uses a MITM proxy instead of CDP request interception for recording sub-resources. That's partly because BiDi is still missing some key interception APIs. Even so in practice I found the proxy method loads pages faster and more reliably, likely because responses can be streamed incrementally, which helps a lot for large resources or server-sent events. - Even when HTTP/2 is unavailable, the new FetchHTTP2 module does connection pooling which makes loading browser requests a lot faster. The original FetchHTTP opened a new connection for every request. - The Browser processor can be configured with a list of behavior beans making it more customizable and extensible. Obvious areas for future development: - More Behavior beans: take screenshots, saveg the rendered DOM, run Browsertrix-compatible behavior scripts - Support for remote WebDrivers (e.g. Selenium Server or cloud services) To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-23 08:33:57
|
Branch: refs/heads/bidi Home: https://github.com/internetarchive/heritrix3 Commit: e4e62ed737d6616ba5e86da3e008aaaadcfed819 https://github.com/internetarchive/heritrix3/commit/e4e62ed737d6616ba5e86da3e008aaaadcfed819 Author: Alex Osborne <aos...@nl...> Date: 2025-05-23 (Fri, 23 May 2025) Changed paths: M commons/pom.xml A commons/src/main/java/org/archive/net/MitmProxy.java A commons/src/main/java/org/archive/net/webdriver/BiDiEvent.java A commons/src/main/java/org/archive/net/webdriver/BiDiJson.java A commons/src/main/java/org/archive/net/webdriver/BiDiModule.java A commons/src/main/java/org/archive/net/webdriver/Browser.java A commons/src/main/java/org/archive/net/webdriver/BrowsingContext.java A commons/src/main/java/org/archive/net/webdriver/LocalWebDriverBiDi.java A commons/src/main/java/org/archive/net/webdriver/Network.java A commons/src/main/java/org/archive/net/webdriver/Script.java A commons/src/main/java/org/archive/net/webdriver/Session.java A commons/src/main/java/org/archive/net/webdriver/WebDriverBiDi.java A commons/src/main/java/org/archive/net/webdriver/WebDriverException.java A commons/src/main/java/org/archive/util/IdleBarrier.java M engine/pom.xml A engine/src/main/java/org/archive/crawler/processor/Browser.java M engine/src/main/resources/org/archive/crawler/restlet/profile-crawler-beans.cxml A engine/src/test/java/org/archive/crawler/processor/BrowserTest.java A engine/src/test/resources/logging.properties M modules/pom.xml A modules/src/main/java/org/archive/modules/behaviors/Behavior.java A modules/src/main/java/org/archive/modules/behaviors/ExtractLinks.java A modules/src/main/java/org/archive/modules/behaviors/Page.java A modules/src/main/java/org/archive/modules/behaviors/ScrollDown.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP2.java Log Message: ----------- Add Browser processor using WebDriver BiDi The Browser processor can load a fetched page in a local web browser, record any requests the browser makes and run behaviors that interact with the page such as scrolling down and extracting links. This differs from my previous attempt (ExtractorChrome) in a few ways: - Uses the new WebDriver BiDi standard instead of the Chrome Devtools Protocol. The new protocol is mostly browser-agnostic, more consistent and hopefully more stable. - Uses a MITM proxy instead of CDP request interception for recording sub-resources. That's partly because BiDi is still missing some key interception APIs. Even so in practice I found the proxy method loads pages faster and more reliably, likely because responses can be streamed incrementally, which helps a lot for large resources or server-sent events. - Even when HTTP/2 is unavailable, the new FetchHTTP2 module does connection pooling which makes loading browser requests a lot faster. The original FetchHTTP opened a new connection for every request. - The Browser processor can be configured with a list of behavior beans making it more customizable and extensible. Obvious areas for future development: - More Behavior beans: take screenshots, saveg the rendered DOM, run Browsertrix-compatible behavior scripts - Support for remote WebDrivers (e.g. Selenium Server or cloud services) To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-22 06:39:17
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 679fde42c6b189f866dd6a21f0a37201b46e0eeb https://github.com/internetarchive/heritrix3/commit/679fde42c6b189f866dd6a21f0a37201b46e0eeb Author: Alex Osborne <aos...@nl...> Date: 2025-05-22 (Thu, 22 May 2025) Changed paths: M engine/src/test/java/org/archive/crawler/selftest/SelfTestBase.java Log Message: ----------- SelfTestBase: Use a dynamic port (port 0) for Heritrix web port This stops the tests from failing when you happen to be running Heritrix or something else on port 8443. To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-22 06:20:24
|
Branch: refs/heads/webarchive-commons-2.0.1 Home: https://github.com/internetarchive/heritrix3 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-22 06:20:22
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 567e27181d4e26ce138633ab405b4785f4e19a88 https://github.com/internetarchive/heritrix3/commit/567e27181d4e26ce138633ab405b4785f4e19a88 Author: Alex Osborne <aos...@nl...> Date: 2025-05-21 (Wed, 21 May 2025) Changed paths: M CHANGELOG.md M commons/pom.xml M commons/src/main/java/org/archive/io/Arc2Warc.java M commons/src/main/java/org/archive/io/Warc2Arc.java M commons/src/main/java/org/archive/net/UURI.java M commons/src/main/java/org/archive/net/UURIFactory.java M commons/src/main/java/org/archive/surt/SURTTokenizer.java M commons/src/main/java/org/archive/util/UriUtils.java M commons/src/test/java/org/archive/surt/SURTTokenizerTest.java M contrib/src/main/java/org/archive/crawler/frontier/AMQPUrlReceiver.java M contrib/src/main/java/org/archive/modules/AMQPPublishProcessor.java M contrib/src/main/java/org/archive/modules/extractor/ExtractorYoutubeDL.java M contrib/src/main/java/org/archive/modules/extractor/KnowledgableExtractorJS.java M contrib/src/test/java/org/archive/modules/extractor/ExtractorPDFContentTest.java M contrib/src/test/java/org/archive/modules/extractor/ExtractorYoutubeFormatStreamTest.java M engine/src/main/java/org/archive/crawler/frontier/AbstractFrontier.java M engine/src/main/java/org/archive/crawler/frontier/FrontierJournal.java M engine/src/main/java/org/archive/crawler/frontier/HostnameQueueAssignmentPolicy.java M engine/src/main/java/org/archive/crawler/postprocessor/CandidatesProcessor.java M engine/src/main/java/org/archive/crawler/postprocessor/DispositionProcessor.java M engine/src/main/java/org/archive/crawler/prefetch/PreconditionEnforcer.java M engine/src/main/java/org/archive/crawler/reporting/CrawlerLoggerModule.java M engine/src/test/java/org/archive/crawler/datamodel/CrawlURITest.java M engine/src/test/java/org/archive/crawler/frontier/BdbMultipleWorkQueuesTest.java M engine/src/test/java/org/archive/crawler/frontier/FrontierJournalTest.java M engine/src/test/java/org/archive/crawler/prefetch/QuotaEnforcerTest.java M engine/src/test/java/org/archive/crawler/util/BdbUriUniqFilterTest.java M engine/src/test/java/org/archive/crawler/util/BloomUriUniqFilterTest.java M engine/src/test/java/org/archive/crawler/util/FPUriUniqFilterTest.java M engine/src/test/java/org/archive/crawler/util/TopNSetTest.java M engine/src/test/java/org/archive/modules/fetcher/FormAuthTest.java M modules/src/main/java/org/archive/modules/CrawlURI.java M modules/src/main/java/org/archive/modules/Processor.java M modules/src/main/java/org/archive/modules/credential/HtmlFormCredential.java M modules/src/main/java/org/archive/modules/deciderules/AddRedirectFromRootServerToScope.java M modules/src/main/java/org/archive/modules/deciderules/ExternalGeoLocationDecideRule.java M modules/src/main/java/org/archive/modules/deciderules/ResourceNoLongerThanDecideRule.java M modules/src/main/java/org/archive/modules/extractor/Extractor.java M modules/src/main/java/org/archive/modules/extractor/ExtractorCSS.java M modules/src/main/java/org/archive/modules/extractor/ExtractorDOC.java M modules/src/main/java/org/archive/modules/extractor/ExtractorHTML.java M modules/src/main/java/org/archive/modules/extractor/ExtractorHTTP.java M modules/src/main/java/org/archive/modules/extractor/ExtractorImpliedURI.java M modules/src/main/java/org/archive/modules/extractor/ExtractorJS.java M modules/src/main/java/org/archive/modules/extractor/ExtractorMultipleRegex.java M modules/src/main/java/org/archive/modules/extractor/ExtractorPDF.java M modules/src/main/java/org/archive/modules/extractor/ExtractorRobotsTxt.java M modules/src/main/java/org/archive/modules/extractor/ExtractorSitemap.java M modules/src/main/java/org/archive/modules/extractor/ExtractorURI.java M modules/src/main/java/org/archive/modules/extractor/ExtractorXML.java M modules/src/main/java/org/archive/modules/extractor/JerichoExtractorHTML.java M modules/src/main/java/org/archive/modules/extractor/UriErrorLoggerModule.java M modules/src/main/java/org/archive/modules/fetcher/AbstractCookieStore.java M modules/src/main/java/org/archive/modules/fetcher/FetchDNS.java M modules/src/main/java/org/archive/modules/fetcher/FetchFTP.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP2.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTPCookieStore.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTPRequest.java M modules/src/main/java/org/archive/modules/fetcher/FetchSFTP.java M modules/src/main/java/org/archive/modules/fetcher/FetchWhois.java M modules/src/main/java/org/archive/modules/forms/FormLoginProcessor.java M modules/src/main/java/org/archive/modules/net/CrawlServer.java M modules/src/main/java/org/archive/modules/net/RobotsPolicy.java M modules/src/main/java/org/archive/modules/net/ServerCache.java M modules/src/main/java/org/archive/modules/seeds/TextSeedModule.java M modules/src/main/java/org/archive/modules/writer/ARCWriterProcessor.java M modules/src/main/java/org/archive/state/ModuleTestBase.java M modules/src/test/java/org/archive/modules/canonicalize/FixupQueryStringTest.java M modules/src/test/java/org/archive/modules/canonicalize/RegexRuleTest.java M modules/src/test/java/org/archive/modules/canonicalize/RulesCanonicalizationPolicyTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripSessionCFIDsTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripSessionIDsTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripUserinfoRuleTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripWWWNRuleTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripWWWRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/MatchesListRegexDecideRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/MatchesStatusCodeDecideRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/NotMatchesStatusCodeDecideRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/ViaSurtPrefixedDecideRuleTest.java M modules/src/test/java/org/archive/modules/extractor/ExtractorHTMLTest.java M modules/src/test/java/org/archive/modules/extractor/JerichoExtractorHTMLTest.java M modules/src/test/java/org/archive/modules/extractor/UnitTestUriLoggerModule.java M modules/src/test/java/org/archive/modules/fetcher/CookieFetchHTTPIntegrationTest.java M modules/src/test/java/org/archive/modules/fetcher/FetchHTTPTest.java M modules/src/test/java/org/archive/modules/forms/FormLoginProcessorTest.java M modules/src/test/java/org/archive/modules/net/ServerCacheTest.java M modules/src/test/java/org/archive/modules/recrawl/ContentDigestHistoryTest.java Log Message: ----------- Upgrade webarchive-commons from 1.3.0 to 2.0.1 (removes httpclient 3) Commit: 4695f8a1537c4e674396196ccc67eecef522d21d https://github.com/internetarchive/heritrix3/commit/4695f8a1537c4e674396196ccc67eecef522d21d Author: Alex Osborne <aos...@nl...> Date: 2025-05-22 (Thu, 22 May 2025) Changed paths: M CHANGELOG.md M commons/pom.xml M commons/src/main/java/org/archive/io/Arc2Warc.java M commons/src/main/java/org/archive/io/Warc2Arc.java M commons/src/main/java/org/archive/net/UURI.java M commons/src/main/java/org/archive/net/UURIFactory.java M commons/src/main/java/org/archive/surt/SURTTokenizer.java M commons/src/main/java/org/archive/util/UriUtils.java M commons/src/test/java/org/archive/surt/SURTTokenizerTest.java M contrib/src/main/java/org/archive/crawler/frontier/AMQPUrlReceiver.java M contrib/src/main/java/org/archive/modules/AMQPPublishProcessor.java M contrib/src/main/java/org/archive/modules/extractor/ExtractorYoutubeDL.java M contrib/src/main/java/org/archive/modules/extractor/KnowledgableExtractorJS.java M contrib/src/test/java/org/archive/modules/extractor/ExtractorPDFContentTest.java M contrib/src/test/java/org/archive/modules/extractor/ExtractorYoutubeFormatStreamTest.java M engine/src/main/java/org/archive/crawler/frontier/AbstractFrontier.java M engine/src/main/java/org/archive/crawler/frontier/FrontierJournal.java M engine/src/main/java/org/archive/crawler/frontier/HostnameQueueAssignmentPolicy.java M engine/src/main/java/org/archive/crawler/postprocessor/CandidatesProcessor.java M engine/src/main/java/org/archive/crawler/postprocessor/DispositionProcessor.java M engine/src/main/java/org/archive/crawler/prefetch/PreconditionEnforcer.java M engine/src/main/java/org/archive/crawler/reporting/CrawlerLoggerModule.java M engine/src/test/java/org/archive/crawler/datamodel/CrawlURITest.java M engine/src/test/java/org/archive/crawler/frontier/BdbMultipleWorkQueuesTest.java M engine/src/test/java/org/archive/crawler/frontier/FrontierJournalTest.java M engine/src/test/java/org/archive/crawler/prefetch/QuotaEnforcerTest.java M engine/src/test/java/org/archive/crawler/util/BdbUriUniqFilterTest.java M engine/src/test/java/org/archive/crawler/util/BloomUriUniqFilterTest.java M engine/src/test/java/org/archive/crawler/util/FPUriUniqFilterTest.java M engine/src/test/java/org/archive/crawler/util/TopNSetTest.java M engine/src/test/java/org/archive/modules/fetcher/FormAuthTest.java M modules/src/main/java/org/archive/modules/CrawlURI.java M modules/src/main/java/org/archive/modules/Processor.java M modules/src/main/java/org/archive/modules/credential/HtmlFormCredential.java M modules/src/main/java/org/archive/modules/deciderules/AddRedirectFromRootServerToScope.java M modules/src/main/java/org/archive/modules/deciderules/ExternalGeoLocationDecideRule.java M modules/src/main/java/org/archive/modules/deciderules/ResourceNoLongerThanDecideRule.java M modules/src/main/java/org/archive/modules/extractor/Extractor.java M modules/src/main/java/org/archive/modules/extractor/ExtractorCSS.java M modules/src/main/java/org/archive/modules/extractor/ExtractorDOC.java M modules/src/main/java/org/archive/modules/extractor/ExtractorHTML.java M modules/src/main/java/org/archive/modules/extractor/ExtractorHTTP.java M modules/src/main/java/org/archive/modules/extractor/ExtractorImpliedURI.java M modules/src/main/java/org/archive/modules/extractor/ExtractorJS.java M modules/src/main/java/org/archive/modules/extractor/ExtractorMultipleRegex.java M modules/src/main/java/org/archive/modules/extractor/ExtractorPDF.java M modules/src/main/java/org/archive/modules/extractor/ExtractorRobotsTxt.java M modules/src/main/java/org/archive/modules/extractor/ExtractorSitemap.java M modules/src/main/java/org/archive/modules/extractor/ExtractorURI.java M modules/src/main/java/org/archive/modules/extractor/ExtractorXML.java M modules/src/main/java/org/archive/modules/extractor/JerichoExtractorHTML.java M modules/src/main/java/org/archive/modules/extractor/UriErrorLoggerModule.java M modules/src/main/java/org/archive/modules/fetcher/AbstractCookieStore.java M modules/src/main/java/org/archive/modules/fetcher/FetchDNS.java M modules/src/main/java/org/archive/modules/fetcher/FetchFTP.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP2.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTPCookieStore.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTPRequest.java M modules/src/main/java/org/archive/modules/fetcher/FetchSFTP.java M modules/src/main/java/org/archive/modules/fetcher/FetchWhois.java M modules/src/main/java/org/archive/modules/forms/FormLoginProcessor.java M modules/src/main/java/org/archive/modules/net/CrawlServer.java M modules/src/main/java/org/archive/modules/net/RobotsPolicy.java M modules/src/main/java/org/archive/modules/net/ServerCache.java M modules/src/main/java/org/archive/modules/seeds/TextSeedModule.java M modules/src/main/java/org/archive/modules/writer/ARCWriterProcessor.java M modules/src/main/java/org/archive/state/ModuleTestBase.java M modules/src/test/java/org/archive/modules/canonicalize/FixupQueryStringTest.java M modules/src/test/java/org/archive/modules/canonicalize/RegexRuleTest.java M modules/src/test/java/org/archive/modules/canonicalize/RulesCanonicalizationPolicyTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripSessionCFIDsTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripSessionIDsTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripUserinfoRuleTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripWWWNRuleTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripWWWRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/MatchesListRegexDecideRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/MatchesStatusCodeDecideRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/NotMatchesStatusCodeDecideRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/ViaSurtPrefixedDecideRuleTest.java M modules/src/test/java/org/archive/modules/extractor/ExtractorHTMLTest.java M modules/src/test/java/org/archive/modules/extractor/JerichoExtractorHTMLTest.java M modules/src/test/java/org/archive/modules/extractor/UnitTestUriLoggerModule.java M modules/src/test/java/org/archive/modules/fetcher/CookieFetchHTTPIntegrationTest.java M modules/src/test/java/org/archive/modules/fetcher/FetchHTTPTest.java M modules/src/test/java/org/archive/modules/forms/FormLoginProcessorTest.java M modules/src/test/java/org/archive/modules/net/ServerCacheTest.java M modules/src/test/java/org/archive/modules/recrawl/ContentDigestHistoryTest.java Log Message: ----------- Merge pull request #652 from internetarchive/webarchive-commons-2.0.1 Upgrade webarchive-commons from 1.3.0 to 2.0.1 (removes httpclient 3) Compare: https://github.com/internetarchive/heritrix3/compare/58e1ac529b39...4695f8a1537c To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-21 08:39:59
|
Branch: refs/heads/webarchive-commons-2.0.1 Home: https://github.com/internetarchive/heritrix3 Commit: 567e27181d4e26ce138633ab405b4785f4e19a88 https://github.com/internetarchive/heritrix3/commit/567e27181d4e26ce138633ab405b4785f4e19a88 Author: Alex Osborne <aos...@nl...> Date: 2025-05-21 (Wed, 21 May 2025) Changed paths: M CHANGELOG.md M commons/pom.xml M commons/src/main/java/org/archive/io/Arc2Warc.java M commons/src/main/java/org/archive/io/Warc2Arc.java M commons/src/main/java/org/archive/net/UURI.java M commons/src/main/java/org/archive/net/UURIFactory.java M commons/src/main/java/org/archive/surt/SURTTokenizer.java M commons/src/main/java/org/archive/util/UriUtils.java M commons/src/test/java/org/archive/surt/SURTTokenizerTest.java M contrib/src/main/java/org/archive/crawler/frontier/AMQPUrlReceiver.java M contrib/src/main/java/org/archive/modules/AMQPPublishProcessor.java M contrib/src/main/java/org/archive/modules/extractor/ExtractorYoutubeDL.java M contrib/src/main/java/org/archive/modules/extractor/KnowledgableExtractorJS.java M contrib/src/test/java/org/archive/modules/extractor/ExtractorPDFContentTest.java M contrib/src/test/java/org/archive/modules/extractor/ExtractorYoutubeFormatStreamTest.java M engine/src/main/java/org/archive/crawler/frontier/AbstractFrontier.java M engine/src/main/java/org/archive/crawler/frontier/FrontierJournal.java M engine/src/main/java/org/archive/crawler/frontier/HostnameQueueAssignmentPolicy.java M engine/src/main/java/org/archive/crawler/postprocessor/CandidatesProcessor.java M engine/src/main/java/org/archive/crawler/postprocessor/DispositionProcessor.java M engine/src/main/java/org/archive/crawler/prefetch/PreconditionEnforcer.java M engine/src/main/java/org/archive/crawler/reporting/CrawlerLoggerModule.java M engine/src/test/java/org/archive/crawler/datamodel/CrawlURITest.java M engine/src/test/java/org/archive/crawler/frontier/BdbMultipleWorkQueuesTest.java M engine/src/test/java/org/archive/crawler/frontier/FrontierJournalTest.java M engine/src/test/java/org/archive/crawler/prefetch/QuotaEnforcerTest.java M engine/src/test/java/org/archive/crawler/util/BdbUriUniqFilterTest.java M engine/src/test/java/org/archive/crawler/util/BloomUriUniqFilterTest.java M engine/src/test/java/org/archive/crawler/util/FPUriUniqFilterTest.java M engine/src/test/java/org/archive/crawler/util/TopNSetTest.java M engine/src/test/java/org/archive/modules/fetcher/FormAuthTest.java M modules/src/main/java/org/archive/modules/CrawlURI.java M modules/src/main/java/org/archive/modules/Processor.java M modules/src/main/java/org/archive/modules/credential/HtmlFormCredential.java M modules/src/main/java/org/archive/modules/deciderules/AddRedirectFromRootServerToScope.java M modules/src/main/java/org/archive/modules/deciderules/ExternalGeoLocationDecideRule.java M modules/src/main/java/org/archive/modules/deciderules/ResourceNoLongerThanDecideRule.java M modules/src/main/java/org/archive/modules/extractor/Extractor.java M modules/src/main/java/org/archive/modules/extractor/ExtractorCSS.java M modules/src/main/java/org/archive/modules/extractor/ExtractorDOC.java M modules/src/main/java/org/archive/modules/extractor/ExtractorHTML.java M modules/src/main/java/org/archive/modules/extractor/ExtractorHTTP.java M modules/src/main/java/org/archive/modules/extractor/ExtractorImpliedURI.java M modules/src/main/java/org/archive/modules/extractor/ExtractorJS.java M modules/src/main/java/org/archive/modules/extractor/ExtractorMultipleRegex.java M modules/src/main/java/org/archive/modules/extractor/ExtractorPDF.java M modules/src/main/java/org/archive/modules/extractor/ExtractorRobotsTxt.java M modules/src/main/java/org/archive/modules/extractor/ExtractorSitemap.java M modules/src/main/java/org/archive/modules/extractor/ExtractorURI.java M modules/src/main/java/org/archive/modules/extractor/ExtractorXML.java M modules/src/main/java/org/archive/modules/extractor/JerichoExtractorHTML.java M modules/src/main/java/org/archive/modules/extractor/UriErrorLoggerModule.java M modules/src/main/java/org/archive/modules/fetcher/AbstractCookieStore.java M modules/src/main/java/org/archive/modules/fetcher/FetchDNS.java M modules/src/main/java/org/archive/modules/fetcher/FetchFTP.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP2.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTPCookieStore.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTPRequest.java M modules/src/main/java/org/archive/modules/fetcher/FetchSFTP.java M modules/src/main/java/org/archive/modules/fetcher/FetchWhois.java M modules/src/main/java/org/archive/modules/forms/FormLoginProcessor.java M modules/src/main/java/org/archive/modules/net/CrawlServer.java M modules/src/main/java/org/archive/modules/net/RobotsPolicy.java M modules/src/main/java/org/archive/modules/net/ServerCache.java M modules/src/main/java/org/archive/modules/seeds/TextSeedModule.java M modules/src/main/java/org/archive/modules/writer/ARCWriterProcessor.java M modules/src/main/java/org/archive/state/ModuleTestBase.java M modules/src/test/java/org/archive/modules/canonicalize/FixupQueryStringTest.java M modules/src/test/java/org/archive/modules/canonicalize/RegexRuleTest.java M modules/src/test/java/org/archive/modules/canonicalize/RulesCanonicalizationPolicyTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripSessionCFIDsTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripSessionIDsTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripUserinfoRuleTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripWWWNRuleTest.java M modules/src/test/java/org/archive/modules/canonicalize/StripWWWRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/MatchesListRegexDecideRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/MatchesStatusCodeDecideRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/NotMatchesStatusCodeDecideRuleTest.java M modules/src/test/java/org/archive/modules/deciderules/ViaSurtPrefixedDecideRuleTest.java M modules/src/test/java/org/archive/modules/extractor/ExtractorHTMLTest.java M modules/src/test/java/org/archive/modules/extractor/JerichoExtractorHTMLTest.java M modules/src/test/java/org/archive/modules/extractor/UnitTestUriLoggerModule.java M modules/src/test/java/org/archive/modules/fetcher/CookieFetchHTTPIntegrationTest.java M modules/src/test/java/org/archive/modules/fetcher/FetchHTTPTest.java M modules/src/test/java/org/archive/modules/forms/FormLoginProcessorTest.java M modules/src/test/java/org/archive/modules/net/ServerCacheTest.java M modules/src/test/java/org/archive/modules/recrawl/ContentDigestHistoryTest.java Log Message: ----------- Upgrade webarchive-commons from 1.3.0 to 2.0.1 (removes httpclient 3) To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-21 07:23:05
|
Branch: refs/heads/bidi Home: https://github.com/internetarchive/heritrix3 Commit: e3f231406a597f805d177d3daa982ff6b78404fb https://github.com/internetarchive/heritrix3/commit/e3f231406a597f805d177d3daa982ff6b78404fb Author: Alex Osborne <aos...@nl...> Date: 2025-05-21 (Wed, 21 May 2025) Changed paths: M commons/src/main/java/org/archive/net/webdriver/LocalWebDriverBiDi.java A commons/src/main/java/org/archive/net/webdriver/WebDriverException.java M engine/src/main/java/org/archive/crawler/browser/BrowserProcessor.java Log Message: ----------- Improve bidi error handling To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-20 10:06:13
|
Branch: refs/heads/codemirror6 Home: https://github.com/internetarchive/heritrix3 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-20 10:06:10
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 33446c7a2bbe48d9c6d2c3472e70d39d5300660a https://github.com/internetarchive/heritrix3/commit/33446c7a2bbe48d9c6d2c3472e70d39d5300660a Author: Alex Osborne <aos...@nl...> Date: 2025-05-16 (Fri, 16 May 2025) Changed paths: M CHANGELOG.md M engine/pom.xml A engine/src/main/java/freemarker_implicit.ftl M engine/src/main/java/org/archive/crawler/restlet/EditRepresentation.java M engine/src/main/java/org/archive/crawler/restlet/EngineApplication.java A engine/src/main/java/org/archive/crawler/restlet/WebJars.java A engine/src/main/resources/org/archive/crawler/restlet/Edit.ftl M engine/src/main/resources/org/archive/crawler/restlet/Script.ftl R engine/src/main/resources/org/archive/crawler/restlet/codemirror/LICENSE R engine/src/main/resources/org/archive/crawler/restlet/codemirror/README R engine/src/main/resources/org/archive/crawler/restlet/codemirror/codemirror.css R engine/src/main/resources/org/archive/crawler/restlet/codemirror/codemirror.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/clike.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/groovy.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/javascript.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/xmlpure.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/dialog.css R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/dialog.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/search.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/searchcursor.js Log Message: ----------- Upgrade to CodeMirror 6 This resolves some browser incompatibilities, allowing CodeMirror’s own find function to be re-enabled for reliable text search of content far outside the viewport. Commit: 58e1ac529b39343d9bfc95dd770e9e5c3fcc3804 https://github.com/internetarchive/heritrix3/commit/58e1ac529b39343d9bfc95dd770e9e5c3fcc3804 Author: Alex Osborne <aos...@nl...> Date: 2025-05-20 (Tue, 20 May 2025) Changed paths: M CHANGELOG.md M engine/pom.xml A engine/src/main/java/freemarker_implicit.ftl M engine/src/main/java/org/archive/crawler/restlet/EditRepresentation.java M engine/src/main/java/org/archive/crawler/restlet/EngineApplication.java A engine/src/main/java/org/archive/crawler/restlet/WebJars.java A engine/src/main/resources/org/archive/crawler/restlet/Edit.ftl M engine/src/main/resources/org/archive/crawler/restlet/Script.ftl R engine/src/main/resources/org/archive/crawler/restlet/codemirror/LICENSE R engine/src/main/resources/org/archive/crawler/restlet/codemirror/README R engine/src/main/resources/org/archive/crawler/restlet/codemirror/codemirror.css R engine/src/main/resources/org/archive/crawler/restlet/codemirror/codemirror.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/clike.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/groovy.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/javascript.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/xmlpure.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/dialog.css R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/dialog.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/search.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/searchcursor.js Log Message: ----------- Merge pull request #651 from internetarchive/codemirror6 Upgrade to CodeMirror 6 Compare: https://github.com/internetarchive/heritrix3/compare/51347f769820...58e1ac529b39 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-16 09:35:07
|
Branch: refs/heads/codemirror6 Home: https://github.com/internetarchive/heritrix3 Commit: 33446c7a2bbe48d9c6d2c3472e70d39d5300660a https://github.com/internetarchive/heritrix3/commit/33446c7a2bbe48d9c6d2c3472e70d39d5300660a Author: Alex Osborne <aos...@nl...> Date: 2025-05-16 (Fri, 16 May 2025) Changed paths: M CHANGELOG.md M engine/pom.xml A engine/src/main/java/freemarker_implicit.ftl M engine/src/main/java/org/archive/crawler/restlet/EditRepresentation.java M engine/src/main/java/org/archive/crawler/restlet/EngineApplication.java A engine/src/main/java/org/archive/crawler/restlet/WebJars.java A engine/src/main/resources/org/archive/crawler/restlet/Edit.ftl M engine/src/main/resources/org/archive/crawler/restlet/Script.ftl R engine/src/main/resources/org/archive/crawler/restlet/codemirror/LICENSE R engine/src/main/resources/org/archive/crawler/restlet/codemirror/README R engine/src/main/resources/org/archive/crawler/restlet/codemirror/codemirror.css R engine/src/main/resources/org/archive/crawler/restlet/codemirror/codemirror.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/clike.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/groovy.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/javascript.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/mode/xmlpure.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/dialog.css R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/dialog.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/search.js R engine/src/main/resources/org/archive/crawler/restlet/codemirror/util/searchcursor.js Log Message: ----------- Upgrade to CodeMirror 6 This resolves some browser incompatibilities, allowing CodeMirror’s own find function to be re-enabled for reliable text search of content far outside the viewport. To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-13 05:24:05
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 51347f769820397032806c48565679890a6bef2e https://github.com/internetarchive/heritrix3/commit/51347f769820397032806c48565679890a6bef2e Author: Alex Osborne <aos...@nl...> Date: 2025-05-13 (Tue, 13 May 2025) Changed paths: M docs/requirements.txt Log Message: ----------- docs: Use javalang17 fork Fixes some parse errors but not all of them To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-13 05:20:49
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 7db91b53c679124acbbf364889956b20433b3ce5 https://github.com/internetarchive/heritrix3/commit/7db91b53c679124acbbf364889956b20433b3ce5 Author: Alex Osborne <aos...@nl...> Date: 2025-05-13 (Tue, 13 May 2025) Changed paths: M docs/configuring-jobs.rst Log Message: ----------- docs: HTTP/2: correct code quotes instead of italics To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-13 05:17:17
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: e1ac68f533ca9544e078a569c4ec9fa12d6c4733 https://github.com/internetarchive/heritrix3/commit/e1ac68f533ca9544e078a569c4ec9fa12d6c4733 Author: Alex Osborne <aos...@nl...> Date: 2025-05-13 (Tue, 13 May 2025) Changed paths: M docs/configuring-jobs.rst Log Message: ----------- docs: Correct bean-example path for FetchHTTP2 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-13 05:16:09
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: fee534c3dbdb9cb0a1a1c9fc94df4cf33d845afb https://github.com/internetarchive/heritrix3/commit/fee534c3dbdb9cb0a1a1c9fc94df4cf33d845afb Author: Alex Osborne <aos...@nl...> Date: 2025-05-13 (Tue, 13 May 2025) Changed paths: M docs/conf.py Log Message: ----------- docs: Get version from pom.xml To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-13 05:10:52
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 81ac46be4557e07bbb346223fe957f22e220ad11 https://github.com/internetarchive/heritrix3/commit/81ac46be4557e07bbb346223fe957f22e220ad11 Author: Alex Osborne <aos...@nl...> Date: 2025-05-13 (Tue, 13 May 2025) Changed paths: M docs/_ext/beandoc.py Log Message: ----------- docs: Catch parse errors and continue generating Looks like the javalang parser doesn't handle some newer syntax. For now just catch the errors. Long term we probably need to switch to a different parser. To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-13 03:25:55
|
Branch: refs/tags/3.9.0 Home: https://github.com/internetarchive/heritrix3 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-13 03:25:53
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: 1827d93fe616c4f17c2a3658d56ae14c6da5a5e4 https://github.com/internetarchive/heritrix3/commit/1827d93fe616c4f17c2a3658d56ae14c6da5a5e4 Author: Alex Osborne <aos...@nl...> Date: 2025-05-13 (Tue, 13 May 2025) Changed paths: M commons/pom.xml M contrib/pom.xml M dist/pom.xml M engine/pom.xml M modules/pom.xml M pom.xml Log Message: ----------- [maven-release-plugin] prepare for next development iteration To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-13 03:25:49
|
Branch: refs/heads/master Home: https://github.com/internetarchive/heritrix3 Commit: dd40970f15f34254cae7f288846946149b5ab448 https://github.com/internetarchive/heritrix3/commit/dd40970f15f34254cae7f288846946149b5ab448 Author: Alex Osborne <aos...@nl...> Date: 2025-05-13 (Tue, 13 May 2025) Changed paths: M commons/pom.xml M contrib/pom.xml M dist/pom.xml M engine/pom.xml M modules/pom.xml M pom.xml Log Message: ----------- [maven-release-plugin] prepare release 3.9.0 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-12 01:19:06
|
Branch: refs/heads/bidi Home: https://github.com/internetarchive/heritrix3 Commit: 28ca857a0a775fded823d0ce311cea8173f3562b https://github.com/internetarchive/heritrix3/commit/28ca857a0a775fded823d0ce311cea8173f3562b Author: Alex Osborne <aos...@nl...> Date: 2025-05-12 (Mon, 12 May 2025) Changed paths: A commons/src/main/java/org/archive/net/WebdriverBiDi.java M modules/pom.xml A modules/src/main/java/org/archive/modules/extractor/ExtractorWebdriverBiDi.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP2.java Log Message: ----------- wip Commit: 44013a9cd22d7a6f644047d2f76e583415ece128 https://github.com/internetarchive/heritrix3/commit/44013a9cd22d7a6f644047d2f76e583415ece128 Author: Alex Osborne <aos...@nl...> Date: 2025-05-12 (Mon, 12 May 2025) Changed paths: M commons/src/main/java/org/archive/net/WebdriverBiDi.java A modules/src/main/java/org/archive/modules/extractor/ExtractorBrowser.java R modules/src/main/java/org/archive/modules/extractor/ExtractorWebdriverBiDi.java Log Message: ----------- wip Commit: 23810a8c71b4129dc2cedea75d8e38cc137346dd https://github.com/internetarchive/heritrix3/commit/23810a8c71b4129dc2cedea75d8e38cc137346dd Author: Alex Osborne <aos...@nl...> Date: 2025-05-12 (Mon, 12 May 2025) Changed paths: R commons/src/main/java/org/archive/net/WebdriverBiDi.java A commons/src/main/java/org/archive/net/webdriver/BiDiEvent.java A commons/src/main/java/org/archive/net/webdriver/BiDiJson.java A commons/src/main/java/org/archive/net/webdriver/BiDiModule.java A commons/src/main/java/org/archive/net/webdriver/BrowsingContext.java A commons/src/main/java/org/archive/net/webdriver/LocalWebDriverBiDi.java A commons/src/main/java/org/archive/net/webdriver/Network.java A commons/src/main/java/org/archive/net/webdriver/Script.java A commons/src/main/java/org/archive/net/webdriver/Session.java A commons/src/main/java/org/archive/net/webdriver/WebDriverBiDi.java A commons/src/main/java/org/archive/util/IdleBarrier.java A engine/src/main/java/org/archive/crawler/browser/BrowserPage.java A engine/src/main/java/org/archive/crawler/browser/BrowserProcessor.java A modules/src/main/java/org/archive/modules/behaviors/Behavior.java A modules/src/main/java/org/archive/modules/behaviors/ExtractLinksBehavior.java A modules/src/main/java/org/archive/modules/behaviors/Page.java A modules/src/main/java/org/archive/modules/behaviors/ScrollDownBehavior.java R modules/src/main/java/org/archive/modules/extractor/ExtractorBrowser.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP2.java Log Message: ----------- wip Commit: d9159c022be159337f123354527a211a9b7f3306 https://github.com/internetarchive/heritrix3/commit/d9159c022be159337f123354527a211a9b7f3306 Author: Alex Osborne <aos...@nl...> Date: 2025-05-12 (Mon, 12 May 2025) Changed paths: M commons/pom.xml A commons/src/main/java/org/archive/net/MitmProxy.java M engine/src/main/java/org/archive/crawler/browser/BrowserPage.java M engine/src/main/java/org/archive/crawler/browser/BrowserProcessor.java M modules/pom.xml Log Message: ----------- wip Compare: https://github.com/internetarchive/heritrix3/compare/e2765615aad0...d9159c022be1 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |
From: Alex O. <no...@gi...> - 2025-05-12 01:18:34
|
Branch: refs/heads/bidi Home: https://github.com/internetarchive/heritrix3 Commit: 776951d0c4c66d9148546f240f7b77d47d748c83 https://github.com/internetarchive/heritrix3/commit/776951d0c4c66d9148546f240f7b77d47d748c83 Author: Alex Osborne <aos...@nl...> Date: 2025-05-12 (Mon, 12 May 2025) Changed paths: A commons/src/main/java/org/archive/net/WebdriverBiDi.java M modules/pom.xml A modules/src/main/java/org/archive/modules/extractor/ExtractorWebdriverBiDi.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP2.java Log Message: ----------- wip Commit: 362823489a19b73217414bcfcbbaf0a952fc186f https://github.com/internetarchive/heritrix3/commit/362823489a19b73217414bcfcbbaf0a952fc186f Author: Alex Osborne <aos...@nl...> Date: 2025-05-12 (Mon, 12 May 2025) Changed paths: M commons/src/main/java/org/archive/net/WebdriverBiDi.java A modules/src/main/java/org/archive/modules/extractor/ExtractorBrowser.java R modules/src/main/java/org/archive/modules/extractor/ExtractorWebdriverBiDi.java Log Message: ----------- wip Commit: aa46ce6499d23186b95368d1d8aaa36265a40e72 https://github.com/internetarchive/heritrix3/commit/aa46ce6499d23186b95368d1d8aaa36265a40e72 Author: Alex Osborne <aos...@nl...> Date: 2025-05-12 (Mon, 12 May 2025) Changed paths: R commons/src/main/java/org/archive/net/WebdriverBiDi.java A commons/src/main/java/org/archive/net/webdriver/BiDiEvent.java A commons/src/main/java/org/archive/net/webdriver/BiDiJson.java A commons/src/main/java/org/archive/net/webdriver/BiDiModule.java A commons/src/main/java/org/archive/net/webdriver/BrowsingContext.java A commons/src/main/java/org/archive/net/webdriver/LocalWebDriverBiDi.java A commons/src/main/java/org/archive/net/webdriver/Network.java A commons/src/main/java/org/archive/net/webdriver/Script.java A commons/src/main/java/org/archive/net/webdriver/Session.java A commons/src/main/java/org/archive/net/webdriver/WebDriverBiDi.java A commons/src/main/java/org/archive/util/IdleBarrier.java A engine/src/main/java/org/archive/crawler/browser/BrowserPage.java A engine/src/main/java/org/archive/crawler/browser/BrowserProcessor.java A modules/src/main/java/org/archive/modules/behaviors/Behavior.java A modules/src/main/java/org/archive/modules/behaviors/ExtractLinksBehavior.java A modules/src/main/java/org/archive/modules/behaviors/Page.java A modules/src/main/java/org/archive/modules/behaviors/ScrollDownBehavior.java R modules/src/main/java/org/archive/modules/extractor/ExtractorBrowser.java M modules/src/main/java/org/archive/modules/fetcher/FetchHTTP2.java Log Message: ----------- wip Commit: e2765615aad0f8d46ca22f66f7d0d1265a5f622d https://github.com/internetarchive/heritrix3/commit/e2765615aad0f8d46ca22f66f7d0d1265a5f622d Author: Alex Osborne <aos...@nl...> Date: 2025-05-12 (Mon, 12 May 2025) Changed paths: M commons/pom.xml A commons/src/main/java/org/archive/net/MitmProxy.java M engine/src/main/java/org/archive/crawler/browser/BrowserPage.java M engine/src/main/java/org/archive/crawler/browser/BrowserProcessor.java M modules/pom.xml Log Message: ----------- wip Compare: https://github.com/internetarchive/heritrix3/compare/23acbfef65ea...e2765615aad0 To unsubscribe from these emails, change your notification settings at https://github.com/internetarchive/heritrix3/settings/notifications |