Home
Name Modified Size InfoDownloads / Week
1.46 2020-06-03
1.45 2020-05-15
1.44 2020-04-18
1.43 2020-04-04
1.42 2020-03-30
1.41 2020-03-29
1.40 2020-03-28
1.39 2020-03-25
1.38 2020-03-23
1.37 2020-03-22
1.36 2020-03-20
1.35 2020-03-19
1.34 2020-03-18
1.33 2020-03-18
1.32 2020-03-17
1.31 2020-03-15
1.30 2020-03-14
1.29 2020-03-09
1.28 2020-03-08
1.27 2020-03-02
1.26 2020-02-09
1.25 2020-01-26
1.24 2020-01-24
0.23 2020-01-11
0.22 2020-01-04
0.21 2020-01-03
0.20 2020-01-01
0.19 2019-12-31
0.18 2019-12-30
0.17 2019-12-27
0.16 2019-12-26
0.15 2019-12-26
0.14 2019-12-26
0.13 2019-12-23
0.12 2019-12-22
0.11 2019-12-21
0.10 2019-12-21
0.9 2019-12-19
0.8 2019-12-18
0.7 2019-12-18
0.6 2019-12-17
0.5 2019-12-16
0.4 2019-12-15
0.3 2019-12-14
0.2 2019-12-11
0.1 2019-12-09
README.md 2020-03-25 7.3 kB
COPYING 2020-01-02 482 Bytes
Totals: 48 Items   7.8 kB 0

sponge - A website crawler and links downloader command line tool Build Status Codacy Badge Download sponge Awesome Kotlin Badge Twitter URL License

How to build and run it?

You will need a Java JDK 8+ and maven 3.3.9 or above.

mvn clean package assembly:single

cd target && unzip sponge-X.X-SNAPSHOT.zip && cd sponge-X.X-SNAPSHOT

./sponge [OPTIONS]

How to use it?

Usage: sponge [OPTIONS]

Options:
  -u, --uri VALUE                 URI (example: https://www.google.com)
  -o, --output PATH               Output directory where files are downloaded
  -t, --mime-type TEXT            Mime types to download (example: text/plain)
  -e, --file-extension TEXT       Extensions to download (example: png)
  -d, --depth INT                 Search depth (default: 1)
  -m, --max-uris INT              Maximum uris to visit (default: 1000000)
  -s, --include-subdomains        Include subdomains
  -R, --concurrent-requests INT   Concurrent requests (default: 1)
  -D, --concurrent-downloads INT  Concurrent downloads (default: 1)
  -r, --referrer TEXT             Referrer (default: https://www.google.com)
  -U, --user-agent TEXT           User agent (default: Mozilla/5.0 (X11; Linux
                                  x86_64) AppleWebKit/537.36 (KHTML, like
                                  Gecko) Chrome/80.0.3987.132 Safari/537.36)
  -O, --overwrite                 Overwrite existing files
  -v, --version                   Show the version and exit
  -h, --help                      Show this message and exit

Examples

./sponge -u https://freemusicarchive.org/genre/Blues \
         -o output \
         -e mp3 \
         -d 2 \
         -R 5 \
         -s
...
↓ /home/spypunk/output/files.freemusicarchive.org/storage-freemusicarchive-org/music/no_curator/Checkie_Brown/hey/Checkie_Brown_-_09_-_Mary_Roose_CB_36.mp3 [8 MB] [4145.71 kB/s] [1/19]
↓ /home/spypunk/output/files.freemusicarchive.org/storage-freemusicarchive-org/music/no_curator/Lobo_Loco/Not_my_Brain/Lobo_Loco_-_01_-_Brain_ID_1270.mp3 [8 MB] [4065.18 kB/s] [2/42]
↓ /home/spypunk/output/files.freemusicarchive.org/storage-freemusicarchive-org/music/no_curator/Lobo_Loco/Not_my_Brain/Lobo_Loco_-_02_-_Brain_-_Instrumental_Retro_ID_1271.mp3 [8 MB] [4216.95 kB/s] [3/56]
↓ /home/spypunk/output/files.freemusicarchive.org/storage-freemusicarchive-org/music/no_curator/Lobo_Loco/Salad_Mixed/Lobo_Loco_-_12_-_Madness_is_Everywhere_ID_1228.mp3 [9 MB] [4178.34 kB/s] [4/65]
↓ /home/spypunk/output/files.freemusicarchive.org/storage-freemusicarchive-org/music/no_curator/Lobo_Loco/Salad_Mixed/Lobo_Loco_-_01_-_Allright_in_Lousiana_ID_1234.mp3 [8 MB] [3843.75 kB/s] [5/84]
↓ /home/spypunk/output/files.freemusicarchive.org/storage-freemusicarchive-org/music/no_curator/Lobo_Loco/Salad_Mixed/Lobo_Loco_-_04_-_Peaceful_Morning_ID_1229.mp3 [9 MB] [2889.31 kB/s] [6/85]
↓ /home/spypunk/output/files.freemusicarchive.org/storage-freemusicarchive-org/music/no_curator/Lobo_Loco/Salad_Mixed/Lobo_Loco_-_03_-_Spencer_-_Bluegrass_ID_1230.mp3 [9 MB] [3951.36 kB/s] [7/94]
↓ /home/spypunk/output/files.freemusicarchive.org/storage-freemusicarchive-org/music/no_curator/My_Yearnings/You_get_the_Blues_ID_1201.mp3 [10 MB] [3990.58 kB/s] [8/101]
↓ /home/spypunk/output/files.freemusicarchive.org/storage-freemusicarchive-org/music/no_curator/My_Yearnings/Tropic_Island_-_Clearmix_ID_1172.mp3 [8 MB] [4146.20 kB/s] [9/101]
↓ /home/spypunk/output/files.freemusicarchive.org/storage-freemusicarchive-org/music/no_curator/My_Yearnings/Traveling_Horse_ID_1207.mp3 [6 MB] [3411.93 kB/s] [10/101]
...
./sponge -u https://www.gutenberg.org/ebooks/search/?sort_order=release_date \
         -o output \
         -t text/plain \
         -d 2 \
         -R 5 \
         -D 5
...
↓ /home/spypunk/output/www.gutenberg.org/files/61671/61671-0.txt [34 KB] [202.73 kB/s] [1/1]
↓ /home/spypunk/output/www.gutenberg.org/files/61673/61673-0.txt [363 KB] [778.25 kB/s] [2/3]
↓ /home/spypunk/output/www.gutenberg.org/files/61667/61667-0.txt [280 KB] [359.20 kB/s] [3/4]
↓ /home/spypunk/output/www.gutenberg.org/files/61672/61672-0.txt [953 KB] [149.74 kB/s] [4/4]
↓ /home/spypunk/output/www.gutenberg.org/files/61666/61666-0.txt [866 KB] [438.76 kB/s] [5/6]
↓ /home/spypunk/output/www.gutenberg.org/files/61662/61662-0.txt [556 KB] [625.44 kB/s] [6/6]
↓ /home/spypunk/output/www.gutenberg.org/files/61670/61670-0.txt [140 KB] [397.26 kB/s] [7/7]
↓ /home/spypunk/output/www.gutenberg.org/files/61665/61665-0.txt [74 KB] [277.85 kB/s] [8/8]
↓ /home/spypunk/output/www.gutenberg.org/ebooks/61664.txt.utf-8 [388 KB] [801.17 kB/s] [9/9]
↓ /home/spypunk/output/www.gutenberg.org/files/61661/61661-0.txt [142 KB] [397.52 kB/s] [10/10]
...
./sponge -u https://free-images.com/  \
         -o output \
         -e jpeg \
         -e jpg \
         -e png \
         -d 2 \
         -R 5 \
         -D 5 \
         -s
...
↓ /home/spypunk/output/free-images.com/sm/de66/bees_in_hive.jpg [24 KB] [11229.43 kB/s] [283/300]
↓ /home/spypunk/output/free-images.com/sm/602e/bees_on_flowers_collecting.jpg [15 KB] [70732.48 kB/s] [284/300]
↓ /home/spypunk/output/free-images.com/sm/7491/bees_on_yellow_flowers.jpg [14 KB] [54516.06 kB/s] [285/300]
↓ /home/spypunk/output/free-images.com/sm/e081/bees_pollenating_basil.jpg [15 KB] [57209.33 kB/s] [286/300]
↓ /home/spypunk/output/free-images.com/sm/7e8b/bees_pollenating_insects_bugs.jpg [13 KB] [52990.93 kB/s] [287/300]
↓ /home/spypunk/output/free-images.com/sm/a8ad/bees_pollen_insects_wings.jpg [7 KB] [17312.71 kB/s] [288/300]
↓ /home/spypunk/output/free-images.com/sm/9b0c/bees_really_like_pollinating_0.jpg [11 KB] [66770.38 kB/s] [289/300]
↓ /home/spypunk/output/free-images.com/sm/2d32/delicate_arch_utah_arches.jpg [8 KB] [44408.98 kB/s] [290/300]
↓ /home/spypunk/output/free-images.com/sm/bf50/landscape_arch.jpg [16 KB] [73487.99 kB/s] [291/300]
↓ /home/spypunk/output/free-images.com/sm/0f6a/golden_arches_omaha.jpg [10 KB] [65179.10 kB/s] [292/300]
...

What about license?

This project is licensed under the WTFPL (Do What The Fuck You Want To Public License, Version 2)

WTFPL

Copyright © 2019-2020 spypunk spypunk@gmail.com

This work is free. You can redistribute it and/or modify it under the terms of the Do What The Fuck You Want To Public License, Version 2, as published by Sam Hocevar. See the COPYING file for more details.

Source: README.md, updated 2020-03-25