The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size
wsprspots	2019-07-09
utility-scripts	2019-07-05
fileinfo	2019-07-05
README.md	2019-07-05	3.0 kB
Totals: 4 Items		3.0 kB

WSPR Analitics Source Data

This project is a 7z recompression of the original spots from WSPRnet. As the project progresses, analitic aplications, scripts, and utilities will be added for general public use.

All original *.csv.gz archives were decompressed, imported to MongoDB and PostgreSQL to check for csv import errors. Two months had issues

wsprspots-2013-01.csv.gz
wsprspots-2013-02.csv.gz

Both files had the same anamoly, IZ"WMD is a malformed callsign which fails to import properly. Errored lines were removed then compressed with the following commands:

# Remove errored spots
sed -i '/IZ\"WMD/d' ./wsprspots-2013-01.csv
sed -i '/IZ\"WMD/d' ./wsprspots-2013-02.csv

# 7z compression command:
7z a -mx=9 $file.7z ./$file.csv

Compression and Stats

The following tests were run to see which options proved most benificial. On average, 7z reduces file size by 45% to 50% with the files tested.

Import file stats can be reviewed in the fileinfo folder:

Stats JSON Fields

The structure of the stats file is as follows:

_id: ObjectId is the key for the document
fileName is the WSPR CSV file
lineCount is the number of decodes in the archive
csvSize is on-disk csv file size in bytes
archiveSize on-disk 7z post compression size in bytes
processDate is the ISODate when the files was processed.

NOTE: The structure could change over time, but for now, this is all that is being tracked. The key element is the lineCount as that will be used for a number of metrics without having to parse the database or open and recount from the source files.

Compression Tests

7z Compression Tests

1) 7z a -mx=9 -mfb=273 -ms=on $file.7z ./*.csv
   Results :    raw csv = 270.1 MB (270,072,472 bytes), 
                gz = 49.7 MB (49,707,897 bytes)
                7z = 26.1 MB (26,099,881 bytes)

2) 7z a -t7z -m0=lzma2 -mx=9 -mfb=64 -md=1024m -ms=on $file.7z ./*.csv
   Results :    raw csv = 270.1 MB (270,072,472 bytes)
                gz = 49.7 MB (49,707,897 bytes)
                7z = 26.1 MB (26,098,713 bytes)

3)  7z a -mx=9 $file.7z ./*.csv
    Results :   raw csv = 270.1 MB (270,072,472 bytes)
                gz = 49.7 MB (49,707,897 bytes)
                7z = 26.1 MB (26,099,881 bytes)


4) 7z a -t7z -mx=9 -mfb=273 -ms -md=31 -myx=9 -mtm=- -mmt -mmtf -md=1536m -mmf=bt3 -mmc=10000 -mpb=0 -mlc=0 $file.7z ./*.csv
    Results :   raw csv = 270.1 MB (270,072,472 bytes)
                gz = 49.7 MB (49,707,897 bytes)
                7z = 24.6 MB (24,551,235 bytes)

While method (4) creates a smaller archive, it's very time comsuming. Number (3)seems to be the best all-around solution in terms of speed and post compression archive sizing.

Source: README.md, updated 2019-07-05

WSPR Analytics Files

Get an email when there's a new version of WSPR Analytics

WSPR Analitics Source Data

Compression and Stats

Stats JSON Fields

Compression Tests