On Oct 21, 2015, the United States Federal Communications Commission (FCC) announced they would be releasing data to support robocall blocking technologies. On May 23rd, 2016, the FCC announced a new method for obtaining the information. It appears to be either updated in real time or daily.
As of NCID 1.7, a get-fcc-list
script has been added to take a digest of the FCC data and make it available as a blacklist. The list can either be appended to the ncidd.blacklist
file or used with the hangup-fcc
extension. It is based on Mike Stember's original FCC2ncid script.
In NCID 1.7, documentation is available from man get-fcc-list
and from the NCID User Manual.
Interesting note: The data made available on 2015-11-02 has exactly the same number of bad phone numbers with 3 or more reports, 404, as the previous week's data from the FCC.
My apologies. I already broke my word about keeping the link stable.
The data that would be appended to your
ncidd.alias
file is in thencidd-FCC.alias
file. The name is more intuitive.This also sets up for the future version of NCID that will essentially allow aliases in the blacklist file. There will be a file
ncidd-FCC.blacklist
that will be appended to yourncidd.blacklist
.See attached script for the up to date version.
Last edit: Mike 2016-01-09
Here is an improved version of your code without your comments to reduce post size. It elimenates the bad lines, makes getting the list optional, and creates either a alias or blacklist file. Consider it a suggestion, it can be improved.
If the FCC file location changes every week, then the -g option needs an argument that is the new location of the FCC file.
No reason to do a sudo as the script does not replace either ncidd.alias or ncidd.blacklist. Note that sed now gets rid of non-numeric and blank lines.
Each time you run the script, it either creates the wanted file or replaces it. Using #= in the blacklist file takes advantage of a new feature in NCID release 1.3 while remaining compatible with current releases.
Edit 1: removed the awk -v option, added awk BEGIN section to print start comment with date, and added awk END section to print end comment,
Edit 2: removed the "-g" option and the "get" variable, changed wget -S to wget -N
Last edit: John L. Chmielewski 2015-11-26
I took your script, thought about it, and decided the script could just output both files every time it is run. Since the script only needs to run once a week, optimizing it for best performance is far from critical. After the data is downloaded from the FCC the processing only takes a few seconds on a Raspberry Pi B (the original, not B2).
WGET has an option to check the timestamp of the remote file and skips the download if the local copy has the same or newer date. It appears the FCC uploads the data about 8:30 AM Eastern time based on the timestamp of the file.
The current version of the script is attached. It is certainly not complete yet. Can you let me know how you know where the ncidd.alias and ncidd.blacklist files are stored?
Mike is there a possibility of a "ncidFCCimportLocal" for use with FCC2ncid?
William, I'm assuming you either have generated the ncid-FCC.blacklist locally or you just have another source of data you want to import into the blacklist. If that is true, what you want is a substantial simplification of the ncidFCCimport.
The part where it tries to download a new ncid-FCC.blacklist and retries 12 times once per hour would be deleted. It would just use an existing local file rather than trying to download a remote file until it becomes a local file.
Please share any details you have. I'm sure something could be whomped up pretty quickly. In fact, what I believe you want is what I could do since I generate the ncid-FCC blacklist file and upload it to sourceforge. To make sure everything is working properly for everyone else, I don't take the shortcut and download the file from sourceforge.
Sorry I wasn't clearer. I use FCC2ncid to download the file. Just want to fully automate integrating the final file, via cron.
Hi William. Please try the new FCC2ncidimport script. I could not test it thoroughly simply because it does a date check against the FCC file and waits for it to be newer than the local one. That won't happen until tomorrow sometime. Your comments and error reports are welcome to help debug any issues.
Initial testing:
with old CSV:
With no CSV:
Backup names(from the description I thought they'd have a timestamp in the name):
ncidd.blacklist.
ncidd.alias.
It did append the entries on the no CSV run. Possibly add blank line or two before them for "neatness" as NCIDpop adds it's entries at the end.
William
In my cut and paste of the two scripts there was an error in the bit that backed up the old ncidd.whitelist and ncidd.blacklist. It has now been corrected so the backups are timestamped.
Fixing this also pointed out an issue with the FCC2ncid script where it would go ahead and reprocess the FCC file even though there was no new one to download. It no longer does this. Human readable date codes are now displayed to show the dates of the local file and remote. file.
Tried the newest. Seems to work as expected. Wasn't a new file today, so can only guess that that logic works. I did delete the CSV and run it and that part works. Though I'd guess this error when it's been run the first time and the CSV doesn't exist might scare some people:
New names:
ncidd.alias.2016-01-25_15:15:12
ncidd.blacklist.2016-01-25_15:15:12
Thanks for the script and the work. This really helps in making a nearly set it and forget it blocker on a RPi. On my RPi, with a 25MB modem download/Wireless G connection, it only took 5.2s to download the 5MB file.
Last edit: William Jacoby 2016-01-25
Okay, one suggestion comes to mind. Move finding $etc_ncid to before wget so you can change to the directory(Thus leaving all the files in the ncid directory, right now Telemarketing_RoboCall_Weekly_Data.csv will be saved most likely in the home directory of the cron user):
cd etc_ncid
I'm working on your suggestion as it's a good one:
1)If the FCC file does not exist locally, workaround the worrying message that comes out because the date of the new file is compared with a null string that comes from trying to get a date of a non-existent file.
2)Download to and modify only files in the etc/ncid directory. It does not make sense to have the possibility of having the files spread across any user. The issue, as usual, is how to let the user do a complicated pipe that processes the information under sudo. It's a syntax thing.
The easiest would be to test if the file($etc_ncid/$FCC_csv) exists, if it doesn't and the user has sudo(ie something like if [[ $(id -u) -ne 0 ]] ), assume it's a first run.
Ok, tried new version. Once I uncommented and fixed the change directory line it works.
You currently have:
It should be:
Again, thanks for taking suggestions into consideration.
Mike I had an idea for log pruning. I've been running the FCC2ncidimport file for the last two months. I just did:
sudo rm ncidd.*.2016-01*
So the logic I think would be to check the month, if PREVIOUS month exist, delete previous months set of backups.
I really hate to do any 'rm' and accidentally delete needed files. I'd feel really bad if that happened. I'm thinking of adding a check that just tells the user how many back up files there are and suggesting a manual clean up with a description of which files to purge.
It looks like the FCC has changed the way they publish data. 5/23/2016 may be the last list we see published at the same link https://consumercomplaints.fcc.gov/hc/en-us/articles/205239443-Data-on-Unwanted-Calls
"The May 23, 2016, update to data on unwanted calls is available for download below. Subsequent updates to data on telemarketing will be available via the FCC's Consumer Complaint Data Center, on the Charts and Graphs page."
It seems they are going to daily updates in a different format/location. I am hoping FCC2ncidimport will be updated soon to work with the new changes.
Last edit: Jeff 2016-05-27
Jeff, Thanks for pointing out the change. I'll get to work on it. It will also light a fire under me to tune FCC2ncidimport to work more widely.
Mike
Thanks Mike, I appreciate it. I am not seeing a precompiled CSV anymore, it could be a big undertaking/change.
Last edit: Jeff 2016-05-27
It looks like someone at the FCC has anticipated this need and made up a filtered version of the enormous database. Here is the link:
https://opendata.fcc.gov/api/views/vakf-fz8e/rows.csv?accessType=DOWNLOAD
I am waiting to see if the 'vakf-fz8e' is stable day-to-day, week-to-week, etc.
https://opendata.fcc.gov/api/views/vakf-fz8e/rows.csv?accessType=DOWNLOAD looks stable so far.
Unfortunately, the size increased from 20M to 50M so the download time is significantly worse.
On the other hand, the FCC site lets you make an account and save your own view. I made an account 'NCID-FCC' and made an 'NCID-01' view. You can download it yourself wit this URL:
https://opendata.fcc.gov/api/views/tiux-yc7f/rows.csv?accessType=DOWNLOAD
It is a much more svelte 12MB. It could be much smaller except it requires the ISSUE field used to filter in Robocallers and Telemarketers to be included. Otherwise it would only be 3MB.
I had to trick it into not including reports without a phone number by having the filter make sure a '-' was in the phone number field.
I'd imagine we cannot simply replace:
wget -N http://consumercomplaints.fcc.gov/hc/theme_assets/513073/200051444/$FCC_csv
with
wget -N https://opendata.fcc.gov/api/views/tiux-yc7f/rows.csv?accessType=DOWNLOAD
and have FCC2ncidimport continue to work as in the past?
It will be nice to have daily updates. One thing I liked about the last version, is that the revision date was included in ncidd.blacklist, #= FCC bad list 2016-05-23, it was an easy way to know for sure when the last successful cron update was. If possible, I'd hope something similar was able to be included.
Last edit: Jeff 2016-06-01