Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
music_galaxy.py | 2020-07-14 | 6.5 kB | |
_gitignore | 2020-07-11 | 1.8 kB | |
LICENSE | 2020-07-11 | 1.1 kB | |
README.md | 2020-07-11 | 1.5 kB | |
requirements.txt | 2020-07-11 | 80 Bytes | |
raw_data.txt | 2020-07-11 | 741.6 kB | |
data_scraper.py | 2020-07-11 | 2.8 kB | |
Totals: 7 Items | 755.3 kB | 0 |
MusicianFamilyTree
A galaxy of 6491 musicians and their pupils, showing the complicated web of the history of music.
Overview
- The code scrapes data from Wikipedia, from the articles listing music students by teacher (e.g, from A to B), using the BeautifulSoup and Requests modules respectively).
- This was originally scraped into a csv format, with the format:
Line | ||||
---|---|---|---|---|
1 | teacher_name | ... | ||
2 | teacher_href | ... | ||
3 | pupil_name_0 | pupil_name_1 | pupil_name_2 | ... |
4 | pupil_href_0 | pupil_href_1 | pupil_href_2 | ... |
5 |
- Then, the wonderful NetworkX module is used to create a graph object, and MatPlotLib is used to create beautiful styling.