| Name | Modified | Size | Downloads / Week | 
|---|---|---|---|
| Parent folder | |||
| music_galaxy.py | 2020-07-14 | 6.5 kB | |
| _gitignore | 2020-07-11 | 1.8 kB | |
| LICENSE | 2020-07-11 | 1.1 kB | |
| README.md | 2020-07-11 | 1.5 kB | |
| requirements.txt | 2020-07-11 | 80 Bytes | |
| raw_data.txt | 2020-07-11 | 741.6 kB | |
| data_scraper.py | 2020-07-11 | 2.8 kB | |
| Totals: 7 Items | 755.3 kB | 0 | |
MusicianFamilyTree
A galaxy of 6491 musicians and their pupils, showing the complicated web of the history of music.
Overview
- The code scrapes data from Wikipedia, from the articles listing music students by teacher (e.g, from A to B), using the BeautifulSoup and Requests modules respectively).
- This was originally scraped into a csv format, with the format:
| Line | ||||
|---|---|---|---|---|
| 1 | teacher_name | ... | ||
| 2 | teacher_href | ... | ||
| 3 | pupil_name_0 | pupil_name_1 | pupil_name_2 | ... | 
| 4 | pupil_href_0 | pupil_href_1 | pupil_href_2 | ... | 
| 5 | 
- Then, the wonderful NetworkX module is used to create a graph object, and MatPlotLib is used to create beautiful styling.
