On 7/16, Slashdot Media sites (including Slashdot and SourceForge) experienced a storage fault. Work has continued 24×7 on service restoration. Updates have been provided as each key service component was restored. We’ve provided two prior updates (7/18 and 7/22) summarizing our infrastructure and service restoration status. This is our third large update.
- Slashdotmedia.com – online
- Slashdot.org – online
- Slashdot Engineering infrastructure – online
- Slashdot Media’s WordPress sites – online
- SourceForge Engineering infrastructure – online
- Slashdot Media operations infrastructure – online
- SourceForge databases – online
- SourceForge download service – online
- SourceForge Directory services (project summary page, download pages, search, front page, directory) – online
- SourceForge Developer Services – partially restored (see detailed status below)
- SourceForge site’s Developer pages backed by Apache Allura (tickets, wikis, forums) – online
- SourceForge Mailing List services (email, web archives, archiving) – online as of 7/22, archiving restored 7/23
- SourceForge Project Database (MySQL) service — online
- SourceForge Project Web service – online as of 7/22, except k* projects (restore in-progress); session store corrected 7/23
- SourceForge User Web service – online as of 7/22
- SourceForge Project Web file management — online as of 7/23
- SourceForge Allura Git service – online as of 7/22.
- SourceForge Allura Mercurial (Hg) service – online as of 7/23
- SourceForge File Upload service – offline, filesystem checks complete, cryptographic summing projected to complete 7/24. Prep of data for service resumption in-progress. ETA to follow once I/O performance calculated during mount reconstruction.
- SourceForge Allura Subversion (SVN) service – offline, filesystem checks complete, data restoration has completed 22 letters (4 remain). This is our current restore priority. We project restore of data to complete by 7/25, to be followed by data validation and restore of service. ETA to follow once I/O performance calculated during data validation.
- SourceForge CVS service — offline, filesystem checks and data restoration to commence after Allura-backed SVN service is restored. ETA to follow once SVN restore completed. CVS is 20% of the size of SVN data, but requires a higher degree of manual validation; this data point will be used to estimate restoration timetable.
- SourceForge non-Allura SCM platforms — offline, filesystem checks and data restoration to occur once CVS restoration is under way. ETA to follow once SVN restore completed. This service will be restored last. Non-Allura SCM data set is substantially smaller than the size of SVN data; this data point will be used to estimate restoration timetable.
Engagement with our storage platform vendor will continue, including review of captured data. Post-mortem activity is anticipated after data restoration is completed. The team continues split operation between data restoration and service restoration as to expedite return to full service.
Knowledge capture has been continuous throughout this outage and will drive continuous improvement. A few key points resulting from this process to date:
- Transition of two SourceForge databases from centralized storage platform SSD to local storage SSD (Intel P3600’s) was completed 7/24. Function and performance validated.
- Review of I/O workloads is ongoing to further expedite service restoration.
- Users on “Classic” non-Allura-backed SCM services should anticipate an upcoming pre-announced migration to Allura-backed service (which was restored first).
- Additional storage is being onboarded at this time. In some cases we currently have three copies of production data to maintain during restoration.
We intend to continue our existing communications approach — incremental updates will be provided on individual service restoration, and large updates (like this one) will be provided with additional metrics and technical details as work progresses.
Work continues 24×7 on restoration of SourceForge file upload and yet-unrestored SCM services (per above list).
Thank you for your continued support and patience.