From: Gustaf N. <ne...@wu...> - 2024-06-17 17:51:15
|
Dear all, I've committed the following change to the GitHub repository of NaviServer, that adds significant improvements for FORM uploads of large files. It makes it now possible to handle files uploads via multipart/form-data (usual format) larger than 4GB without crashing. The support is just for NaviServer, applications (e.g. OpenACS) might still try to read such large files into Tcl_Objs leading to crashes with Tcl 8. Although, loading huge data into memory is not a good practice and leads to memory bloats. However, this should work with Tcl9. The code replaces an implementation that has not changed for the last 17 years. I am planing to backport this change also to the 4.99 branch. All the best -g Added experimental command "ns_fseekchars" and use it in form.tcl for parsing multipart/form-data As a consequence, (a) the file-based parser of multipart/form-data is able to read files >4GB, (b) leads to less memory bloat and (c) is more than a factor of 10 faster file size old ns_fseekchars factor 65,517 4,471 151 29.61 124,523 1,139 94 12.12 74,006,378 682,375 54,752 12.46 2,104,408,064 18,916,496 1,564,472 12.09 3,992,977,408 35,942,768 3,061,061 11.74 5,368,709,120 3,817,896 The problem with the old file-based parser was that it was searching for boundary strings using the Tcl "gets" command: if { [string match $boundary* [string trim [gets $fp]]] } { ... } Since "gets" reads a line (i.e., all character until the next new line) Tcl can crash, when the next new line is more than 4GB away. Even with e.g. 2GB, it will temporarily create a Tcl_Obj with 2GB content, which is stripped etc. leading therefore to a potential memory bloat keeping multiple huge Tcl_Objs in memory. The new code avoids all this by performing the search for the boundary in C. TODO: add documentation page |