You can subscribe to this list here.
| 2005 |
Jan
|
Feb
(53) |
Mar
(62) |
Apr
(88) |
May
(55) |
Jun
(204) |
Jul
(52) |
Aug
|
Sep
(1) |
Oct
(94) |
Nov
(15) |
Dec
(68) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2006 |
Jan
(130) |
Feb
(105) |
Mar
(34) |
Apr
(61) |
May
(41) |
Jun
(92) |
Jul
(176) |
Aug
(102) |
Sep
(247) |
Oct
(69) |
Nov
(32) |
Dec
(140) |
| 2007 |
Jan
(58) |
Feb
(51) |
Mar
(11) |
Apr
(20) |
May
(34) |
Jun
(37) |
Jul
(18) |
Aug
(60) |
Sep
(41) |
Oct
(105) |
Nov
(19) |
Dec
(14) |
| 2008 |
Jan
(3) |
Feb
|
Mar
(7) |
Apr
(5) |
May
(123) |
Jun
(5) |
Jul
(1) |
Aug
(29) |
Sep
(15) |
Oct
(21) |
Nov
(51) |
Dec
(3) |
| 2009 |
Jan
|
Feb
(36) |
Mar
(29) |
Apr
|
May
|
Jun
(7) |
Jul
(4) |
Aug
|
Sep
(4) |
Oct
|
Nov
(13) |
Dec
|
| 2010 |
Jan
|
Feb
|
Mar
(9) |
Apr
(11) |
May
(16) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2012 |
Jan
(7) |
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
(92) |
Nov
(28) |
Dec
(16) |
| 2013 |
Jan
(9) |
Feb
(2) |
Mar
|
Apr
(4) |
May
(4) |
Jun
(6) |
Jul
(14) |
Aug
(12) |
Sep
(4) |
Oct
(13) |
Nov
(1) |
Dec
(6) |
| 2014 |
Jan
(23) |
Feb
(19) |
Mar
(10) |
Apr
(14) |
May
(11) |
Jun
(6) |
Jul
(11) |
Aug
(15) |
Sep
(41) |
Oct
(95) |
Nov
(23) |
Dec
(11) |
| 2015 |
Jan
(3) |
Feb
(9) |
Mar
(19) |
Apr
(3) |
May
(1) |
Jun
(3) |
Jul
(11) |
Aug
(1) |
Sep
(15) |
Oct
(5) |
Nov
(2) |
Dec
|
| 2016 |
Jan
(7) |
Feb
(11) |
Mar
(8) |
Apr
(1) |
May
(3) |
Jun
(17) |
Jul
(12) |
Aug
(3) |
Sep
(5) |
Oct
(19) |
Nov
(12) |
Dec
(6) |
| 2017 |
Jan
(30) |
Feb
(23) |
Mar
(12) |
Apr
(32) |
May
(27) |
Jun
(7) |
Jul
(13) |
Aug
(16) |
Sep
(6) |
Oct
(11) |
Nov
|
Dec
(12) |
| 2018 |
Jan
(1) |
Feb
(5) |
Mar
(6) |
Apr
(7) |
May
(23) |
Jun
(3) |
Jul
(2) |
Aug
(1) |
Sep
(6) |
Oct
(6) |
Nov
(10) |
Dec
(3) |
| 2019 |
Jan
(26) |
Feb
(15) |
Mar
(9) |
Apr
|
May
(8) |
Jun
(14) |
Jul
(10) |
Aug
(10) |
Sep
(4) |
Oct
(2) |
Nov
(20) |
Dec
(10) |
| 2020 |
Jan
(10) |
Feb
(14) |
Mar
(29) |
Apr
(11) |
May
(25) |
Jun
(21) |
Jul
(23) |
Aug
(12) |
Sep
(19) |
Oct
(6) |
Nov
(8) |
Dec
(12) |
| 2021 |
Jan
(29) |
Feb
(9) |
Mar
(8) |
Apr
(8) |
May
(2) |
Jun
(2) |
Jul
(9) |
Aug
(9) |
Sep
(3) |
Oct
(4) |
Nov
(12) |
Dec
(13) |
| 2022 |
Jan
(4) |
Feb
|
Mar
(4) |
Apr
(12) |
May
(15) |
Jun
(7) |
Jul
(10) |
Aug
(2) |
Sep
|
Oct
(1) |
Nov
(8) |
Dec
|
| 2023 |
Jan
(15) |
Feb
|
Mar
(23) |
Apr
(1) |
May
(2) |
Jun
(10) |
Jul
|
Aug
(22) |
Sep
(19) |
Oct
(2) |
Nov
(20) |
Dec
|
| 2024 |
Jan
(1) |
Feb
|
Mar
(16) |
Apr
(15) |
May
(6) |
Jun
(4) |
Jul
(1) |
Aug
(1) |
Sep
|
Oct
(13) |
Nov
(18) |
Dec
(6) |
| 2025 |
Jan
(12) |
Feb
|
Mar
(2) |
Apr
(1) |
May
(11) |
Jun
(5) |
Jul
(4) |
Aug
(2) |
Sep
(2) |
Oct
(6) |
Nov
|
Dec
|
|
From: Gustaf N. <ne...@wu...> - 2022-05-18 20:01:13
|
Dear David,
i've committed the option "-fallbackencodings" for the commands
"ns_getform" and "ns_parsequery". The implementation covers
"ns_getform", where the data is provided as
"application/x-www-form-urlencoded" either when parsing from memory or
from the spool file. The "multipart/form-data" implementation (also
separate for memory and spoolfile) is not yet covered.
We can also consider a global parameter for the configuration file (like
e.g. FormFallbackEncodings). Probably, we should use the term "charset"
instead of "encoding", since "charset" is the MIME term, also used for
e.g. "URLCharset", while "encoding" is the Tcl name.
Although the names might still change, you might test whether this works
for your test cases.
-gn
On 16.05.22 16:16, David Osborne wrote:
> Hi Gustaf,
>
> I spotted that *ns_getform *takes a charset argument from looking at
> the source code.
> The options for overriding charsets at the moment seem to be:
>
> *ns_getform iso8859-1
> *
> *
> *
> *ns_urlcharset iso8859-1*
> *ns_getform
> *
> *
> *
> *ns_conn urlencoding iso8859-1
> *
> *ns_getform *
>
> We experimented with some code which tried to trap errors from
> *ns_getform*, and where the error was due to "invalid UTF-8", try a
> fallback charset.
> All 3 of the above techniques worked OK when the Content-Type header
> leaves the charset /unspecified/.
>
> The main issues we had were:
>
> 1. When a *charset=utf-8* is present in the *Content-Type* header,
> this overrides ([1]) any encoding we pass with using the 3 techniques
> above.
> In those cases we have to manipulate the headers' ns_set to remove or
> change the charset.
> eg.
> *Content-Type: application/x-www-form-urlencoded; charset=utf-8*
> transform to ->
> *Content-Type: application/x-www-form-urlencoded*
> or
> *Content-Type: application/x-www-form-urlencoded; charset=windows-1252*
>
> 2. Trapping the specific "invalid UTF-8" error - this method seems
> fragile - would be nice if there was an *errorCode *we would trap.
> *::try {
> *
> * ns_getform*
> *} on error {msg options} {*
> * if { [string match "*contains invalid UTF-8" $msg] } {*
> * # change Content_type charset (if present)*
> * # try fallback charset*
> * } else {*
> * # rethrow error*
> * }*
> *}*
>
> But I think this presents us with a way forward in cases where client
> apps are not getting the encoding correct.
>
> [1]
> https://bitbucket.org/naviserver/naviserver/annotate/master/nsd/form.c?at=master#form.c-170
>
>
> _______________________________________________
> naviserver-devel mailing list
> nav...@li...
> https://lists.sourceforge.net/lists/listinfo/naviserver-devel
--
Univ.Prof. Dr. Gustaf Neumann
Head of the Institute of Information Systems and New Media
of Vienna University of Economics and Business
Program Director of MSc "Information Systems"
|
|
From: David O. <da...@qc...> - 2022-05-16 14:16:53
|
Hi Gustaf,
I spotted that *ns_getform *takes a charset argument from looking at the
source code.
The options for overriding charsets at the moment seem to be:
*ns_getform iso8859-1*
*ns_urlcharset iso8859-1*
*ns_getform *
*ns_conn urlencoding iso8859-1*
*ns_getform *
We experimented with some code which tried to trap errors from *ns_getform*,
and where the error was due to "invalid UTF-8", try a fallback charset.
All 3 of the above techniques worked OK when the Content-Type header leaves
the charset *unspecified*.
The main issues we had were:
1. When a *charset=utf-8* is present in the *Content-Type* header, this
overrides ([1]) any encoding we pass with using the 3 techniques above.
In those cases we have to manipulate the headers' ns_set to remove or
change the charset.
eg.
*Content-Type: application/x-www-form-urlencoded; charset=utf-8*
transform to ->
*Content-Type: application/x-www-form-urlencoded*
or
*Content-Type: application/x-www-form-urlencoded; charset=windows-1252*
2. Trapping the specific "invalid UTF-8" error - this method seems fragile
- would be nice if there was an *errorCode *we would trap.
*::try {*
* ns_getform*
*} on error {msg options} {*
* if { [string match "*contains invalid UTF-8" $msg] } {*
* # change Content_type charset (if present)*
* # try fallback charset*
* } else {*
* # rethrow error*
* }*
*}*
But I think this presents us with a way forward in cases where client apps
are not getting the encoding correct.
[1]
https://bitbucket.org/naviserver/naviserver/annotate/master/nsd/form.c?at=master#form.c-170
|
|
From: Gustaf N. <ne...@wu...> - 2022-05-14 07:59:29
|
Hi Dave, Maybe i find time slots before the release for easing this process, e.g.m providing a flag for providing a charset for "ns_getform" in case it fails, "ns_urldecode" has already a "-charset" flag) but i have not checked the details how complex this is. all the best -g On 13.05.22 10:32, David Osborne wrote: > Thanks Gustaf, > > I didn't pick up that your latest commit makes it possible to catch > and handle an encoding error now. > Thanks - we'll try to address the issue that way. > Regards, > Dave > > On Thu, 12 May 2022 at 12:27, Gustaf Neumann <ne...@wu...> wrote: > > Dear David, > > NaviServer is less strict than the W3C-document, since it does not > send automatically an error back. > Such invalid characters can show up during decode operations of > ns_urldecode and ns_getform. > So, a custom application can catch exceptions and try alternative > encodings if necessary. > > Since there is currently a large refactoring concerning Unicode > handling going on for > the Tcl community (with potentially different handling in Tcl 8.6, > 8.7 and 9.0, ... hopefully > there will be full support for Unicode already in Tcl 8.7, the > voting is happening right now) > it is not a good idea to come up with a special handling by > NaviServer. These byte sequences > have to be processed sooner or later by Tcl in various versions... > > I do not think it is a good idea to swallow incorrect incoming > data by transforming this > on the fly, this will cause sooner or later user concerns (e.g. > "why is this funny character > in the user name", ...) When the legacy application sends e.g. > iso8859 encoded data, then it > should set the appropriate charset, and it will be properly > converted by NaviServer. > > If for whatever reason this is not feasible to get a proper > charset, then the NaviServer > approach allows to make a second attempt of decoding the data with > a different charset. > > all the best > > -gn > > > > _______________________________________________ > naviserver-devel mailing list > nav...@li... > https://lists.sourceforge.net/lists/listinfo/naviserver-devel -- Univ.Prof. Dr. Gustaf Neumann Head of the Institute of Information Systems and New Media of Vienna University of Economics and Business Program Director of MSc "Information Systems" |
|
From: David O. <da...@qc...> - 2022-05-13 08:33:08
|
Thanks Gustaf, I didn't pick up that your latest commit makes it possible to catch and handle an encoding error now. Thanks - we'll try to address the issue that way. Regards, Dave On Thu, 12 May 2022 at 12:27, Gustaf Neumann <ne...@wu...> wrote: > Dear David, > > NaviServer is less strict than the W3C-document, since it does not send > automatically an error back. > Such invalid characters can show up during decode operations of > ns_urldecode and ns_getform. > So, a custom application can catch exceptions and try alternative > encodings if necessary. > > Since there is currently a large refactoring concerning Unicode handling > going on for > the Tcl community (with potentially different handling in Tcl 8.6, 8.7 and > 9.0, ... hopefully > there will be full support for Unicode already in Tcl 8.7, the voting is > happening right now) > it is not a good idea to come up with a special handling by NaviServer. > These byte sequences > have to be processed sooner or later by Tcl in various versions... > > I do not think it is a good idea to swallow incorrect incoming data by > transforming this > on the fly, this will cause sooner or later user concerns (e.g. "why is > this funny character > in the user name", ...) When the legacy application sends e.g. iso8859 > encoded data, then it > should set the appropriate charset, and it will be properly converted by > NaviServer. > > If for whatever reason this is not feasible to get a proper charset, then > the NaviServer > approach allows to make a second attempt of decoding the data with a > different charset. > > all the best > > -gn > |
|
From: Gustaf N. <ne...@wu...> - 2022-05-12 11:27:03
|
Dear David,
NaviServer is less strict than the W3C-document, since it does not send
automatically an error back.
Such invalid characters can show up during decode operations of
ns_urldecode and ns_getform.
So, a custom application can catch exceptions and try alternative
encodings if necessary.
Since there is currently a large refactoring concerning Unicode handling
going on for
the Tcl community (with potentially different handling in Tcl 8.6, 8.7
and 9.0, ... hopefully
there will be full support for Unicode already in Tcl 8.7, the voting is
happening right now)
it is not a good idea to come up with a special handling by NaviServer.
These byte sequences
have to be processed sooner or later by Tcl in various versions...
I do not think it is a good idea to swallow incorrect incoming data by
transforming this
on the fly, this will cause sooner or later user concerns (e.g. "why is
this funny character
in the user name", ...) When the legacy application sends e.g. iso8859
encoded data, then it
should set the appropriate charset, and it will be properly converted by
NaviServer.
If for whatever reason this is not feasible to get a proper charset,
then the NaviServer
approach allows to make a second attempt of decoding the data with a
different charset.
all the best
-gn
On 12.05.22 11:05, David Osborne wrote:
>
> Thanks again Gustaf,
>
> I can see the W3C spec you reference seems quite unequivocal in saying
> an error message should be sent back when decoding invalid UTF-8 form
> data.
>
> But I was curious why other implementations appear to use the UTF-8
> replacement character (U+FFFD) instead, and found a bit of discussion
> in the unicode standard itself [1] & [2].
>
> [1] specifically refers to the WHATWG(W3C) spec for encoding/decoding
> [3] which defines an "error" condition when decoding UTF-8 as being
> one of two possible error modes:
> Namely:
>
> * fatal - "return the error"
> * replacement - "Push U+FFFD (�) to output."
>
> This aligns with the behaviour of, say, Python's bytes.decode() where
> the default is to raise an error for encoding errors ("strict" error
> handling), but optionally, you can specify "replace" error handling
> which will utilise the U+FFFD character instead. I can see this
> working in cases where we're told the data should be UTF-8, or where
> we're assuming by default it's UTF-8.
>
> But I'm not sure how much work this would be to implement and whether
> it is seen as worthwhile to others?
>
> As it stands, we have legacy applications which POSTs data to us which
> regularly (and, by now, expectedly) sends invalid characters despite
> best efforts to fix it.
> I guess we would redirect the POSTs to another non-naviserver system,
> sanitise the data there, then send it on to NaviServer, but it would
> be nice to be able to deal with it within NaviServer itself.
>
> [1] https://www.unicode.org/versions/Unicode14.0.0/ch03.pdf (Section
> 3.9 "U+FFFD Substitution of Maximal Subparts")
> [2] https://www.unicode.org/versions/Unicode14.0.0/ch05.pdf (Section
> 5.22 "U+FFFD Substitution in Conversion")
> [3] https://encoding.spec.whatwg.org/#decoder
> [4] https://docs.python.org/3/library/stdtypes.html#bytes.decode
>
>
> On Mon, 2 May 2022 at 13:30, Gustaf Neumann <ne...@wu...> wrote:
>
> Dear David and all,
>
> I looked into this issue, and I do not like the current situation
> either.
> In the current snapshot, a GET request with invalid coded
> query variables is rejected, while the POST request leads just
> to the warning, and the invalid entry is omitted.
>
> W3C [1] says in the reference for Multilingual form encoding:
> > If non-UTF-8 data is received, an error message should be sent back.
>
> This means, that the only defensible logic is to reject in both cases
> the request as invalid. One can certainly send single-byte funny
> character
> data in URLs, which is invalid UTF8 (e.g. "%9C" or "%E6" etc.),
> but for these requests, the charset has to be specified, either
> via content type, or via the default URL encoding in the NaviServer
> configuration... see example (2) below.
>
> As mentioned earlier, there are increasingly many attacks with invalid
> UTF-8 data (also by vulnerability scanners), so we to be strict here.
>
> I will try to address the outstanding issues ASAP and provide then
> another RC.
>
> All the best
>
> -gn
>
> [1] https://www.w3.org/International/questions/qa-forms-utf-8
>
>
> # POST request with already encoded form data (x-www-form-urlencoded)
> $ curl -X POST -d "p1=a%C5%93Cb&p2=a%E6b" localhost:8100/upload.tcl
>
> # POST request with already encoded form data, but proper encoding
> $ curl -X POST -H "Content-Type: application/x-www-form-urlencoded; charset=iso-8859-1" -d "p2=a%E6b" localhost:8100/upload.tcl
>
> # POST + x-www-form-urlencoded, but let curl do the encoding
> $ curl -X POST -d "p1=aœb" -d $(echo -e 'p2=a\xE6b') localhost:8100/upload.tcl
>
> # POST + multipart/form-data, let curl do the encoding
> $ curl -X POST -F "p1=aœb" -F $(echo -e 'p2=a\xE6b') localhost:8100/upload.tcl
>
> POST request with already encoded form data (x-www-form-urlencoded)
> $ curl -X GET "localhost:8100/upload.tcl?p1=a%C5%93Cb&p2=a%E6b"
>
>
>
>
> _______________________________________________
> naviserver-devel mailing list
> nav...@li...
> https://lists.sourceforge.net/lists/listinfo/naviserver-devel
--
Univ.Prof. Dr. Gustaf Neumann
Head of the Institute of Information Systems and New Media
of Vienna University of Economics and Business
Program Director of MSc "Information Systems"
|
|
From: David O. <da...@qc...> - 2022-05-12 09:05:22
|
Thanks again Gustaf,
I can see the W3C spec you reference seems quite unequivocal in saying an
error message should be sent back when decoding invalid UTF-8 form data.
But I was curious why other implementations appear to use the UTF-8
replacement character (U+FFFD) instead, and found a bit of discussion in
the unicode standard itself [1] & [2].
[1] specifically refers to the WHATWG(W3C) spec for encoding/decoding [3]
which defines an "error" condition when decoding UTF-8 as being one of two
possible error modes:
Namely:
- fatal - "return the error"
- replacement - "Push U+FFFD (�) to output."
This aligns with the behaviour of, say, Python's bytes.decode() where the
default is to raise an error for encoding errors ("strict" error handling),
but optionally, you can specify "replace" error handling which will utilise
the U+FFFD character instead. I can see this working in cases where we're
told the data should be UTF-8, or where we're assuming by default it's
UTF-8.
But I'm not sure how much work this would be to implement and whether it is
seen as worthwhile to others?
As it stands, we have legacy applications which POSTs data to us which
regularly (and, by now, expectedly) sends invalid characters despite best
efforts to fix it.
I guess we would redirect the POSTs to another non-naviserver system,
sanitise the data there, then send it on to NaviServer, but it would be
nice to be able to deal with it within NaviServer itself.
[1] https://www.unicode.org/versions/Unicode14.0.0/ch03.pdf (Section 3.9
"U+FFFD Substitution of Maximal Subparts")
[2] https://www.unicode.org/versions/Unicode14.0.0/ch05.pdf (Section 5.22
"U+FFFD Substitution in Conversion")
[3] https://encoding.spec.whatwg.org/#decoder
[4] https://docs.python.org/3/library/stdtypes.html#bytes.decode
On Mon, 2 May 2022 at 13:30, Gustaf Neumann <ne...@wu...> wrote:
> Dear David and all,
>
> I looked into this issue, and I do not like the current situation either.
> In the current snapshot, a GET request with invalid coded
> query variables is rejected, while the POST request leads just
> to the warning, and the invalid entry is omitted.
>
> W3C [1] says in the reference for Multilingual form encoding:
> > If non-UTF-8 data is received, an error message should be sent back.
>
> This means, that the only defensible logic is to reject in both cases
> the request as invalid. One can certainly send single-byte funny character
> data in URLs, which is invalid UTF8 (e.g. "%9C" or "%E6" etc.),
> but for these requests, the charset has to be specified, either
> via content type, or via the default URL encoding in the NaviServer
> configuration... see example (2) below.
>
> As mentioned earlier, there are increasingly many attacks with invalid
> UTF-8 data (also by vulnerability scanners), so we to be strict here.
>
> I will try to address the outstanding issues ASAP and provide then
> another RC.
>
> All the best
>
> -gn
>
> [1] https://www.w3.org/International/questions/qa-forms-utf-8
>
>
> # POST request with already encoded form data (x-www-form-urlencoded)
> $ curl -X POST -d "p1=a%C5%93Cb&p2=a%E6b" localhost:8100/upload.tcl
>
> # POST request with already encoded form data, but proper encoding
> $ curl -X POST -H "Content-Type: application/x-www-form-urlencoded; charset=iso-8859-1" -d "p2=a%E6b" localhost:8100/upload.tcl
>
> # POST + x-www-form-urlencoded, but let curl do the encoding
> $ curl -X POST -d "p1=aœb" -d $(echo -e 'p2=a\xE6b') localhost:8100/upload.tcl
>
> # POST + multipart/form-data, let curl do the encoding
> $ curl -X POST -F "p1=aœb" -F $(echo -e 'p2=a\xE6b') localhost:8100/upload.tcl
>
> POST request with already encoded form data (x-www-form-urlencoded)
> $ curl -X GET "localhost:8100/upload.tcl?p1=a%C5%93Cb&p2=a%E6b"
>
>
>
|
|
From: Gustaf N. <ne...@wu...> - 2022-05-03 12:39:17
|
Dear all, i have committed a change to achieve a more consistent and compliant behavior. Since all form and query processing of NaviServer happens via API (ns_urldecode, ns_getform), the current architecture does not allow direct error messages. The NaviServer philosophy is that the (Tcl) developer should have the option to handle such cases application specific. We had recently changes to address this (mostly driven by vulnerability scanners) by letting e.g. "ns_urldecode" to raise an exception when this happens. This change completes this by also raising an exception for "ns_getform" in such conditions. Note that raising an exception might be a potential incompatibility for invalid data (which was "swallowed" before). The regression test was extended to handle such cases. There is one more thing (in ns_connchan, so far, not able to reproduce) that i would like to have a look on before making the next release candidate available. all the best -gn On 02.05.22 14:29, Gustaf Neumann wrote: > > Dear David and all, > > I looked into this issue, and I do not like the current situation either. > In the current snapshot, a GET request with invalid coded > query variables is rejected, while the POST request leads just > to the warning, and the invalid entry is omitted. > > W3C [1] says in the reference for Multilingual form encoding: > > If non-UTF-8 data is received, an error message should be sent back. > > This means, that the only defensible logic is to reject in both cases > the request as invalid. One can certainly send single-byte funny character > data in URLs, which is invalid UTF8 (e.g. "%9C" or "%E6" etc.), > but for these requests, the charset has to be specified, either > via content type, or via the default URL encoding in the NaviServer > configuration... see example (2) below. > > As mentioned earlier, there are increasingly many attacks with invalid > UTF-8 data (also by vulnerability scanners), so we to be strict here. > > I will try to address the outstanding issues ASAP and provide then > another RC. > > All the best > > -gn > > [1] https://www.w3.org/International/questions/qa-forms-utf-8 > > > # POST request with already encoded form data (x-www-form-urlencoded) > $ curl -X POST -d "p1=a%C5%93Cb&p2=a%E6b" localhost:8100/upload.tcl > > # POST request with already encoded form data, but proper encoding > $ curl -X POST -H "Content-Type: application/x-www-form-urlencoded; charset=iso-8859-1" -d "p2=a%E6b" localhost:8100/upload.tcl > > # POST + x-www-form-urlencoded, but let curl do the encoding > $ curl -X POST -d "p1=aœb" -d $(echo -e 'p2=a\xE6b') localhost:8100/upload.tcl > > # POST + multipart/form-data, let curl do the encoding > $ curl -X POST -F "p1=aœb" -F $(echo -e 'p2=a\xE6b') localhost:8100/upload.tcl > > POST request with already encoded form data (x-www-form-urlencoded) > $ curl -X GET "localhost:8100/upload.tcl?p1=a%C5%93Cb&p2=a%E6b" > On 28.04.22 17:45, David Osborne wrote: >> Hi Gustaf, >> >> We've been testing *4.99.24 rc1* and it seems pretty solid so far. >> Thanks for all the work that went into it. >> >> One change of behaviour that is causing us issues is the handling of >> invalid UTF8 characters. >> >> We have a system which regularly POSTs data to NaviServer - sometimes >> (for reasons we're looking into) the POST'ed data received by >> NaviServer can contain urlencoded characters which don't exist in >> UTF8 ( for example *%9C* instead of *%C5%93*). >> >> In previous versions of NaviServer, this causes an invalid character >> to be embedded in the data when we save it. >> >> Now, in version 4.99.24 we, rightly, get the warning "*Warning: >> decoded string is invalid UTF-8:*". >> But the additional behaviour is that the entire form variable seems >> to be dropped. >> >> I just wanted to query if that is the intended behaviour? >> >> I've seen some servers convert such invalid characters to *\ufffd* >> (\ufffd being "replacement character" - "used to replace an incoming >> character whose value is unknown or unrepresentable in Unicode") - >> but not sure which is the correct behaviour. >> >> Regards, >> Dave >> >> >> >> >> >> _______________________________________________ >> naviserver-devel mailing list >> nav...@li... >> https://lists.sourceforge.net/lists/listinfo/naviserver-devel > -- > Univ.Prof. Dr. Gustaf Neumann > Head of the Institute of Information Systems and New Media > of Vienna University of Economics and Business > Program Director of MSc "Information Systems" > > > _______________________________________________ > naviserver-devel mailing list > nav...@li... > https://lists.sourceforge.net/lists/listinfo/naviserver-devel -- Univ.Prof. Dr. Gustaf Neumann Head of the Institute of Information Systems and New Media of Vienna University of Economics and Business Program Director of MSc "Information Systems" |
|
From: Gustaf N. <ne...@wu...> - 2022-05-02 12:29:29
|
Dear David and all, I looked into this issue, and I do not like the current situation either. In the current snapshot, a GET request with invalid coded query variables is rejected, while the POST request leads just to the warning, and the invalid entry is omitted. W3C [1] says in the reference for Multilingual form encoding: > If non-UTF-8 data is received, an error message should be sent back. This means, that the only defensible logic is to reject in both cases the request as invalid. One can certainly send single-byte funny character data in URLs, which is invalid UTF8 (e.g. "%9C" or "%E6" etc.), but for these requests, the charset has to be specified, either via content type, or via the default URL encoding in the NaviServer configuration... see example (2) below. As mentioned earlier, there are increasingly many attacks with invalid UTF-8 data (also by vulnerability scanners), so we to be strict here. I will try to address the outstanding issues ASAP and provide then another RC. All the best -gn [1] https://www.w3.org/International/questions/qa-forms-utf-8 # POST request with already encoded form data (x-www-form-urlencoded) $ curl -X POST -d "p1=a%C5%93Cb&p2=a%E6b" localhost:8100/upload.tcl # POST request with already encoded form data, but proper encoding $ curl -X POST -H "Content-Type: application/x-www-form-urlencoded; charset=iso-8859-1" -d "p2=a%E6b" localhost:8100/upload.tcl # POST + x-www-form-urlencoded, but let curl do the encoding $ curl -X POST -d "p1=aœb" -d $(echo -e 'p2=a\xE6b') localhost:8100/upload.tcl # POST + multipart/form-data, let curl do the encoding $ curl -X POST -F "p1=aœb" -F $(echo -e 'p2=a\xE6b') localhost:8100/upload.tcl POST request with already encoded form data (x-www-form-urlencoded) $ curl -X GET "localhost:8100/upload.tcl?p1=a%C5%93Cb&p2=a%E6b" On 28.04.22 17:45, David Osborne wrote: > Hi Gustaf, > > We've been testing *4.99.24 rc1* and it seems pretty solid so far. > Thanks for all the work that went into it. > > One change of behaviour that is causing us issues is the handling of > invalid UTF8 characters. > > We have a system which regularly POSTs data to NaviServer - sometimes > (for reasons we're looking into) the POST'ed data received by > NaviServer can contain urlencoded characters which don't exist in UTF8 > ( for example *%9C* instead of *%C5%93*). > > In previous versions of NaviServer, this causes an invalid character > to be embedded in the data when we save it. > > Now, in version 4.99.24 we, rightly, get the warning "*Warning: > decoded string is invalid UTF-8:*". > But the additional behaviour is that the entire form variable seems to > be dropped. > > I just wanted to query if that is the intended behaviour? > > I've seen some servers convert such invalid characters to *\ufffd* > (\ufffd being "replacement character" - "used to replace an incoming > character whose value is unknown or unrepresentable in Unicode") - but > not sure which is the correct behaviour. > > Regards, > Dave > > > > > > _______________________________________________ > naviserver-devel mailing list > nav...@li... > https://lists.sourceforge.net/lists/listinfo/naviserver-devel -- Univ.Prof. Dr. Gustaf Neumann Head of the Institute of Information Systems and New Media of Vienna University of Economics and Business Program Director of MSc "Information Systems" |
|
From: David O. <da...@qc...> - 2022-04-28 15:46:13
|
Hi Gustaf, We've been testing *4.99.24 rc1* and it seems pretty solid so far. Thanks for all the work that went into it. One change of behaviour that is causing us issues is the handling of invalid UTF8 characters. We have a system which regularly POSTs data to NaviServer - sometimes (for reasons we're looking into) the POST'ed data received by NaviServer can contain urlencoded characters which don't exist in UTF8 ( for example *%9C* instead of *%C5%93*). In previous versions of NaviServer, this causes an invalid character to be embedded in the data when we save it. Now, in version 4.99.24 we, rightly, get the warning "*Warning: decoded string is invalid UTF-8:*". But the additional behaviour is that the entire form variable seems to be dropped. I just wanted to query if that is the intended behaviour? I've seen some servers convert such invalid characters to *\ufffd* (\ufffd being "replacement character" - "used to replace an incoming character whose value is unknown or unrepresentable in Unicode") - but not sure which is the correct behaviour. Regards, Dave |
|
From: David O. <da...@qc...> - 2022-04-26 11:12:18
|
Thanks Gustaf - we've successfully built 4.99.24 rc1 and are in the process of testing it. Much appreciated! On Sun, 10 Apr 2022 at 13:04, Gustaf Neumann <ne...@wu...> wrote: > There are now two changes committed to bitbucket: > > a) Provide an error message when the configured locale is not installed > on the host (misconfiguration) > > This change causes NaviServer to abort, when the configured locale is > not installed on the host. Typically, this locale is e.g. used by > ns_strcoll for determining the default collating order. The > configuration file for the regression testing sets the environment > variable LANG to "en_US.UTF-8". This means that for running the stock > regression test, this locale must be installed on the OS level. > > b) Silence warning with recent versions of gcc when certain values of > _FORTIFY_SOURCE/-Wstringop-overflow are set > > Newer versions of gcc support warning of dangerous operations (such as > e.g. strncat) when these depend on not easy traceable sources. In the > fixed case, the warning was: > > warning: ‘__builtin_strncat’ specified bound depends on > the length of the source argument > > With FORTIFY_SOURCE whenever possible, GCC tries to use buffer-length > aware replacements for functions, which was not possible in the case in > question. The documentation says that with _FORTIFY_SOURCE set to > 2, some more checking is added, but some conforming programs might fail. > > The case for (b) was a false positive, but it is still better to silence > these rather than ignoring it. > > all the best > > -g > > > > |
|
From: Gustaf N. <ne...@wu...> - 2022-04-15 18:26:19
|
Dear all, on sourceforge is a release candidate for NaviServer 4.99.24 [1]. Please test if possible. The release should be in the near future. Below is a preliminary summary of changes. All the best, and have a nice easter weekend! -g [1] https://sourceforge.net/projects/naviserver/files/naviserver/4.99.24/ ======================================= NaviServer 4.99.24, released 2022-04-XX ======================================= 57 files changed, 1840 insertions(+), 824 deletions(-) New Features: - Improved security * Added protection against certain attacks in ns_dbquotevalue Due to the corrected conversion to external UTF-8 in db-output, new potential attack vectors appeared that were protected earlier via the Tcl-internal 'modified UTF-8'. E.g., the binary null character is stored as an overlong (two-byte) encoding of null (0xc0 0x80), so that an actual (embedded) null byte (0x00) never appears in the string. Due to the conversion, the internal representation is translated back to the binary null character. Embedded null byte characters can lead to non-terminated string literals via ns_dbquotevalue. In the updated version of NaviServer, ns_dbquotevalue raises an exception when this occurs. Therefore, the function can be used as well as an input checker (together with "try"). * Raise an exception when trying to use "ns_urldecode" to produce invalid UTF-8 Background: several (external) functions expect valid UTF-8 to be passed in and crash if this is not the case. One such example is tDOM. These nasty byte sequences are used more intensively by vulnerability scanners. Therefore, ns_urldecode raises now an exception, when it tries to convert to invalid UTF-8. It is still possible to use ns_urldecode to convert to other charsets. ns_urldecode -charset iso8859-1 -part path "/mot%C3or" When urldecode() is called internally and would produce invalid UTF-8, it truncates the string (and writes a warning to the system log). - Provide a hint when cache-entry was too large for caching Background: the size of the entry is typically determined after the execution of a potentially expensive query. During the eval of the command, the cache entry is locked and forces a serialization. However, this means that in these cases the situation is worse than without a cache, where some queries can be executed in parallel. We faced the situation of an expected slowdown of the server with many "create entry collision", where due to application matters, an entry was becoming too large. This situation is not easy to debug, especially in stress situations. The log message would have helped a log to identify the cause. - Added support for multibyte numeric entities This change supports conversion of numeric entities representing multibyte characters into HTML in "ns_striphtml" and "ns_unquotehtml". Technically, the numeric entities represent Unicode code points, which are transformed into UTF-8 serialization. Every entity represents a single code point; The values can be provided in decimal or hexadecimal notation. Before this change, only single byte numeric entities were supported. ASCII control characters (decimal 0-31) are ignored as before. - New and extended commands: * ns_unquotehtml /text/ This command is the inverse operation of "ns_quotehtml". It replaces the named and numeric entities in the provided string with the native values. The command is similar to "ns_striphtml", but "ns_striphtml" removes as well other HTML markup which might not be desired in all cases. This change fixes as well a bug with numeric entities (the old code assumed, these are starting directly with a number after the ampersand) and it adds support for numeric entities with hexadecimal values (so far with the same value range as for decimal numeric entities). * ns_subnetmatch /subnet/ /ipaddr/ Determine, if a provided IP address (IPv4 or IPv6) is included in a subnet specification, which is provided in CIDR notation. The command makes internal NaviServer functionality available at the Tcl level. The regression test was extended to cover the functionality. The command ns_subnetmatch validates the provided subnet specification (IPv4 or IPv6 address followed by slash and number of significant bits) and the provided IP address and tests whether the IP address is in the implied range. The command returns a boolean value as the result. When comparing an IPv4 and IPv6 address/CIDR specification or vice versa, the result is always false. The function can be use when e.g. restricting access to certain functionality to some subnets. The function can be used as well to check, whether an IP address is an IPv4 or IPv6 address. Examples: % ns_subnetmatch 137.208.0.0/16 137.208.116.31 1 % ns_subnetmatch 137.208.0.0/16 112.207.16.33 0 % ns_subnetmatch 2001:628:404:74::31/64 [ns_conn peeraddr] ... # Is IP address a valid IPv6 address? % ns_subnetmatch ::/0 $ip # Is IP address a valid IPv4 address? % ns_subnetmatch 0.0.0.0/0 $ip * ns_connchan: Added new subcommand "ns_connchan connect" "ns_connchan connect" is similar to "ns_connchan open", except that it does not send an HTTP request (HTTP method, URL, and header fields) but just opens the connection. It can be used for some non-HTTP communication over TCP and TLS over the ns_connchan infrastructure. * ns_parseheader, Ns_ParseHeader(): return the field number (index) of the parsed entry Previously, there was no explicit feedback, what field of an "ns_set" has been parsed by "ns_parseheader". Now, in success cases, the function returns the index of the new/modified entry. This function made it possible to generalize and simplify the Tcl-level parsing of "multipart/form-data" significantly. Additionally, a new optional argument "-prefix" was added. When specified, it adds the specified prefix to the key. API changes: - ns_parsequery: added option "-charset" and raise exception on failure The new option "-charset" can be used to add a charset for the result encoding of the passed-in HTTP query. In case the charset is UTF8 (default on most platforms), and the content is invalid UTF-8, an exception is raised (similar to ns_urldecode). - Better Unicode support, including emojis requiring 4-byte UTF-8 characters. - ns_trim enhancements: The new option "-prefix ..." can be used to strip a string (such as ">> ") from every line starting with it. Performance Improvements: Bug Fixes: - Improved robustness of "ns_parseurl" for handling query parameters and fragments for partial URLs * fix over-eager collecting of URL components in tail * extended regression test - Fixed Ns_ResetFileVec NOT to invalidate residual Ns_FileVec buffer.q (caused problems under Windows). - ns_striphtml: Fixed probably very old bug for markup immediately after an entity This bug fix handles cases, where e.g. two entities are in a text right next to each other, like e.g. in the string "hello<>world". The old code was correctly decoding the first entity, but output the second one literally. - Fixed compilation for C++, which was introduced in 4.99.23 to avoid usage of reserved C identifiers Many thanks to Brendan Graves for reporting the problem. - Added missing named entities "apos" and "quote". These have been missing since ages. - Provide an error message when the configured locale is not installed on the host. This change causes NaviServer to abort, when the configured locale is not installed on the host. Typically, this locale is e.g. used by "ns_strcoll" for determining the default collating order. The configuration file for the regression testing sets the environment variable LANG to "en_US.UTF-8". This means that for running the stock regression test, this locale must be installed on the system. Before this change, NaviServer could crash at runtime when trying to access the default locale (as e.g. in "ns_strcoll") Documentation improvements: --------------------------- - Improved the following man pages: doc/src/manual/admin-install.man doc/src/naviserver/ns_conn.man doc/src/naviserver/ns_connchan.man doc/src/naviserver/ns_crypto.man doc/src/naviserver/ns_http.man doc/src/naviserver/ns_httptime.man doc/src/naviserver/ns_log.man doc/src/naviserver/ns_parseheader.man doc/src/naviserver/ns_parsequery.man doc/src/naviserver/ns_parseurl.man doc/src/naviserver/ns_rlimit.man doc/src/naviserver/ns_subnetmatch.man doc/src/naviserver/ns_urldecode.man doc/src/naviserver/ns_urlencode.man doc/src/naviserver/textutil-cmds.man nsdb/doc/mann/ns_db.man Configuration Changes: ---------------------- - Updated OpenACS sample configuration file * reflect recent Oracle (tested with Oracle 19c) * added documentation for "StaticCSP", "CookieNamespace", "NsShutdownWithNonZeroExitCode", "LogIncludeUserId" Code Changes: ------------- - Extended regression test - Improve Tcl version compatibility * Removed -DTCL_NO_DEPRECATED from default CFLAGS to cope with recent deprecation in Tcl 8.7a5 - Code Cleanup . Do not declare reserved C identifiers . Improved type cleanness - Improved comments, fixed typos - Marked "ns_set_precision" as deprecated, since there is no reason why not setting the Tcl variable ::tcl_precision directly. - Don't hard-wire port for https testing to 8443 The setup code looks now for a free port for HTTPS connections starting with 8443, and remembers the free port in the configuration value "tls_listenport" and "tls_listenurl". This is now fully analogous to the setup of the plain HTTP testing (setting "listenport" and "listenurl") - Silence warning with recent versions of gcc when certain values of _FORTIFY_SOURCE/-Wstringop-overflow are set. Changes in NaviServer Modules: ============================== ... |
|
From: Gustaf N. <ne...@wu...> - 2022-04-10 12:04:22
|
There are now two changes committed to bitbucket: a) Provide an error message when the configured locale is not installed on the host (misconfiguration) This change causes NaviServer to abort, when the configured locale is not installed on the host. Typically, this locale is e.g. used by ns_strcoll for determining the default collating order. The configuration file for the regression testing sets the environment variable LANG to "en_US.UTF-8". This means that for running the stock regression test, this locale must be installed on the OS level. b) Silence warning with recent versions of gcc when certain values of _FORTIFY_SOURCE/-Wstringop-overflow are set Newer versions of gcc support warning of dangerous operations (such as e.g. strncat) when these depend on not easy traceable sources. In the fixed case, the warning was: warning: ‘__builtin_strncat’ specified bound depends on the length of the source argument With FORTIFY_SOURCE whenever possible, GCC tries to use buffer-length aware replacements for functions, which was not possible in the case in question. The documentation says that with _FORTIFY_SOURCE set to 2, some more checking is added, but some conforming programs might fail. The case for (b) was a false positive, but it is still better to silence these rather than ignoring it. all the best -g |
|
From: Gustaf N. <ne...@wu...> - 2022-04-08 20:09:53
|
Dear David, The problem with manual compilation on minimal Debian installations is that the locale "en_US.UTF-8" is there not installed by default. The configuration file for the test sets LC_COLLATE to this value such that the collate test will work. .... The problem with Debian is that when LC_COLLATE is set to any of the predefined values, that crash will go away, but the test will fail (leading to a different comparison result). So we have either to skip the ns_stroll tests if no proper locale is defined, or refuse to start naviserver, if the local is not installed. Since other Unixes have the locale "en_US.UTF-8" installed by default, it is probably the best to require its installation during startup of nsd. This can avoid surprises later. See below for its installation of the missing locale under Debian; with this set-up, the problem with strcoll_l() will disappear. all the best -gn $ locale -a C C.UTF-8 POSIX $ sed -i 's/^# *\(en_US.UTF-8\)/\1/' /etc/locale.gen $ locale-gen $ locale -a C C.UTF-8 POSIX en_US.utf8 On 07.04.22 13:17, David Osborne wrote: > Thanks very much, > > For versions of Tcl less than 8.6.11 we're failing because it's a new > test exposing an old problem, is that correct? > This would explain why I don't see any test failures after > building v4.99.22 with Tcl 8.6.9 - it's only because 4.99.23 has > introduced the tests and not that 4.99.22 doesn't have the problem. > > To test Tcl8.6.11 the easiest way for me is to jump to bullseye > (Debian v11) which provides 8.6.11+dfsg-1 > > Strangely I seem to get the same ns_strcoll seg fault with that. > But if I remove misc.test temporarily, all other tests pass happily. > > # uname -a > Linux ip-172-0-1-190 5.10.0-12-cloud-amd64 #1 SMP Debian 5.10.103-1 > (2022-03-07) x86_64 GNU/Linux > > # cat /etc/debian_version > 11.3 > > # ls -l /lib/x86_64-linux-gnu/libc.so.6 > lrwxrwxrwx 1 root root 12 Mar 17 21:37 /lib/x86_64-linux-gnu/libc.so.6 > -> libc-2.31.so <http://libc-2.31.so> > > # apt-cache policy tcl8.6 > tcl8.6: > Installed: 8.6.11+dfsg-1 > > # git clone https://bitbucket.org/naviserver/naviserver.git > Cloning into 'naviserver'... > > # cd naviserver > # git checkout tags/naviserver-4.99.23 > Note: switching to 'tags/naviserver-4.99.23'. > > # ./autogen.sh --with-tcl=/usr/lib/tcl8.6 --enable-rpath > --enable-threads --enable-symbols > # make > > Compiler warning for reference: > > gcc -Wall -fPIC -g -O2 > -fdebug-prefix-map=/build/tcl8.6-qxVr7a/tcl8.6-8.6.11+dfsg=. > -fstack-protector-strong -Wformat -Werror=format-security > -fno-unit-at-a-time -pipe -Wdate-time -D_FORTIFY_SOURCE=2 -DNDEBUG > -DSYSTEM_MALLOC -DTCL_NO_DEPRECATED -std=c99 -I../include > -I"/usr/include/tcl8.6" -DHAVE_CONFIG_H -c -o tclenv.o tclenv.c > In file included from /usr/include/string.h:495, > from ../include/nsthread.h:378, > from ../include/ns.h:46, > from nsd.h:38, > from tclenv.c:37: > In function ‘strncat’, > inlined from ‘PutEnv’ at tclenv.c:349:13: > /usr/include/x86_64-linux-gnu/bits/string_fortified.h:136:10: warning: > ‘__builtin_strncat’ specified bound depends on the length of the > source argument [ > ]8;;https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wstringop-overflow=-Wstringop-overflow= > ]8;;] > 136 | return __builtin___strncat_chk (__dest, __src, __len, __bos > (__dest)); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > tclenv.c: In function ‘PutEnv’: > tclenv.c:314:23: note: length computed here > 314 | valueLength = strlen(value) + 1; > | ^~~~~~~~~~~~~ > > # make memcheck TESTFLAGS="-verbose start -file misc.test" > ---- ns_random-1.1 start > ---- ns_fmttime-1.0 start > ---- ns_fmttime-1.1 start > ---- ns_trim-0.0 start > ---- ns_trim-0.1 start > ---- ns_trim-0.2 start > ---- ns_trim-1.1 start > ---- ns_trim-1.2 start > ---- ns_trim-1.3 start > ---- ns_trim-1.4 start > ---- ns_trim-1.5 start > ---- ns_trim-2.1 start > ---- ns_trim-2.2 start > ---- ns_quotehtml start > ---- ns_strcoll-1.0.0 start > ==37899== Thread 2: > ==37899== Invalid read of size 8 > ==37899== at 0x49E1361: strcoll_l (strcoll_l.c:260) > ==37899== by 0x48DA9FF: NsTclStrcollObjCmd (tclmisc.c:2802) > ==37899== by 0x4BBC4A1: TclNRRunCallbacks (in > /usr/lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so>) > ==37899== by 0x4BBD71F: ??? (in > /usr/lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so>) > ==37899== by 0x4C794D8: Tcl_FSEvalFileEx (in > /usr/lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so>) > ==37899== by 0x4C818AD: Tcl_MainEx (in > /usr/lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so>) > ==37899== by 0x4B745AF: NsThreadMain (thread.c:232) > ==37899== by 0x4B75A48: ThreadMain (pthread.c:870) > ==37899== by 0x521BEA6: start_thread (pthread_create.c:477) > ==37899== by 0x4A4DDEE: clone (clone.S:95) > ==37899== Address 0x18 is not stack'd, malloc'd or (recently) free'd > ==37899== > ==37899== > ==37899== Process terminating with default action of signal 11 (SIGSEGV) > ==37899== Access not within mapped region at address 0x18 > ==37899== at 0x49E1361: strcoll_l (strcoll_l.c:260) > ==37899== by 0x48DA9FF: NsTclStrcollObjCmd (tclmisc.c:2802) > ==37899== by 0x4BBC4A1: TclNRRunCallbacks (in > /usr/lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so>) > ==37899== by 0x4BBD71F: ??? (in > /usr/lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so>) > ==37899== by 0x4C794D8: Tcl_FSEvalFileEx (in > /usr/lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so>) > ==37899== by 0x4C818AD: Tcl_MainEx (in > /usr/lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so>) > ==37899== by 0x4B745AF: NsThreadMain (thread.c:232) > ==37899== by 0x4B75A48: ThreadMain (pthread.c:870) > ==37899== by 0x521BEA6: start_thread (pthread_create.c:477) > ==37899== by 0x4A4DDEE: clone (clone.S:95) > ==37899== If you believe this happened as a result of a stack > ==37899== overflow in your program's main thread (unlikely but > ==37899== possible), you can try to increase the size of the > ==37899== main thread stack using the --main-stacksize= flag. > ==37899== The main thread stack size used in this run was 8388608. > ==37899== > ==37899== HEAP SUMMARY: > ==37899== in use at exit: 12,499,085 bytes in 8,840 blocks > ==37899== total heap usage: 12,059 allocs, 3,219 frees, 29,314,210 > bytes allocated > ==37899== > ==37899== LEAK SUMMARY: > ==37899== definitely lost: 131 bytes in 1 blocks > ==37899== indirectly lost: 0 bytes in 0 blocks > ==37899== possibly lost: 10,466,879 bytes in 2,978 blocks > ==37899== still reachable: 2,032,075 bytes in 5,861 blocks > ==37899== suppressed: 0 bytes in 0 blocks > ==37899== Rerun with --leak-check=full to see details of leaked memory > ==37899== > ==37899== For lists of detected and suppressed errors, rerun with: -s > ==37899== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) > Segmentation fault > make: *** [Makefile:273: memcheck] Error 139 > > > > On Wed, 6 Apr 2022 at 16:58, Gustaf Neumann <ne...@wu...> wrote: > > > On 06.04.22 16:46, David Osborne wrote: >> >> On Wed, 6 Apr 2022 at 14:53, Gustaf Neumann <ne...@wu...> wrote: >> >> Hi David, >> >> i will setup a VM for testing in your configuration, but >> first i have to >> understand, what pt1/pt2 means. >> >> * >> * >> *Sorry that is just an abbreviation for "part1" and "part2" of a >> 2 part email. >> * > > ok, i thought there is a version called "Debian Buster pt1".... > but could not find insights via googling :) > >> *"tcl8.6" debian supplied package version 8.6.9+dfsg-2* > > This seems to be a part of the problem. Tcl 8.6.9 was released in > nov 2018 and has > probably some issues with UTF-8 which were fixed in later releases. > > i have just now installed NaviServer on a fresh Debian Buster > machine using my usual install script [1] (using Tcl 8.6.11) and > everything looks ok. It is not unlikely that the problem with > ns_strcoll is related, since one has to translate the "internal" > UTF-8 to the external variant before calling "strcoll_l()", so, > when this step is broken, then there might be some invalid memory > around. > > For you, it would the best to use a newer version of Tcl. There > are newer Debian packages of Tcl around... > > https://packages.debian.org/search?keywords=tcl > > Is this an option for you? > > Not sure, how NaviServer could address the problem. Deactivating > the ns_strcoll command in NaviServer when it is compiled with Tcl > 8.6.9 or older, is probably no good option, since the > UTF-to-external conversion is now all over the place and the > problem will pop up at other places. We can consider deactivating > the UTF-to-external conversion altogether for older Tcl version > (requires several changes, including PostgreSQL driver) ... but > the many tests will fail as well, which have to be deactivated as > well. > > What do you think? > > -gn > > > > _______________________________________________ > naviserver-devel mailing list > nav...@li... > https://lists.sourceforge.net/lists/listinfo/naviserver-devel -- Univ.Prof. Dr. Gustaf Neumann Head of the Institute of Information Systems and New Media of Vienna University of Economics and Business Program Director of MSc "Information Systems" |
|
From: David O. <da...@qc...> - 2022-04-07 11:17:26
|
Thanks very much, For versions of Tcl less than 8.6.11 we're failing because it's a new test exposing an old problem, is that correct? This would explain why I don't see any test failures after building v4.99.22 with Tcl 8.6.9 - it's only because 4.99.23 has introduced the tests and not that 4.99.22 doesn't have the problem. To test Tcl8.6.11 the easiest way for me is to jump to bullseye (Debian v11) which provides 8.6.11+dfsg-1 Strangely I seem to get the same ns_strcoll seg fault with that. But if I remove misc.test temporarily, all other tests pass happily. # uname -a Linux ip-172-0-1-190 5.10.0-12-cloud-amd64 #1 SMP Debian 5.10.103-1 (2022-03-07) x86_64 GNU/Linux # cat /etc/debian_version 11.3 # ls -l /lib/x86_64-linux-gnu/libc.so.6 lrwxrwxrwx 1 root root 12 Mar 17 21:37 /lib/x86_64-linux-gnu/libc.so.6 -> libc-2.31.so # apt-cache policy tcl8.6 tcl8.6: Installed: 8.6.11+dfsg-1 # git clone https://bitbucket.org/naviserver/naviserver.git Cloning into 'naviserver'... # cd naviserver # git checkout tags/naviserver-4.99.23 Note: switching to 'tags/naviserver-4.99.23'. # ./autogen.sh --with-tcl=/usr/lib/tcl8.6 --enable-rpath --enable-threads --enable-symbols # make Compiler warning for reference: gcc -Wall -fPIC -g -O2 -fdebug-prefix-map=/build/tcl8.6-qxVr7a/tcl8.6-8.6.11+dfsg=. -fstack-protector-strong -Wformat -Werror=format-security -fno-unit-at-a-time -pipe -Wdate-time -D_FORTIFY_SOURCE=2 -DNDEBUG -DSYSTEM_MALLOC -DTCL_NO_DEPRECATED -std=c99 -I../include -I"/usr/include/tcl8.6" -DHAVE_CONFIG_H -c -o tclenv.o tclenv.c In file included from /usr/include/string.h:495, from ../include/nsthread.h:378, from ../include/ns.h:46, from nsd.h:38, from tclenv.c:37: In function ‘strncat’, inlined from ‘PutEnv’ at tclenv.c:349:13: /usr/include/x86_64-linux-gnu/bits/string_fortified.h:136:10: warning: ‘__builtin_strncat’ specified bound depends on the length of the source argument [ ]8;; https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wstringop-overflow=-Wstringop-overflow= ]8;;] 136 | return __builtin___strncat_chk (__dest, __src, __len, __bos (__dest)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tclenv.c: In function ‘PutEnv’: tclenv.c:314:23: note: length computed here 314 | valueLength = strlen(value) + 1; | ^~~~~~~~~~~~~ # make memcheck TESTFLAGS="-verbose start -file misc.test" ---- ns_random-1.1 start ---- ns_fmttime-1.0 start ---- ns_fmttime-1.1 start ---- ns_trim-0.0 start ---- ns_trim-0.1 start ---- ns_trim-0.2 start ---- ns_trim-1.1 start ---- ns_trim-1.2 start ---- ns_trim-1.3 start ---- ns_trim-1.4 start ---- ns_trim-1.5 start ---- ns_trim-2.1 start ---- ns_trim-2.2 start ---- ns_quotehtml start ---- ns_strcoll-1.0.0 start ==37899== Thread 2: ==37899== Invalid read of size 8 ==37899== at 0x49E1361: strcoll_l (strcoll_l.c:260) ==37899== by 0x48DA9FF: NsTclStrcollObjCmd (tclmisc.c:2802) ==37899== by 0x4BBC4A1: TclNRRunCallbacks (in /usr/lib/x86_64-linux-gnu/ libtcl8.6.so) ==37899== by 0x4BBD71F: ??? (in /usr/lib/x86_64-linux-gnu/libtcl8.6.so) ==37899== by 0x4C794D8: Tcl_FSEvalFileEx (in /usr/lib/x86_64-linux-gnu/ libtcl8.6.so) ==37899== by 0x4C818AD: Tcl_MainEx (in /usr/lib/x86_64-linux-gnu/ libtcl8.6.so) ==37899== by 0x4B745AF: NsThreadMain (thread.c:232) ==37899== by 0x4B75A48: ThreadMain (pthread.c:870) ==37899== by 0x521BEA6: start_thread (pthread_create.c:477) ==37899== by 0x4A4DDEE: clone (clone.S:95) ==37899== Address 0x18 is not stack'd, malloc'd or (recently) free'd ==37899== ==37899== ==37899== Process terminating with default action of signal 11 (SIGSEGV) ==37899== Access not within mapped region at address 0x18 ==37899== at 0x49E1361: strcoll_l (strcoll_l.c:260) ==37899== by 0x48DA9FF: NsTclStrcollObjCmd (tclmisc.c:2802) ==37899== by 0x4BBC4A1: TclNRRunCallbacks (in /usr/lib/x86_64-linux-gnu/ libtcl8.6.so) ==37899== by 0x4BBD71F: ??? (in /usr/lib/x86_64-linux-gnu/libtcl8.6.so) ==37899== by 0x4C794D8: Tcl_FSEvalFileEx (in /usr/lib/x86_64-linux-gnu/ libtcl8.6.so) ==37899== by 0x4C818AD: Tcl_MainEx (in /usr/lib/x86_64-linux-gnu/ libtcl8.6.so) ==37899== by 0x4B745AF: NsThreadMain (thread.c:232) ==37899== by 0x4B75A48: ThreadMain (pthread.c:870) ==37899== by 0x521BEA6: start_thread (pthread_create.c:477) ==37899== by 0x4A4DDEE: clone (clone.S:95) ==37899== If you believe this happened as a result of a stack ==37899== overflow in your program's main thread (unlikely but ==37899== possible), you can try to increase the size of the ==37899== main thread stack using the --main-stacksize= flag. ==37899== The main thread stack size used in this run was 8388608. ==37899== ==37899== HEAP SUMMARY: ==37899== in use at exit: 12,499,085 bytes in 8,840 blocks ==37899== total heap usage: 12,059 allocs, 3,219 frees, 29,314,210 bytes allocated ==37899== ==37899== LEAK SUMMARY: ==37899== definitely lost: 131 bytes in 1 blocks ==37899== indirectly lost: 0 bytes in 0 blocks ==37899== possibly lost: 10,466,879 bytes in 2,978 blocks ==37899== still reachable: 2,032,075 bytes in 5,861 blocks ==37899== suppressed: 0 bytes in 0 blocks ==37899== Rerun with --leak-check=full to see details of leaked memory ==37899== ==37899== For lists of detected and suppressed errors, rerun with: -s ==37899== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) Segmentation fault make: *** [Makefile:273: memcheck] Error 139 On Wed, 6 Apr 2022 at 16:58, Gustaf Neumann <ne...@wu...> wrote: > > On 06.04.22 16:46, David Osborne wrote: > > > On Wed, 6 Apr 2022 at 14:53, Gustaf Neumann <ne...@wu...> wrote: > >> Hi David, >> >> i will setup a VM for testing in your configuration, but first i have to >> understand, what pt1/pt2 means. >> > > > *Sorry that is just an abbreviation for "part1" and "part2" of a 2 part > email. * > > ok, i thought there is a version called "Debian Buster pt1".... but could > not find insights via googling :) > > > *"tcl8.6" debian supplied package version 8.6.9+dfsg-2* > > This seems to be a part of the problem. Tcl 8.6.9 was released in nov 2018 > and has > probably some issues with UTF-8 which were fixed in later releases. > > i have just now installed NaviServer on a fresh Debian Buster machine > using my usual install script [1] (using Tcl 8.6.11) and everything looks > ok. It is not unlikely that the problem with ns_strcoll is related, since > one has to translate the "internal" UTF-8 to the external variant before > calling "strcoll_l()", so, when this step is broken, then there might be > some invalid memory around. > > For you, it would the best to use a newer version of Tcl. There are newer > Debian packages of Tcl around... > > https://packages.debian.org/search?keywords=tcl > > Is this an option for you? > > Not sure, how NaviServer could address the problem. Deactivating the > ns_strcoll command in NaviServer when it is compiled with Tcl 8.6.9 or > older, is probably no good option, since the UTF-to-external conversion is > now all over the place and the problem will pop up at other places. We can > consider deactivating the UTF-to-external conversion altogether for older > Tcl version (requires several changes, including PostgreSQL driver) ... but > the many tests will fail as well, which have to be deactivated as well. > > What do you think? > > -gn > |
|
From: Gustaf N. <ne...@wu...> - 2022-04-06 15:57:38
|
On 06.04.22 16:46, David Osborne wrote: > > On Wed, 6 Apr 2022 at 14:53, Gustaf Neumann <ne...@wu...> wrote: > > Hi David, > > i will setup a VM for testing in your configuration, but first i > have to > understand, what pt1/pt2 means. > > * > * > *Sorry that is just an abbreviation for "part1" and "part2" of a 2 > part email. > * ok, i thought there is a version called "Debian Buster pt1".... but could not find insights via googling :) > *"tcl8.6" debian supplied package version 8.6.9+dfsg-2* This seems to be a part of the problem. Tcl 8.6.9 was released in nov 2018 and has probably some issues with UTF-8 which were fixed in later releases. i have just now installed NaviServer on a fresh Debian Buster machine using my usual install script [1] (using Tcl 8.6.11) and everything looks ok. It is not unlikely that the problem with ns_strcoll is related, since one has to translate the "internal" UTF-8 to the external variant before calling "strcoll_l()", so, when this step is broken, then there might be some invalid memory around. For you, it would the best to use a newer version of Tcl. There are newer Debian packages of Tcl around... https://packages.debian.org/search?keywords=tcl Is this an option for you? Not sure, how NaviServer could address the problem. Deactivating the ns_strcoll command in NaviServer when it is compiled with Tcl 8.6.9 or older, is probably no good option, since the UTF-to-external conversion is now all over the place and the problem will pop up at other places. We can consider deactivating the UTF-to-external conversion altogether for older Tcl version (requires several changes, including PostgreSQL driver) ... but the many tests will fail as well, which have to be deactivated as well. What do you think? -gn # cat /etc/debian_version 10.12 ... [06/Apr/2022:17:06:44][23784.7fa1b0695800][-main:conf-] Notice: nsmain: NaviServer/4.99.23 (tar-4.99.23) starting ... all.tcl: Total 1860 Passed 1834 Skipped 26 Failed 0 Sourced 72 Test Files. Number of tests skipped for each constraint: 20 !usingExternalToUtf 2 binaryMismatch 1 copyAliasBug 2 knownBug 1 stress root@buster:/usr/local/src/naviserver-4.99.23# uname -a Linux buster 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2 (2020-04-29) x86_64 GNU/Linux root@buster:/usr/local/src/naviserver-4.99.23# ls -l /lib/x86_64-linux-gnu/libc.so.6 lrwxrwxrwx 1 root root 12 Mar 15 22:48 /lib/x86_64-linux-gnu/libc.so.6 -> libc-2.28.so root@buster:/usr/local/src/naviserver-4.99.23# make memcheck TESTFLAGS="-verbose start -file misc.test" ... ---- ns_trim-1.5 start ---- ns_trim-2.1 start ---- ns_trim-2.2 start ---- ns_quotehtml start ---- ns_strcoll-1.0.0 start ---- ns_strcoll-1.0.1 start ---- ns_strcoll-1.0.2 start ---- ns_strcoll-1.1 start ---- ns_strcoll-1.2 start [1] https://github.com/gustafn/install-ns > The setup based on install-ns.sh [1] of the release was tested with: > > macOS 11.6.2, Rocky Linux 8.5, Ubuntu 20.04, OpenBSD 6.9, > FreeBSD 13.1 > > The situation of strcoll is also platform dependent (depends on > version > of libc, e.g. the detail behavior is different on *BSD to the tested > Linux versions - but identical on the test cases). Maybe we have to > deactivate it for some platforms until these contain working versions. > > all the best > > -g > > [1] https://github.com/gustafn/install-ns > > On 06.04.22 13:24, David Osborne wrote: > > Hi there, > > > > We're trying to do a build of the official NaviServer v4.99.23 > release > > (from the sourceforge tarball) on Debian Buster (10.12) but we're > > getting some failed tests. > > > > First one is encoding_ns_http-1.1 > > > > Seems it's serving an Emoji but the expected content-length is > wrong > > upon receiving it. > > I see there's been some discussion about Emoji support since > 4.99.23 > > (which I don't fully understand) - not sure if that's relevant > here.. > > > > * Test in question: > > > > test encoding_ns_http-1.1 { > > Send body with ns_return and charset utf-8 > > } -constraints usingExternalToUtf -setup { > > ns_register_proc GET /encoding { > > ns_return 200 "text/plain; charset=utf-8" "äöü😃" > > } > > } -body { > > set result [ns_http run [ns_config test listenurl]/encoding] > > set headers [dict get $result headers] > > list [dict get $result status] \ > > [ns_set iget $headers Content-Type] \ > > [ns_set iget $headers Content-Length] \ > > [dict get $result body] > > } -cleanup { > > ns_unregister_op GET /encoding > > } -result [list 200 "text/plain; charset=utf-8" 10 "äöü😃"] > > > > > > * Reproduction steps below. > > > > $ uname -a > > Linux ip-172-0-1-61 4.19.0-18-cloud-amd64 #1 SMP Debian 4.19.208-1 > > (2021-09-29) x86_64 GNU/Linux > > $ cat /etc/debian_version > > 10.12 > > $ locale > > LANG=C.UTF-8 > > LANGUAGE= > > LC_CTYPE="C.UTF-8" > > LC_NUMERIC="C.UTF-8" > > LC_TIME="C.UTF-8" > > LC_COLLATE="C.UTF-8" > > LC_MONETARY="C.UTF-8" > > LC_MESSAGES="C.UTF-8" > > LC_PAPER="C.UTF-8" > > LC_NAME="C.UTF-8" > > LC_ADDRESS="C.UTF-8" > > LC_TELEPHONE="C.UTF-8" > > LC_MEASUREMENT="C.UTF-8" > > LC_IDENTIFICATION="C.UTF-8" > > LC_ALL= > > > > $ apt-get install build-essential git automake tcl8.6-dev libssl-dev > > $ cd naviserver-4.99.23 > > $ ./autogen.sh --disable-ipv6 --with-tcl=/usr/lib/tcl8.6 > > --enable-rpath --enable-threads --enable-symbols > > $ make > > $ make test > > <snip> > > ==== encoding_ns_http-1.1 Send body with ns_return and charset > utf-8 > > FAILED > > ==== Contents of test case: > > > > set result [ns_http run [ns_config test listenurl]/encoding] > > set headers [dict get $result headers] > > list [dict get $result status] [ns_set iget $headers > > Content-Type] [ns_set iget $headers Content-Length] [dict get > > $result body] > > > > ---- Result was: > > 200 {text/plain; charset=utf-8} 14 äöü😃 > > ---- Result should have been (exact matching): > > 200 {text/plain; charset=utf-8} 10 äöü😃 > > ==== encoding_ns_http-1.1 FAILED > > > > Can anyone shed any light? > > Regards, > > -- > > David > > > > _______________________________________________ > naviserver-devel mailing list > nav...@li... > https://lists.sourceforge.net/lists/listinfo/naviserver-devel > > > > -- > David > > > _______________________________________________ > naviserver-devel mailing list > nav...@li... > https://lists.sourceforge.net/lists/listinfo/naviserver-devel -- Univ.Prof. Dr. Gustaf Neumann Head of the Institute of Information Systems and New Media of Vienna University of Economics and Business Program Director of MSc "Information Systems" |
|
From: David O. <da...@qc...> - 2022-04-06 14:47:12
|
Thanks - answers inline.... On Wed, 6 Apr 2022 at 14:53, Gustaf Neumann <ne...@wu...> wrote: > Hi David, > > i will setup a VM for testing in your configuration, but first i have to > understand, what pt1/pt2 means. > *Sorry that is just an abbreviation for "part1" and "part2" of a 2 part email.I thought it would be confusing reporting both problems in the same email so I split it (and therefore unintentionally made it more confusing!).Both problems are occuring on the exact same build.* > Is it sufficient to set up a Debian Buster with all available updates? > *Yes. I used a standard buster **amd64 **image on AWS then ran "apt-get upgrade" before building NaviServer* > > The emoji/UTF-8 problem points to a Tcl problem. What exact version of > Tcl is used in this installation? > > *"tcl8.6" debian supplied package version 8.6.9+dfsg-2* > The setup based on install-ns.sh [1] of the release was tested with: > > macOS 11.6.2, Rocky Linux 8.5, Ubuntu 20.04, OpenBSD 6.9, FreeBSD 13.1 > > The situation of strcoll is also platform dependent (depends on version > of libc, e.g. the detail behavior is different on *BSD to the tested > Linux versions - but identical on the test cases). Maybe we have to > deactivate it for some platforms until these contain working versions. > > all the best > > -g > > [1] https://github.com/gustafn/install-ns > > On 06.04.22 13:24, David Osborne wrote: > > Hi there, > > > > We're trying to do a build of the official NaviServer v4.99.23 release > > (from the sourceforge tarball) on Debian Buster (10.12) but we're > > getting some failed tests. > > > > First one is encoding_ns_http-1.1 > > > > Seems it's serving an Emoji but the expected content-length is wrong > > upon receiving it. > > I see there's been some discussion about Emoji support since 4.99.23 > > (which I don't fully understand) - not sure if that's relevant here.. > > > > * Test in question: > > > > test encoding_ns_http-1.1 { > > Send body with ns_return and charset utf-8 > > } -constraints usingExternalToUtf -setup { > > ns_register_proc GET /encoding { > > ns_return 200 "text/plain; charset=utf-8" "äöü😃" > > } > > } -body { > > set result [ns_http run [ns_config test listenurl]/encoding] > > set headers [dict get $result headers] > > list [dict get $result status] \ > > [ns_set iget $headers Content-Type] \ > > [ns_set iget $headers Content-Length] \ > > [dict get $result body] > > } -cleanup { > > ns_unregister_op GET /encoding > > } -result [list 200 "text/plain; charset=utf-8" 10 "äöü😃"] > > > > > > * Reproduction steps below. > > > > $ uname -a > > Linux ip-172-0-1-61 4.19.0-18-cloud-amd64 #1 SMP Debian 4.19.208-1 > > (2021-09-29) x86_64 GNU/Linux > > $ cat /etc/debian_version > > 10.12 > > $ locale > > LANG=C.UTF-8 > > LANGUAGE= > > LC_CTYPE="C.UTF-8" > > LC_NUMERIC="C.UTF-8" > > LC_TIME="C.UTF-8" > > LC_COLLATE="C.UTF-8" > > LC_MONETARY="C.UTF-8" > > LC_MESSAGES="C.UTF-8" > > LC_PAPER="C.UTF-8" > > LC_NAME="C.UTF-8" > > LC_ADDRESS="C.UTF-8" > > LC_TELEPHONE="C.UTF-8" > > LC_MEASUREMENT="C.UTF-8" > > LC_IDENTIFICATION="C.UTF-8" > > LC_ALL= > > > > $ apt-get install build-essential git automake tcl8.6-dev libssl-dev > > $ cd naviserver-4.99.23 > > $ ./autogen.sh --disable-ipv6 --with-tcl=/usr/lib/tcl8.6 > > --enable-rpath --enable-threads --enable-symbols > > $ make > > $ make test > > <snip> > > ==== encoding_ns_http-1.1 Send body with ns_return and charset utf-8 > > FAILED > > ==== Contents of test case: > > > > set result [ns_http run [ns_config test listenurl]/encoding] > > set headers [dict get $result headers] > > list [dict get $result status] [ns_set iget $headers > > Content-Type] [ns_set iget $headers Content-Length] [dict get > > $result body] > > > > ---- Result was: > > 200 {text/plain; charset=utf-8} 14 äöü😃 > > ---- Result should have been (exact matching): > > 200 {text/plain; charset=utf-8} 10 äöü😃 > > ==== encoding_ns_http-1.1 FAILED > > > > Can anyone shed any light? > > Regards, > > -- > > David > > > > _______________________________________________ > naviserver-devel mailing list > nav...@li... > https://lists.sourceforge.net/lists/listinfo/naviserver-devel > -- David |
|
From: Wolfgang W. <wol...@di...> - 2022-04-06 14:45:17
|
Hi! We are using Debian Buster without any problems. Here are some details: cat /etc/debian_version 10.12 dcweb:nscp 1> info patchlevel 8.6.11 dcweb:nscp 1> ns_info patchlevel 4.99.23 This is our configure script: # --with-openssl=/usr/local assumes a manually installed openssl version. For debian buster (10), an entry in # /etc/ld.so.conf.d/libc.conf for /usr/local/lib64 and calling "ldconfig" once is necessary ./configure --enable-64bit --prefix=/usr/local/naviserver --with-openssl=/usr/local/ --with-tcl=/usr/local/lib/ --enable-threads Tcl is self compiled. Yours, Wolfgang Am 06.04.22 um 15:53 schrieb Gustaf Neumann: > Hi David, > > i will setup a VM for testing in your configuration, but first i have > to understand, what pt1/pt2 means. > Is it sufficient to set up a Debian Buster with all available updates? > > The emoji/UTF-8 problem points to a Tcl problem. What exact version of > Tcl is used in this installation? > > The setup based on install-ns.sh [1] of the release was tested with: > > macOS 11.6.2, Rocky Linux 8.5, Ubuntu 20.04, OpenBSD 6.9, FreeBSD 13.1 > > The situation of strcoll is also platform dependent (depends on > version of libc, e.g. the detail behavior is different on *BSD to the > tested Linux versions - but identical on the test cases). Maybe we > have to deactivate it for some platforms until these contain working > versions. > > all the best > > -g > > [1] https://github.com/gustafn/install-ns > > On 06.04.22 13:24, David Osborne wrote: >> Hi there, >> >> We're trying to do a build of the official NaviServer v4.99.23 >> release (from the sourceforge tarball) on Debian Buster (10.12) but >> we're getting some failed tests. >> >> First one is encoding_ns_http-1.1 >> >> Seems it's serving an Emoji but the expected content-length is wrong >> upon receiving it. >> I see there's been some discussion about Emoji support since 4.99.23 >> (which I don't fully understand) - not sure if that's relevant here.. >> >> * Test in question: >> >> test encoding_ns_http-1.1 { >> Send body with ns_return and charset utf-8 >> } -constraints usingExternalToUtf -setup { >> ns_register_proc GET /encoding { >> ns_return 200 "text/plain; charset=utf-8" "äöü😃" >> } >> } -body { >> set result [ns_http run [ns_config test listenurl]/encoding] >> set headers [dict get $result headers] >> list [dict get $result status] \ >> [ns_set iget $headers Content-Type] \ >> [ns_set iget $headers Content-Length] \ >> [dict get $result body] >> } -cleanup { >> ns_unregister_op GET /encoding >> } -result [list 200 "text/plain; charset=utf-8" 10 "äöü😃"] >> >> >> * Reproduction steps below. >> >> $ uname -a >> Linux ip-172-0-1-61 4.19.0-18-cloud-amd64 #1 SMP Debian 4.19.208-1 >> (2021-09-29) x86_64 GNU/Linux >> $ cat /etc/debian_version >> 10.12 >> $ locale >> LANG=C.UTF-8 >> LANGUAGE= >> LC_CTYPE="C.UTF-8" >> LC_NUMERIC="C.UTF-8" >> LC_TIME="C.UTF-8" >> LC_COLLATE="C.UTF-8" >> LC_MONETARY="C.UTF-8" >> LC_MESSAGES="C.UTF-8" >> LC_PAPER="C.UTF-8" >> LC_NAME="C.UTF-8" >> LC_ADDRESS="C.UTF-8" >> LC_TELEPHONE="C.UTF-8" >> LC_MEASUREMENT="C.UTF-8" >> LC_IDENTIFICATION="C.UTF-8" >> LC_ALL= >> >> $ apt-get install build-essential git automake tcl8.6-dev libssl-dev >> $ cd naviserver-4.99.23 >> $ ./autogen.sh --disable-ipv6 --with-tcl=/usr/lib/tcl8.6 >> --enable-rpath --enable-threads --enable-symbols >> $ make >> $ make test >> <snip> >> ==== encoding_ns_http-1.1 Send body with ns_return and charset utf-8 >> FAILED >> ==== Contents of test case: >> >> set result [ns_http run [ns_config test listenurl]/encoding] >> set headers [dict get $result headers] >> list [dict get $result status] [ns_set iget $headers >> Content-Type] [ns_set iget $headers Content-Length] [dict get >> $result body] >> >> ---- Result was: >> 200 {text/plain; charset=utf-8} 14 äöü😃 >> ---- Result should have been (exact matching): >> 200 {text/plain; charset=utf-8} 10 äöü😃 >> ==== encoding_ns_http-1.1 FAILED >> >> Can anyone shed any light? >> Regards, >> -- >> David > > > > _______________________________________________ > naviserver-devel mailing list > nav...@li... > https://lists.sourceforge.net/lists/listinfo/naviserver-devel -- *Wolfgang Winkler* Geschäftsführung wol...@di... mobil +43.699.19971172 dc:*büro* digital concepts Novak Winkler OG Software & Design Landstraße 68, 5. Stock, 4020 Linz www.digital-concepts.com <http://www.digital-concepts.com> tel +43.732.997117.72 tel +43.699.1997117.2 Firmenbuchnummer: 192003h Firmenbuchgericht: Landesgericht Linz |
|
From: Gustaf N. <ne...@wu...> - 2022-04-06 13:53:20
|
Hi David, i will setup a VM for testing in your configuration, but first i have to understand, what pt1/pt2 means. Is it sufficient to set up a Debian Buster with all available updates? The emoji/UTF-8 problem points to a Tcl problem. What exact version of Tcl is used in this installation? The setup based on install-ns.sh [1] of the release was tested with: macOS 11.6.2, Rocky Linux 8.5, Ubuntu 20.04, OpenBSD 6.9, FreeBSD 13.1 The situation of strcoll is also platform dependent (depends on version of libc, e.g. the detail behavior is different on *BSD to the tested Linux versions - but identical on the test cases). Maybe we have to deactivate it for some platforms until these contain working versions. all the best -g [1] https://github.com/gustafn/install-ns On 06.04.22 13:24, David Osborne wrote: > Hi there, > > We're trying to do a build of the official NaviServer v4.99.23 release > (from the sourceforge tarball) on Debian Buster (10.12) but we're > getting some failed tests. > > First one is encoding_ns_http-1.1 > > Seems it's serving an Emoji but the expected content-length is wrong > upon receiving it. > I see there's been some discussion about Emoji support since 4.99.23 > (which I don't fully understand) - not sure if that's relevant here.. > > * Test in question: > > test encoding_ns_http-1.1 { > Send body with ns_return and charset utf-8 > } -constraints usingExternalToUtf -setup { > ns_register_proc GET /encoding { > ns_return 200 "text/plain; charset=utf-8" "äöü😃" > } > } -body { > set result [ns_http run [ns_config test listenurl]/encoding] > set headers [dict get $result headers] > list [dict get $result status] \ > [ns_set iget $headers Content-Type] \ > [ns_set iget $headers Content-Length] \ > [dict get $result body] > } -cleanup { > ns_unregister_op GET /encoding > } -result [list 200 "text/plain; charset=utf-8" 10 "äöü😃"] > > > * Reproduction steps below. > > $ uname -a > Linux ip-172-0-1-61 4.19.0-18-cloud-amd64 #1 SMP Debian 4.19.208-1 > (2021-09-29) x86_64 GNU/Linux > $ cat /etc/debian_version > 10.12 > $ locale > LANG=C.UTF-8 > LANGUAGE= > LC_CTYPE="C.UTF-8" > LC_NUMERIC="C.UTF-8" > LC_TIME="C.UTF-8" > LC_COLLATE="C.UTF-8" > LC_MONETARY="C.UTF-8" > LC_MESSAGES="C.UTF-8" > LC_PAPER="C.UTF-8" > LC_NAME="C.UTF-8" > LC_ADDRESS="C.UTF-8" > LC_TELEPHONE="C.UTF-8" > LC_MEASUREMENT="C.UTF-8" > LC_IDENTIFICATION="C.UTF-8" > LC_ALL= > > $ apt-get install build-essential git automake tcl8.6-dev libssl-dev > $ cd naviserver-4.99.23 > $ ./autogen.sh --disable-ipv6 --with-tcl=/usr/lib/tcl8.6 > --enable-rpath --enable-threads --enable-symbols > $ make > $ make test > <snip> > ==== encoding_ns_http-1.1 Send body with ns_return and charset utf-8 > FAILED > ==== Contents of test case: > > set result [ns_http run [ns_config test listenurl]/encoding] > set headers [dict get $result headers] > list [dict get $result status] [ns_set iget $headers > Content-Type] [ns_set iget $headers Content-Length] [dict get > $result body] > > ---- Result was: > 200 {text/plain; charset=utf-8} 14 äöü😃 > ---- Result should have been (exact matching): > 200 {text/plain; charset=utf-8} 10 äöü😃 > ==== encoding_ns_http-1.1 FAILED > > Can anyone shed any light? > Regards, > -- > David |
|
From: David O. <da...@qc...> - 2022-04-06 11:54:25
|
Still building of the official NaviServer v4.99.23 release on Debian
Buster, we are also seeing some SIGSEGVs starting from ns_strcoll-1.0.0
test ns_strcoll-1.0.0 {ns_strcoll without locale (assuming en_US.UTF-8)} \
-constraints localeCollate -body {
return [expr {[ns_strcoll Bär Bor] < 0}]
} -result 1
Running make memcheck I get the following output for the first signal 11
==19345== Thread 2:
==19345== Invalid read of size 8
==19345== at 0x5113371: strcoll_l (strcoll_l.c:260)
==19345== by 0x48D51DA: NsTclStrcollObjCmd (tclmisc.c:2802)
==19345== by 0x49EDFB6: TclNRRunCallbacks (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==19345== by 0x49EF3AE: ??? (in /usr/lib/x86_64-linux-gnu/libtcl8.6.so)
==19345== by 0x4AA83F7: Tcl_FSEvalFileEx (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==19345== by 0x4AB0246: Tcl_MainEx (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==19345== by 0x49A527B: NsThreadMain (thread.c:232)
==19345== by 0x49A64C8: ThreadMain (pthread.c:870)
==19345== by 0x5256FA2: start_thread (pthread_create.c:486)
==19345== by 0x5180EFE: clone (clone.S:95)
==19345== Address 0x18 is not stack'd, malloc'd or (recently) free'd
==19345==
==19345==
==19345== Process terminating with default action of signal 11 (SIGSEGV)
==19345== Access not within mapped region at address 0x18
==19345== at 0x5113371: strcoll_l (strcoll_l.c:260)
==19345== by 0x48D51DA: NsTclStrcollObjCmd (tclmisc.c:2802)
==19345== by 0x49EDFB6: TclNRRunCallbacks (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==19345== by 0x49EF3AE: ??? (in /usr/lib/x86_64-linux-gnu/libtcl8.6.so)
==19345== by 0x4AA83F7: Tcl_FSEvalFileEx (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==19345== by 0x4AB0246: Tcl_MainEx (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==19345== by 0x49A527B: NsThreadMain (thread.c:232)
==19345== by 0x49A64C8: ThreadMain (pthread.c:870)
==19345== by 0x5256FA2: start_thread (pthread_create.c:486)
==19345== by 0x5180EFE: clone (clone.S:95)
==19345== If you believe this happened as a result of a stack
==19345== overflow in your program's main thread (unlikely but
==19345== possible), you can try to increase the size of the
==19345== main thread stack using the --main-stacksize= flag.
==19345== The main thread stack size used in this run was 8388608.
==19345==
==19345== HEAP SUMMARY:
==19345== in use at exit: 12,996,807 bytes in 12,198 blocks
==19345== total heap usage: 40,995 allocs, 28,797 frees, 41,780,348 bytes
allocated
==19345==
==19345== LEAK SUMMARY:
==19345== definitely lost: 179 bytes in 3 blocks
==19345== indirectly lost: 0 bytes in 0 blocks
==19345== possibly lost: 11,145,880 bytes in 6,013 blocks
==19345== still reachable: 1,850,748 bytes in 6,182 blocks
==19345== suppressed: 0 bytes in 0 blocks
==19345== Rerun with --leak-check=full to see details of leaked memory
==19345==
==19345== For counts of detected and suppressed errors, rerun with: -v
==19345== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault
make: *** [Makefile:273: memcheck] Error 139
Can you advise on the best course of action?
Regards,
--
David
|
|
From: David O. <da...@qc...> - 2022-04-06 11:48:35
|
Hi there,
We're trying to do a build of the official NaviServer v4.99.23 release
(from the sourceforge tarball) on Debian Buster (10.12) but we're getting
some failed tests.
First one is encoding_ns_http-1.1
Seems it's serving an Emoji but the expected content-length is wrong upon
receiving it.
I see there's been some discussion about Emoji support since 4.99.23 (which
I don't fully understand) - not sure if that's relevant here..
* Test in question:
test encoding_ns_http-1.1 {
Send body with ns_return and charset utf-8
} -constraints usingExternalToUtf -setup {
ns_register_proc GET /encoding {
ns_return 200 "text/plain; charset=utf-8" "äöü😃"
}
} -body {
set result [ns_http run [ns_config test listenurl]/encoding]
set headers [dict get $result headers]
list [dict get $result status] \
[ns_set iget $headers Content-Type] \
[ns_set iget $headers Content-Length] \
[dict get $result body]
} -cleanup {
ns_unregister_op GET /encoding
} -result [list 200 "text/plain; charset=utf-8" 10 "äöü😃"]
* Reproduction steps below.
$ uname -a
Linux ip-172-0-1-61 4.19.0-18-cloud-amd64 #1 SMP Debian 4.19.208-1
(2021-09-29) x86_64 GNU/Linux
$ cat /etc/debian_version
10.12
$ locale
LANG=C.UTF-8
LANGUAGE=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_PAPER="C.UTF-8"
LC_NAME="C.UTF-8"
LC_ADDRESS="C.UTF-8"
LC_TELEPHONE="C.UTF-8"
LC_MEASUREMENT="C.UTF-8"
LC_IDENTIFICATION="C.UTF-8"
LC_ALL=
$ apt-get install build-essential git automake tcl8.6-dev libssl-dev
$ cd naviserver-4.99.23
$ ./autogen.sh --disable-ipv6 --with-tcl=/usr/lib/tcl8.6 --enable-rpath
--enable-threads --enable-symbols
$ make
$ make test
<snip>
==== encoding_ns_http-1.1 Send body with ns_return and charset utf-8 FAILED
==== Contents of test case:
set result [ns_http run [ns_config test listenurl]/encoding]
set headers [dict get $result headers]
list [dict get $result status] [ns_set iget $headers Content-Type]
[ns_set iget $headers Content-Length] [dict get $result body]
---- Result was:
200 {text/plain; charset=utf-8} 14 äöü😃
---- Result should have been (exact matching):
200 {text/plain; charset=utf-8} 10 äöü😃
==== encoding_ns_http-1.1 FAILED
Can anyone shed any light?
Regards,
--
David
|
|
From: Gustaf N. <ne...@wu...> - 2022-03-18 11:35:50
|
Just as a short notice: iOS 14.5 (released a few days ago) supports some more Unicode 14 characters, iOS 15 is supposed to support all of Unicode 14.0. The melting face of Unicode 14 on the test-page on openacs.org (see link below) works already. -gn On 04.12.21 15:57, Gustaf Neumann wrote: > It will take some time, until the Emojis from Unicode 14 will be > generally available, but when this comes, we should have already > everything working in NaviServer and the DB interfaces. I've added a > small demo page, one can try when the new clients come out: > > https://openacs.org/emojis.tcl > |
|
From: Gustaf N. <ne...@wu...> - 2022-03-03 09:49:28
|
Dear all, There are more changes related to this problem area: a) Due to the full support of UTF-8 in the database interface in the last release, potential new problems showed up which were hidden so far by the mangled Tcl-UTF-8; similarly, problems showed up with vulnerability scanners trying to inject invalid UTF-8, causing then some extensions (expecting only valid UTF-8) to fatal out (e.g. tDOM). These issues were addressed by the stronger input validation changes of the last weeks and months since the release. b) For full emoji support, it is also necessary to support emojis specified as numeric entities in HTML markup. The old versions of NaviServer were just capable of handling single byte decimal numeric entities, now multibyte decimal or hexadecimal numeric entities are supported as well (see e.g., in the regression test the mermaids with the light and dark skin tones [1]). Since HTML entity interpretation was before only available through "ns_striphtml" (which does also comment and tag stripping), I have added the command "ns_unquotehtml" as a counterpart to "ns_quotehtml" which just interprets numeric and non-numeric entities. The next release should come out around easter. all the best -gn [1] https://bitbucket.org/naviserver/naviserver/commits/b923ad4384529a80ac88cadcadde1947a6413753#Ltests/ns_striphtml.testT369 |
|
From: Brian F. <Bri...@qu...> - 2022-03-02 15:35:53
|
Great news, and thanks to you and your team, Gustaf for this tremendous work. Brian ________________________________ From: Gustaf Neumann <ne...@wu...> Sent: Wednesday 2 March 2022 09:23 To: Navidevel <nav...@li...> Subject: [naviserver-devel] Fwd: NaviServer Wins an Award from SourceForge This message's attachments contains at least one web link. This is often used for phishing attempts. Please only interact with this attachment if you know its source and that the content is safe. If in doubt, confirm the legitimacy with the sender by phone. Dear all, NaviServer has just been recognized with the Community Choice award by SourceForge, which - to my understanding - reflects in essence the number of downloads from sourceforge. As an official winner of the Community Choice Award, we have the permission to use the award badge wherever appropriate. If you are using NaviServer in your products, you might use this information and badge to advertise using an awarded server. All the best -gustaf neumann -------- Forwarded Message -------- Subject: NaviServer Wins an Award from SourceForge Date: Tue, 01 Mar 2022 23:23:35 +0000 From: SourceForge <no...@so...><mailto:no...@so...> To: ne...@wu...<mailto:ne...@wu...> [SourceForge] <https://sourceforge.net> Find, Create, & Publish Open Source Software for free. Hi gustafn, Congratulations! NaviServer has just been recognized with a Community Choice award by SourceForge. This honor is awarded only to select projects that have reached significant milestones in terms of downloads and user engagement from the SourceForge community. This is a big achievement, as your project has qualified for this award out of over 500,000 open source projects on SourceForge. SourceForge sees nearly 30 million users per month looking for, and developing, open source software. This award badge will now appear on your project page, and the award assets can be found in your project admin section. To recognize NaviServer’s achievement, we’ve awarded you with the Community Choice award, which you can see below: [https://a.fsdn.com/con/img/sandiego/svg/originals/badges/examples/oss-community-choice.png] Now that NaviServer is an official winner of the Community Choice Award, you have express permission to use the award badge wherever you’d like. Feel free to proudly display the award on your personal or organizational website, social media, or anywhere else you’d like. You can get the award badge assets here.<https://sourceforge.net/p/naviserver/admin/files/badges/> Congrats again on winning and keep doing amazing work because SourceForge and our users appreciate it! Thanks, The SourceForge Team This e-mail was intended for: ne...@wu...<mailto:ne...@wu...> Unsubscribe from these notifications<https://sourceforge.net/cdn/syndication/unsubscribe?ema...@wu...> Privacy policy<https://slashdotmedia.com/privacy-statement/> PO BOX 2452, La Jolla, CA 92038 [SF Logo] <https://sourceforge.net> |
|
From: Gustaf N. <ne...@wu...> - 2022-03-02 09:23:19
|
Dear all, NaviServer has just been recognized with the Community Choice award by SourceForge, which - to my understanding - reflects in essence the number of downloads from sourceforge. As an official winner of the Community Choice Award, we have the permission to use the award badge wherever appropriate. If you are using NaviServer in your products, you might use this information and badge to advertise using an awarded server. All the best -gustaf neumann -------- Forwarded Message -------- Subject: NaviServer Wins an Award from SourceForge Date: Tue, 01 Mar 2022 23:23:35 +0000 From: SourceForge <no...@so...> To: ne...@wu... SourceForge <https://sourceforge.net> *Find*, *Create*, & *Publish* Open Source Software for *free*. Hi gustafn, Congratulations! NaviServer has just been recognized with a Community Choice award by SourceForge. This honor is awarded only to select projects that have reached significant milestones in terms of downloads and user engagement from the SourceForge community. This is a big achievement, as your project has qualified for this award out of over 500,000 open source projects on SourceForge. SourceForge sees nearly 30 million users per month looking for, and developing, open source software. This award badge will now appear on your project page, and the award assets can be found in your project admin section. To recognize NaviServer’s achievement, we’ve awarded you with the Community Choice award, which you can see below: Now that NaviServer is an official winner of the Community Choice Award, you have express permission to use the award badge wherever you’d like. Feel free to proudly display the award on your personal or organizational website, social media, or anywhere else you’d like. *You can get the award badge assets here. <https://sourceforge.net/p/naviserver/admin/files/badges/>* Congrats again on winning and keep doing amazing work because SourceForge and our users appreciate it! Thanks, The SourceForge Team This e-mail was intended for: ne...@wu... Unsubscribe from these notifications <https://sourceforge.net/cdn/syndication/unsubscribe?ema...@wu...> Privacy policy <https://slashdotmedia.com/privacy-statement/> PO BOX 2452, La Jolla, CA 92038 SF Logo <https://sourceforge.net> |
|
From: Gustaf N. <ne...@wu...> - 2022-01-18 21:02:17
|
Dear Wolfgang,
ns_parseurl is intended to parse a compete URL, including scheme,
authority (host, port),
path, query and fragment
foo://example.com:8042/over/there?name=ferret#nose
\_/ \______________/\_________/ \_________/ \__/
| | | | |
scheme authority path query fragment
The path has to start with a slash, so it tries to parse the scheme,
when the first
characters of the scheme look ok. When the scheme is not properly
terminated,
it complains. In your example, it found the terminating colon.... I
have now
made the parsing more robust, also in the cases where these are not fully
percent-encoded, while keeping the error cases constant.
We see now the following results
% ns_parseurl "/test/index?url=http://www.test.at"
path test tail index query url=http://www.test.at
% ns_parseurl "index?url=http://www.test.com"
tail index query url=http://www.test.com
% ns_parseurl "/index?url=http://www.test.com"
path {} tail index query url=http://www.test.com
% ns_parseurl "index?url=https//www.test.com"
tail index query url=https//www.test.com
Altogether, we have now 81 tests for the URL parsing...
The changes a committed on Bitbucket.
Let me know, in case you found other questionable cases.
all the best
-gn
On 14.01.22 08:34, Wolfgang Winkler via naviserver-devel wrote:
>
> Hello all!
>
> We encountered a small problem with ns_parseurl:
>
> When parsing
>
> ns_parseurl "/test/index?url=http://www.test.at"
>
> everything works fine.
>
> With
>
> ns_parseurl "index?url=http://www.test.com"
>
> We get "Could not parse URL "index?url=https://www.test.com": invalid
> scheme"
>
> * ns_parseurl "/index?url=http://www.test.com"
> * ns_parseurl "index?url=https//www.test.com" (notice the missing ":")
>
> work as well.
>
> The "url" Parameter is encoded with ns_urlencode. When encoding with JS:
>
> encodeURIComponent("http://www.test.com");
>
> we get
>
> 'http%3A%2F%2Fwww.test.com'
>
> Using this value:
>
> ns_parseurl "index?url=http%3A%2F%2Fwww.test.com"
>
> works. ns_urldecode decodes the value correctly.
>
> Regards,
>
> Wolfgang
>
>
>
>
> _______________________________________________
> naviserver-devel mailing list
> nav...@li...
> https://lists.sourceforge.net/lists/listinfo/naviserver-devel
--
Univ.Prof. Dr. Gustaf Neumann
Head of the Institute of Information Systems and New Media
of Vienna University of Economics and Business
Program Director of MSc "Information Systems"
|