From: Wolfgang W. <wol...@di...> - 2021-11-16 15:39:50
|
Hello all, the fix worked, thank you Gustaf! But we still have a problem with emojis when writing them to the database. The error we get is: Database operation "dml" failed (exception ERROR, "ERROR: invalid byte sequence for encoding "UTF8": 0xf0 0x9f 0x98 0xff when trying to write the emoji to a TEXT or VARCHAR field in the database. Inserting the same string in the database console works as expected. When we read the string and reinsert it, it also works flawlessly. We've compared the two strings, wrote them to files and compared them with a hex reader, converted them with tcl "encoding convertto" and iconv, all with no luck. We are using postgres 12 and the nsdbpg module with naviserver-4.99.22-16-g67adf3c34710+ Here is the test case: In the database console: CREATE TABLE test ( idx SERIAL, txt TEXT ); INSERT INTO test (txt) VALUES ('<smiley>😃</smiley>'); In the naviserver console or in a script: # V1: working set db [ns_db gethandle] set sql "SELECT txt FROM test WHERE idx=1" set selection [ns_db 1row $db $sql] set str [ns_set value $selection 0] set sql "INSERT INTO test (txt) VALUES ('$str')" ns_db dml $db $sql ns_db releasehandle $db # V2: not working set db [ns_db gethandle] set sql "INSERT INTO test (txt) VALUES ('<smiley>😃</smiley>')" ns_db dml $db $sql ns_db releasehandle $db With nscp, pasting the string of V2 already shows a wrong string in the log: Notice: nscp: 3: set sql "INSERT INTO test (txt) VALUES ('<smiley>�������</smiley>')" Whereas V1 works (the smiley is not printed here, but works in the console): Notice: nscp: 5: puts $str <smiley></smiley> Any help is greatly appreciated! Wolfgang Winkler Am 09.11.21 um 09:36 schrieb Gustaf Neumann: > Dear all, > > The situation is trickier than someone might hope. Aside of the Tcl > version dependencies (as Brian pointed out), Tcl before 8.7 do not > support TCL_UTF_MAX with longer multi-byte sequences than 4 (see Tcl > TIP 389), which are also mostly relevant for some newer emojis. So, > for full emoji support, Tcl 8.7 with the proper compilation options is > needed. > > Anyhow, in the case of Wolfgang's the "Smiling Face with Open Mouth" > we have just a 4-byte UTF-8 character, which is supported by > out-of-the-box Tcl 8.6. However, this emoji is represented > Tcl-internally as a 6-byte sequence. Since NaviServer wrongly assumed > that Tcl-internal representations are also accepted as external > representations, a conversion step was omitted for utf-8 (which is not > always true). > > In the tip version of NaviServer on Bitbucket, this optimization is > now removed, the examples work as expected, the regression test is > extended for this case. > > Many thanks to Wolfgang for the good bug report. > > -g > > > > > > _______________________________________________ > naviserver-devel mailing list > nav...@li... > https://lists.sourceforge.net/lists/listinfo/naviserver-devel -- *Wolfgang Winkler* Geschäftsführung wol...@di... mobil +43.699.19971172 dc:*büro* digital concepts Novak Winkler OG Software & Design Landstraße 68, 5. Stock, 4020 Linz www.digital-concepts.com <http://www.digital-concepts.com> tel +43.732.997117.72 tel +43.699.1997117.2 Firmenbuchnummer: 192003h Firmenbuchgericht: Landesgericht Linz |