From: james r. <ji...@re...> - 2010-05-02 17:42:46
|
Wow, You are right Ian. I just ran the following benchmark and the U functions were faster 9 to 16 seconds. I ran it multiple times (and reversed) and got the same results. I will change all of the existing string functions to the new UNICODE safe code, remove my test/working U functions, and will commit later. I am glad you were there to tell me to keep it simple. Jim a$ = "abcdef ghijklmno pqrstuvwxyzabcd efghijklmnopqr stuvwxyzabcd efghijklmnopq rs tuvwxyza bcdefghijklm nopqrstuvwxyza bcdefghijklmno pqrstuvwxyzabcde fghijklmnopqrstuvwxy zabcdefghijklmnopq rstuvwxyzabcdefgh ijklmnopqrstuvwxyz abcdefghijklmnopqrstuvwx yzabcdefghijklmnopqrstuvwxyza bcdefghijklmnopq rstuvwxyzabcdefghi jklmnopqrstuvwxyz" start = hour * 60 * 60 + minute * 60 + second for r = 1 to 1000 for n = 1 to ulength(a$) c = useq(umid(a$,n,1)) c$ = uchr(c) b$ = uleft(a$,n) b$ = uright(a$,n) next n next r finish = hour * 60 * 60 + minute * 60 + second print finish - start start = hour * 60 * 60 + minute * 60 + second for r = 1 to 1000 for n = 1 to length(a$) c = asc(mid(a$,n,1)) c$ = chr(c) b$ = left(a$,n) b$ = right(a$,n) next n next r finish = hour * 60 * 60 + minute * 60 + second print finish - start On Sun, May 2, 2010 at 1:26 PM, james reneau <ji...@re...> wrote: > Ian, > > The U functions do a whole bunch of additional conversion from the char* in > utf8 to qstring and then back again to char* utf8. For those of us in the > ASCII/English world I thought it would be slower. Havn't done a benchmark. > Let me do one before I push. > > Jim > > > > > On Sun, May 2, 2010 at 12:51 PM, Ian Larsen <dr...@gm...> wrote: > >> For the sake of simplicity, why don't you just have your U* functions >> just replace the default? For ASCII strings they should do the same >> thing as the regular ones. >> >> -Ian >> >> On Sun, May 2, 2010 at 12:45 PM, james reneau <ji...@re...> wrote: >> > Guys, >> > >> > I have gotten the save and load to work with UTF8 and I am adding string >> > functions to handle unicode (ULENGTH, USEQ, >> UCHR, UMUD, ULEFT, URIGHT, >> > UINSTR) and should have them committed later today. >> > >> > Jim >> > >> > On Sat, May 1, 2010 at 5:39 PM, Ian Larsen <dr...@gm...> wrote: >> >> >> >> Here's a screenshot of the result. >> >> >> >> On Sat, May 1, 2010 at 5:34 PM, Ian Larsen <dr...@gm...> wrote: >> >> > It's even simpler than that. You just have to change most of the >> >> > QString::toAscii calls to QString::toUtf8. >> >> > >> >> > My changes are committed. I've tried to test everything I could but >> >> > more extensive testing would ensure I've got everything. If you see >> >> > question marks instead of extended characters anywhere, please let me >> >> > know. >> >> > >> >> > -Ian >> >> > >> >> > On Sat, May 1, 2010 at 4:57 PM, james reneau <ji...@re...> wrote: >> >> >> Ian, >> >> >> >> >> >> That was my thought, too. I was going to email you to see if we >> could >> >> >> change all the char* stuff in the stack and interpreter to QStrings? >> >> >> >> >> >> Looking forward to your commit. >> >> >> >> >> >> Jim >> >> >> >> >> >> On Fri, Apr 30, 2010 at 9:59 PM, Ian Larsen <dr...@gm...> >> wrote: >> >> >>> >> >> >>> All, >> >> >>> >> >> >>> I was wrong about Flex; it handles Utf8 just fine. The problem was >> >> >>> with the way QStrings were being converted. I have a working >> version >> >> >>> that I'm going to test some more and commit tomorrow. >> >> >>> >> >> >>> -Ian >> >> >>> >> >> >>> On Fri, Apr 30, 2010 at 5:53 PM, Ian Larsen <dr...@gm...> >> wrote: >> >> >>> > All, >> >> >>> > >> >> >>> > I believe the reason you're seeing the question marks in the >> output >> >> >>> > is >> >> >>> > because Gnu Flex and Bison, which the basic256 parser is written >> in, >> >> >>> > doesn't support Unicode at all. >> >> >>> > >> >> >>> > There are no simple fixes for this, unfortunately. Here are some >> >> >>> > possibilities: >> >> >>> > >> >> >>> > 1) Encode ALL strings in a program's source code using base64 and >> >> >>> > then >> >> >>> > decode them prior to pushing them onto the operand stack. This >> is >> >> >>> > an >> >> >>> > ugly hack, but right now would be the path of least resistance. >> >> >>> > 2) Find a drop-in replacement for Flex and Bison that supports >> >> >>> > Unicode >> >> >>> > 3) Write a custom parser that supports Unicode. This would be a >> >> >>> > *lot* >> >> >>> > of work, but would be a lot of fun for someone interested in >> >> >>> > learning >> >> >>> > compiler design. >> >> >>> > >> >> >>> > If anyone has any other ideas, please let me know. >> >> >>> > >> >> >>> > -Ian >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > On Fri, Apr 30, 2010 at 11:06 AM, <web...@bi...> >> wrote: >> >> >>> >> Ian, >> >> >>> >> >> >> >>> >> I am very glad that you have returned to the development >> BASIC256! >> >> >>> >> I >> >> >>> >> would just tell you about a serious problem that exists for >> users >> >> >>> >> who >> >> >>> >> use the Russian language. Attached - screenshot. >> >> >>> >> >> >> >>> >> I made a patch for version 0.9.5 which was published 12/2009 for >> >> >>> >> the >> >> >>> >> distribution of ALT Linux. Of course, this patch is not urgent, >> >> >>> >> since >> >> >>> >> you have done a lot of changes. Can I ask you to make necessary >> >> >>> >> changes (because I have little experience) or the provision of >> >> >>> >> Russian-speaking users - only my problem? :-) >> >> >>> >> >> >> >>> >>> On this list about two weeks ago we got a french translation if >> >> >>> >>> anyone >> >> >>> >>> would like to add that in. If not, I'll get around to it >> >> >>> >>> eventually. >> >> >>> >> >> >> >>> >> I have a little more experience, so it's better if you did. >> >> >>> >> >> >> >>> >> -- >> >> >>> >> Blessing, >> >> >>> >> Sergei Irupin >> >> >>> >> http://rnd-lug.blogspot.com/ >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > -- >> >> >>> > My PGP Public Key: >> >> >>> > http://www.scrapshark.com/pubkey.txt >> >> >>> > >> >> >>> >> >> >>> >> >> >>> >> >> >>> -- >> >> >>> My PGP Public Key: >> >> >>> http://www.scrapshark.com/pubkey.txt >> >> >>> >> >> >> >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > My PGP Public Key: >> >> > http://www.scrapshark.com/pubkey.txt >> >> > >> >> >> >> >> >> >> >> -- >> >> My PGP Public Key: >> >> http://www.scrapshark.com/pubkey.txt >> > >> > >> >> >> >> -- >> My PGP Public Key: >> http://www.scrapshark.com/pubkey.txt >> >> > |