From: james r. <ji...@re...> - 2010-05-02 19:19:39
|
Agreed, I have committed my changes and created a new windows install 0.9.6a. Jim On Sun, May 2, 2010 at 2:50 PM, Ian Larsen <dr...@gm...> wrote: > Even if the Utf8 functions were slower, I think simplicity would still > be more of a factor than speed in this case. How often is performance > dependent on manipulating strings in a tight loop? > > Better a performance hit than having to explain Unicode to a new > programmer. > > -Ian > > > On Sun, May 2, 2010 at 1:42 PM, james reneau <ji...@re...> wrote: > > Wow, > > > > You are right Ian. I just ran the following benchmark and the U > functions > > were faster 9 to 16 seconds. I ran it multiple times (and reversed) and > got > > the same results. I will change all of the existing string functions to > the > > new UNICODE safe code, remove my test/working U functions, and will > commit > > later. > > > > I am glad you were there to tell me to keep it simple. > > > > Jim > > > > a$ = "abcdef ghijklmno pqrstuvwxyzabcd efghijklmnopqr stuvwxyzabcd > > efghijklmnopq rs tuvwxyza bcdefghijklm nopqrstuvwxyza bcdefghijklmno > > pqrstuvwxyzabcde fghijklmnopqrstuvwxy zabcdefghijklmnopq > rstuvwxyzabcdefgh > > ijklmnopqrstuvwxyz abcdefghijklmnopqrstuvwx yzabcdefghijklmnopqrstuvwxyza > > bcdefghijklmnopq rstuvwxyzabcdefghi jklmnopqrstuvwxyz" > > > > start = hour * 60 * 60 + minute * 60 + second > > > > for r = 1 to 1000 > > > > for n = 1 to ulength(a$) > > > > c = useq(umid(a$,n,1)) > > > > c$ = uchr(c) > > > > b$ = uleft(a$,n) > > > > b$ = uright(a$,n) > > > > next n > > > > next r > > > > finish = hour * 60 * 60 + minute * 60 + second > > > > print finish - start > > > > start = hour * 60 * 60 + minute * 60 + second > > > > for r = 1 to 1000 > > > > for n = 1 to length(a$) > > > > c = asc(mid(a$,n,1)) > > > > c$ = chr(c) > > > > b$ = left(a$,n) > > > > b$ = right(a$,n) > > > > next n > > > > next r > > > > finish = hour * 60 * 60 + minute * 60 + second > > > > print finish - start > > > > > > On Sun, May 2, 2010 at 1:26 PM, james reneau <ji...@re...> wrote: > >> > >> Ian, > >> > >> The U functions do a whole bunch of additional conversion from the char* > >> in utf8 to qstring and then back again to char* utf8. For those of us > in > >> the ASCII/English world I thought it would be slower. Havn't done a > >> benchmark. Let me do one before I push. > >> > >> Jim > >> > >> > >> > >> On Sun, May 2, 2010 at 12:51 PM, Ian Larsen <dr...@gm...> wrote: > >>> > >>> For the sake of simplicity, why don't you just have your U* functions > >>> just replace the default? For ASCII strings they should do the same > >>> thing as the regular ones. > >>> > >>> -Ian > >>> > >>> On Sun, May 2, 2010 at 12:45 PM, james reneau <ji...@re...> wrote: > >>> > Guys, > >>> > > >>> > I have gotten the save and load to work with UTF8 and I am adding > >>> > string > >>> > functions to handle unicode (ULENGTH, USEQ, > >>> UCHR, UMUD, ULEFT, URIGHT, > >>> > UINSTR) and should have them committed later today. > >>> > > >>> > Jim > >>> > > >>> > On Sat, May 1, 2010 at 5:39 PM, Ian Larsen <dr...@gm...> > wrote: > >>> >> > >>> >> Here's a screenshot of the result. > >>> >> > >>> >> On Sat, May 1, 2010 at 5:34 PM, Ian Larsen <dr...@gm...> > wrote: > >>> >> > It's even simpler than that. You just have to change most of the > >>> >> > QString::toAscii calls to QString::toUtf8. > >>> >> > > >>> >> > My changes are committed. I've tried to test everything I could > but > >>> >> > more extensive testing would ensure I've got everything. If you > see > >>> >> > question marks instead of extended characters anywhere, please let > >>> >> > me > >>> >> > know. > >>> >> > > >>> >> > -Ian > >>> >> > > >>> >> > On Sat, May 1, 2010 at 4:57 PM, james reneau <ji...@re...> > wrote: > >>> >> >> Ian, > >>> >> >> > >>> >> >> That was my thought, too. I was going to email you to see if we > >>> >> >> could > >>> >> >> change all the char* stuff in the stack and interpreter to > >>> >> >> QStrings? > >>> >> >> > >>> >> >> Looking forward to your commit. > >>> >> >> > >>> >> >> Jim > >>> >> >> > >>> >> >> On Fri, Apr 30, 2010 at 9:59 PM, Ian Larsen <dr...@gm...> > >>> >> >> wrote: > >>> >> >>> > >>> >> >>> All, > >>> >> >>> > >>> >> >>> I was wrong about Flex; it handles Utf8 just fine. The problem > >>> >> >>> was > >>> >> >>> with the way QStrings were being converted. I have a working > >>> >> >>> version > >>> >> >>> that I'm going to test some more and commit tomorrow. > >>> >> >>> > >>> >> >>> -Ian > >>> >> >>> > >>> >> >>> On Fri, Apr 30, 2010 at 5:53 PM, Ian Larsen <dr...@gm...> > >>> >> >>> wrote: > >>> >> >>> > All, > >>> >> >>> > > >>> >> >>> > I believe the reason you're seeing the question marks in the > >>> >> >>> > output > >>> >> >>> > is > >>> >> >>> > because Gnu Flex and Bison, which the basic256 parser is > written > >>> >> >>> > in, > >>> >> >>> > doesn't support Unicode at all. > >>> >> >>> > > >>> >> >>> > There are no simple fixes for this, unfortunately. Here are > >>> >> >>> > some > >>> >> >>> > possibilities: > >>> >> >>> > > >>> >> >>> > 1) Encode ALL strings in a program's source code using base64 > >>> >> >>> > and > >>> >> >>> > then > >>> >> >>> > decode them prior to pushing them onto the operand stack. > This > >>> >> >>> > is > >>> >> >>> > an > >>> >> >>> > ugly hack, but right now would be the path of least > resistance. > >>> >> >>> > 2) Find a drop-in replacement for Flex and Bison that supports > >>> >> >>> > Unicode > >>> >> >>> > 3) Write a custom parser that supports Unicode. This would be > a > >>> >> >>> > *lot* > >>> >> >>> > of work, but would be a lot of fun for someone interested in > >>> >> >>> > learning > >>> >> >>> > compiler design. > >>> >> >>> > > >>> >> >>> > If anyone has any other ideas, please let me know. > >>> >> >>> > > >>> >> >>> > -Ian > >>> >> >>> > > >>> >> >>> > > >>> >> >>> > > >>> >> >>> > On Fri, Apr 30, 2010 at 11:06 AM, <web...@bi...> > >>> >> >>> > wrote: > >>> >> >>> >> Ian, > >>> >> >>> >> > >>> >> >>> >> I am very glad that you have returned to the development > >>> >> >>> >> BASIC256! > >>> >> >>> >> I > >>> >> >>> >> would just tell you about a serious problem that exists for > >>> >> >>> >> users > >>> >> >>> >> who > >>> >> >>> >> use the Russian language. Attached - screenshot. > >>> >> >>> >> > >>> >> >>> >> I made a patch for version 0.9.5 which was published 12/2009 > >>> >> >>> >> for > >>> >> >>> >> the > >>> >> >>> >> distribution of ALT Linux. Of course, this patch is not > urgent, > >>> >> >>> >> since > >>> >> >>> >> you have done a lot of changes. Can I ask you to make > necessary > >>> >> >>> >> changes (because I have little experience) or the provision > of > >>> >> >>> >> Russian-speaking users - only my problem? :-) > >>> >> >>> >> > >>> >> >>> >>> On this list about two weeks ago we got a french translation > >>> >> >>> >>> if > >>> >> >>> >>> anyone > >>> >> >>> >>> would like to add that in. If not, I'll get around to it > >>> >> >>> >>> eventually. > >>> >> >>> >> > >>> >> >>> >> I have a little more experience, so it's better if you did. > >>> >> >>> >> > >>> >> >>> >> -- > >>> >> >>> >> Blessing, > >>> >> >>> >> Sergei Irupin > >>> >> >>> >> http://rnd-lug.blogspot.com/ > >>> >> >>> > > >>> >> >>> > > >>> >> >>> > > >>> >> >>> > -- > >>> >> >>> > My PGP Public Key: > >>> >> >>> > http://www.scrapshark.com/pubkey.txt > >>> >> >>> > > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> -- > >>> >> >>> My PGP Public Key: > >>> >> >>> http://www.scrapshark.com/pubkey.txt > >>> >> >>> > >>> >> >> > >>> >> >> > >>> >> > > >>> >> > > >>> >> > > >>> >> > -- > >>> >> > My PGP Public Key: > >>> >> > http://www.scrapshark.com/pubkey.txt > >>> >> > > >>> >> > >>> >> > >>> >> > >>> >> -- > >>> >> My PGP Public Key: > >>> >> http://www.scrapshark.com/pubkey.txt > >>> > > >>> > > >>> > >>> > >>> > >>> -- > >>> My PGP Public Key: > >>> http://www.scrapshark.com/pubkey.txt > >>> > >> > > > > > > > > -- > My PGP Public Key: > http://www.scrapshark.com/pubkey.txt > > |