Menu

POKE is almost as good as an lvalue MID$,

Programs
2019-05-31
2024-01-23
  • Yet Another Troll

    but my data corruption paranoia seems to have slowed this palindrome finder down at least as much as replacing MID$ with PEEK and POKE has speeded it up. At least it no longer uses four bytes per character with tens of millions of characters like the version using a Corpus%() array instead of Corpus$ did.

    This uses a naive O(N * N) palindrome matching algorithm instead of Manacher's or one of the parallelizable variants, but it's much simpler to analyze. There's a SPAWN command I'm looking at, but how can SPAWNed threads communicate results back without stepping all over each other?

    Be on the lookout for lingering debugging cruft,

    Program Palindromes
     Input "How long? ", N%
    
     Start = Timer
    
     Alpha$ = "GATTACA"
     LenA% = Len(Alpha$)
    
     ' Belt, suspenders, drones, crane, cargo helicopter, orbital skyhook
     Dim Alpha%(256)
     Clr Alpha%()
     For i% = 1 To Len(Alpha$)
      ' Parallelizable, but would it be worth it?
      Alpha%(Asc(Mid$(Alpha$, i%, 1))) = True
     Next i%
    
     Corpus$ = Space$(N% + 2) ! Make room for Sentinels at front and end
    
     Print "Allocated "; Len(Corpus$); " in "; Timer - Start
    
     Clr i%, j%, k%
    
     ' Randomize
    
     ' Watch out on 64-bit
     VPA% = VarPtr(Alpha$)
     VPC% = VarPtr(Corpus$)
     Poke VPC%, Asc("{") ! Insist on genuine Trask Industries Sentinels
     Poke VPC% + N% + 1, Asc("}")
     For i% = 1 To N%
      ' Parallelizable, I guess
      j% = Random(LenA%)
      ' k% = Asc(Mid$(Alpha$, j% + 1, 1))
      k% = Peek(VPA% + j%)
      ' Caution, this sanity check adds significant overhead
      If Not Alpha%(k%) ! InStr(Alpha$, Chr$(k%)) = 0 Then
       Print "Invalid character at 0x"; Hex$(VPA%); " + "; j%, k%
       Stop
      EndIf
      Poke VPC% + i%, k%
     Next i%
    
     Print "Corpus$ initialized in "; Timer - Start
    
     If N% <= 600 Then
      Print "'"; Corpus$; "'"
     EndIf
    
     Clr B0rk3d%
     If Left$(Corpus$, 1) <> "{" Then
      Print "Left sentinel missing!"
      B0rk3d% = True
     EndIf
     If Right$(Corpus$, 1) <> "}" Then
      Print "Right sentinel missing!"
      B0rk3d% = True
     EndIf
     If B0rk3d% Then
      Print "Where are my dragons?!"
      Stop
     EndIf
     ' Corpus$ = Corpus$ + "}"
     If Glob(Corpus$, "{*[!" + Alpha$ + "]*}") Then
      Print "Invalid characters!"
      Stop
     EndIf
    
     Print "Sanity checked in "; Timer - Start
    
     Clr i%, j%, k%, l%, Lo%, Hi%
     ' Hi% = 1 ! Skip uninteresting one or two character 'palindromes'
     VPC% = VarPtr(Corpus$)
     For j% = N% DownTo 1 ! 1 To N% ! Either way works fine
      ' Consider divide and conquer recursive thread launcher
      Clr i%, k%, l%
      ' Caution, this sanity check adds significant overhead
      k% = Peek(VPC% + j%)
      If Not Alpha%(k%) ! InStr(Alpha$, Chr$(k%)) = 0 Then
       Print "Invalid character at 0x"; Hex$(VPC%); " + "; j%, k%
       Stop
      EndIf
      For l% = 0 To 1
       ' I can haz free thread?
       ' Naive check for palindrome, will blow up on certain long patterns
       ' When l% = 0, checks for odd length palindrome centered at j%
       ' When l% = 1, checks for even length palindrome centered at j%-1,j%
       i% = j%
       k% = j% - l%
       ' Caution, loop depends upon the sentinels to stop
       While Peek(VPC% + i% - 1) = Peek(VPC% + k% + 1)
        Dec i%
        Inc k%
       Wend
       ' Synchronize Swatches here
       If 0 And Not ((i% < j%) And (j% < k%)) Then
        ' Ignore one or two character 'palindromes'
        ! No-op
       Else If (Hi% - Lo%) <= (k% - i%) Then
        ' This would be the critical section
        Lo% = i%
        Hi% = k%
        ' Offsets for PEEK are zero based but strings are one based.
        ' Palindrome display includes surrounding characters, possibly
        ' even a sentinel. Delete the '- 1' and '+ 2' to elide them.
        Print i%, k%, "'"; Mid$(Corpus$, i% + 1 - 1, k% - i% + 1 + 2); "'"
       EndIf
      Next l%
     Next j%
    
     Print "Done in "; Timer - Start
     Print "OK"
    End
    
     

    Last edit: Yet Another Troll 2019-05-31
  • Yet Another Troll

    A first try at cloning MS-style lvalue MID. I prioritize safety over speed or replicating the overlapping range behavior. Hopefully xbasic strings aren't copy on write reference counted magic memory, or if they are, VARPTR triggers a string copy if the refcount is greater than one.

    ' Mid$(LV$, LVFirst%, LVLength%) = Mid$(RV$, RVFirst%, RVLength%)
    Procedure MidMid(Var LV$, LVFirst%, LVLength%, RV$, RVFirst%, RVLength%)
     Local VPLV%, VPRV%, LenLV%, LenRV%, LVLast%, RVLast%, i%, j%
     LenLV% = Len(LV$)
     LenRV% = Len(RV$)
     i% = LVFirst%
     j% = RVFirst%
     If i% < 1 Then
      i% = 1
     EndIf
     If j% < 1 Then
      j% = 1
     EndIf
     LVLast% = i% + LVLength% - 1
     RVLast% = j% + RVLength% - 1
     If LVLast% > LenLV% Then
      LVLast% = LenLV%
     EndIf
     If RVLast% > LenRV% Then
      RVLast% = LenRV%
     EndIf
     Print i%, LVLast%, j%, RVLast%
     VPLV% = VarPtr(LV$)
     VPRV% = VarPtr(RV$)
     While (i% <= LVLast%) And (j% <= RVLast%)
      Poke VPLV% + i% - 1, Peek(VPRV% + j% - 1) ! And 0xFF
      Inc i%
      Inc j%
     Wend
    Return
    
     

    Last edit: Yet Another Troll 2019-06-06
  • Markus Hoffmann

    Markus Hoffmann - 2019-06-06

    VARPTR() just returns the current address of a string. It does not trigger any other action. The strings adress stays the same, as long as it will not have been assigned any new value.

    In case a string is passed by reference, the receiving variable gets the same memory pointer than the calling varible (same VARPTR). Insteadof passing a string by reference, You could also have passd a pointer variable (adr%=VARPTR(s$)) by value, and then do the PEEK/POKE. This way, the original variable (s$) cannot change inside the procedure, and the adress is garanteed to stay the same.

    More interesting would be a test, if the content of a variable can change if changed inside a procedure where it has been passed to by reference and where it was assigned a new value.

     

    Last edit: Markus Hoffmann 2019-06-06
    • Yet Another Troll

      I was mainly afraid of an optimization for large strings, seen in languages like Java and C# and some C++ string classes, causing an issue like this,

      A$ = Space$(128 * 4096)
      A% = VarPtr(A$)
      B$ = A$
      Poke A%, Asc("A")
      Print Left$(A$, 1), Left$(B$, 1)

      Both A$ and B$ may see the change, depending upon fiddly language implementation details.

       
    • Yet Another Troll

      Is this kind of what you mean? It appears to work as intended, at least in Windows and Android

      > load "G:\My Drive\Projects\bas\new.bas"
      > list
      #!/lbin/ybasic
      
      Program PokeTheRef
      
       X$ = "Original"
       @PassRef(X$)
       Print X$
      
       Print
       Print "OK"
      End
      
      Procedure PassRef(Var S$)
       Local s%, i%
      
       Print S$
      
       S$ = "New~Data"
       Print S$
      
       s% = VarPtr(S$)
       For i% = 1 To Len(S$)
        Poke s% + i% - 1, Peek(s% + i% - 1) XOr 32
       Next i%
      
       Print S$
      Return
      > run
      Original
      New~Data
      nEW^dATA
      nEW^dATA
      
      OK
      > load "G:\My Drive\Projects\bas\new.b"
      > run
      Original
      New~Data
      nEW^dATA
      nEW^dATA
      
      OK
      > ver
      X11-BASIC Version: 1.27 Jul  8 2019 12:44:26
      >
      
       
  • Markus Hoffmann

    Markus Hoffmann - 2019-06-06

    By default, the assignment operator ("=") makes a copy of the content of the variable. For strings this means, the memory area is duplicated. If you do not want this, you may use the ABSOLUTE command. This can link two Variables together so that they have the same content (and the same VARPTR). I have never used this feature. It might however be usefull when using the SPAWN command, or EVERY and AFTER. ABSOLUTE is more useful when accessing a (foreign) memory location like a variable.

    You could p.ex. allocate a shared memory segment and put a variable into it by using the ABSOLUTE command. This way it can be access by another process.

     

    Last edit: Markus Hoffmann 2019-06-06
    • Yet Another Troll

      Could be very useful for bottom-up approximate string matching dynamic programming algorithms like Smith-Waterman DNA alignment that build up many copies of large strings very quickly, but ABSOLUTE looks very dangerous. I went with a separate string table array that's resized as needed, with the ginormous matrix of strings converted to a matrix of integers pointing into the string array. It cut down the memory use by nearly two orders of magnitude! It's a good thing I remembered that trick from sixth grade and fiddling around with old 8-bit BASIC.

      https://en.m.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorithm

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.