From: Vincent R. <vin...@fr...> - 2025-01-08 02:50:01
|
On 07/01/2025 at19:00, Thorsten Otto via Freemint-discuss wrote: > That was a great understatement. I recently took a closer look, and realized > that you managed to enter already 1700!! symbols. Even with the 1.04 source > available, that is simply awesome. You even managed to find some aes > variables although the functions cannot correctly be disassembled by ghidra > (because of the LineF opcodes). How long did it take to get that far? Thanks 😄 Actually, I spent a week to hunt that damn GEMDOS bug. So I learned Ghidra and I added the relevant labels and function signatures. Then as that archeology was highly addictive, I continued to add the labels for most TOS entry points. I started with the BIOS/XBIOS/BDOS function tables, as they could easily be identified. Same for VDI and Line-A, but unfortunately not Line-F. Then when finding a function, I looked at function calls inside, and by looking at the TOS 1.04 source I was able to find those function names. And so on. Same for the global variables. For AES/Desktop, it was tricky because of the ugly Line-F calls. I only found some functions. I started with the easy ones such as cli/sti. Then the dos_* wrappers. A Ghidra trick that helped me later. When pressing 'D', Ghidra stops at the first 0xf opcode. But looking 2 bytes after, we can see if there is another Line-F call afterwards. If not, we can press 'D' again and disassemble a few more lines. I did that for a few functions like sh_main, gemstart, etc. But the best method to find functions is to decode the Line-F opcode. For example, if the opcode is 0xf124. It is an even value, so it's a function call. Keep only the last 3 digits, that gives 124. Then in Ghidra, type 'G' (go to) the enter the expression "lineftab+124". This reveals the address of the called routine. Unfortunately, not its name. See more at the end of this message (*). In each case, I entered the names manually with 'L'. Nothing automatic. I did that kind of stuff during 2 more weeks. Then as it wasn't possible to easily go further, I stopped. For the AES variables, I don't remember well. But I guess I found some list of variables in your TOS 1.04 sources. A key tool was Ctrl+Shift+F to find the references to a variable. By seeing how it was used, and initialized, that gave a clue. I specially did that for VDI. > Anyway, i've spend some time trying to identify the aes functions. About > halfway through, (most of aes is done, but most of desktop is still > missing). I've pushed my current work to a new branch of the sources: > https://github.com/th-otto/tos1x/tree/TOS_100 Fine. If, for your work, you add some stuff to the Ghidra disassembly, please consider contributing your finds. I can give you write access to the DisasTOS repositories, no problem with that. Only issue is the inconvenient way to share Ghidra projects. But I don't plan to work actively on it in a near future. So if you want to have hands free to add more labels to the disassembly, then go ahead. On the other hand, if you have your new labels in a flat text file, we should find a way to import them into Ghidra. Certainly easy using a script. > I've also changed already some of the sources where i saw differences, but > without being able to verify them yet. There are also extracted resource > files. Handling of the resource file is a bit strange: the format dialog is > in a seperate resource, and there are 2 additional blocks of data for which > i have idea yet what they are good for (you already noticed that too, given > that you already assigned lables to them). The routine that copies them to > ram (ram_rom) also does some strange juggling with the aes global array. Indeed, it is different from TOS 1.04. I saw the separate resource files, DESKTOP.INF, and other data that I didn't understand. Some other information: I didn't know, but someone else already wrote some interesting Ghidra scripts for Atari binaries 😃 I haven't tried them, but that's certainly worth: https://github.com/czietz/ghidraScripts_for_Atari/ (*) This week I also looked at Ghidra scripts, and after some initial efforts to understand the object model, it seems to be rather easy. At least for basic stuff. Key point is that I found that it was possible: 1) To replace a Line-F instruction with "dw" (a.k.a dc.w) by pressing 'T' and assigning the type "word". 2) To add a reference on that pseudo-opcode to actual Line-F function by pressing 'R' and entering the target address. There are 3 benefits: - Double-click on the opcode reference to jump to the function. - On a Line-F function, use Ctrl-Shit-F to find references. - Ability to rename the function by simply using 'L' on any reference. Yes, I'm really speaking about those obfuscated Line-F calls! This doesn't fix the C decompilation. But at least, this eases browsing from the listing window, finding cross-references, and renaming functions. Then I went further and I wrote a script for that: https://github.com/disastos/tos100fr/blob/main/ghidra_scripts/AddLineFReference.java I will write more complete documentation later. But in a few words: - Get the script with "git pull -r" as usual. - In Ghidra, click on Window > Bundle manager. - Click on the + icon from the right of the toolbar, and add the ghidra_scripts directory. - Click on Window > Script Manager - On the left, go to the Atari section. - On the AddLineFReference.java line, on the left, check the "In Tool" checkbox to enable the script. I assigned it to the '$' key by default. Then to use the script: - Go to Ghidra listing (disassembly) window. - Type 'G' then gem_main, for example, to go to that function. - A few lines down, put the cursor on the "?? F7h" line - And simply press '$'. First time, the script is compiled. Next times, it's immediate. Result: You will see the 2 lines transforming into "dw FUN_00fd92fc" for example. You can double-click on that FUN_00fd92fc to go to its definition, press Alt+Left to go back, and even rename it with 'L' if you managed to determine its real name. This way, it's quite easy to disassemble AES/Desktop: - 'D' for normal disassembly - '$' to disassemble a Line-F call - If needed, 'C' to revert disassembly to undefined state. NB: I pushed the script, but as I didn't make significant work on the disassembly itself, I'm not going to push a new version of the GAR file. -- Vincent Rivière |