Ok, with big fat help from BSZili we almost done with native ftp.module. For now, it compiles for os4, mos and aros (for os3 need to install necessary network SDK where all those proto/socket.h present and so on).
There was a lot of changes in summ to make it works on gcc/ng:
rewrite all to SDI
make it ABI compatible
new macroses for calling correctly 68k-hook-structures (now uses in themes.module and ftp.module)
get rid of as255 fully
all the necessary bsdsocket.library changes
getting rid of old aligned attribute via D_S macro
and bunch of other stuff, everything in Modules/ftp/ now.
As first test, to be sure that everything is ok with native ftp.module itself , we do tests on OS4 till it will not works properly. So, for now we can login to ftp (let's say aminet), we can browse files, we can dbl-click on them, and they will downloads and shows. We also can dbl-click on let's say .lha archive , it also will downloads, and go inside with new lister (if we use xadopus.module or those arc-arexx scripts).
Now to the problems which we need to fix.
-1--
D&D of files from ftp-listers to any local listers do not works. Didn't crashes, just keep silence. On serial through it thrown such info when i try to d&d something from ftp to local lister:
** trapped 'dropfrom'
check alive
check network
check network this site
LOG: --> NOOP
lister_xfer()
LOG: 200 NOOP command successful
-2--
If use debug.kernel on os4 with all those "munge" and co, and we choice a file on ftp and press button at top bar for "copy" (or for move, whatever), then it crashes heavy with such stack trace:
DAR shows: 0xCCCCCCCC , what mean again can be that problem like we have before with filetype.module (0xCCCCCCC mean that we have tried to free a Node a second time).
I assume that D&D can not works because of the reasons why copy crashes. But can be wrong.
As i say that crash i only can catch with debug.kernel and munge enabled (that 0xCCCCCCCC is always catches only on debug kernels with munge enabled). With user kernel, pressing on copy/move/copyas/moveass buttons in lister also make nothing (like as it with d&d).
Also, if i test it on "user" kernel and it didn't crashes (as user kernel can't catch those 2-times-node-free, it thrown to serial such info when i press toolbar button "copy":
** trapped 'Copy'
check alive
check network
check network this site
LOG: --> NOOP
lister_xfer()
LOG: 200 NOOP command successful
For "move" button:
** trapped 'Move'
check alive
check network
check network this site
LOG: --> NOOP
lister_xfer()
LOG: 200 NOOP command successful
for "copy as" button:
** trapped 'CopyAs'
check alive
check network
check network this site
LOG: --> NOOP
lister_xfLOG: 200 NOOP command successful
er()
-3--
Third bug are different: Just press on "ftp" in the buttons, it will bring a ftp-buttons window, where one of them are "localhost". Pressing on that will spawn a new Lister, and then 2 crashes come one after another.
Stack trace of first one (in dopus_requester_proc and seems graphics related, more exactly something with rastports):
DAR there point out on 000064C0 (so no NULL pointer) and it can be just side-effect-crash of the first, null-pointer crash.
It can be original bug, like "non-checking-if-there-is-ftp-port-opened-at-all", but that for sure should be fixed to make all looks robust and clean.
It also can be something with ABI (like those ASM related functions)..
To add, it not necessary should be localhost only. You can create any new entry in ftp's address book, just with wrong hostname and then it will crashes.
I assume, it just crashes when dopus trying to build a window by "simplerequest" with error, and just sucked up. The same, as it was in some other parts where SimpleRequest is involved (like it was in filetype.module, which Xenic fixed back in times). But can be wrong, need usual debugging and bug-hunting.
Last edit: kas1e 2013-08-20
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Now, i checked rev533, till what BSzili do some cleanup in ftp.module as well. Now, in aos4 version i CAN download files from ftp listers to local listers and most of problems just disappear !
For example there is no more crashes when we tried to log on localhost (now, normal window with "Cannot log in to localhost (could not connect)", the same for all other "bad" hostnames.
Also there is no crash when we copy files from ftp lister to local lister via buttons at top bar, or via drag&drop, but ! There is still one problem keeps, and it can be or because of some of our previous replacements of some functions, or because of some still lefts STDARGS based funcs, or because of that VA_END which was missing and no added (maybe that was intended?) or anything else, but anyway, problems is:
i can't copy 2 files at the same time now. I.e. i mark 2 files from aminet, d&d them to ram: , and while first one copy ok, another one says "sorry you there is no such file". But then, i just try to copy that file standalone, and it copies.
It seems just after we download a file, something going wrong with buffers (like forgotten null-termination somewhere). For example, if we will just do hard reboot, then run dopus5, go at aminet/dbase/ , there mark AA_30.lha and AA_30.readme, and d&d them, then, ftp.module download first file, and bring us a window for the next one "550 A_30.readme: No such file or directory try again/skip/abort), i.e. there visibly that first character of second file name just "eats", like after first one, there wasn't null-termination in buffers => fail.
In the log it says:
** trapped 'dropfrom'
check alive
check network
check network this site
LOG: --> NOOP
LOG: 200 NOOP command successful
LOG: --> PORT 192,168,1,6,4,34
LOG: 200 PORT command successful
LOG: --> RETR AA_30.lha
LOG: 150 Opening BINARY mode data connection for AA_30.lha (371418 bytes)
LOG: 226 Transfer complete
LOG: --> PORT 192,168,1,6,4,35
LOG: 200 PORT command successful
LOG: --> RETR A_30.readme
LOG: 550 A_30.readme: No such file or directory
** get src err
After that happens, all sort of weird things can happens when we try to download/rename/move files from ftp lister to local one. Names are fucked pretty much by all sort of fancy characters, etc.
Also as far as i can see, pressing on "Aminet" button in the FTP button bank do nothing, its only throw me at serial:
mod_connect()
module connect done (0)
while on that button by default dopus5 have: Command FTPConnect aminet.net DIR.
Its just do the same as if i pres on FtpConnect button, where i can write all my data, but without window.
Last edit: kas1e 2013-09-07
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you remember I ended up reverting my function replacements. Also VA_END is necessary, it might not do anything on some platforms, but you can end up with memory leaks on others.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I doubt a function which opens an error requester has much to do with file transfer. If anything, I might have forgotten to change a strncpy back to stccpy.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Should't we try anything just to make bugs fixed ? do you have any other ideas ? i am not. except maybe that sculd or how it was called. i cant see code till tomorrow normaly, so even if idea with strcmpi make sounds stoopid, i have only for tomorrow: step by step rewer changed funcs and.see when differences start. but if you have any other ideas that for sure will be cool. we can.go tomorrow again that kprinf way, but that skipped first symbols of second buffer looks like real non null termination somwhere (and it arise when we change funcs)
Last edit: kas1e 2013-09-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm not sure how random guesswork is supposed to fix bugs. LSrintF is used in a single function for displaying an error requester, and strcmpi / stricmp is used to compare strings. How are these related to the null termination of any string? You are free to experiment, but don't expect me to agree with you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
How are these related to the null termination of any string?
Null termination it also my random gueswork. It can be not null termination at all. It just looks like this. But it can be easy overwriting of one buffer by another, or some overflow somewhere, or anything else, like some string compare and then wrong if/else somewhere, or some long/ulong char/uchar differences or some non-harmless warnings. It can be anything , and its of course all random gueswork.
Are you have any other ideas in compare with random gueswork ?
edit: another random gueswork: maybe something related to "fib" stuff , as it used for filenames with all those FILENAMELEN + 1 , so pretty possible we can miss somewhere to change one of them ?
Last edit: kas1e 2013-09-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'd prefer to trace back to the root of the problem instead of trial and error. I'm not comfortable with the idea that computing is non-deterministic, and changing anything can solve the bug. If that makes closed-minded so be it, but as I said you are free to experiment, prove me wrong. That will be one less bug to take care of.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If, we copy just 2 files, and press on topbar just "copy", then it copy first file ok, and for second says the same "no have file" (with skipped first character). And if we then press "Skip", then in my serial-line i can see some heavy crap like this: http://kas1e.mikendezign.com/misc/dopus5/modules_bug/ftp/ftp_5.jpg
@BSzili
Plz go on jabber, i am here for whole day today, for sure can be faster to put kprinfs and stuff together. All i see now, is that we in the FTPERR_XFER_SRCERR: case in ftp_recursive.c when we can't get second file.
Last edit: kas1e 2013-09-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I won't really be able to go on jabber before 17:00 GMT+1 anymore, because the semester just started, and I'm busy with my studies and office routines (yuck!).
Note that I was not having a go at you, but I'm literally swamped, and I have to get ftp.module working on AROS.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
@all
Ftp.module fully working now on os4 ! Last bug was because of stptok() function which wasn't close enough to sasc : so we found right one, and bug is gone.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Build todays svn (rev733), and ftp.module on os4 didn't shown any content anymore. I.e. i go to address book, dbl-click on any host, it connects, says "reading files" and then show empty lister without files. In the log i have:
Yep, PORT is fixed, crash is here. I assume on AROS you do tests you just can't catch those "free node" bugs. I can catch them only on debug kernel with "munge" option. I.e. bug for sure there , just dunno how you can reproduce it.. Maybe aros-hosted will segfault on it ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
@all
Ok, with big fat help from BSZili we almost done with native ftp.module. For now, it compiles for os4, mos and aros (for os3 need to install necessary network SDK where all those proto/socket.h present and so on).
There was a lot of changes in summ to make it works on gcc/ng:
As first test, to be sure that everything is ok with native ftp.module itself , we do tests on OS4 till it will not works properly. So, for now we can login to ftp (let's say aminet), we can browse files, we can dbl-click on them, and they will downloads and shows. We also can dbl-click on let's say .lha archive , it also will downloads, and go inside with new lister (if we use xadopus.module or those arc-arexx scripts).
Now to the problems which we need to fix.
-1--
D&D of files from ftp-listers to any local listers do not works. Didn't crashes, just keep silence. On serial through it thrown such info when i try to d&d something from ftp to local lister:
-2--
If use debug.kernel on os4 with all those "munge" and co, and we choice a file on ftp and press button at top bar for "copy" (or for move, whatever), then it crashes heavy with such stack trace:
DAR shows: 0xCCCCCCCC , what mean again can be that problem like we have before with filetype.module (0xCCCCCCC mean that we have tried to free a Node a second time).
I assume that D&D can not works because of the reasons why copy crashes. But can be wrong.
As i say that crash i only can catch with debug.kernel and munge enabled (that 0xCCCCCCCC is always catches only on debug kernels with munge enabled). With user kernel, pressing on copy/move/copyas/moveass buttons in lister also make nothing (like as it with d&d).
Also, if i test it on "user" kernel and it didn't crashes (as user kernel can't catch those 2-times-node-free, it thrown to serial such info when i press toolbar button "copy":
For "move" button:
for "copy as" button:
-3--
Third bug are different: Just press on "ftp" in the buttons, it will bring a ftp-buttons window, where one of them are "localhost". Pressing on that will spawn a new Lister, and then 2 crashes come one after another.
Stack trace of first one (in dopus_requester_proc and seems graphics related, more exactly something with rastports):
DAR: 00000000. I.e. in first stack-trace dar point out on NULL-pointer access.
And then stack trace of second one (in dopus_ftp_lister):
DAR there point out on 000064C0 (so no NULL pointer) and it can be just side-effect-crash of the first, null-pointer crash.
It can be original bug, like "non-checking-if-there-is-ftp-port-opened-at-all", but that for sure should be fixed to make all looks robust and clean.
It also can be something with ABI (like those ASM related functions)..
To add, it not necessary should be localhost only. You can create any new entry in ftp's address book, just with wrong hostname and then it will crashes.
I assume, it just crashes when dopus trying to build a window by "simplerequest" with error, and just sucked up. The same, as it was in some other parts where SimpleRequest is involved (like it was in filetype.module, which Xenic fixed back in times). But can be wrong, need usual debugging and bug-hunting.
Last edit: kas1e 2013-08-20
@All
Now, i checked rev533, till what BSzili do some cleanup in ftp.module as well. Now, in aos4 version i CAN download files from ftp listers to local listers and most of problems just disappear !
For example there is no more crashes when we tried to log on localhost (now, normal window with "Cannot log in to localhost (could not connect)", the same for all other "bad" hostnames.
Also there is no crash when we copy files from ftp lister to local lister via buttons at top bar, or via drag&drop, but ! There is still one problem keeps, and it can be or because of some of our previous replacements of some functions, or because of some still lefts STDARGS based funcs, or because of that VA_END which was missing and no added (maybe that was intended?) or anything else, but anyway, problems is:
i can't copy 2 files at the same time now. I.e. i mark 2 files from aminet, d&d them to ram: , and while first one copy ok, another one says "sorry you there is no such file". But then, i just try to copy that file standalone, and it copies.
It seems just after we download a file, something going wrong with buffers (like forgotten null-termination somewhere). For example, if we will just do hard reboot, then run dopus5, go at aminet/dbase/ , there mark AA_30.lha and AA_30.readme, and d&d them, then, ftp.module download first file, and bring us a window for the next one "550 A_30.readme: No such file or directory try again/skip/abort), i.e. there visibly that first character of second file name just "eats", like after first one, there wasn't null-termination in buffers => fail.
In the log it says:
After that happens, all sort of weird things can happens when we try to download/rename/move files from ftp lister to local one. Names are fucked pretty much by all sort of fancy characters, etc.
Also as far as i can see, pressing on "Aminet" button in the FTP button bank do nothing, its only throw me at serial:
while on that button by default dopus5 have: Command FTPConnect aminet.net DIR.
Its just do the same as if i pres on FtpConnect button, where i can write all my data, but without window.
Last edit: kas1e 2013-09-07
If you remember I ended up reverting my function replacements. Also VA_END is necessary, it might not do anything on some platforms, but you can end up with memory leaks on others.
@BSzili
Right.. maybe something with that lsprintf/rawdofmt changes ? it for sure looks like some buffer not null terminited..
I doubt a function which opens an error requester has much to do with file transfer. If anything, I might have forgotten to change a strncpy back to stccpy.
imho nope, as you revert them in rev522, but maybe its strcmpi which wasnt reverted ? will check them all tomorrow
Why on the earth should we revert back to using strcmpi?
Should't we try anything just to make bugs fixed ? do you have any other ideas ? i am not. except maybe that sculd or how it was called. i cant see code till tomorrow normaly, so even if idea with strcmpi make sounds stoopid, i have only for tomorrow: step by step rewer changed funcs and.see when differences start. but if you have any other ideas that for sure will be cool. we can.go tomorrow again that kprinf way, but that skipped first symbols of second buffer looks like real non null termination somwhere (and it arise when we change funcs)
Last edit: kas1e 2013-09-09
I'm not sure how random guesswork is supposed to fix bugs. LSrintF is used in a single function for displaying an error requester, and strcmpi / stricmp is used to compare strings. How are these related to the null termination of any string? You are free to experiment, but don't expect me to agree with you.
@BSzili
Null termination it also my random gueswork. It can be not null termination at all. It just looks like this. But it can be easy overwriting of one buffer by another, or some overflow somewhere, or anything else, like some string compare and then wrong if/else somewhere, or some long/ulong char/uchar differences or some non-harmless warnings. It can be anything , and its of course all random gueswork.
Are you have any other ideas in compare with random gueswork ?
edit: another random gueswork: maybe something related to "fib" stuff , as it used for filenames with all those FILENAMELEN + 1 , so pretty possible we can miss somewhere to change one of them ?
Last edit: kas1e 2013-09-09
I'd prefer to trace back to the root of the problem instead of trial and error. I'm not comfortable with the idea that computing is non-deterministic, and changing anything can solve the bug. If that makes closed-minded so be it, but as I said you are free to experiment, prove me wrong. That will be one less bug to take care of.
Another test-update:
If we choice 2 files, and press on topbar's "copy as", then:
-- for first file it ask "Enter new filename" and it have full name in the requester field, like this: http://kas1e.mikendezign.com/misc/dopus5/modules_bug/ftp/ftp_1.jpg
-- for second file it ask as well "Enter new filename" and it have in the requester field skipped first character as well: http://kas1e.mikendezign.com/misc/dopus5/modules_bug/ftp/ftp_2.jpg
-- if we now press "copy", then we have a window which says there is no such file with skipped firs character: http://kas1e.mikendezign.com/misc/dopus5/modules_bug/ftp/ftp_3.jpg
-- if instead of "copy" we press "skip", then we have again "Enter new filename" window, in which name of file are fancy characters: http://kas1e.mikendezign.com/misc/dopus5/modules_bug/ftp/ftp_4.jpg
If, we copy just 2 files, and press on topbar just "copy", then it copy first file ok, and for second says the same "no have file" (with skipped first character). And if we then press "Skip", then in my serial-line i can see some heavy crap like this: http://kas1e.mikendezign.com/misc/dopus5/modules_bug/ftp/ftp_5.jpg
@BSzili
Plz go on jabber, i am here for whole day today, for sure can be faster to put kprinfs and stuff together. All i see now, is that we in the FTPERR_XFER_SRCERR: case in ftp_recursive.c when we can't get second file.
Last edit: kas1e 2013-09-09
I won't really be able to go on jabber before 17:00 GMT+1 anymore, because the semester just started, and I'm busy with my studies and office routines (yuck!).
Note that I was not having a go at you, but I'm literally swamped, and I have to get ftp.module working on AROS.
@all
Ftp.module fully working now on os4 ! Last bug was because of stptok() function which wasn't close enough to sasc : so we found right one, and bug is gone.
@all
Found one error in ftp.module (tested os4 version). To reproduce:
Directory Opus Request
Error Saving File !
DOS error 205: object not found
retry/chancel
Looks like some code didn't close normally file when save or something ?
@BSZili
Build todays svn (rev733), and ftp.module on os4 didn't shown any content anymore. I.e. i go to address book, dbl-click on any host, it connects, says "reading files" and then show empty lister without files. In the log i have:
Also if i just in this state press button "up", then it show me that:
And then crashes with such stack trace:
With DAR 0xCCCCCCCC which mean that something wrong with "free a Node a second time"
Seems all about those PORT changes, which imho rev759. I can't recheck that rev right now, but will do tomorrow if need it.
Last edit: kas1e 2013-10-09
Committed the fix for the PORT command, the crash is unrelated.
Yep, PORT is fixed, crash is here. I assume on AROS you do tests you just can't catch those "free node" bugs. I can catch them only on debug kernel with "munge" option. I.e. bug for sure there , just dunno how you can reproduce it.. Maybe aros-hosted will segfault on it ?
It probably would, but I can't get the tunnelled network to work in AROS hosted. One user reported a crash which could be the result of this:
http://aros-exec.org/modules/newbb/viewtopic.php?topic_id=8470&forum=24&post_id=84266#forumpost84266
Right .. In meantime will just make a ticket so we will have all in place
EDIT: done, ticket #19
Last edit: kas1e 2013-10-11