The QNX demo floppy can be downloaded for free at
http://www.qnx.com/iat/download/network.html
I'm using current CVS on 9/3/2001. With the new PIT
model, the QNX demo disk is reporting segmentation
faults and filesystem failures during the boot process.
QNX Demodisk Loader v1.5
[then a percent done going from 0% to 100%]
starting initial
../disk-COMMON/bin/ramdisk.demo terminated (SIGSEGV) at
10A1:0000BFD3.
ization
[a few seconds pause]
Timed out waiting for '/'.
[Then it continues to the first install screen. Press
space (or just wait) and it moves on to the second
screen, asking what kind of keyboard you have. I chose
F1 for North America.]
Progress Indicator[
Could not open '/image/image1.z', No such file or
directory.
Couldn't start '/bin/Dev -n8'.
Spawn of (No such file or directory).
File not present
Couldn't start '/bin/Net -S -d3 -n401'.
Spawn of (No such file or directory).
File not present
[then comes the screen which says that you don't have a
NIC. Press space or wait]
Progress indicator [ ]
Couldn't start '/bin/emu387'.
Spawn of (No such file or directory).
File not present
Couldn't start '/bin/Socklet -p1 qnxdemo'.
Spawn of (No such file or directory).
File not present
Couldn't start '/bin/phlib_s11'.
The only log message from the PIT code is:
00078501111e[ ] Unknown behavior when latching
during 2-part read.
00078501111e[ ] This message will not be
repeated.
If you instead do
configure --disable-new-pit; make
and boot the QNX demo disk, it works. There are no
SIGSEGV or "Could not open" messages, it loads the OS,
goes into graphics mode, etc. If you revert to the CVS
version from 2001-08-14 it also works.
And, the most strange of all, if I compile the current
CVS with the new pit and enable the debugger, IT ALSO
WORKS! This makes it hard to get an instruction trace
and say exactly where it fails.
I'll post more if I find any more clues.
Summary of what I've tried so far:
with current cvs:
configure --enable-new-pit. fails
configure --disable-new-pit. works
configure --enable-new-pit --enable-debugger. works
configure --disable-new-pit --enable-debugger. works
I tried them again with a cvs version from 8/14, with
the same results:
configure --enable-new-pit. fails
configure --disable-new-pit. works
configure --enable-new-pit --enable-debugger. works
Logged In: YES
user_id=125806
Even stranger still, it works if ips=500000 but not if
ips=1000000. You might check to make sure you aren't
masking the ips problem when you enable the debugger. I
have some guesses as to what this might be, but they're just
guesses for now.
The big difference between ips=500000 and ips=1000000 is
that at ips=500000 we only call clock_all(2) instead of
clock_all(1) because of a problem with the timing model of
bochs. This could conceiveably mask an interrupt caused by
a data/control word write, which could explain the problem.
Try running the old pit with TIMER_DELTA in devices.cc set
to 1 (warning: this will be hideously slow, but is helpful
as a test.)
Logged In: YES
user_id=125806
Okay, I think I sort of understand the problem, and its a
QNX "feature," not a bochs problem.
It goes a little something like this:
With a low ips, certain interrupts can be masked off.
Specifically, those resulting from a control word write
might get masked. This is a bochs bug (fundamentally a
problem with the timing model), but is actually helping
here.
With ips=1000000 you get 1 instruction per timer tick, and
the timer is an accurate model. However, this doesn't
accurately represent any real x86 processor, which brings us
to:
With ips=4000000 you model the slowest x86 processor ever
built, the 4MHz 8088 (which has somehow mysteriously had
Pentium architecture changes added :) .) This allows you
4 instructions per timer tick, and fixes the current problem
by allowing a write to the control word followed by writes
of the initial count before the timer gets to register an
interrupt. (And thus QNX works.)
There may be some minor fixes I can do to make the 1MHz
behavior a little cleaner, but fundamentally the problem is
that there is no 1MHz x86, so there may be code that won't
run on an imaginary one.
There also may be a PI_C_ problem exposed here, but that's
not my area of expertise.
Logged In: YES
user_id=125806
This gets stranger by the minute.
I found a bug in the latching code, but that's not the
problem you're seeing here. At both 500KHz and 1MHz there
are 2 writes to the PIT, but at 4MHz there are 3 writes.
This makes no sense at all, and doesn't explain why 500KHz
is working but 1MHz isn't (so far as I can tell both should
be broken.) I sure wish I knew what was causing that
SIGSERGV.
Oh, and the masked interrupt problem I was talking about
doesn't exist except for at ips values less than 500000.
I have a feeling this is a PIC problem (or a PIC model
problem) due to the
[PIC ] IRQ lowest command 0xc2
line, but I'm not sure.
Logged In: YES
user_id=185114
If I can find some combination of settings that causes the
problem with the debugger enabled, I can get instruction
traces at the point that the simulations diverge.
That assumes that the old PIT and new PIT actually put up
their interrupt at exactly the same time. If not, the
instruction traces diverge as soon as the first PIT
interrupt.
Logged In: YES
user_id=185114
Now that bug #465262 (timed events not consistent) is fixed,
I hope that it will be possible to debug this. Before,
there were some significant differences in how the
simulation time got updated when the debugger was on or off,
and even if tracing was on or off.
Logged In: YES
user_id=185114
Postpone until a later version.
Logged In: YES
user_id=93674
what's the status on this bug? Is it still present in 1.4?
Logged In: YES
user_id=125806
This bug still exists. I don't think it's a PIT problem,
though I could be mistaken. The problem is that bochs
naturally runs at unreasonably slow clock speeds in relation
to modern x86 processors, so there isn't a clear fix. I
believe the same problem exists on linux 2.?0? series
kernels.
Logged In: YES
user_id=376477
This problem seems to be fixed in the current version of
Bochs. In older releases (2.0.2, 2.1.1) QNX crashed at low
ips values (e.g. 1000000), but now it works here, even at
low values.