On 5/3/06, Ed L. Cashin <ecashin@coraid.com> wrote:
On Wed, May 03, 2006 at 05:36:53AM +0000, Jayesh Salvi wrote:
> Hello,
> We are using AOE devices for booting Xen virtual machines so that we can
> share a same physical device across different hosts using AOE.
> It used to work well few days back. We did successful migrations. But lately
> booting a Xen domU from these AOE devices suddenly causes the system to
> crash.

Hi, Jayesh Salvi.  What has changed?

We are not sure. We tried by undoing some changes that we had done (Xen version), but that doesn't help.

> The system freezes, without any trace in logs. We have confirmed that
> this is not a Xen problem, as domUs booted from non-AOE devices work fine.
> To reproduce same condition without Xen, I tried to do repetitive I/O on the
> AOE mounted device, to see if it crashes in the same way. But I couldn't
> make it crash the same way.
> Could anyone suggest how we can debug this problem? Are there some tools (i
> figure tcpdump can't be used to see AOE traffic,

You can use tcpdump, despite its name, to capture non-TCP packets.  To
capture AoE on eth1, for example, you can do this:

  tcpdump ether proto 0x88a2

gr8. I will use this one and see if I find something.

> is there any similar
> tool?), some logs?

Even when messages don't make it into the logs, you can often capture
messages using netconsole or a serial console.

> Some modifications in aoe module (debug mode etc.?)?

It's pretty easy to add calls to printk, since the Linux kernel
developers have made sure it can be called from almost all situations.

Thanks a lot Ed. I will try these things.

  Ed L Cashin < ecashin@coraid.com>

Everything you can imagine is real