Thread: [Libmesh-devel] XDA Node/Element Numbering | libMesh: A C++ Finite Element Library

libmesh-devel

[Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-06 22:03:03

When you write and read an XDA file in parallel the node and element
numbering is changed.

This means that nodesets can't work through a "restart" in parallel... and
even worse if you are trying to "append" to the Exodus file it doesn't work
(because that Exodus file already has a different numbering so the solution
looks "scrambled").

Further, if you have "holes" in your numbering (ie your nodes start at 500
and go to 1000) those numbers aren't preserved at all.

We must write out the element and node numbers and preserve them through a
write to XDA and a read from XDA.

Agree, disagree?

Derek

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-06 22:17:02

As a follow on to this I think the XDA format is not currently done
properly.

It should go:

Meta Data
Nodes
Elements
BCs

Where the "Nodes" section has
node_id node_unique_id coord_x coord_y coord_z

That way you don't have to create "surrogate" nodes during read - you just
create the correct ones (with the correct ids) and then use them when
creating the elements.

But, as usual - I'm sure there's something I'm not thinking of....

Derek



On Wed, Nov 6, 2013 at 4:02 PM, Derek Gaston <fri...@gm...> wrote:

> When you write and read an XDA file in parallel the node and element
> numbering is changed.
>
> This means that nodesets can't work through a "restart" in parallel... and
> even worse if you are trying to "append" to the Exodus file it doesn't work
> (because that Exodus file already has a different numbering so the solution
> looks "scrambled").
>
> Further, if you have "holes" in your numbering (ie your nodes start at 500
> and go to 1000) those numbers aren't preserved at all.
>
> We must write out the element and node numbers and preserve them through a
> write to XDA and a read from XDA.
>
> Agree, disagree?
>
> Derek
>

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-06 22:18:54

Let me be a bit more clear:

After writing an XDA file and reading it back in - I want _exactly_ the
same Mesh structure that I had to start with.... same numbering, same
everything...

Derek



On Wed, Nov 6, 2013 at 4:16 PM, Derek Gaston <fri...@gm...> wrote:

> As a follow on to this I think the XDA format is not currently done
> properly.
>
> It should go:
>
> Meta Data
> Nodes
> Elements
> BCs
>
> Where the "Nodes" section has
> node_id node_unique_id coord_x coord_y coord_z
>
> That way you don't have to create "surrogate" nodes during read - you just
> create the correct ones (with the correct ids) and then use them when
> creating the elements.
>
> But, as usual - I'm sure there's something I'm not thinking of....
>
> Derek
>
>
>
> On Wed, Nov 6, 2013 at 4:02 PM, Derek Gaston <fri...@gm...> wrote:
>
>> When you write and read an XDA file in parallel the node and element
>> numbering is changed.
>>
>> This means that nodesets can't work through a "restart" in parallel...
>> and even worse if you are trying to "append" to the Exodus file it doesn't
>> work (because that Exodus file already has a different numbering so the
>> solution looks "scrambled").
>>
>> Further, if you have "holes" in your numbering (ie your nodes start at
>> 500 and go to 1000) those numbers aren't preserved at all.
>>
>> We must write out the element and node numbers and preserve them through
>> a write to XDA and a read from XDA.
>>
>> Agree, disagree?
>>
>> Derek
>>
>
>

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-06 23:01:23

On Nov 6, 2013, at 4:18 PM, Derek Gaston <fri...@gm...>
 wrote:

> Let me be a bit more clear:
> 
> After writing an XDA file and reading it back in - I want _exactly_ the same Mesh structure that I had to start with.... same numbering, same everything…

That should be possible…  The parallel format is loosely thought out and open to extension.  From a serial file the global ids are inferred, for the parallel case I don't see a reason we couldn't include a unique global id too.

> It should go:
> 
> Meta Data
> Nodes
> Elements
> BCs

The idea here is we can optionally support a partition file, which defines element ownership.  This could allow the elements to be shipped off first in the case of a serial read, or read only on the processors that need them.  The important subset of nodes can then be determined.

Reading the nodes first would require caching all of them until you know which ones you can discard.  Or closing the buffer and doing some seek business.

-Ben

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-06 22:41:47

For Parallel Mesh I was just thinking that each processor would write it's
own file... so that you could perfectly recreate the exact Mesh data
structure on read (not too mention being more amenable to parallel
filesystems like Panasas)

Derek


On Wed, Nov 6, 2013 at 4:33 PM, Kirk, Benjamin (JSC-EG311) <
ben...@na...> wrote:

> On Nov 6, 2013, at 4:18 PM, Derek Gaston <fri...@gm...>
>  wrote:
>
> > Let me be a bit more clear:
> >
> > After writing an XDA file and reading it back in - I want _exactly_ the
> same Mesh structure that I had to start with.... same numbering, same
> everything…
>
> That should be possible…  The parallel format is loosely thought out and
> open to extension.  From a serial file the global ids are inferred, for the
> parallel case I don't see a reason we couldn't include a unique global id
> too.
>
> > It should go:
> >
> > Meta Data
> > Nodes
> > Elements
> > BCs
>
> The idea here is we can optionally support a partition file, which defines
> element ownership.  This could allow the elements to be shipped off first
> in the case of a serial read, or read only on the processors that need
> them.  The important subset of nodes can then be determined.
>
> Reading the nodes first would require caching all of them until you know
> which ones you can discard.  Or closing the buffer and doing some seek
> business.
>
> -Ben
>
>

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-06 23:34:58

I don't even understand how the current stuff works at all.

Since the node and element numbers change after running the mesh through an
XDA read and write - how do the dof ids match up with what was written out
with EquationSystems::write()??

Derek


On Wed, Nov 6, 2013 at 4:41 PM, Derek Gaston <fri...@gm...> wrote:

> For Parallel Mesh I was just thinking that each processor would write it's
> own file... so that you could perfectly recreate the exact Mesh data
> structure on read (not too mention being more amenable to parallel
> filesystems like Panasas)
>
> Derek
>
>
> On Wed, Nov 6, 2013 at 4:33 PM, Kirk, Benjamin (JSC-EG311) <
> ben...@na...> wrote:
>
>> On Nov 6, 2013, at 4:18 PM, Derek Gaston <fri...@gm...>
>>  wrote:
>>
>> > Let me be a bit more clear:
>> >
>> > After writing an XDA file and reading it back in - I want _exactly_ the
>> same Mesh structure that I had to start with.... same numbering, same
>> everything…
>>
>> That should be possible…  The parallel format is loosely thought out and
>> open to extension.  From a serial file the global ids are inferred, for the
>> parallel case I don't see a reason we couldn't include a unique global id
>> too.
>>
>> > It should go:
>> >
>> > Meta Data
>> > Nodes
>> > Elements
>> > BCs
>>
>> The idea here is we can optionally support a partition file, which
>> defines element ownership.  This could allow the elements to be shipped off
>> first in the case of a serial read, or read only on the processors that
>> need them.  The important subset of nodes can then be determined.
>>
>> Reading the nodes first would require caching all of them until you know
>> which ones you can discard.  Or closing the buffer and doing some seek
>> business.
>>
>> -Ben
>>
>>
>

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-06 23:59:57

I think I see - "globally consistent" ids are assigned then both the mesh
and solution vectors are written out together... then those ids are
"swapped" back.

Unfortunately - that "swap" is WRONG... and can in fact change the the ids
for your nodes and elements in the middle of your simulation just by
writing an XDA file!

If I have nodes that are numbered 500-1000 and I write an XDA file... after
the mesh.write() has completed my nodes will now be numbered 0-500!  Even
if my ids are zero indexed and compact (no holes in the numbering)... if I
didn't add the nodes in the order of their ids then writing an XDA file
will completely scramble the numbering!

Here's another issue: Why are we doing so much parallel communication of
meshes in the case of Serial mesh?  Why doesn't processor zero just go
through and write the mesh out exactly as it is to XDA?  Instead - there is
a complicated routine that does a lot of parallel communication to push all
of the pieces through processor 0 from all the other processors....

I tried to turn off the renumbering and un-renumbering - but it doesn't
help in parallel because the parallel communication changes the order of
things....

I'm halfway wondering if we should just invent a new format for restart
files.  I don't see any way to fix all the issues with XDA and maintain
backwards compatibility.

Is it time to just have a design discussion and lay out a new format all
together?

Unfortunately this has thrown an enormous monkey wrench into my current
project (doing perfect restart in MOOSE).  I thought I was 99% done (it all
works in serial with compact meshes that are zero indexed... but everything
is scrambled in parallel)... but I was counting on the XDA formats in
libMesh working in a way that the perfect (as in an exact replica) mesh and
EQSys showed up on the other side of a write a read... and it turns out not
to be the case...

Derek

On Wed, Nov 6, 2013 at 5:34 PM, Derek Gaston <fri...@gm...> wrote:

> I don't even understand how the current stuff works at all.
>
> Since the node and element numbers change after running the mesh through
> an XDA read and write - how do the dof ids match up with what was written
> out with EquationSystems::write()??
>
> Derek
>
>
> On Wed, Nov 6, 2013 at 4:41 PM, Derek Gaston <fri...@gm...> wrote:
>
>> For Parallel Mesh I was just thinking that each processor would write
>> it's own file... so that you could perfectly recreate the exact Mesh data
>> structure on read (not too mention being more amenable to parallel
>> filesystems like Panasas)
>>
>> Derek
>>
>>
>> On Wed, Nov 6, 2013 at 4:33 PM, Kirk, Benjamin (JSC-EG311) <
>> ben...@na...> wrote:
>>
>>> On Nov 6, 2013, at 4:18 PM, Derek Gaston <fri...@gm...>
>>>  wrote:
>>>
>>> > Let me be a bit more clear:
>>> >
>>> > After writing an XDA file and reading it back in - I want _exactly_
>>> the same Mesh structure that I had to start with.... same numbering, same
>>> everything…
>>>
>>> That should be possible…  The parallel format is loosely thought out and
>>> open to extension.  From a serial file the global ids are inferred, for the
>>> parallel case I don't see a reason we couldn't include a unique global id
>>> too.
>>>
>>> > It should go:
>>> >
>>> > Meta Data
>>> > Nodes
>>> > Elements
>>> > BCs
>>>
>>> The idea here is we can optionally support a partition file, which
>>> defines element ownership.  This could allow the elements to be shipped off
>>> first in the case of a serial read, or read only on the processors that
>>> need them.  The important subset of nodes can then be determined.
>>>
>>> Reading the nodes first would require caching all of them until you know
>>> which ones you can discard.  Or closing the buffer and doing some seek
>>> business.
>>>
>>> -Ben
>>>
>>>
>>
>

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-07 00:42:41

On Nov 6, 2013, at 5:59 PM, "Derek Gaston" <fri...@gm...> wrote:

> Is it time to just have a design discussion and lay out a new format all together?

No issue there - that's part of why I was asking if there is an obvious way to extend an exodus file to contain the refinement hierarchy. But that's just one possibility. 

> Unfortunately this has thrown an enormous monkey wrench into my current project (doing perfect restart in MOOSE).  I thought I was 99% done (it all works in serial with compact meshes that are zero indexed... but everything is scrambled in parallel)... but I was counting on the XDA formats in libMesh working in a way that the perfect (as in an exact replica) mesh and EQSys showed up on the other side of a write a read... and it turns out not to be the case...

If you have a minimal example I could help modify the existing parallel file I/O implementation to see if we can get what you want.  I don't know of anyone relying on backward compatibility for the parallel format, so I'm not too worried about changes there. 

-Ben

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-07 01:36:41

				
On Nov 6, 2013, at 6:42 PM, "Kirk, Benjamin (JSC-EG311)" <ben...@na...>
 wrote:

>> 
>> Unfortunately this has thrown an enormous monkey wrench into my current project (doing perfect restart in MOOSE). I thought I was 99% done (it all works in serial with compact meshes that are zero indexed... but everything is scrambled in parallel)... but I was counting on the XDA formats in libMesh working in a way that the perfect (as in an exact replica) mesh and EQSys showed up on the other side of a write a read... and it turns out not to be the case...
> 
> 
> If you have a minimal example I could help modify the existing parallel file I/O implementation to see if we can get what you want.  I don't know of anyone relying on backward compatibility for the parallel format, so I'm not too worried about changes there. 

OK, so I've reminded myself of where this was left off - the parallel I/O for the equation systems is in place, while for the mesh it is not implemented yet. I'll review where things was left off, and I don't think it'll be that difficult to implement what you're looking for.

-Ben

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-07 01:51:17

On Nov 6, 2013, at 5:59 PM, Derek Gaston <fri...@gm...>
 wrote:

> Here's another issue: Why are we doing so much parallel communication of meshes in the case of Serial mesh?  Why doesn't processor zero just go through and write the mesh out exactly as it is to XDA?  Instead - there is a complicated routine that does a lot of parallel communication to push all of the pieces through processor 0 from all the other processors….

the answer to this one is actually easy:

e.g. write_serialized_…(), the private implementation, is just a bad naming convention.  What is meant by that is 'write … to a serialized file'.  The implementation works for a serial or parallel mesh, and uses communication to gather data to write to a serialized file.  Rather than maintain two implementations, we just always proceed as if the source mesh is distributed.

-Ben

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-07 01:54:10

Sure - I understand.  But a specialization for SerialMesh would make a lot of sense in this case - saving a ton of parallel communication (and making it possible to produce the exact same XDA file on one processor or thousands)

Derek

Sent from my iPhone

> On Nov 6, 2013, at 6:51 PM, "Kirk, Benjamin (JSC-EG311)" <ben...@na...> wrote:
> 
> On Nov 6, 2013, at 5:59 PM, Derek Gaston <fri...@gm...>
> wrote:
> 
>> Here's another issue: Why are we doing so much parallel communication of meshes in the case of Serial mesh?  Why doesn't processor zero just go through and write the mesh out exactly as it is to XDA?  Instead - there is a complicated routine that does a lot of parallel communication to push all of the pieces through processor 0 from all the other processors….
> 
> the answer to this one is actually easy:
> 
> e.g. write_serialized_…(), the private implementation, is just a bad naming convention.  What is meant by that is 'write … to a serialized file'.  The implementation works for a serial or parallel mesh, and uses communication to gather data to write to a serialized file.  Rather than maintain two implementations, we just always proceed as if the source mesh is distributed.
> 
> -Ben
> 
>

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Cody P. <cod...@gm...> - 2013-11-07 02:54:12

On Wed, Nov 6, 2013 at 6:36 PM, Kirk, Benjamin (JSC-EG311) <
ben...@na...> wrote:

>
> On Nov 6, 2013, at 6:42 PM, "Kirk, Benjamin (JSC-EG311)" <
> ben...@na...>
>  wrote:
>
> >>
> >> Unfortunately this has thrown an enormous monkey wrench into my current
> project (doing perfect restart in MOOSE). I thought I was 99% done (it all
> works in serial with compact meshes that are zero indexed... but everything
> is scrambled in parallel)... but I was counting on the XDA formats in
> libMesh working in a way that the perfect (as in an exact replica) mesh and
> EQSys showed up on the other side of a write a read... and it turns out not
> to be the case...
> >
> >
> > If you have a minimal example I could help modify the existing parallel
> file I/O implementation to see if we can get what you want.  I don't know
> of anyone relying on backward compatibility for the parallel format, so I'm
> not too worried about changes there.
>
> OK, so I've reminded myself of where this was left off - the parallel I/O
> for the equation systems is in place, while for the mesh it is not
> implemented yet. I'll review where things was left off, and I don't think
> it'll be that difficult to implement what you're looking for.
>
> -Ben
>

Let us know where we can help.  I was going to element IDs back to the
connectivity arrays in XDA output and also add node IDs to that format for
the first time too.  That might be counter-productive if you are going
another direction.

Cody

>
>
>
> ------------------------------------------------------------------------------
> November Webinars for C, C++, Fortran Developers
> Accelerate application performance with scalable programming models.
> Explore
> techniques for threading, error checking, porting, and tuning. Get the most
> from the latest Intel processors and coprocessors. See abstracts and
> register
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
> _______________________________________________
> Libmesh-devel mailing list
> Lib...@li...
> https://lists.sourceforge.net/lists/listinfo/libmesh-devel
>

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-07 03:19:35

On Nov 6, 2013, at 8:54 PM, Cody Permann <cod...@gm...>
 wrote:

> Let us know where we can help.  I was going to element IDs back to the connectivity arrays in XDA output and also add node IDs to that format for the first time too.  That might be counter-productive if you are going another direction.  
> 

Sure - I'll make a branch to implement the parallel mesh XDA output.  I expect in the parallel files we'll need element ID and node ID.  It took a lot of work in the serialized implementation to get those inferred by position, and I don't see any reason to mess with that.

I actually expect the parallel mesh XDA bit to be pretty straightforward.

As for alternatives, I've looked into HDF5 and the like, and I am not at all sold…  I'm open for discussion, though.

-Ben

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-07 03:26:31

On Wed, Nov 6, 2013 at 6:36 PM, Kirk, Benjamin (JSC-EG311) <
ben...@na...> wrote:

> OK, so I've reminded myself of where this was left off - the parallel I/O
> for the equation systems is in place, while for the mesh it is not
> implemented yet. I'll review where things was left off, and I don't think
> it'll be that difficult to implement what you're looking for.
>

Note: I'm not that worried about parallel IO... mostly I just need all of
the element and node numbering preserved if I write out XDA and then read
it back in (i.e., I need the exact same Mesh and vector layout).  For now,
just getting that working for SerialMesh would be useful.

Further, the way that the XDA writer in EquationSystems::read() and
EquationSystems::write() can destroy your current node and element
numbering (i.e., you call it and after it finishes your node and element
numbering can be different for the system you called it on!).  This is
happening because of the combination of
"globally_renumber_nodes_and_elements()" and
"fix_broken_node_and_element_numbering()" calls in those routines.

fix_broken_node_and_element_numbering() will destroy any non-contiguous,
non-zero starting numbering - or any numbering where the numbering of nodes
doesn't match the order they are added to the mesh.  Note that last one:
that is a serious one as we're getting ready to start using the node and
element "maps" in Exodus that will effectively renumber nodes and elements
after they are added to the mesh.  So just writing out an EquationsSystem
will destroy our numbering...

Basically, the node and element numbering cannot be inferred from the XDA
file.  It might not start at 0.  It might not be contiguous and it might
not be assigned in the "perfect" order...

Derek

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-07 03:39:38

On Nov 6, 2013, at 9:26 PM, "Derek Gaston" <fri...@gm...> wrote:

> fix_broken_node_and_element_numbering() will destroy any non-contiguous, non-zero starting numbering - or any numbering where the numbering of nodes doesn't match the order they are added to the mesh.  Note that last one: that is a serious one as we're getting ready to start using the node and element "maps" in Exodus that will effectively renumber nodes and elements after they are added to the mesh.  So just writing out an EquationsSystem will destroy our numbering...

Indeed - that method assumes the contiguous and changeable order that's been in the library for a long time. But it doesn't have to - right now it looks through objects and infers their Ids by position, which will indeed break when you make the aforementioned change. 

That's easy to work around though - the  globally_renumber_...  can simply save the original ordering and then it can be restored exactly by providing that to fix_broken_...

We've not historically treated object ids as sacred, but preserving them when important should not be too difficult. 

-Ben

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-07 03:47:34

I think we shouldn't be renumbering at all... we also want to read back
from the XDA file and get the exact same numbering we currently have...

I think I may be missing the reason why we are renumbering.  To me, I just
want a direct representation of the current mesh in XDA form...

Derek



On Wed, Nov 6, 2013 at 8:39 PM, Kirk, Benjamin (JSC-EG311) <
ben...@na...> wrote:

> On Nov 6, 2013, at 9:26 PM, "Derek Gaston" <fri...@gm...> wrote:
>
> > fix_broken_node_and_element_numbering() will destroy any non-contiguous,
> non-zero starting numbering - or any numbering where the numbering of nodes
> doesn't match the order they are added to the mesh.  Note that last one:
> that is a serious one as we're getting ready to start using the node and
> element "maps" in Exodus that will effectively renumber nodes and elements
> after they are added to the mesh.  So just writing out an EquationsSystem
> will destroy our numbering...
>
> Indeed - that method assumes the contiguous and changeable order that's
> been in the library for a long time. But it doesn't have to - right now it
> looks through objects and infers their Ids by position, which will indeed
> break when you make the aforementioned change.
>
> That's easy to work around though - the  globally_renumber_...  can simply
> save the original ordering and then it can be restored exactly by providing
> that to fix_broken_...
>
> We've not historically treated object ids as sacred, but preserving them
> when important should not be too difficult.
>
> -Ben
>
>
>

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-07 12:40:21

On Nov 6, 2013, at 9:47 PM, "Derek Gaston" <fri...@gm...> wrote:

> I think I may be missing the reason why we are renumbering. To me, I just want a direct representation of the current mesh in XDA form...

For the serialized solution XDA, it is all about M->N restarts. Forget the mesh for the moment, its not relevant to this case.

Building a cube on 1 or 10 processors, has historically produced a different element/node ordering in libMesh. Same thing will happen if you invoke different partitioners on the same mesh.
Same thing will happen if you build a square with 4 elements vs. building one and refining it once.

We need a way to restart the same solution on topologically the same mesh regardless of the ordering, and the serialized solution format is designed to handle that need.

So at the moment the parallel solution I/O stuff proceeds from that history. I agree if we are limiting the parallel restart capability to (1) always require an associated mesh and (2) only work in the M->M case, we can likely remove the reordering.

The mesh is different - we do not reorder it explicitly, but we write the elements by level so that can change the "natural" ordering with refinement. This was done as a logical way to get the refinement tree out and also not require additional IDs. As I mentioned yesterday for the parallel mesh case I expect we'll need Ids in any case. I'm also open to constructing some kind of integer graph instead to represent the refinement hierarchy, but I don't clearly see what that looks like so we need to think about it. It may be as easy as the current parent index bit, but read after the elements...

-Ben

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-07 15:13:12

I understand what you're saying - but the current format is so very close
to what I actually currently need.  If it had element and node ids in it
and tried to restore those ids when you loaded the file I believe that it
would do everything I currently need it to do.

Can we do something simple in the interim like a configure option to write
IDs to the XDA file?

After we have that working we can take a look at doing something better.
 For instance, I currently have these patches that add Silo (
https://wci.llnl.gov/codes/silo/ ) support to libMesh (from a developer at
Livermore)... maybe we can take a look at what he's done and improve upon
it and make it our official format (one nice thing is that Paraview and
ViSit actually read that format - so you could actually look at the files
you dump out... and it natively supports AMR).

But - I really just need a quick fix for now to allow progress to continue
on my current task...

Derek

On Thu, Nov 7, 2013 at 5:40 AM, Kirk, Benjamin (JSC-EG311) <
ben...@na...> wrote:

> On Nov 6, 2013, at 9:47 PM, "Derek Gaston" <fri...@gm...> wrote:
>
> > I think I may be missing the reason why we are renumbering.  To me, I
> just want a direct representation of the current mesh in XDA form...
>
> For the serialized solution XDA, it is all about M->N restarts. Forget the
> mesh for the moment, its not relevant to this case.
>
> Building a cube on 1 or 10 processors, has historically produced a
> different element/node ordering in libMesh. Same thing will happen if you
> invoke different partitioners on the same mesh.
> Same thing will happen if you build a square with 4 elements vs. building
> one and refining it once.
>
> We need a way to restart the same solution on topologically the same mesh
> regardless of the ordering, and the serialized solution format is designed
> to handle that need.
>
> So at the moment the parallel solution I/O stuff proceeds from that
> history. I agree if we are limiting the parallel restart capability to (1)
> always require an associated mesh and (2) only work in the M->M case, we
> can likely remove the reordering.
>
> The mesh is different - we do not reorder it explicitly, but we write the
> elements by level so that can change the "natural" ordering with
> refinement. This was done as a logical way to get the refinement tree out
> and also not require additional IDs. As I mentioned yesterday for the
> parallel mesh case I expect we'll need Ids in any case. I'm also open to
> constructing some kind of integer graph instead to represent the refinement
> hierarchy, but I don't clearly see what that looks like so we need to think
> about it. It may be as easy as the current parent index bit, but read after
> the elements...
>
> -Ben
>
>

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-07 15:36:20

On Nov 7, 2013, at 9:13 AM, Derek Gaston <fri...@gm...>
 wrote:

> I understand what you're saying - but the current format is so very close to what I actually currently need.  If it had element and node ids in it and tried to restore those ids when you loaded the file I believe that it would do everything I currently need it to do.
> 
> Can we do something simple in the interim like a configure option to write IDs to the XDA file?

Yeah, this is trivial.

Consider what we have currently in the header:

1        # number of elements
27       # number of nodes
.        # boundary condition specification file
n/a      # subdomain id specification file
n/a      # processor id specification file
n/a      # p-level specification file
1        # n_elem at level 0, [ type (n0 ... nN-1) ]

We already support writing the processor id, or not, based on the "processor id specification file"

We presently support "n/a" or ".", meaning the processor id is either not included or written to the current file.  The idea here is the processor mapping could also be read from a separate file, but this has not been implemented.

Adding 

"n/a" # element id specification file
   .     # element id specification file 

to the header is what we want to do.  Similarly for the node ids.

Cody, you just recently changed the IO version number anyway, right?  So we can just add this option into the new, unreleased change.

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-07 15:43:53

After thinking more about it - even getting the IDs right is not enough - I
need the nodes in the _nodes vector in Mesh to literally be inserted in the
exact same order (or else the solution vector passed to the ExodusIO will
be in a different order).

At this point I believe that the current goals of XDA (while really cool)
are just not aligned with what I'm looking for.  I literally just want a
serialization of the Mesh and solution vectors.  I want to be able to
perfectly recreate those structures upon read - exactly as if the code had
not exited at all.

I actually just wrote a system in MOOSE for doing exactly that for
complicated data structures... and I think I'm going to give writing the
Mesh and solution vectors out using that system a shot.

Let me see what I can come up with.

Derek

On Thu, Nov 7, 2013 at 8:36 AM, Kirk, Benjamin (JSC-EG311) <
ben...@na...> wrote:

> On Nov 7, 2013, at 9:13 AM, Derek Gaston <fri...@gm...>
>  wrote:
>
> > I understand what you're saying - but the current format is so very
> close to what I actually currently need.  If it had element and node ids in
> it and tried to restore those ids when you loaded the file I believe that
> it would do everything I currently need it to do.
> >
> > Can we do something simple in the interim like a configure option to
> write IDs to the XDA file?
>
> Yeah, this is trivial.
>
> Consider what we have currently in the header:
>
> 1        # number of elements
> 27       # number of nodes
> .        # boundary condition specification file
> n/a      # subdomain id specification file
> n/a      # processor id specification file
> n/a      # p-level specification file
> 1        # n_elem at level 0, [ type (n0 ... nN-1) ]
>
>
> We already support writing the processor id, or not, based on the
> "processor id specification file"
>
> We presently support "n/a" or ".", meaning the processor id is either not
> included or written to the current file.  The idea here is the processor
> mapping could also be read from a separate file, but this has not been
> implemented.
>
> Adding
>
> "n/a" # element id specification file
>    .     # element id specification file
>
> to the header is what we want to do.  Similarly for the node ids.
>
> Cody, you just recently changed the IO version number anyway, right?  So
> we can just add this option into the new, unreleased change.

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-07 15:51:09

On Nov 7, 2013, at 9:43 AM, Derek Gaston <fri...@gm...>
 wrote:

> At this point I believe that the current goals of XDA (while really cool) are just not aligned with what I'm looking for.  I literally just want a serialization of the Mesh and solution vectors.  I want to be able to perfectly recreate those structures upon read - exactly as if the code had not exited at all.

Let's denote this feature a "checkpoint/reload."  Under the following restrictions I contend it'll be "easy":

(1) checkpoint/reload is restricted to the same number of processors, &
(2) you always have an XDA mesh to go along with an XDA solution.

The biggest pain will be refined meshes.  Depending on Serial vs. parallel mesh, I do not think there is a guarantee that iterating through the elements as contained in the mesh guarantees children come before parents.  Especially for the parallel mesh, we are just iterating through a map of some unknown type (std::map, hash_map, unordered_map I think).  Reconstructing the proper refinement hierarchy when the elements are read in a random order is going to be the tricky part.

-Ben

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Cody P. <cod...@gm...> - 2013-11-07 16:37:35

On Thu, Nov 7, 2013 at 8:36 AM, Kirk, Benjamin (JSC-EG311) <
ben...@na...> wrote:

> On Nov 7, 2013, at 9:13 AM, Derek Gaston <fri...@gm...>
>  wrote:
>
> > I understand what you're saying - but the current format is so very
> close to what I actually currently need.  If it had element and node ids in
> it and tried to restore those ids when you loaded the file I believe that
> it would do everything I currently need it to do.
> >
> > Can we do something simple in the interim like a configure option to
> write IDs to the XDA file?
>
> Yeah, this is trivial.
>
> Consider what we have currently in the header:
>
> 1        # number of elements
> 27       # number of nodes
> .        # boundary condition specification file
> n/a      # subdomain id specification file
> n/a      # processor id specification file
> n/a      # p-level specification file
> 1        # n_elem at level 0, [ type (n0 ... nN-1) ]
>
>
> We already support writing the processor id, or not, based on the
> "processor id specification file"
>
> We presently support "n/a" or ".", meaning the processor id is either not
> included or written to the current file.  The idea here is the processor
> mapping could also be read from a separate file, but this has not been
> implemented.
>
> Adding
>
> "n/a" # element id specification file
>    .     # element id specification file
>
> to the header is what we want to do.  Similarly for the node ids.
>
> Cody, you just recently changed the IO version number anyway, right?  So
> we can just add this option into the new, unreleased change.


Yes I did, and we are still tweaking it even now.  I just add the nodeset
information a few days ago so lets figure out what we need in the current
version before the next release!

Cody

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-07 19:27:36

On Nov 7, 2013, at 10:37 AM, Cody Permann <cod...@gm...>
 wrote:

> Yes I did, and we are still tweaking it even now.  I just add the nodeset information a few days ago so lets figure out what we need in the current version before the next release!

If at all possible I'd like to treat the "unique ids" the same way we do the other stuff:

 XdrIO io(mesh);
  io.partition_map_file_name() = ".";
  io.unique_ids_file_name() = "n/a";
  io.write("mesh.xda");

We can then write the (id,uid) in the connectivity block, and also along with the coordinates.

As for the header stuff, how about we target something more like this:

libMesh-0.9.2+ parallel  type_size=8 sid_size=8 eid_size=8 side_size=8 bid_size=8
1000  # number of elements
1331  # number of nodes
.         # boundary condition specification file
.         # subdomain id specification file
.         # unique id specification file
n/a     # processor id specification file
n/a     # p-level specification file
0        # subdomain id to name map
1000     # n_elem at level 0, [ type sid uid (n0 ... nN-1) ]


Or maybe just 
libMesh-0.9.2+ parallel  id_size=8
…

I think we can make better use of the "version" string for extensibility, basically treating it instead as a "feature" string.

---------------------------------------------------------------------

As for Derek's concern, I tend to agree that the current XDA file format has to address a lot of concerns that overcomplicate it for what he wants.  I'll add another restriction:  If we are limiting the parallel restart capability to 

(1) always require an associated mesh, 
(2) only work in the M->M case, and
(3) don't worry about version compatibility, 

a new feature is desirable and not too hard.

I recommend

XdrIO io(mesh);

io.checkpoint("foo.xda");
...
io.restore("foo.xda");

and likewise for the EquationSystems.

Agreed?

-Ben

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Derek G. <fri...@gm...> - 2013-11-07 20:26:52

I'm thinking a new object: CheckpointIO

So you do CheckpointIO.write()/read()

Cody and I are starting on this now - I'll provide a pull request once we
get a ways in so you can comment.

Derek



On Thu, Nov 7, 2013 at 12:27 PM, Kirk, Benjamin (JSC-EG311) <
ben...@na...> wrote:

> On Nov 7, 2013, at 10:37 AM, Cody Permann <cod...@gm...>
>  wrote:
>
> > Yes I did, and we are still tweaking it even now.  I just add the
> nodeset information a few days ago so lets figure out what we need in the
> current version before the next release!
>
> If at all possible I'd like to treat the "unique ids" the same way we do
> the other stuff:
>
>  XdrIO io(mesh);
>   io.partition_map_file_name() = ".";
>   io.unique_ids_file_name() = "n/a";
>   io.write("mesh.xda");
>
> We can then write the (id,uid) in the connectivity block, and also along
> with the coordinates.
>
> As for the header stuff, how about we target something more like this:
>
> libMesh-0.9.2+ parallel  type_size=8 sid_size=8 eid_size=8 side_size=8
> bid_size=8
> 1000  # number of elements
> 1331  # number of nodes
> .         # boundary condition specification file
> .         # subdomain id specification file
> .         # unique id specification file
> n/a     # processor id specification file
> n/a     # p-level specification file
> 0        # subdomain id to name map
> 1000     # n_elem at level 0, [ type sid uid (n0 ... nN-1) ]
>
>
> Or maybe just
> libMesh-0.9.2+ parallel  id_size=8
> …
>
> I think we can make better use of the "version" string for extensibility,
> basically treating it instead as a "feature" string.
>
> ---------------------------------------------------------------------
>
> As for Derek's concern, I tend to agree that the current XDA file format
> has to address a lot of concerns that overcomplicate it for what he wants.
>  I'll add another restriction:  If we are limiting the parallel restart
> capability to
>
> (1) always require an associated mesh,
> (2) only work in the M->M case, and
> (3) don't worry about version compatibility,
>
> a new feature is desirable and not too hard.
>
> I recommend
>
> XdrIO io(mesh);
>
> io.checkpoint("foo.xda");
> ...
> io.restore("foo.xda");
>
> and likewise for the EquationSystems.
>
> Agreed?
>
> -Ben
>
>
>
> ------------------------------------------------------------------------------
> November Webinars for C, C++, Fortran Developers
> Accelerate application performance with scalable programming models.
> Explore
> techniques for threading, error checking, porting, and tuning. Get the most
> from the latest Intel processors and coprocessors. See abstracts and
> register
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
> _______________________________________________
> Libmesh-devel mailing list
> Lib...@li...
> https://lists.sourceforge.net/lists/listinfo/libmesh-devel
>

Re: [Libmesh-devel] XDA Node/Element Numbering

From: Kirk, B. (JSC-EG311) <ben...@na...> - 2013-11-07 20:30:35

Sounds good.  Feel free to grab anything useful from here:
https://github.com/libMesh/libmesh/tree/checkpoint


On Nov 7, 2013, at 2:26 PM, Derek Gaston <fri...@gm...>
 wrote:

> I'm thinking a new object: CheckpointIO
> 
> So you do CheckpointIO.write()/read()
> 
> Cody and I are starting on this now - I'll provide a pull request once we get a ways in so you can comment.
> 
> Derek
> 
> 
> 
> On Thu, Nov 7, 2013 at 12:27 PM, Kirk, Benjamin (JSC-EG311) <ben...@na...> wrote:
> On Nov 7, 2013, at 10:37 AM, Cody Permann <cod...@gm...>
>  wrote:
> 
> > Yes I did, and we are still tweaking it even now.  I just add the nodeset information a few days ago so lets figure out what we need in the current version before the next release!
> 
> If at all possible I'd like to treat the "unique ids" the same way we do the other stuff:
> 
>  XdrIO io(mesh);
>   io.partition_map_file_name() = ".";
>   io.unique_ids_file_name() = "n/a";
>   io.write("mesh.xda");
> 
> We can then write the (id,uid) in the connectivity block, and also along with the coordinates.
> 
> As for the header stuff, how about we target something more like this:
> 
> libMesh-0.9.2+ parallel  type_size=8 sid_size=8 eid_size=8 side_size=8 bid_size=8
> 1000  # number of elements
> 1331  # number of nodes
> .         # boundary condition specification file
> .         # subdomain id specification file
> .         # unique id specification file
> n/a     # processor id specification file
> n/a     # p-level specification file
> 0        # subdomain id to name map
> 1000     # n_elem at level 0, [ type sid uid (n0 ... nN-1) ]
> 
> 
> Or maybe just
> libMesh-0.9.2+ parallel  id_size=8
> …
> 
> I think we can make better use of the "version" string for extensibility, basically treating it instead as a "feature" string.
> 
> ---------------------------------------------------------------------
> 
> As for Derek's concern, I tend to agree that the current XDA file format has to address a lot of concerns that overcomplicate it for what he wants.  I'll add another restriction:  If we are limiting the parallel restart capability to
> 
> (1) always require an associated mesh,
> (2) only work in the M->M case, and
> (3) don't worry about version compatibility,
> 
> a new feature is desirable and not too hard.
> 
> I recommend
> 
> XdrIO io(mesh);
> 
> io.checkpoint("foo.xda");
> ...
> io.restore("foo.xda");
> 
> and likewise for the EquationSystems.
> 
> Agreed?
> 
> -Ben
> 
> 
> ------------------------------------------------------------------------------
> November Webinars for C, C++, Fortran Developers
> Accelerate application performance with scalable programming models. Explore
> techniques for threading, error checking, porting, and tuning. Get the most
> from the latest Intel processors and coprocessors. See abstracts and register
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
> _______________________________________________
> Libmesh-devel mailing list
> Lib...@li...
> https://lists.sourceforge.net/lists/listinfo/libmesh-devel
>

1 2 > >> (Page 1 of 2)