From: <se...@pi...> - 2006-10-20 21:34:10
|
Hi! This is probably a silly question but I'm getting confused with a certain problem: a comparison between experimental data points (2D points set) and a model (2D points set - no analytical form). The physical model produces (by a sophisticated simulations done by an external program) some 2D points data and one of my task is to compare those calculated data with an experimental one. The experimental and modeled data have form of 2D curves, build of n 2D-points, i.e.: expDat=[[x1,x2,x3,..xn],[y1,y2,y3,...,yn]] simDat=[[X1,X2,X3,...,Xn],[Y1,Y2,Y3,...,Yn]] The task of determining, let's say, a root mean squarred error (RMSe) is trivial if x1==X1, x2==X2, etc. In general, which is a common situation xk differs from Xk (k=0..n) and one may not simply compare succeeding Yk and yk (k=0..n) to determine the goodness-of-fit. The distance h=Xk-X(k-1) is constant, but similar distance m(k)=xk-x(k-1) depends on k-th point and is not a constant value, although the data array lengths for simulation and experiment are the same. My first idea was to do some interpolations to obtain the missing points, but I held it 'by a hand' (which, BTW gave quite rewarding results) and I suppose, there's some i.g. numpy method to do it for me, isn't it? I suppose to do something like: gfit(expDat,simDat,'measure_type') which I hope will return the number determining the goodness-of-fit (mean squarred error, root mean squarred error,...) of two sets of discrete 2D data points. Is there something like that in any numerical python modules (numpy, pylab) I could use? I can imagine, I can fit the data with some polynomial or whatever, and than compare the fitted data, but my goal is to operate on as raw data as it's possible. Thanks for your comments! Sebastian |
From: Robert K. <rob...@gm...> - 2006-10-20 22:15:28
|
Sebastian Żurek wrote: > Hi! > > This is probably a silly question but I'm getting confused with a > certain problem: a comparison between experimental data points (2D > points set) and a model (2D points set - no analytical form). > > The physical model produces (by a sophisticated simulations done by an > external program) some 2D points data and one of my task is to compare > those calculated data with an experimental one. > > The experimental and modeled data have form of 2D curves, build of n > 2D-points, i.e.: > > expDat=[[x1,x2,x3,..xn],[y1,y2,y3,...,yn]] > simDat=[[X1,X2,X3,...,Xn],[Y1,Y2,Y3,...,Yn]] > > The task of determining, let's say, a root mean squarred error (RMSe) > is trivial if x1==X1, x2==X2, etc. > > In general, which is a common situation xk differs from Xk (k=0..n) and > one may not simply compare succeeding Yk and yk (k=0..n) to determine > the goodness-of-fit. The distance h=Xk-X(k-1) is constant, but similar > distance m(k)=xk-x(k-1) depends on k-th point and is not a constant > value, although the data array lengths for simulation and experiment are > the same. Your description is a bit vague. Do you mean that you have some model function f that maps X values to Y values? f(x) -> y If that is the case, is there some reason that you cannot run your simulation using the same X points as your experimental data? OTOH, is there some other independent variable (say Z) that *is* common between your experimental and simulated data? f(z) -> (x, y) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: <se...@pi...> - 2006-10-21 13:42:05
|
Robert Kern napisał(a): > Your description is a bit vague. Possibly by my weak English... I'll try to make myself clearer now. Do you mean that you have some model function f > that maps X values to Y values? > > f(x) -> y > My model is quantum energy operator - spin hamiltonian (SH) with some additional assumption about so called 'line shape', 'line widths',etc. It describes various electron interactions, visible in electron paramagnetic resonance (EPR, ESR) experiment. The simplest SH can be written in a form: H = m B g S (1) where m is a constant (bohr magneton), B is magnetic field (my x-variable), g is so called 'zeeman matrix' and S is total spin angular momentum operator. Summing it all together: the simple model is parametrized by: - line shape, - line width, - zeeman matrix (3x3 diagonal matrix - the spatial dependence), - total spin S. After SH (1) diagonalization one can obtain so called 'resonance fields' and 'resonance intensities'. After a convolution with appropriate line shape function which is parametrized by the line width one can finally get the simulated EPR spectrum (simDat=[[X1,...,Xn],[Y1,...,Yn]]). This is a roughly, schematic description, appropriate to EPR spectra of monocrystals. In my situation the problem is more sophisticated - I have polycrystaline (powders) data, and to obtain a simulated EPR powder spectrum I need to sum up the EPR spectra of monocrystals that come from many possible spatial orientations, and the resultant spectrum is an envelope of all the monocrystals spectra. There's no simple model function that maps X -> Y. > If that is the case, is there some reason that you cannot run your simulation > using the same X points as your experimental data? > I can only demand a X range and number of X values within the range, there's no possibility to find the Y(X) for a specified X. These limitations on one hand come from the external program I'm using to simulate the EPR spectra, on the other are a result of spatial averaging of EPR data for powders, where a lot of interpolations are involved. > OTOH, is there some other independent variable (say Z) that *is* common between > your experimental and simulated data? > > f(z) -> (x, y) > This is probably the situation I'm in. These other variables are my model parameters, namely: line shape-width, zeeman matrix... and they're commen between the experiment and the simulation. To make it clear. I've already solved the problem by a simple linear interpolation of simulated points within the narrow neighborhood of experimental data point. The simulation points are uniformly distributed along the X-range, with a density I'm able to tune. It all works quite well but I'm founding it as a 'brute-force' method and I wonder, if there's any more sophisticated and maybe already incorporated into any Python module method? Anyway, it looks like it's impossible to compare two discrete 2D data sets without any interpolations included... :] A. M. Archibald has proposed spline fitting, which I'll try. I'll also look at the Numerical Recipes discussion he has proposed. Sebastian |
From: Charles R H. <cha...@gm...> - 2006-10-21 15:51:56
|
T24gMTAvMjEvMDYsIFNlYmFzdGlhbiCvdXJlayA8c2VienVyQHBpbi5pZi51ei56Z29yYS5wbD4g d3JvdGU6Cj4KPiBSb2JlcnQgS2VybiBuYXBpc2GzKGEpOgo+Cj4KPiBUbyBtYWtlIGl0IGNsZWFy Lgo+Cj4gSSd2ZSBhbHJlYWR5IHNvbHZlZCB0aGUgcHJvYmxlbSBieSBhIHNpbXBsZSBsaW5lYXIg aW50ZXJwb2xhdGlvbiBvZgo+IHNpbXVsYXRlZCBwb2ludHMgd2l0aGluIHRoZSBuYXJyb3cgbmVp Z2hib3Job29kIG9mIGV4cGVyaW1lbnRhbCBkYXRhCj4gcG9pbnQuIFRoZSBzaW11bGF0aW9uIHBv aW50cyBhcmUgdW5pZm9ybWx5IGRpc3RyaWJ1dGVkIGFsb25nIHRoZQo+IFgtcmFuZ2UsIHdpdGgg YSBkZW5zaXR5IEknbSBhYmxlIHRvIHR1bmUuIEl0IGFsbCB3b3JrcyBxdWl0ZSB3ZWxsIGJ1dAo+ IEknbSBmb3VuZGluZyBpdCBhcyBhICdicnV0ZS1mb3JjZScgbWV0aG9kIGFuZCBJIHdvbmRlciwg aWYgdGhlcmUncyBhbnkKPiBtb3JlIHNvcGhpc3RpY2F0ZWQgYW5kIG1heWJlIGFscmVhZHkgaW5j b3Jwb3JhdGVkIGludG8gYW55IFB5dGhvbiBtb2R1bGUKPiBtZXRob2Q/Cj4KPiBBbnl3YXksIGl0 IGxvb2tzIGxpa2UgaXQncyBpbXBvc3NpYmxlIHRvIGNvbXBhcmUgdHdvIGRpc2NyZXRlIDJEIGRh dGEKPiBzZXRzIHdpdGhvdXQgYW55IGludGVycG9sYXRpb25zIGluY2x1ZGVkLi4uIDpdCgoKSSBu b3RlIHRoYXQgaW50ZXJwb2xhdGlvbiBjYW4gYmUgc2VlbiBhcyBhIGxpbmVhciBtYXAgZnJvbSB0 aGUgZGF0YSBwb2ludHMKdG8gdGhlIGludGVycG9sYXRpb24gcG9pbnRzLCBzbyBhIGxvdCBvZiBz dGFuZGFyZCB0b29scyB0byBiZSB1c2VkLgoKQ2h1Y2sK |
From: <se...@pi...> - 2006-10-21 18:47:32
|
The problem seemed to be solved, by the A. M. Archibald clue. I've used splines to fit the simulation data. After that, I can easily find any Y(X) point, for all X in range (x_min,x_max) where x_min and x_max are the experiment independent variable. The experimental data stay untouched. Sorry for all the confusion I've made. Thanks a lot to all of You! Sebastian |
From: A. M. A. <per...@gm...> - 2006-10-20 23:28:37
|
T24gMjAvMTAvMDYsIFNlYmFzdGlhbiCvdXJlayA8c2VienVyQHBpbi5pZi51ei56Z29yYS5wbD4g d3JvdGU6CgoKPiBJcyB0aGVyZSBzb21ldGhpbmcgbGlrZSB0aGF0IGluIGFueSBudW1lcmljYWwg cHl0aG9uIG1vZHVsZXMgKG51bXB5LAo+IHB5bGFiKSBJIGNvdWxkIHVzZT8KCkluIHNjaXB5IHRo ZXJlIGFyZSBzb21lIHZlcnkgY29udmVuaWVudCBzcGxpbmUgZml0dGluZyB0b29scyB3aGljaAp3 aWxsIGFsbG93IHlvdSB0byBmaXQgYSBuaWNlIHNtb290aCBzcGxpbmUgdGhyb3VnaCB0aGUgc2lt dWxhdGlvbiBkYXRhCnBvaW50cyAob3IgbmVhciwgaWYgdGhleSBoYXZlIHNvbWUgdW5jZXJ0YWlu dHkpOyB5b3UgY2FuIHRoZW4gZWFzaWx5Cmxvb2sgYXQgdGhlIFJNUyBkaWZmZXJlbmNlIGluIHRo ZSB5IHZhbHVlcy4gWW91IGNhbiBhbHNvLCBsZXNzIGVhc2lseSwKbG9vayBhdCB0aGUgZGlzdGFu Y2UgZnJvbSB0aGUgY3VydmUgYWxsb3dpbmcgZm9yIHNvbWUgdW5jZXJ0YWludHkgaW4KdGhlIHgg dmFsdWVzLgoKSSBzdXBwb3NlIHlvdSBjb3VsZCBhbHNvIGZpdCBhIGN1cnZlIHRocm91Z2ggdGhl IGV4cGVyaW1lbnRhbCBwb2ludHMKYW5kIGNvbXBhcmUgdGhlIHR3byBjdXJ2ZXMgaW4gc29tZSB3 YXkuCgo+IEkgY2FuIGltYWdpbmUsIEkgY2FuIGZpdCB0aGUgZGF0YSB3aXRoIHNvbWUgcG9seW5v bWlhbCBvciB3aGF0ZXZlciwKPiBhbmQgdGhhbiBjb21wYXJlIHRoZSBmaXR0ZWQgZGF0YSwgYnV0 IG15IGdvYWwgaXMgdG8gb3BlcmF0ZSBvbgo+IGFzIHJhdyBkYXRhIGFzIGl0J3MgcG9zc2libGUu CgpJZiB5b3Ugd2FudCB0byBhdm9pZCB1c2luZyBhbiBhIHByaW9yaSBtb2RlbCwgTnVtZXJpY2Fs IFJlY2lwZXMKZGlzY3VzcyBzb21lIHBvc3NpYmxlIGFwcHJvYWNoZXMgKCJEbyB0d28tZGltZW5z aW9uYWwgZGlzdHJpYnV0aW9ucwpkaWZmZXI/IiBhdCBodHRwOi8vd3d3Lm5yYm9vay5jb20vYS9i b29rY3BkZi5odG1sIGlzIG9uZSkgYnV0IGl0J3Mgbm90CmNsZWFyIGhvdyB0byB0dXJuIHRoZSBw cm9ibGVtIHlvdSBkZXNjcmliZSBpbnRvIGEgc29sdmFibGUgb25lIC0gc29tZQphc3N1bXB0aW9u IGFib3V0IGhvdyB0aGUgbW9kZWxzIHZhcnkgYmV0d2VlbiBzYW1wbGVkIHggdmFsdWVzIGFwcGVh cnMKdG8gYmUgbmVjZXNzYXJ5LCBhbmQgdGhhdCBhbW91bnRzIHRvIGludGVycG9sYXRpb24uCgpB LiBNLiBBcmNoaWJhbGQK |
From: <se...@pi...> - 2006-10-21 14:00:51
|
A. M. Archibald napisał(a): > > In scipy there are some very convenient spline fitting tools which > will allow you to fit a nice smooth spline through the simulation data > points (or near, if they have some uncertainty); you can then easily > look at the RMS difference in the y values. You can also, less easily, > look at the distance from the curve allowing for some uncertainty in > the x values. > I'll try a spline fitting. I've already made some linear interpolations (see Robert Kern answer) which works well enough to use it. I'm working on a genetic algorithms application to the model parameters optimalization problem and this RMSe comparison serves me as 'fitness function'. This 'fitness function' is important element in whole procedure, so I'm trying to found the best solution to obtain it. > I suppose you could also fit a curve through the experimental points > and compare the two curves in some way. > Well, I can do it, indeed. But every single fitting procedure implicate some additional error, so when it comes to fit, I must use it very cautiously. The simulated data-points fitting should be the only acceptable fitting procedure, I guess. > If you want to avoid using an a priori model, Numerical Recipes > discuss some possible approaches ("Do two-dimensional distributions > differ?" at http://www.nrbook.com/a/bookcpdf.html is one) but it's not > clear how to turn the problem you describe into a solvable one - some > assumption about how the models vary between sampled x values appears > to be necessary, and that amounts to interpolation. > I'll look to this NR discussion. Thank You for these comments! Sebastian |