## [Vxl-users] bug report - MUL - pdf1d library - pdf_gaussian_builder

 [Vxl-users] bug report - MUL - pdf1d library - pdf_gaussian_builder From: Nhon Trinh - 2009-09-02 21:24:28 ```File         : mul/pdf1d/pdf_1d_gaussian_builder.cxx Function : pdf1d_gaussian_builder::build(pdf1d_pdf& model, mbl_data_wrapper& data) Line        : 129 It appears that there is a bug in the computation of variance in the function pdf1d_gaussian_builder::build(...) With "m" being the mean of the samples, the variance is computed as double v = sum_sq/(n_samples-1) - m*m; while it should have been: double v = sum_sq/(n_samples-1) - m*m * (n_samples / (n_samples-1)); This is the only occasion I have encountered so far. I'm not sure if a similar formula has been used in other places. The complete function is copied below for reference: ---------------------------------------------------- void pdf1d_gaussian_builder::build(pdf1d_pdf& model, mbl_data_wrapper& data) const {  pdf1d_gaussian& g = gaussian(model);  int n_samples = data.size();  if (n_samples<2)  {    vcl_cerr<<"pdf1d_gaussian_builder::build() Too few examples available.\n";    vcl_abort();  }    if (data.is_class("mbl_data_array_wrapper"))    {      // Use more efficient build_from_array algorithm      mbl_data_array_wrapper& data_array =                       static_cast&>(data);      build_from_array(model,data_array.data(),n_samples);      return;    }  double sum = 0;  double sum_sq = 0;  data.reset();  for (int i=0;i

 [Vxl-users] bug report - MUL - pdf1d library - pdf_gaussian_builder From: Nhon Trinh - 2009-09-02 21:24:28 ```File         : mul/pdf1d/pdf_1d_gaussian_builder.cxx Function : pdf1d_gaussian_builder::build(pdf1d_pdf& model, mbl_data_wrapper& data) Line        : 129 It appears that there is a bug in the computation of variance in the function pdf1d_gaussian_builder::build(...) With "m" being the mean of the samples, the variance is computed as double v = sum_sq/(n_samples-1) - m*m; while it should have been: double v = sum_sq/(n_samples-1) - m*m * (n_samples / (n_samples-1)); This is the only occasion I have encountered so far. I'm not sure if a similar formula has been used in other places. The complete function is copied below for reference: ---------------------------------------------------- void pdf1d_gaussian_builder::build(pdf1d_pdf& model, mbl_data_wrapper& data) const {  pdf1d_gaussian& g = gaussian(model);  int n_samples = data.size();  if (n_samples<2)  {    vcl_cerr<<"pdf1d_gaussian_builder::build() Too few examples available.\n";    vcl_abort();  }    if (data.is_class("mbl_data_array_wrapper"))    {      // Use more efficient build_from_array algorithm      mbl_data_array_wrapper& data_array =                       static_cast&>(data);      build_from_array(model,data_array.data(),n_samples);      return;    }  double sum = 0;  double sum_sq = 0;  data.reset();  for (int i=0;i
 [Vxl-users] bug report - MUL - pdf1d library - pdf_gaussian_builder From: Nhon Trinh - 2009-09-02 22:51:34 ```File         : mul/pdf1d/pdf_1d_gaussian_builder.cxx Function : pdf1d_gaussian_builder::build(pdf1d_pdf& model, mbl_data_wrapper& data) Line        : 129 It appears that there is a bug in the computation of variance in the function pdf1d_gaussian_builder::build(...) With "m" being the mean of the samples, the variance is computed as double v = sum_sq/(n_samples-1) - m*m; while it should have been: double v = sum_sq/(n_samples-1) - m*m * (n_samples / (n_samples-1)); This is the only occasion I have encountered so far. I'm not sure if a similar formula has been used in other places. The complete function is copied below for reference: ---------------------------------------------------- void pdf1d_gaussian_builder::build(pdf1d_pdf& model, mbl_data_wrapper& data) const {  pdf1d_gaussian& g = gaussian(model);  int n_samples = data.size();  if (n_samples<2)  {    vcl_cerr<<"pdf1d_gaussian_builder::build() Too few examples available.\n";    vcl_abort();  }    if (data.is_class("mbl_data_array_wrapper"))    {      // Use more efficient build_from_array algorithm      mbl_data_array_wrapper& data_array =                       static_cast&>(data);      build_from_array(model,data_array.data(),n_samples);      return;    }  double sum = 0;  double sum_sq = 0;  data.reset();  for (int i=0;i
 Re: [Vxl-users] bug report - MUL - pdf1d library - pdf_gaussian_builder From: Peter Vanroose - 2009-09-04 18:38:42 ```Thanks for the bug report, Nhon! I've applied your fix. (Below is the patch, in "diff -u" format.) Actually, there was also a similar error in the variance calculation in the method weighted_build(): I had to replace v *= double(n_samples-1)/n_samples; by v *= n_samples/double(n_samples-1); For large samples, the difference between the two is of course minimal, so there is no effect of this change on the tests. -- Peter. --- pdf1d_gaussian_builder.cxx 2009-03-19 16:50:51 +0100 +++ pdf1d_gaussian_builder.cxx 2009-09-04 20:03:48 +0200 @@ -126,7 +126,7 @@ } double m = sum/n_samples; - double v = sum_sq/(n_samples-1) - m*m; + double v = (sum_sq - m*sum)/(n_samples-1); if (v