Menu

Fitting

Armin Weiser Jörgen Brandt
Attachments
primary_view_small.jpg (74490 bytes)
secondary_view_small.jpg (36511 bytes)

Curve fitting

Curve fitting is a central aspect of PMM-Lab. In this section we will present, in what kind of use cases you may want to fit a curve and how to accomplish that. Furthermore, we will introduce, how the fitted models can be further processed and how to visualize the obtained results.

Use cases

There are two distinct scenarios in which you might want to fit a curve.

  • Given a microbial data set, you want to derive a model that approximates the tenacity behavior under just the same conditions that gave rise to this data set.
  • Given the model parameters derived under a variety of conditions, you want to derive a model that approximates the value of the model parameters dependent on these conditions.

While the first scenario describes the estimation of a Primary Model, the latter scenario represents a Secondary Model. We will refer to these two concepts quite frequently. Hence, its understanding is paramount for the upcoming discussion.

Primary Model Fitting

Fitted primary model curve in PMM-Lab view

In Primary Model Fitting, we consider bacterial growth or inactivation as the concentration of a bacterial agent dependent on the time. So

t -> log10(C)

In PMM-Lab this mapping is implemented as a function dependent on a set of parameters p. Hence,

log10(C)=f(t|p)

It is the goal of the Fitting process to derive a realization of the parameter set p that approximates a given data set D as accurately as possible. Consequently, the fitting of a Primary Model consumes a microbial data set D and outputs a set of parameters p, the primary model.

Secondary Model Fitting

Fitted secondary model curve in PMM-Lab view

In Secondary Model Fitting, we assume that bacterial kinetics depend on the environmental conditions e under which the original experiment has been conducted. So

e -> p

In PMM-Lab, a model that implements this mapping also depends on a set of parameters s. Hence,

p=f(e|s)

So, in Secondary Model Fitting we try to find a parameter set s that approximates the parameters for a primary model as accurately as possible for given environmental conditions. Consequently Secondary Model Fitting consumes examples of environmental conditions e and the primary model parameters p associated to them, and produces a set of secondary parameters s.

Curve fitting in PMM-Lab

In PMM-Lab, both Primary and Secondary Model Fitting are performed by a single KNIME node called "Model Fitting". The node knows, when to perform what from the KNIME table presented to it. If given a combination of a Microbial Data Set and a Primary Model Formulas, it performs a Primary Model fit. If presented a combination of fitted Primary Models and Secondary Models, if performs a Secondary model fit.

Formula Notation

PMM-Lab comes with a number of preset formulas that you can deploy right away. You may, however, want to create your own models. The node "Model Creator" gives you the possibility to create and edit model formulas inside PMM-Lab. For these formulas PMM-Lab uses infix notation and a number of common math functions are available. You may use

  • sqrt(x) to calculate the square root of the expression x
  • ...

An important part of PMM-Lab formula notation is the disambiguation of the logarithm function. In contrast to other programming languages, in PMM-Lab, the function log(x) refers to the decadic logarithm. To avoid ambituity among the various logarithm functions, you may prefer to use the functions ln(x) for the natural logarithm with base e and log10(x) to refer to the decadic logarithm.

In your formulas you can also use conditional operators. For example

t<=1

evaluates 1 if t is smaller or equals 1 whereas it evaluates 0 otherwise. You may use the following operators accordingly:

  • \<
  • >
  • \<=
  • >=
  • &&
  • ||

Initialization range of formula parameters

For each parameter in a formula you can impose an initialization range. This range will be used for initializing the algorithm that tries to find the optimal parameter set. It is not mandatory that the actual parameters lie within this range. You will, however, be notified if estimated parameters lie outside their initial definition range.