Re: [Quantlib-users] QuantLib's GBM found flawed. What say the experts? (SOLVED)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Ioannis Rigopoulos wrote on 09/11/23 11:47:
 > By the way, this is how my histogram of S(t) (for t = 1) simulated values
 > looks like when I used 100,000 time steps and 100,000 simulations. It is
 > clearly lognormal in shape.

Yes, it's lognormally. My values are lognormlly as well,
You just restate the obvious fact.

 > I am sure, if you do the same when time steps = 1,
 > you will get a normal distribution for S(t), which would prove
 > that time steps = 1 does not reproduce the theoretically expected
 > values for S(t).

Not true, simply from the fact that the set boundary is lognormally:
   -1SD: 74.08182206817179
   +1SD: 134.9858807576003
Ie. it's not symmetric (Gaussian, normally distributed) around the
initial mean (100), but is lognormally (and therefore unsymmetric).

FYI: in my other test program (that does not use QuantLib)
I get in such simulations exact results almost identical
to the expected theoretical results! So, your math and arguing is wrong, IMO.

You are too much fixated on timeSteps=1. Maybe you have
a much different understanding of what timeSteps really
means in practice. It's _not_ something like the number
of simulations that you simply can change and expect
to get the same result; no, it has a clearly defined context
and gives different results for different timeSteps.

And you haven't answered the question why you think it has
to be 68.27% for all timeSteps (which IMO is wrong).

For those interested, here are more details about the above said:
https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-3#post-5849548

And, I today did a brief summary of these ongoing discussions on GBM here:
https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-5#post-5861919

Ioannis Rigopoulos wrote on 09/11/23 11:47:
> Our SDE for the evolution of S is:
>
> dS = μSdt + σSdw
>
> In your code you use:
>
>
> PathGenerator<RandomSequenceGenerator<MersenneBoxMuller>>
> gbmPathGenerator(gbm, t, timeSteps, gsg, false);
> const Path& samplePath = gbmPathGenerator.next().value;
>
> The PathGenerator::next().value is designed to return an array of numbers
> {S(T₁) , S(T₂) , ... , S(Tᵤ)}
>
> where
>
> u := timeSteps
>
> Δt := t/u
>
> Tᵢ := iΔt for i = 1,...,u
>
> It follows:
>
> Tᵤ = t = the final time when we are interested to know the simulated values of
> S (in your test you use t = 1).
>
> QuantLib builds the sequence {S(T₁) , S(T₂) , ... , S(Tᵤ)} incrementally
> through a total number of u steps by adding the quantity ΔS := μSΔt + σSΔw on
> the previous member of the sequence, with Δw := random number drawn from
> N~(0,Δt¹ᐟ²).
>
> The previous member of S(T₁) is assumed the T₀ = 0 and the initial value S(T₀)
> = S(0) is assumed deterministic and known.
>
> So, starting with the given S(T₀), QuantLib calculates the S(T₁) using the
> formula:
>
> S(T₁) := S(T₀) + μS(T₀)Δt + σS(T₀)Δw
>
> Then given the S(T₁) as calculated above, it proceeds to calculate S(T₂) as:
>
> S(T₂) := S(T₁) + μS(T₁)Δt + σS(T₁)Δw
>
> For simplicity I have used the same notation for the Δw in the equations
> above, but they are different.
>
> After u steps, the final value S(t) = S(Tᵤ) is calculated as:
>
> S(t) = S(Tᵤ) := S(Tᵤ₋₁) + μS(Tᵤ₋₁)Δt + σS(Tᵤ₋₁)Δw
>
> All equations above hold only in an approximate sense when Δt is finite.
>
> They are accurate only in a limiting sense as Δt → 0.
>
> It follows that the value S(t) will be calculated correctly only if Δt → 0,
> which is equivalent to u → ∞
>
> If you use u = 1 (i.e. one time step), you will still match the 68.27% rule
> when the 1SD interval is defined in accordance with a normal distribution
> because S(T₁) := S(T₀) + μS(T₀)Δt + σS(T₀)Δw corresponds to a normal
> distribution and if you build a histogram of all simulated values you will
> notice that S(T₁) is distributed normally, which is clearly not consistent
> with the SDE that predicts lognormal distribution.
>
> The correct procedure is of course to use a very large u and define the 1SD
> interval in accordance with a lognormal distribution, as I described in my
> previous email.
>
> Then you will get convergence to the 68.27% rule.
>
> By the way, this is how my histogram of S(t) (for t = 1) simulated values
> looks like when I used 100,000 time steps and 100,000 simulations. It is
> clearly lognormal in shape. I am sure, if you do the same when time steps = 1,
> you will get a normal distribution for S(t), which would prove that time steps
> = 1 does not reproduce the theoretically expected values for S(t).
>
> Ioannis
>
> On 9/11/2023 10:19 AM, U.Mutlu wrote:
>> Dear Ioannis Rigopoulos, I now analysed your table and
>> must say that you seem to have a much different expectation to,
>> and understanding of that table.
>>
>> The  values in your table for timeSteps > 1 can definitely
>> not be correct, b/c why do you think it has to be 68.27 for all timeSteps?
>> This is illogical b/c the more timeSteps the more the hitrate%
>> must be, like in my table. It saturates at about 84.93%
>>
>> Sorry, I have to retract my initial enthusiastic judment
>> of your work below, as it's fundamentally flawed & wrong,
>> and I should have been more careful when I quickly & briefly
>> inspected your posting in the early morning here.
>>
>> Since your results in the table for timeSteps > 1 are wrong,
>> then your formulas must be wrong too, so I had to stop my analysis right there.
>>
>> Please just answer why you think it has to be 68.27% for all timeSteps?
>>
>>
>> U.Mutlu wrote on 09/11/23 08:36:
>>> Mr. Ioannis Rigopoulos, that's a brilliant work by you. Thank you very much.
>>> I need some time to study and verify the reults, but it looks very good
>>> to solve the confusion and mystery I had experienced when I started this
>>> research some weeks ago.
>>> I'll try to replicate your results and will report later in detail here.
>>>
>>> Ioannis Rigopoulos wrote on 09/10/23 19:29:
>>>  > On a different topic, I do not understand your statement "/I've never seen
>>>  > that Sx@-1SD and Sx@+1SD get computed with Ito's lemma applied/".
>>>  > What do the Sx@-1SD and Sx@+1SD mean? I suppose m -1SD and m +1SD, where
>>>
>>> Sorry for the consfusion, it just means the stock price (Sx after time t,
>>> ie. S(t) or S_t) for the lower boundary at -1SD and for the upper boundary at
>>> +1SD,
>>> ie. your "min of S(t)" and "max of S(t)" in your other posting.
>>> In future I'll use a more common term like yours, sorry.
>>>
>>> Thx
>>>
>>>
>>> Ioannis Rigopoulos wrote on 09/10/23 19:29:
>>>> Yes, I mean mu * t - 0.5 * sigma * sigma * t
>>>>
>>>> Are you sure, you also adjusted the boundaries corresponding to plus and
>>>> minus
>>>> 1 SD?
>>>>
>>>> I ran an Excel spreadsheet by modifying a spreadsheet I had published several
>>>> years ago at https://blog.deriscope.com/index.php/en/excel-value-at-risk that
>>>> uses the GeometricBrownianMotionProcess , albeit with the
>>>> PseudoRandom::rsg_type generator.
>>>>
>>>> At any case, I got convergence to 68.27% as I increased the number of time
>>>> steps and keeping the number of simulations fixed to 100,000, provided I used
>>>> the correct (Ito-adjusted) means, as below:
>>>>
>>>> As you see above, the "No Ito-adjusted" boundaries are 74.08 and 134.99, just
>>>> like those you use.
>>>>
>>>> The Ito-adjusted boundaries are 70.82 and 129.05 produced by shifting the
>>>> no-adjusted boundaries - in log terms - to the left by 0.5*sigma^2*t.
>>>>
>>>> The simulation results (percentage of values being within the min and max
>>>> boundaries) are below:
>>>>
>>>> As you see, the Ito-adjusted percentage reaches 68.30% for 100,000 time
>>>> steps,
>>>> which is close to the expected 68.27%.
>>>>
>>>> The No Ito-adjusted percentage seems instead to converge to a different value
>>>> around 67.73%
>>>>
>>>> On a different topic, I do not understand your statement "/I've never seen
>>>> that Sx@-1SD and Sx@+1SD get computed with Ito's lemma applied/".
>>>>
>>>> What do the Sx@-1SD and Sx@+1SD mean? I suppose m -1SD and m +1SD, where m is
>>>> the mean of a normally distributed variable. This means that one needs to
>>>> calculate m in order to do the mentioned computation.
>>>>
>>>> If your starting point is an SDE like dS = S*mu*dt + S*sigma*dw, then
>>>> ln(S(t))
>>>> at any time t will be normally distributed with mean m = ln(S(0)) + mu*t -
>>>> 0.5*sigma^2*t , where the last term is a consequence of the Ito lemma.
>>>>
>>>> Do you have any reason to believe that the Ito lemma is somehow not valid so
>>>> that m = ln(S(0)) + mu*t?
>>>>
>>>> Ioannis
>>>>
>>>> On 9/10/2023 3:00 PM, U.Mutlu wrote:
>>>>
>>>>> Ioannis Rigopoulos wrote on 09/10/23 13:06:
>>>>> > new GeometricBrownianMotionProcess(S, mu - 0.5*sigma^2*t, sigma)
>>>>>
>>>>> Thank you for your analysis.
>>>>> I guess you mean "mu * t - 0.5 * sigma * sigma * t",
>>>>> though in our example with t=1 it doesn't make any difference.
>>>>> It gives for timeSteps=1 this result:
>>>>>
>>>>> $ ./Testing_GBM_of_QuantLib.exe 1
>>>>> timeSteps=1 nRuns=1000000 seed=1694350548 S=100.000000 mu=0.000000
>>>>> sigma=0.300000 t=1.000000 : nGenAll=1000000 USE_ITO=1 : -1SD=70.822035
>>>>> +1SD=129.046162
>>>>> cHit=663067/1000000(66.307%)
>>>>>
>>>>> This is far from the expected value of 68.2689% for timeStep=1.
>>>>> So, there must be a bug somewhere. IMO it's our suspect friend Ito :-)
>>>>>
>>>>> Cf. also the following showing similar results for GBM,
>>>>> which then was compared to the standard lognormal calculation:
>>>>> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-3#post-5849548
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> And, honestly, I've never seen that Sx@-1SD and Sx@+1SD get computed with
>>>>> Ito's lemma applied.
>>>>>
>>>>>
>>>>> Ioannis Rigopoulos wrote on 09/10/23 13:06:
>>>>>> Thank you for testing the BoxMullerGaussianRng code, which - as I wrote -
>>>>>> does
>>>>>> not seem to be used in other areas of the standard part of QuantLib.
>>>>>>
>>>>>> Your numerical results below ...
>>>>>>
>>>>>> indicate that the simulated values for ln(S(t)) produced with timeSteps = 1
>>>>>> are very likely normally distributed with mean (I use your notation)
>>>>>> ln(S(0))
>>>>>> + mu*t and standard deviation sigma * sqrt(t)
>>>>>>
>>>>>> This result _*is consistent*_ with:
>>>>>>
>>>>>> a) the SDE: dS(t) = S(t)*mu*dt + S(t)*sigma*dw
>>>>>>
>>>>>> _*and*_
>>>>>>
>>>>>> b) the fact that in QuantLib the PathGenerator.next().value returns the
>>>>>> result
>>>>>> of the SDE expression _*without *_applying the ITO correction associated
>>>>>> with
>>>>>> the fact that dt is not infinitesimal.
>>>>>>
>>>>>> The b) is also responsible for the lack of convergence in your output
>>>>>> towards
>>>>>> the theoretical target of 68.27%
>>>>>>
>>>>>> You would get the correct convergence if you modified your code by using
>>>>>> the
>>>>>> expressions below:
>>>>>>
>>>>>>      const double m1SD = S * exp(mu * t - 0.5*sigma^2*t + sigma * sqrt(t) *
>>>>>> -1.0);     // Sx at -1SD
>>>>>>      const double p1SD = S * exp(mu * t - 0.5*sigma^2*t + sigma * sqrt(t) *
>>>>>> 1.0);     // Sx at +1SD
>>>>>>
>>>>>>      new GeometricBrownianMotionProcess(S, mu - 0.5*sigma^2*t, sigma) (as
>>>>>> Peter also pointed out)
>>>>>>
>>>>>> Again, due to b) one can produce the correct simulated values of a GBM
>>>>>> diffused quantity (such as a stock price in the GBM model) by using N time
>>>>>> steps, with N very large. Using N = 1 (like in your example), the simulated
>>>>>> values will still be lognormally distributed (whence your good result with
>>>>>> N =
>>>>>> 1), but will be centered at a wrong mean and thus will _*fail *_to
>>>>>> represent
>>>>>> the correct values expected by the GBM SDE.
>>>>>>
>>>>>> Ioannis
>>>>>>
>>>>>> On 9/10/2023 11:29 AM, U.Mutlu wrote:
>>>>>>> As said in other posting here, after fixing the test program
>>>>>>> by skipping the first item (the initial price) in the
>>>>>>> generated sample path, BoxMullerGaussianRng now passes the said test.
>>>>>>>
>>>>>>> The bugfixed test code and the new test results can be found here:
>>>>>>> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-4#post-5861666
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> That important fact that the generated sample path contains
>>>>>>> not timeSteps elements but 1 + timeSteps elements needs
>>>>>>> to be documented in the library doc.
>>>>>>> For example on this API doc page one normally would expect
>>>>>>> to find such an important information, but it's missing:
>>>>>>> https://www.quantlib.org/reference/class_quant_lib_1_1_path_generator.html
>>>>>>>
>>>>>>> If you or someone else can change/extend the test program by using
>>>>>>> the suggested alternative(s) to BoxMullerGaussianRng, I would be happy
>>>>>>> to hear about it. Thx.
>>>>>>>
>>>>>>>
>>>>>>> Ioannis Rigopoulos wrote on 09/09/23 16:28:
>>>>>>>> If you search within the QuantLib code for BoxMullerGaussianRng, you will
>>>>>>>> see
>>>>>>>> it is used only in the experimental folder. It is therefore not
>>>>>>>> surprising if
>>>>>>>> it doesn't produce the expected results.
>>>>>>>>
>>>>>>>> I use myself the MultiPathGenerator with PseudoRandom::rsg_type, which is
>>>>>>>> used
>>>>>>>> extensively in other areas of QuantLib.
>>>>>>>>
>>>>>>>> This type expands to InverseCumulativeRsg< RandomSequenceGenerator<
>>>>>>>> MersenneTwisterUniformRng > , InverseCumulativeNormal > and gives me good
>>>>>>>> results.
>>>>>>>>
>>>>>>>> Ioannis Rigopoulos, founder of deriscope.com