Re: [Quantlib-users] QuantLib's GBM found flawed. What say the experts? (SOLVED)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Ioannis Rigopoulos wrote on 09/11/23 13:43:
> Could you please send us a link to a file that contains the simulated values
> produced by your program when you use one time step?
>
> Since you use 1,000,000 simulations, the file should contain 1,000,000 numbers.

On the following web page you can find download links to even 2 such
GBM data sets I just have created and uploaded to a server:
https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-5#post-5861992

"
2 GBM generated test files in CSV format (with header line; see there),
with 1,000,000 generated stock prices (S0=100, IV=30, t=1y, timeSteps=1, r=0, 
q=0),
each about 29 MB uncompressed (about 12 MB zip-compressed).

This GBM generator does not use Ito's lemma.

...
"

> On 9/11/2023 1:33 PM, U.Mutlu wrote:
>> Ioannis Rigopoulos wrote on 09/11/23 11:47:
>> > By the way, this is how my histogram of S(t) (for t = 1) simulated values
>> > looks like when I used 100,000 time steps and 100,000 simulations. It is
>> > clearly lognormal in shape.
>>
>> Yes, it's lognormally. My values are lognormlly as well,
>> You just restate the obvious fact.
>>
>> > I am sure, if you do the same when time steps = 1,
>> > you will get a normal distribution for S(t), which would prove
>> > that time steps = 1 does not reproduce the theoretically expected
>> > values for S(t).
>>
>> Not true, simply from the fact that the set boundary is lognormally:
>>   -1SD: 74.08182206817179
>>   +1SD: 134.9858807576003
>> Ie. it's not symmetric (Gaussian, normally distributed) around the
>> initial mean (100), but is lognormally (and therefore unsymmetric).
>>
>> FYI: in my other test program (that does not use QuantLib)
>> I get in such simulations exact results almost identical
>> to the expected theoretical results! So, your math and arguing is wrong, IMO.
>>
>> You are too much fixated on timeSteps=1. Maybe you have
>> a much different understanding of what timeSteps really
>> means in practice. It's _not_ something like the number
>> of simulations that you simply can change and expect
>> to get the same result; no, it has a clearly defined context
>> and gives different results for different timeSteps.
>>
>> And you haven't answered the question why you think it has
>> to be 68.27% for all timeSteps (which IMO is wrong).
>>
>> For those interested, here are more details about the above said:
>> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-3#post-5849548
>>
>>
>> And, I today did a brief summary of these ongoing discussions on GBM here:
>> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-5#post-5861919
>>
>>
>>
>> Ioannis Rigopoulos wrote on 09/11/23 11:47:
>>> Our SDE for the evolution of S is:
>>>
>>> dS = μSdt + σSdw
>>>
>>> In your code you use:
>>>
>>>
>>> PathGenerator<RandomSequenceGenerator<MersenneBoxMuller>>
>>> gbmPathGenerator(gbm, t, timeSteps, gsg, false);
>>> const Path& samplePath = gbmPathGenerator.next().value;
>>>
>>> The PathGenerator::next().value is designed to return an array of numbers
>>> {S(T₁) , S(T₂) , ... , S(Tᵤ)}
>>>
>>> where
>>>
>>> u := timeSteps
>>>
>>> Δt := t/u
>>>
>>> Tᵢ := iΔt for i = 1,...,u
>>>
>>> It follows:
>>>
>>> Tᵤ = t = the final time when we are interested to know the simulated values of
>>> S (in your test you use t = 1).
>>>
>>> QuantLib builds the sequence {S(T₁) , S(T₂) , ... , S(Tᵤ)} incrementally
>>> through a total number of u steps by adding the quantity ΔS := μSΔt + σSΔw on
>>> the previous member of the sequence, with Δw := random number drawn from
>>> N~(0,Δt¹ᐟ²).
>>>
>>> The previous member of S(T₁) is assumed the T₀ = 0 and the initial value S(T₀)
>>> = S(0) is assumed deterministic and known.
>>>
>>> So, starting with the given S(T₀), QuantLib calculates the S(T₁) using the
>>> formula:
>>>
>>> S(T₁) := S(T₀) + μS(T₀)Δt + σS(T₀)Δw
>>>
>>> Then given the S(T₁) as calculated above, it proceeds to calculate S(T₂) as:
>>>
>>> S(T₂) := S(T₁) + μS(T₁)Δt + σS(T₁)Δw
>>>
>>> For simplicity I have used the same notation for the Δw in the equations
>>> above, but they are different.
>>>
>>> After u steps, the final value S(t) = S(Tᵤ) is calculated as:
>>>
>>> S(t) = S(Tᵤ) := S(Tᵤ₋₁) + μS(Tᵤ₋₁)Δt + σS(Tᵤ₋₁)Δw
>>>
>>> All equations above hold only in an approximate sense when Δt is finite.
>>>
>>> They are accurate only in a limiting sense as Δt → 0.
>>>
>>> It follows that the value S(t) will be calculated correctly only if Δt → 0,
>>> which is equivalent to u → ∞
>>>
>>> If you use u = 1 (i.e. one time step), you will still match the 68.27% rule
>>> when the 1SD interval is defined in accordance with a normal distribution
>>> because S(T₁) := S(T₀) + μS(T₀)Δt + σS(T₀)Δw corresponds to a normal
>>> distribution and if you build a histogram of all simulated values you will
>>> notice that S(T₁) is distributed normally, which is clearly not consistent
>>> with the SDE that predicts lognormal distribution.
>>>
>>> The correct procedure is of course to use a very large u and define the 1SD
>>> interval in accordance with a lognormal distribution, as I described in my
>>> previous email.
>>>
>>> Then you will get convergence to the 68.27% rule.
>>>
>>> By the way, this is how my histogram of S(t) (for t = 1) simulated values
>>> looks like when I used 100,000 time steps and 100,000 simulations. It is
>>> clearly lognormal in shape. I am sure, if you do the same when time steps = 1,
>>> you will get a normal distribution for S(t), which would prove that time steps
>>> = 1 does not reproduce the theoretically expected values for S(t).
>>>
>>> Ioannis
>>>
>>> On 9/11/2023 10:19 AM, U.Mutlu wrote:
>>>> Dear Ioannis Rigopoulos, I now analysed your table and
>>>> must say that you seem to have a much different expectation to,
>>>> and understanding of that table.
>>>>
>>>> The  values in your table for timeSteps > 1 can definitely
>>>> not be correct, b/c why do you think it has to be 68.27 for all timeSteps?
>>>> This is illogical b/c the more timeSteps the more the hitrate%
>>>> must be, like in my table. It saturates at about 84.93%
>>>>
>>>> Sorry, I have to retract my initial enthusiastic judment
>>>> of your work below, as it's fundamentally flawed & wrong,
>>>> and I should have been more careful when I quickly & briefly
>>>> inspected your posting in the early morning here.
>>>>
>>>> Since your results in the table for timeSteps > 1 are wrong,
>>>> then your formulas must be wrong too, so I had to stop my analysis right
>>>> there.
>>>>
>>>> Please just answer why you think it has to be 68.27% for all timeSteps?
>>>>
>>>>
>>>> U.Mutlu wrote on 09/11/23 08:36:
>>>>> Mr. Ioannis Rigopoulos, that's a brilliant work by you. Thank you very much.
>>>>> I need some time to study and verify the reults, but it looks very good
>>>>> to solve the confusion and mystery I had experienced when I started this
>>>>> research some weeks ago.
>>>>> I'll try to replicate your results and will report later in detail here.
>>>>>
>>>>> Ioannis Rigopoulos wrote on 09/10/23 19:29:
>>>>>  > On a different topic, I do not understand your statement "/I've never
>>>>> seen
>>>>>  > that Sx@-1SD and Sx@+1SD get computed with Ito's lemma applied/".
>>>>>  > What do the Sx@-1SD and Sx@+1SD mean? I suppose m -1SD and m +1SD, where
>>>>>
>>>>> Sorry for the consfusion, it just means the stock price (Sx after time t,
>>>>> ie. S(t) or S_t) for the lower boundary at -1SD and for the upper
>>>>> boundary at
>>>>> +1SD,
>>>>> ie. your "min of S(t)" and "max of S(t)" in your other posting.
>>>>> In future I'll use a more common term like yours, sorry.
>>>>>
>>>>> Thx
>>>>>
>>>>>
>>>>> Ioannis Rigopoulos wrote on 09/10/23 19:29:
>>>>>> Yes, I mean mu * t - 0.5 * sigma * sigma * t
>>>>>>
>>>>>> Are you sure, you also adjusted the boundaries corresponding to plus and
>>>>>> minus
>>>>>> 1 SD?
>>>>>>
>>>>>> I ran an Excel spreadsheet by modifying a spreadsheet I had published
>>>>>> several
>>>>>> years ago at https://blog.deriscope.com/index.php/en/excel-value-at-risk
>>>>>> that
>>>>>> uses the GeometricBrownianMotionProcess , albeit with the
>>>>>> PseudoRandom::rsg_type generator.
>>>>>>
>>>>>> At any case, I got convergence to 68.27% as I increased the number of time
>>>>>> steps and keeping the number of simulations fixed to 100,000, provided I
>>>>>> used
>>>>>> the correct (Ito-adjusted) means, as below:
>>>>>>
>>>>>> As you see above, the "No Ito-adjusted" boundaries are 74.08 and 134.99,
>>>>>> just
>>>>>> like those you use.
>>>>>>
>>>>>> The Ito-adjusted boundaries are 70.82 and 129.05 produced by shifting the
>>>>>> no-adjusted boundaries - in log terms - to the left by 0.5*sigma^2*t.
>>>>>>
>>>>>> The simulation results (percentage of values being within the min and max
>>>>>> boundaries) are below:
>>>>>>
>>>>>> As you see, the Ito-adjusted percentage reaches 68.30% for 100,000 time
>>>>>> steps,
>>>>>> which is close to the expected 68.27%.
>>>>>>
>>>>>> The No Ito-adjusted percentage seems instead to converge to a different
>>>>>> value
>>>>>> around 67.73%
>>>>>>
>>>>>> On a different topic, I do not understand your statement "/I've never seen
>>>>>> that Sx@-1SD and Sx@+1SD get computed with Ito's lemma applied/".
>>>>>>
>>>>>> What do the Sx@-1SD and Sx@+1SD mean? I suppose m -1SD and m +1SD, where
>>>>>> m is
>>>>>> the mean of a normally distributed variable. This means that one needs to
>>>>>> calculate m in order to do the mentioned computation.
>>>>>>
>>>>>> If your starting point is an SDE like dS = S*mu*dt + S*sigma*dw, then
>>>>>> ln(S(t))
>>>>>> at any time t will be normally distributed with mean m = ln(S(0)) + mu*t -
>>>>>> 0.5*sigma^2*t , where the last term is a consequence of the Ito lemma.
>>>>>>
>>>>>> Do you have any reason to believe that the Ito lemma is somehow not
>>>>>> valid so
>>>>>> that m = ln(S(0)) + mu*t?
>>>>>>
>>>>>> Ioannis
>>>>>>
>>>>>> On 9/10/2023 3:00 PM, U.Mutlu wrote:
>>>>>>
>>>>>>> Ioannis Rigopoulos wrote on 09/10/23 13:06:
>>>>>>> > new GeometricBrownianMotionProcess(S, mu - 0.5*sigma^2*t, sigma)
>>>>>>>
>>>>>>> Thank you for your analysis.
>>>>>>> I guess you mean "mu * t - 0.5 * sigma * sigma * t",
>>>>>>> though in our example with t=1 it doesn't make any difference.
>>>>>>> It gives for timeSteps=1 this result:
>>>>>>>
>>>>>>> $ ./Testing_GBM_of_QuantLib.exe 1
>>>>>>> timeSteps=1 nRuns=1000000 seed=1694350548 S=100.000000 mu=0.000000
>>>>>>> sigma=0.300000 t=1.000000 : nGenAll=1000000 USE_ITO=1 : -1SD=70.822035
>>>>>>> +1SD=129.046162
>>>>>>> cHit=663067/1000000(66.307%)
>>>>>>>
>>>>>>> This is far from the expected value of 68.2689% for timeStep=1.
>>>>>>> So, there must be a bug somewhere. IMO it's our suspect friend Ito :-)
>>>>>>>
>>>>>>> Cf. also the following showing similar results for GBM,
>>>>>>> which then was compared to the standard lognormal calculation:
>>>>>>> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-3#post-5849548
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> And, honestly, I've never seen that Sx@-1SD and Sx@+1SD get computed with
>>>>>>> Ito's lemma applied.
>>>>>>>
>>>>>>>
>>>>>>> Ioannis Rigopoulos wrote on 09/10/23 13:06:
>>>>>>>> Thank you for testing the BoxMullerGaussianRng code, which - as I wrote -
>>>>>>>> does
>>>>>>>> not seem to be used in other areas of the standard part of QuantLib.
>>>>>>>>
>>>>>>>> Your numerical results below ...
>>>>>>>>
>>>>>>>> indicate that the simulated values for ln(S(t)) produced with
>>>>>>>> timeSteps = 1
>>>>>>>> are very likely normally distributed with mean (I use your notation)
>>>>>>>> ln(S(0))
>>>>>>>> + mu*t and standard deviation sigma * sqrt(t)
>>>>>>>>
>>>>>>>> This result _*is consistent*_ with:
>>>>>>>>
>>>>>>>> a) the SDE: dS(t) = S(t)*mu*dt + S(t)*sigma*dw
>>>>>>>>
>>>>>>>> _*and*_
>>>>>>>>
>>>>>>>> b) the fact that in QuantLib the PathGenerator.next().value returns the
>>>>>>>> result
>>>>>>>> of the SDE expression _*without *_applying the ITO correction associated
>>>>>>>> with
>>>>>>>> the fact that dt is not infinitesimal.
>>>>>>>>
>>>>>>>> The b) is also responsible for the lack of convergence in your output
>>>>>>>> towards
>>>>>>>> the theoretical target of 68.27%
>>>>>>>>
>>>>>>>> You would get the correct convergence if you modified your code by using
>>>>>>>> the
>>>>>>>> expressions below:
>>>>>>>>
>>>>>>>>      const double m1SD = S * exp(mu * t - 0.5*sigma^2*t + sigma *
>>>>>>>> sqrt(t) *
>>>>>>>> -1.0);     // Sx at -1SD
>>>>>>>>      const double p1SD = S * exp(mu * t - 0.5*sigma^2*t + sigma *
>>>>>>>> sqrt(t) *
>>>>>>>> 1.0);     // Sx at +1SD
>>>>>>>>
>>>>>>>>      new GeometricBrownianMotionProcess(S, mu - 0.5*sigma^2*t, sigma) (as
>>>>>>>> Peter also pointed out)
>>>>>>>>
>>>>>>>> Again, due to b) one can produce the correct simulated values of a GBM
>>>>>>>> diffused quantity (such as a stock price in the GBM model) by using N
>>>>>>>> time
>>>>>>>> steps, with N very large. Using N = 1 (like in your example), the
>>>>>>>> simulated
>>>>>>>> values will still be lognormally distributed (whence your good result
>>>>>>>> with
>>>>>>>> N =
>>>>>>>> 1), but will be centered at a wrong mean and thus will _*fail *_to
>>>>>>>> represent
>>>>>>>> the correct values expected by the GBM SDE.
>>>>>>>>
>>>>>>>> Ioannis
>>>>>>>>
>>>>>>>> On 9/10/2023 11:29 AM, U.Mutlu wrote:
>>>>>>>>> As said in other posting here, after fixing the test program
>>>>>>>>> by skipping the first item (the initial price) in the
>>>>>>>>> generated sample path, BoxMullerGaussianRng now passes the said test.
>>>>>>>>>
>>>>>>>>> The bugfixed test code and the new test results can be found here:
>>>>>>>>> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-4#post-5861666
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That important fact that the generated sample path contains
>>>>>>>>> not timeSteps elements but 1 + timeSteps elements needs
>>>>>>>>> to be documented in the library doc.
>>>>>>>>> For example on this API doc page one normally would expect
>>>>>>>>> to find such an important information, but it's missing:
>>>>>>>>> https://www.quantlib.org/reference/class_quant_lib_1_1_path_generator.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you or someone else can change/extend the test program by using
>>>>>>>>> the suggested alternative(s) to BoxMullerGaussianRng, I would be happy
>>>>>>>>> to hear about it. Thx.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Ioannis Rigopoulos wrote on 09/09/23 16:28:
>>>>>>>>>> If you search within the QuantLib code for BoxMullerGaussianRng, you
>>>>>>>>>> will
>>>>>>>>>> see
>>>>>>>>>> it is used only in the experimental folder. It is therefore not
>>>>>>>>>> surprising if
>>>>>>>>>> it doesn't produce the expected results.
>>>>>>>>>>
>>>>>>>>>> I use myself the MultiPathGenerator with PseudoRandom::rsg_type,
>>>>>>>>>> which is
>>>>>>>>>> used
>>>>>>>>>> extensively in other areas of QuantLib.
>>>>>>>>>>
>>>>>>>>>> This type expands to InverseCumulativeRsg< RandomSequenceGenerator<
>>>>>>>>>> MersenneTwisterUniformRng > , InverseCumulativeNormal > and gives me
>>>>>>>>>> good
>>>>>>>>>> results.
>>>>>>>>>>
>>>>>>>>>> Ioannis Rigopoulos, founder of deriscope.com