Re: [Quantlib-users] QuantLib's GBM found flawed. What say the experts? (SOLVED)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Thank you U.Mutlu.

The histograms indicate that the numbers represent indeed lognormal 
distributions.

The average of all numbers in file GBM_ts1_A amounts to *104.5935229*

The average of all numbers in file GBM_ts1_B amounts to *104.6047793*

Both data sets exhibit the correct log standard deviation, very close to 
*0.3*.

*Since you have assumed a drift of 0, the SDE represents a martingale 
and therefore the average stock price at t=1 owes to be very close to 100.*

*This proves that the simulated numbers cannot represent the GBM 
diffusion of a stock price starting at 100 and having drift = 0.*

Furthermore, the quantity S(0)exp(0.5*sigma^2*t) equals *104.602786*, 
which is extremely close to the above averages and indicates that the 
Ito term 0.5*sigma^2*t is embedded in your results, albeit in a wrong 
fashion, since - as mentioned above - it should not affect the average 
of S(t).

It seems that you have set up your code in such a way that ln(S) is 
driftless and therefore S acquires a positive Ito drift of 0.5*sigma^2.

It also seems that the number of time steps are much greater than 1, 
otherwise the distribution would appear as normal.

At any case, since I cannot run your own code and verify the results, I 
will finish my feedback with this email.

Ioannis

On 9/11/2023 4:19 PM, U.Mutlu wrote:
> Ioannis Rigopoulos wrote on 09/11/23 13:43:
>> Could you please send us a link to a file that contains the simulated 
>> values
>> produced by your program when you use one time step?
>>
>> Since you use 1,000,000 simulations, the file should contain 
>> 1,000,000 numbers.
>
> On the following web page you can find download links to even 2 such
> GBM data sets I just have created and uploaded to a server:
> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-5#post-5861992 
>
>
> "
> 2 GBM generated test files in CSV format (with header line; see there),
> with 1,000,000 generated stock prices (S0=100, IV=30, t=1y, 
> timeSteps=1, r=0, q=0),
> each about 29 MB uncompressed (about 12 MB zip-compressed).
>
> This GBM generator does not use Ito's lemma.
>
> ...
> "
>
>
>> On 9/11/2023 1:33 PM, U.Mutlu wrote:
>>> Ioannis Rigopoulos wrote on 09/11/23 11:47:
>>> > By the way, this is how my histogram of S(t) (for t = 1) simulated 
>>> values
>>> > looks like when I used 100,000 time steps and 100,000 simulations. 
>>> It is
>>> > clearly lognormal in shape.
>>>
>>> Yes, it's lognormally. My values are lognormlly as well,
>>> You just restate the obvious fact.
>>>
>>> > I am sure, if you do the same when time steps = 1,
>>> > you will get a normal distribution for S(t), which would prove
>>> > that time steps = 1 does not reproduce the theoretically expected
>>> > values for S(t).
>>>
>>> Not true, simply from the fact that the set boundary is lognormally:
>>>   -1SD: 74.08182206817179
>>>   +1SD: 134.9858807576003
>>> Ie. it's not symmetric (Gaussian, normally distributed) around the
>>> initial mean (100), but is lognormally (and therefore unsymmetric).
>>>
>>> FYI: in my other test program (that does not use QuantLib)
>>> I get in such simulations exact results almost identical
>>> to the expected theoretical results! So, your math and arguing is 
>>> wrong, IMO.
>>>
>>> You are too much fixated on timeSteps=1. Maybe you have
>>> a much different understanding of what timeSteps really
>>> means in practice. It's _not_ something like the number
>>> of simulations that you simply can change and expect
>>> to get the same result; no, it has a clearly defined context
>>> and gives different results for different timeSteps.
>>>
>>> And you haven't answered the question why you think it has
>>> to be 68.27% for all timeSteps (which IMO is wrong).
>>>
>>> For those interested, here are more details about the above said:
>>> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-3#post-5849548 
>>>
>>>
>>>
>>> And, I today did a brief summary of these ongoing discussions on GBM 
>>> here:
>>> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-5#post-5861919 
>>>
>>>
>>>
>>>
>>> Ioannis Rigopoulos wrote on 09/11/23 11:47:
>>>> Our SDE for the evolution of S is:
>>>>
>>>> dS = μSdt + σSdw
>>>>
>>>> In your code you use:
>>>>
>>>>
>>>> PathGenerator<RandomSequenceGenerator<MersenneBoxMuller>>
>>>> gbmPathGenerator(gbm, t, timeSteps, gsg, false);
>>>> const Path& samplePath = gbmPathGenerator.next().value;
>>>>
>>>> The PathGenerator::next().value is designed to return an array of 
>>>> numbers
>>>> {S(T₁) , S(T₂) , ... , S(Tᵤ)}
>>>>
>>>> where
>>>>
>>>> u := timeSteps
>>>>
>>>> Δt := t/u
>>>>
>>>> Tᵢ := iΔt for i = 1,...,u
>>>>
>>>> It follows:
>>>>
>>>> Tᵤ = t = the final time when we are interested to know the 
>>>> simulated values of
>>>> S (in your test you use t = 1).
>>>>
>>>> QuantLib builds the sequence {S(T₁) , S(T₂) , ... , S(Tᵤ)} 
>>>> incrementally
>>>> through a total number of u steps by adding the quantity ΔS := μSΔt 
>>>> + σSΔw on
>>>> the previous member of the sequence, with Δw := random number drawn 
>>>> from
>>>> N~(0,Δt¹ᐟ²).
>>>>
>>>> The previous member of S(T₁) is assumed the T₀ = 0 and the initial 
>>>> value S(T₀)
>>>> = S(0) is assumed deterministic and known.
>>>>
>>>> So, starting with the given S(T₀), QuantLib calculates the S(T₁) 
>>>> using the
>>>> formula:
>>>>
>>>> S(T₁) := S(T₀) + μS(T₀)Δt + σS(T₀)Δw
>>>>
>>>> Then given the S(T₁) as calculated above, it proceeds to calculate 
>>>> S(T₂) as:
>>>>
>>>> S(T₂) := S(T₁) + μS(T₁)Δt + σS(T₁)Δw
>>>>
>>>> For simplicity I have used the same notation for the Δw in the 
>>>> equations
>>>> above, but they are different.
>>>>
>>>> After u steps, the final value S(t) = S(Tᵤ) is calculated as:
>>>>
>>>> S(t) = S(Tᵤ) := S(Tᵤ₋₁) + μS(Tᵤ₋₁)Δt + σS(Tᵤ₋₁)Δw
>>>>
>>>> All equations above hold only in an approximate sense when Δt is 
>>>> finite.
>>>>
>>>> They are accurate only in a limiting sense as Δt → 0.
>>>>
>>>> It follows that the value S(t) will be calculated correctly only if 
>>>> Δt → 0,
>>>> which is equivalent to u → ∞
>>>>
>>>> If you use u = 1 (i.e. one time step), you will still match the 
>>>> 68.27% rule
>>>> when the 1SD interval is defined in accordance with a normal 
>>>> distribution
>>>> because S(T₁) := S(T₀) + μS(T₀)Δt + σS(T₀)Δw corresponds to a normal
>>>> distribution and if you build a histogram of all simulated values 
>>>> you will
>>>> notice that S(T₁) is distributed normally, which is clearly not 
>>>> consistent
>>>> with the SDE that predicts lognormal distribution.
>>>>
>>>> The correct procedure is of course to use a very large u and define 
>>>> the 1SD
>>>> interval in accordance with a lognormal distribution, as I 
>>>> described in my
>>>> previous email.
>>>>
>>>> Then you will get convergence to the 68.27% rule.
>>>>
>>>> By the way, this is how my histogram of S(t) (for t = 1) simulated 
>>>> values
>>>> looks like when I used 100,000 time steps and 100,000 simulations. 
>>>> It is
>>>> clearly lognormal in shape. I am sure, if you do the same when time 
>>>> steps = 1,
>>>> you will get a normal distribution for S(t), which would prove that 
>>>> time steps
>>>> = 1 does not reproduce the theoretically expected values for S(t).
>>>>
>>>> Ioannis
>>>>
>>>> On 9/11/2023 10:19 AM, U.Mutlu wrote:
>>>>> Dear Ioannis Rigopoulos, I now analysed your table and
>>>>> must say that you seem to have a much different expectation to,
>>>>> and understanding of that table.
>>>>>
>>>>> The  values in your table for timeSteps > 1 can definitely
>>>>> not be correct, b/c why do you think it has to be 68.27 for all 
>>>>> timeSteps?
>>>>> This is illogical b/c the more timeSteps the more the hitrate%
>>>>> must be, like in my table. It saturates at about 84.93%
>>>>>
>>>>> Sorry, I have to retract my initial enthusiastic judment
>>>>> of your work below, as it's fundamentally flawed & wrong,
>>>>> and I should have been more careful when I quickly & briefly
>>>>> inspected your posting in the early morning here.
>>>>>
>>>>> Since your results in the table for timeSteps > 1 are wrong,
>>>>> then your formulas must be wrong too, so I had to stop my analysis 
>>>>> right
>>>>> there.
>>>>>
>>>>> Please just answer why you think it has to be 68.27% for all 
>>>>> timeSteps?
>>>>>
>>>>>
>>>>> U.Mutlu wrote on 09/11/23 08:36:
>>>>>> Mr. Ioannis Rigopoulos, that's a brilliant work by you. Thank you 
>>>>>> very much.
>>>>>> I need some time to study and verify the reults, but it looks 
>>>>>> very good
>>>>>> to solve the confusion and mystery I had experienced when I 
>>>>>> started this
>>>>>> research some weeks ago.
>>>>>> I'll try to replicate your results and will report later in 
>>>>>> detail here.
>>>>>>
>>>>>> Ioannis Rigopoulos wrote on 09/10/23 19:29:
>>>>>>  > On a different topic, I do not understand your statement 
>>>>>> "/I've never
>>>>>> seen
>>>>>>  > that Sx@-1SD and Sx@+1SD get computed with Ito's lemma applied/".
>>>>>>  > What do the Sx@-1SD and Sx@+1SD mean? I suppose m -1SD and m 
>>>>>> +1SD, where
>>>>>>
>>>>>> Sorry for the consfusion, it just means the stock price (Sx after 
>>>>>> time t,
>>>>>> ie. S(t) or S_t) for the lower boundary at -1SD and for the upper
>>>>>> boundary at
>>>>>> +1SD,
>>>>>> ie. your "min of S(t)" and "max of S(t)" in your other posting.
>>>>>> In future I'll use a more common term like yours, sorry.
>>>>>>
>>>>>> Thx
>>>>>>
>>>>>>
>>>>>> Ioannis Rigopoulos wrote on 09/10/23 19:29:
>>>>>>> Yes, I mean mu * t - 0.5 * sigma * sigma * t
>>>>>>>
>>>>>>> Are you sure, you also adjusted the boundaries corresponding to 
>>>>>>> plus and
>>>>>>> minus
>>>>>>> 1 SD?
>>>>>>>
>>>>>>> I ran an Excel spreadsheet by modifying a spreadsheet I had 
>>>>>>> published
>>>>>>> several
>>>>>>> years ago at 
>>>>>>> https://blog.deriscope.com/index.php/en/excel-value-at-risk
>>>>>>> that
>>>>>>> uses the GeometricBrownianMotionProcess , albeit with the
>>>>>>> PseudoRandom::rsg_type generator.
>>>>>>>
>>>>>>> At any case, I got convergence to 68.27% as I increased the 
>>>>>>> number of time
>>>>>>> steps and keeping the number of simulations fixed to 100,000, 
>>>>>>> provided I
>>>>>>> used
>>>>>>> the correct (Ito-adjusted) means, as below:
>>>>>>>
>>>>>>> As you see above, the "No Ito-adjusted" boundaries are 74.08 and 
>>>>>>> 134.99,
>>>>>>> just
>>>>>>> like those you use.
>>>>>>>
>>>>>>> The Ito-adjusted boundaries are 70.82 and 129.05 produced by 
>>>>>>> shifting the
>>>>>>> no-adjusted boundaries - in log terms - to the left by 
>>>>>>> 0.5*sigma^2*t.
>>>>>>>
>>>>>>> The simulation results (percentage of values being within the 
>>>>>>> min and max
>>>>>>> boundaries) are below:
>>>>>>>
>>>>>>> As you see, the Ito-adjusted percentage reaches 68.30% for 
>>>>>>> 100,000 time
>>>>>>> steps,
>>>>>>> which is close to the expected 68.27%.
>>>>>>>
>>>>>>> The No Ito-adjusted percentage seems instead to converge to a 
>>>>>>> different
>>>>>>> value
>>>>>>> around 67.73%
>>>>>>>
>>>>>>> On a different topic, I do not understand your statement "/I've 
>>>>>>> never seen
>>>>>>> that Sx@-1SD and Sx@+1SD get computed with Ito's lemma applied/".
>>>>>>>
>>>>>>> What do the Sx@-1SD and Sx@+1SD mean? I suppose m -1SD and m 
>>>>>>> +1SD, where
>>>>>>> m is
>>>>>>> the mean of a normally distributed variable. This means that one 
>>>>>>> needs to
>>>>>>> calculate m in order to do the mentioned computation.
>>>>>>>
>>>>>>> If your starting point is an SDE like dS = S*mu*dt + S*sigma*dw, 
>>>>>>> then
>>>>>>> ln(S(t))
>>>>>>> at any time t will be normally distributed with mean m = 
>>>>>>> ln(S(0)) + mu*t -
>>>>>>> 0.5*sigma^2*t , where the last term is a consequence of the Ito 
>>>>>>> lemma.
>>>>>>>
>>>>>>> Do you have any reason to believe that the Ito lemma is somehow not
>>>>>>> valid so
>>>>>>> that m = ln(S(0)) + mu*t?
>>>>>>>
>>>>>>> Ioannis
>>>>>>>
>>>>>>> On 9/10/2023 3:00 PM, U.Mutlu wrote:
>>>>>>>
>>>>>>>> Ioannis Rigopoulos wrote on 09/10/23 13:06:
>>>>>>>> > new GeometricBrownianMotionProcess(S, mu - 0.5*sigma^2*t, sigma)
>>>>>>>>
>>>>>>>> Thank you for your analysis.
>>>>>>>> I guess you mean "mu * t - 0.5 * sigma * sigma * t",
>>>>>>>> though in our example with t=1 it doesn't make any difference.
>>>>>>>> It gives for timeSteps=1 this result:
>>>>>>>>
>>>>>>>> $ ./Testing_GBM_of_QuantLib.exe 1
>>>>>>>> timeSteps=1 nRuns=1000000 seed=1694350548 S=100.000000 mu=0.000000
>>>>>>>> sigma=0.300000 t=1.000000 : nGenAll=1000000 USE_ITO=1 : 
>>>>>>>> -1SD=70.822035
>>>>>>>> +1SD=129.046162
>>>>>>>> cHit=663067/1000000(66.307%)
>>>>>>>>
>>>>>>>> This is far from the expected value of 68.2689% for timeStep=1.
>>>>>>>> So, there must be a bug somewhere. IMO it's our suspect friend 
>>>>>>>> Ito :-)
>>>>>>>>
>>>>>>>> Cf. also the following showing similar results for GBM,
>>>>>>>> which then was compared to the standard lognormal calculation:
>>>>>>>> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-3#post-5849548 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> And, honestly, I've never seen that Sx@-1SD and Sx@+1SD get 
>>>>>>>> computed with
>>>>>>>> Ito's lemma applied.
>>>>>>>>
>>>>>>>>
>>>>>>>> Ioannis Rigopoulos wrote on 09/10/23 13:06:
>>>>>>>>> Thank you for testing the BoxMullerGaussianRng code, which - 
>>>>>>>>> as I wrote -
>>>>>>>>> does
>>>>>>>>> not seem to be used in other areas of the standard part of 
>>>>>>>>> QuantLib.
>>>>>>>>>
>>>>>>>>> Your numerical results below ...
>>>>>>>>>
>>>>>>>>> indicate that the simulated values for ln(S(t)) produced with
>>>>>>>>> timeSteps = 1
>>>>>>>>> are very likely normally distributed with mean (I use your 
>>>>>>>>> notation)
>>>>>>>>> ln(S(0))
>>>>>>>>> + mu*t and standard deviation sigma * sqrt(t)
>>>>>>>>>
>>>>>>>>> This result _*is consistent*_ with:
>>>>>>>>>
>>>>>>>>> a) the SDE: dS(t) = S(t)*mu*dt + S(t)*sigma*dw
>>>>>>>>>
>>>>>>>>> _*and*_
>>>>>>>>>
>>>>>>>>> b) the fact that in QuantLib the PathGenerator.next().value 
>>>>>>>>> returns the
>>>>>>>>> result
>>>>>>>>> of the SDE expression _*without *_applying the ITO correction 
>>>>>>>>> associated
>>>>>>>>> with
>>>>>>>>> the fact that dt is not infinitesimal.
>>>>>>>>>
>>>>>>>>> The b) is also responsible for the lack of convergence in your 
>>>>>>>>> output
>>>>>>>>> towards
>>>>>>>>> the theoretical target of 68.27%
>>>>>>>>>
>>>>>>>>> You would get the correct convergence if you modified your 
>>>>>>>>> code by using
>>>>>>>>> the
>>>>>>>>> expressions below:
>>>>>>>>>
>>>>>>>>>      const double m1SD = S * exp(mu * t - 0.5*sigma^2*t + sigma *
>>>>>>>>> sqrt(t) *
>>>>>>>>> -1.0);     // Sx at -1SD
>>>>>>>>>      const double p1SD = S * exp(mu * t - 0.5*sigma^2*t + sigma *
>>>>>>>>> sqrt(t) *
>>>>>>>>> 1.0);     // Sx at +1SD
>>>>>>>>>
>>>>>>>>>      new GeometricBrownianMotionProcess(S, mu - 0.5*sigma^2*t, 
>>>>>>>>> sigma) (as
>>>>>>>>> Peter also pointed out)
>>>>>>>>>
>>>>>>>>> Again, due to b) one can produce the correct simulated values 
>>>>>>>>> of a GBM
>>>>>>>>> diffused quantity (such as a stock price in the GBM model) by 
>>>>>>>>> using N
>>>>>>>>> time
>>>>>>>>> steps, with N very large. Using N = 1 (like in your example), the
>>>>>>>>> simulated
>>>>>>>>> values will still be lognormally distributed (whence your good 
>>>>>>>>> result
>>>>>>>>> with
>>>>>>>>> N =
>>>>>>>>> 1), but will be centered at a wrong mean and thus will _*fail 
>>>>>>>>> *_to
>>>>>>>>> represent
>>>>>>>>> the correct values expected by the GBM SDE.
>>>>>>>>>
>>>>>>>>> Ioannis
>>>>>>>>>
>>>>>>>>> On 9/10/2023 11:29 AM, U.Mutlu wrote:
>>>>>>>>>> As said in other posting here, after fixing the test program
>>>>>>>>>> by skipping the first item (the initial price) in the
>>>>>>>>>> generated sample path, BoxMullerGaussianRng now passes the 
>>>>>>>>>> said test.
>>>>>>>>>>
>>>>>>>>>> The bugfixed test code and the new test results can be found 
>>>>>>>>>> here:
>>>>>>>>>> https://www.elitetrader.com/et/threads/simulating-stock-prices-using-gbm.375533/page-4#post-5861666 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> That important fact that the generated sample path contains
>>>>>>>>>> not timeSteps elements but 1 + timeSteps elements needs
>>>>>>>>>> to be documented in the library doc.
>>>>>>>>>> For example on this API doc page one normally would expect
>>>>>>>>>> to find such an important information, but it's missing:
>>>>>>>>>> https://www.quantlib.org/reference/class_quant_lib_1_1_path_generator.html 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> If you or someone else can change/extend the test program by 
>>>>>>>>>> using
>>>>>>>>>> the suggested alternative(s) to BoxMullerGaussianRng, I would 
>>>>>>>>>> be happy
>>>>>>>>>> to hear about it. Thx.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Ioannis Rigopoulos wrote on 09/09/23 16:28:
>>>>>>>>>>> If you search within the QuantLib code for 
>>>>>>>>>>> BoxMullerGaussianRng, you
>>>>>>>>>>> will
>>>>>>>>>>> see
>>>>>>>>>>> it is used only in the experimental folder. It is therefore not
>>>>>>>>>>> surprising if
>>>>>>>>>>> it doesn't produce the expected results.
>>>>>>>>>>>
>>>>>>>>>>> I use myself the MultiPathGenerator with 
>>>>>>>>>>> PseudoRandom::rsg_type,
>>>>>>>>>>> which is
>>>>>>>>>>> used
>>>>>>>>>>> extensively in other areas of QuantLib.
>>>>>>>>>>>
>>>>>>>>>>> This type expands to InverseCumulativeRsg< 
>>>>>>>>>>> RandomSequenceGenerator<
>>>>>>>>>>> MersenneTwisterUniformRng > , InverseCumulativeNormal > and 
>>>>>>>>>>> gives me
>>>>>>>>>>> good
>>>>>>>>>>> results.
>>>>>>>>>>>
>>>>>>>>>>> Ioannis Rigopoulos, founder of deriscope.com
>
>

-- 
This email has been checked for viruses by Avast antivirus software.
www.avast.com