gretl / Feature Requests / #208 GMM estimation of multiple equ. systems

Artur Tarassow - 2024-01-29

Thanks for ticket, Sven. Indeed, that would be a useful feature.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sven Schreiber - 2024-02-02

OK, I created a hansl-based sort-of proof of concept (attached).
However, there are some complications here:

When it comes to checking the dimensions of the weighting matrix, I believe gretl is not flexible enough. In this two-equation system, I could not have two "orthog" lines, one for each equation. The reason is that given three system-wide instruments (in this case), each orthog line stood for three actual conditions. So the weighting matrix is 6x6. But gretl saw the two orthog lines and complained about a mismatch. So you have to have quite verbose orthog lines, not using lists on the RHS.

It seems there is no way to construct the orthog lines in a more abstract way programmatically, which makes it a bit difficult to automate this stuff. This is probably the same situation that we had with the 'system' command until a few years ago. The solution there was to allow an array of lists or something like that. Note that I also tried to use just a single "orthog" line with lists on the left and on the right, like this: "orthog ud us ; const dy dpf", because in section 27.2 of the guide it says that you could do that. But that also gave an error, so that is perhaps another bug. Allowing a list on the left of the semicolon would probably solve that problem.

Not sure, either, about how to construct the N residual series of the system automatically. It says you can use a matrix as the target quantity there, but is that implicitly taken to be a data vector, or is a multi-column matrix also allowed? And what about a list?

BTW, I think that the syntax introduction in section 27.2 is missing the part that you also need to provide a formula (assignment) that updates your target variable as a function of the parameters. There it only says you need orthog, weights, params. Further down the rest also becomes clear (sort of), but it's a bit confusing.
Thanks, Sven

sysGMM.inp
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Allin Cottrell - 2024-02-06
  
  Several points merit a reply but right now I'll just deal with one: using a list in an orthog statement. Up till now we've assumed there's only one term on each side of the semicolon in such a statement: a named list would work, but not a "raw" list. However, now in git and snapshots there's a smarter parser which allows for a list in the form of two or more series names -- since in some cases that's more convenient. (We then manufacture an internal named list to keep track of what's wanted.)
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Sven Schreiber - 2024-02-06
    
    Thanks, Allin. Actually, using a named list would have been enough (and of course still must be used with current gretl), it just didn't occur to me to try that.
    The only real remaining question then is number 3.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Allin Cottrell - 2024-02-06
      
      Number 3 (constructing N residual series): "It says you can use a matrix as the target quantity there". What's the "it" in question and where does it say something about a residuals target, please?
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Sven Schreiber - 2024-02-07
        
        Hi @allin, the context was always section 27.2 (and/or 27.3) of the guide. Specifically: "...must follow the syntax 'orthog x ; Z' where x may be a series, matrix or list of series..."
        Since the first term in the orthog line (the x here) is the stuff that depends on the parameters, I am inferring from that statement that the constructed residuals can also be calculated as a gretl matrix or list.
        And indeed I have now successfully specified the list of residuals (list resids = ud us) in a single orthog line: 'orthog resids ; allinst', so that's progress. I'm still struggling a bit with updating the contained series, because I'm not allowed to use a loop within the gmm block. So I guess I'll have to switch to a matrix-based formulation. I'm working on that.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Allin Cottrell - 2024-02-07
        
        I'm not quite sure what you mean by the "constructed residuals". But the LHS values from an orthog statement are automatically updated in the course of iteration, and on successful completion of a gmm block they hold the final estimates. Here's a series-and-list oriented variant of what I posted on the devel list:
        
        set verbose off nulldata 100 T = $nobs k = 3 n = 2 set seed 12345 matrix X = mnormal(T, k) matrix Y = mnormal(T, n) list LX = mat2list(X, "x") list LY = mat2list(Y, "y") # OLS ols y1 LX --quiet u1o = $uhat ols y2 LX --quiet u2o = $uhat list Luo = u1o u2o # GMM matrix b1 = zeros(k, 1) matrix b2 = zeros(k, 1) series u1 = 0 series u2 = 0 list Lu = u1 u2 matrix W = I(n*k) gmm u1 = y1 - lincomb(LX, b1) u2 = y2 - lincomb(LX, b2) orthog Lu ; LX weights W params b1 b2 end gmm print Luo Lu --byobs --range=1:10
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Sven Schreiber - 2024-02-07
        
        Yes, that's similar to what I have. By "constructed residuals" I mean the u1 and u2 series which are re-calculated (constructed) every time.
        The two generalizations that I'm trying to achieve are:
        
        Do the meta-programming for the general case n>2. A loop is not an option because it's forbidden inside gmm. This probably means to use matrix-es.
        
        If you have a mix of endogenous and exogenous regressors along with identifying restrictions and so on, not all coefficients are free parameters, so the regressor set with varying coefficients differs across equations.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Allin Cottrell - 2024-02-07

I'm going back to the original idea here, which I take to be providing a "method=gmm" option for gretl's "system" command. I'd suggest that it might be more fruitful to figure out, based initially on the current "system" syntax, how one would best go about expressing the information in a system block to facilitate GMM estimation. I mean, in general terms, in pseudo-code if you like -- not necessarily tied to the current syntax for a "gmm" block.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Sven Schreiber - 2024-02-08
  
  Yes, that's what I have in mind as well. I'm thinking of user input in terms of what's in section 34.2 of the guide: The wanted input is basically (i) a list of LHS series, then (ii) a lists array of the regressor lists for each equation. Finally, although this is not directly mentioned in 34.2 (it probably should be), but in the previous section 34.1 --albeit without an example--, (iii) a list of system-wide instruments, or (iiiB) alternatively perhaps a list of system-wide endogenous variables.
  I think I'm getting there, though. As Jack mentioned elsewhere, writing a helper function that contains a loop circumvents some problems. (I had also tried to build a quasi-block-diagonal matrix with the diagcat function, which was then multiplied with a certain stacked coef vector, but that approach gave numerical issues and non-convergence. So the plain but outsourced loop-based variant is preferred now.)
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sven Schreiber - 2024-02-12

OK, I'm attaching an updated prototype. Given the insight that a loop can be used if it's encapsulated in a function I went back to a series and list-based formulation. (And avoiding a loop proved basically impossible also when working with matrices.)
The user input in this example script is given by the two list types "sysLHS" and "allinst", plus an N-element lists array "sysRHSs". This should already be very close to the programmatic syntax for the system command. An alternative might be to let the user specify the endogenous variables instead of the instruments and back out the instrument list automatically.
What's missing in this prototype apart from moving everything into functions:

produce sensible starting values automatically, e.g. by initial OLS regressions

perhaps initialize the weights matrix more cleverly

After the estimation is done, re-transform the full coefficient vector to a equ-by-equ representation, and produce some readable estimation output.

At least this example works for me.
cheers, sven

sysGMM_lists.inp
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Allin Cottrell - 2024-02-13
  
  Nice job getting this working. I notice that the coefficient estimates are identical with equation-by-equation tsls. That's correct, of course, given the setup. But I wonder: how, under the GMM approach, would one go about constructing a true "system" estimator which takes into account covariance of the errors across the equations?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Sven Schreiber - 2024-02-13
    
    My more or less spontaneous reaction is that it should differ in the overidentified case, shouldn't it? In that sense this just-identified example is probably not optimal. And do I understand correctly that the weights matrix is updated automatically in the gmm block, generally speaking?
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Riccardo "Jack" Lucchetti - 2024-02-13
    
    I notice that the coefficient estimates are identical with equation-by-equation tsls. That's correct, of course, given the setup.
    
    Exact identification is also an essential ingredient. Adding a randomly-generated instrument the equality vanishes.
    
    But I wonder: how, under the GMM approach, would one go about constructing a true "system" estimator which takes into account covariance of the errors across the equations?
    
    I would have thought that iterated GMM would have done the trick, but I'm getting different results than 3-stage ls.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Allin Cottrell - 2024-02-13
      
      Hmm. I'm pretty confident about our one- and two-step GMM implementations, but not sure I'd go to the stake claiming our iterated GMM is irreproachable. Maybe something to look into.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Sven Schreiber - 2024-02-14
        
        Right now I cannot run the system comparison, but why should iterated GMM be exactly equivalent to 3sls?
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Allin Cottrell - 2024-02-14
        
        I'm not sure if that's quite right (Jack?), but at any rate Davidson and MacKinnon are clear that 3sls can be represented as a GMM estimator.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Sven Schreiber - 2024-02-14
        
        Or did Jack mean to apply iteration for both?
        Anyway, yet another incarnation is attached, where I changed the system to be overidentified, and the "system" block formulation is also added for comparison.
        I'm never getting exactly identical estimates, but they (3sls, iterated or not, and iterated gmm) are close enough I think.
        BTW, I guess the --two-step and --iterate options are mutually exclusive for gmm? But I'm not getting an error.
        
        sysGMM_lists.inp
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sven Schreiber - 2024-03-11

First of all, for the record, the --two-step and --iterate options are now officially incompatible and their combination is rejected by gretl.
Secondly, as a reminder for the future, the gmm approach to system estimation would especially be useful for nonlinear systems, because the other estimators aren't set up for this (in gretl).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sven Schreiber - 2024-11-05

OK, so what do we want to do with this now?

Take this hansl code (posted above) as a template for natively adding a GMM estimator option for gretl's standard 'system' setup?

Instead package up the hansl code as a contributed offering, not pursuing the native implementation?

And/or focusing on a hansl-based extension to non-linear systems (which aren't covered by native 'system')?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Artur Tarassow - 2024-11-09
  
  I guess the quickest way is write a user-contributed package. One could postpone a native implementation, I think.
  
  The extension to non-linear systems would be nice, but may be less demanding now.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Artur Tarassow - 2024-12-30

Your code, @svetosch, is a good starting point for a package. I've started to build a package based on it. The project can be found here:

https://github.com/atecon/gmmsys/tree/main/src

Relevant files are:

Functions: https://github.com/atecon/gmmsys/blob/main/src/gmmsys.inp

Sample script: https://github.com/atecon/gmmsys/blob/main/src/gmmsys_sample.inp

I can replicate Sven's example. I've also added many post-estimation statistics to the printout such as R² etc. as you will see. The package is not finished yet but hopefully a good starting point to progress.

If anyone is interested in developing the package jointly, I can "invite" you to the repo. Please let me know.

Best
Artur
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sven Schreiber - 2026-01-18

Hi @atecon, sorry I didn't react to your message about a year ago -- yes please "invite" me officially to this github repo for gmmsys. thanks

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Artur Tarassow - 2026-01-21
  
  Hi, I just sent you github invitation.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Sven Schreiber - 2026-03-02
    
    Sorry, I think I didn't react, and it's expired now - can you re-invite me? thanks
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

GMM estimation of multiple equ. systems

A cross-platform statistical package for econometric analysis

Group

Searches

Help

#208 GMM estimation of multiple equ. systems

Discussion