Menu

#68 accumulation of variables in .RData files

open
nobody
None
5
2012-08-30
2011-02-24
Lee Worden
No

via XPJ.

When you load A.RData into B.R and save B.RData for use by C.R, it includes all the variables from both A and B, not just the ones from B. The data from A can be very large, causing subsequent .RData files to fill up disk space. Also, the namespcace can become increasingly polluted by variable names that were used much earlier in the chain of dependencies. That is, you can have trouble when C.R innocently makes use of a variable called "w" or whatever, not realizing that "w" has a value from when it was used as a loop counter in A.R. This can cause hard-to-diagnose bugs.

It's not clear how to implement it, but it might be preferable if only data created by B.R were saved in B.RData.

Discussion

  • Lee Worden

    Lee Worden - 2011-02-24

    I wonder if there's a nice way to do this with "environments":
    http://stat.ethz.ch/R-manual/R-devel/library/base/html/environment.html

    For instance, load A.RData into an environment called "A" so that the variables are accessible but we can leave them out when saving B.RData.

     
  • Anonymous

    Anonymous - 2011-02-24

    OK. As JD suggested, a way to solve this:

    When we have a small project, we don't worry about that.

    But when we have a big project, we should use #rdsave to save variables in each R program, that will remove the chain of the dependence.

     
  • Jonathan Dushoff

    I don't think what we already do is functionally different from using environments.

    If we make B.Rout depend on A.RData then B has access to all of the variables that A saved.

    If C is downstream of B, we can:

    • Save everything we need in B, and make C depend on B

    • Save only new things in B, and make C depend on A and B.

    I now understand that Xingpeng is asking an additional question that I didn't get: what if B changes something from A? We should test and confirm that loading A and B in that order has the expected behaviour that B should override A when we want it to.

    JD

     

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.