Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Patch to reduce memory consumption

Help
2005-01-28
2013-03-21
  • Dieter Shirley
    Dieter Shirley
    2005-01-28

    Below is a patch that reduces the memory consumption of nget by about 20% on large newsgroups.  It replaces the multimap<> object used on nntp_parts to keep track of nntp_server_files with a simple vector<>.  My thinking is that there will usually be a very small number of server files per part (often 1, in fact), and using a red-black tree to manage this is way overkill.  No cache changes were needed.

    The patch was created with nget-0.27 on a gentoo x86 machine, but I have no reason to believe that it shouldn't work anywhere.  I happily grant permission to include this change in the main source under the GPL.

    Enjoy!
    -dete

    diff -ur nget-0.27.orig/cache.cc nget-0.27/cache.cc
    --- nget-0.27.orig/cache.cc   2005-01-27 21:42:50.000000000 -0800
    +++ nget-0.27/cache.cc  2005-01-27 22:01:35.749210488 -0800
    @@ -133,25 +133,28 @@
       c_nntp_server_article *sa;
    #ifndef NDEBUG
       if (debug>=DEBUG_MIN){
    -     t_nntp_server_articles::iterator sai=articles.find(h->serverid);
    -     if (sai!=articles.end()){
    -        sa=(*sai).second;
    -        printf("adding server_article we already have %lu %lu %lu %lu(%lu %lu %lu %lu)\n",h->serverid,h->articlenum,h->bytes,h->lines,sa->serverid,sa->articlenum,sa->bytes,sa->lines);
    -        //    return;//could be useful, lets add it.
    +     t_nntp_server_articles::iterator sai=articles.begin();
    +     for (;sai!=articles.end();sai++){
    +        if ((*sai)->serverid == h->serverid)
    +        {
    +           sa=(*sai);
    +           printf("adding server_article we already have %lu %lu %lu %lu(%lu %lu %lu %lu)\n",h->serverid,h->articlenum,h->bytes,h->lines,sa->serverid,sa->articlenum,sa->bytes,sa->lines);
    +           //    return;//could be useful, lets add it.
    +        }
          }
       }
       if (h->date!=date)
          printf("adding server_article with different date, date=%li h->date=%li mid=%s\n",date,h->date,h->messageid.c_str());
    #endif
       sa=new c_nntp_server_article(h->serverid,h->group,h->articlenum,h->bytes,h->lines);
    -  articles.insert(t_nntp_server_articles::value_type(h->serverid,sa));
    +  articles.push_back(sa);
    }

    c_nntp_part::~c_nntp_part(){
       t_nntp_server_articles::iterator i;
       for(i = articles.begin();i!=articles.end();++i){
    -     assert((*i).second);
    -     delete (*i).second;
    +     assert(*i);
    +     delete (*i);
       }
    }

    @@ -191,8 +194,8 @@
          }
          for (t_nntp_server_articles::const_iterator fsai=p->articles.begin(); fsai!=p->articles.end(); ++fsai){

    -        c_nntp_server_article *nsa = new c_nntp_server_article(*fsai->second);
    -        nfpi->second->articles.insert(t_nntp_server_articles::value_type(nsa->serverid,nsa));
    +        c_nntp_server_article *nsa = new c_nntp_server_article(**fsai);
    +        nfpi->second->articles.push_back(nsa);
          }
          t_nntp_file_parts::iterator del_pi = fpi;
          ++fpi;
    @@ -211,7 +214,7 @@
          set<ulong> servers_already_found;

          for (;nsai!=pi->second->articles.end();++nsai) {
    -        serverid=nsai->first;
    +        serverid=(*nsai)->serverid;
             //don't increment count twice if a server has multiple server_articles for a single part
             if (servers_already_found.insert(serverid).second){
                t_server_have_map::iterator hmi(have_map.insert(t_server_have_map::value_type(serverid, 0)).first);
    @@ -360,7 +363,6 @@
       t_nntp_file_parts::const_iterator pi;
       c_nntp_part *np;
       pair<t_nntp_server_articles::const_iterator,t_nntp_server_articles::const_iterator> sarange;
    -  c_nntp_server_article *sa;
       for(i = files.begin();i!=files.end();++i){
          nf=(*i).second;
          assert(!nf.isnull());
    @@ -368,12 +370,12 @@
          for(pi = nf->parts.begin();pi!=nf->parts.end();++pi){
             np=(*pi).second;
             assert(np);
    -        sarange=np->articles.equal_range(servinfo->serverid);
    -        while (sarange.first!=sarange.second){
    -           sa=(*sarange.first).second;
    -           assert(sa);
    -           range->remove(sa->articlenum);
    -           ++sarange.first;
    +        t_nntp_server_articles::const_iterator sai = np->articles.begin();
    +        for (; sai != np->articles.end(); sai++){
    +           if ((*sai)->serverid == servinfo->serverid)
    +           {
    +              range->remove((*sai)->articlenum);
    +           }
             }
          }
       }
    @@ -392,7 +394,6 @@
       c_nntp_file::ptr nf;
       t_nntp_file_parts::iterator pi,pic;
       c_nntp_part *np;
    -  pair<t_nntp_server_articles::iterator,t_nntp_server_articles::iterator> sarange;
       t_nntp_server_articles::iterator sai;
       c_nntp_server_article *sa;
       c_mid_info rel_midinfo("");
    @@ -413,13 +414,11 @@
             ++pi;
             np=(*pic).second;
             assert(np);
    -        sarange=np->articles.equal_range(servinfo->serverid);
    -        while (sarange.first!=sarange.second){
    -           sai=sarange.first;
    -           ++sarange.first;
    -           sa=(*sai).second;
    +        sai = np->articles.begin();
    +        for (; (np != NULL) && (sai != np->articles.end()); sai++){
    +           sa=*sai;
                assert(sa);
    -           if (flushrange.check(sa->articlenum)){
    +           if ((sa->serverid == servinfo->serverid) && flushrange.check(sa->articlenum)){
                   delete sa;
                   np->articles.erase(sai);
                   if (np->articles.empty()){
    @@ -455,10 +454,11 @@
          for(pi = nf->parts.begin();pi!=nf->parts.end();++pi){
             np=(*pi).second;
             assert(np);
    -        sai=np->articles.find(servinfo->serverid);
    -        if (sai!=np->articles.end()){
    -           sa=(*sai).second;
    -           assert(!flushrange.check(sa->articlenum));
    +        for (sai=np->articles.begin(); sai != np->articles.end(); sai++){
    +           sa=*sai;
    +           if (sa->serverid == servinfo->serverid){
    +              assert(!flushrange.check(sa->articlenum));
    +           }
             }
          }
       }
    @@ -604,7 +604,7 @@
                   if (nconfig.hasserver(serverid)) {
                      sa=new c_nntp_server_article(serverid,group,atoul(t[1]),atoul(t[2]),atoul(t[3]));
                      //np->addserverarticle(sa);
    -                 np->articles.insert(t_nntp_server_articles::value_type(sa->serverid,sa));
    +                 np->articles.push_back(sa);
                      counta++;
                   }else
                      countdeada++;
    @@ -762,7 +762,7 @@
                      assert(np);
                      f->putf("%i\t%lu\t%s\n",np->partnum,np->date,np->messageid.c_str());//PART_MODE
                      for (sai = np->articles.begin(); sai != np->articles.end(); ++sai){
    -                    sa=(*sai).second;
    +                    sa=(*sai);
                         assert(sa);
                         f->putf("%lu\t%lu\t%lu\t%lu\n",sa->serverid,sa->articlenum,sa->bytes,sa->lines);//SERVER_ARTICLE_MODE
                         counta++;
    diff -ur nget-0.27.orig/cache.h nget-0.27/cache.h
    --- nget-0.27.orig/cache.h 2005-01-27 21:42:50.000000000 -0800
    +++ nget-0.27/cache.h   2005-01-27 22:01:24.581908176 -0800
    @@ -102,7 +102,7 @@
          ulong bytes,lines;
          c_nntp_server_article(ulong serverid,const c_group_info::ptr &group,ulong articlenum,ulong bytes,ulong lines);
    };
    -typedef multimap<ulong,c_nntp_server_article*> t_nntp_server_articles;
    +typedef vector<c_nntp_server_article*> t_nntp_server_articles;
    typedef pair<c_nntp_server_article*,c_server::ptr> t_real_server_article;
    typedef multimap<float,t_real_server_article,greater<float> > t_nntp_server_articles_prioritized;
    class c_nntp_part {
    @@ -118,7 +118,7 @@
             c_nntp_server_article *highest_sa=NULL;
             float highprio=-10000.0,f;
             for (;nsai!=articles.end();++nsai) {
    -           sa=(*nsai).second;
    +           sa=(*nsai);
                for (t_server_list_range servers = nconfig.getservers(sa->serverid); servers.first!=servers.second; ++servers.first)
                   if ((f=nconfig.trustsizes->getserverpriority(servers.first->second)) > highprio){
                      highest_sa=sa;
    @@ -313,7 +313,7 @@
             const string &mid=f->bamid();
             c_nntp_part *p = f->parts.begin()->second;
             for (t_nntp_server_articles::iterator sai=p->articles.begin(); sai!=p->articles.end(); ++sai)
    -           midinfos.find(sai->second->group->group)->second->insert(mid);
    +           midinfos.find((*sai)->group->group)->second->insert(mid);
          }
          void remove(const string &mid){
             for (t_mid_info_list::iterator mili=midinfos.begin(); mili!=midinfos.end(); ++mili)
    diff -ur nget-0.27.orig/prot_nntp.cc nget-0.27/prot_nntp.cc
    --- nget-0.27.orig/prot_nntp.cc  2005-01-27 21:42:50.000000000 -0800
    +++ nget-0.27/prot_nntp.cc 2005-01-27 21:55:22.000000000 -0800
    @@ -805,7 +805,7 @@
       c_nntp_server_article *sa=NULL;
       float prio;
       for (sai = part->articles.begin(); sai != part->articles.end(); ++sai){
    -     sa=(*sai).second;
    +     sa=(*sai);
          assert(sa);
          for (t_server_list_range servers = nconfig.getservers(sa->serverid); servers.first!=servers.second; ++servers.first) {
             const c_server::ptr &s = servers.first->second;

     
    • Sounds like a good idea.  Unfortunately the forum mangled the patch.  Could you please open an item in the patches tracker and attach the patch file to it?  Thanks.

       
    • Dieter Shirley
      Dieter Shirley
      2005-01-28

      Sorry, I didn't notice the patch section... Whoops!  :-)  It's up there now.

      BTW - I'm thinking of doing the same with the parts list in nntp_file; it's not going to be as profound as the server files change, but I expect it will still be significant...

       
    • TH
      TH
      2005-06-03

      Matthew: I think you should release a version with this new patch right away. I finally got around compiling the CVS version, mostly because it has the option to limit the number of headers. However, with this patch there really wasn't much need to limit any more, although that may of course change soon. I think the 20% savings estimate must have been very conservative, because I never see the memory footprint go above 1 GB anymore. It used to be 1.6 GB. To me this is the most significant update since 0.19, when it stopped loading all articles into memory with the -G option. Thank you, Dieter!