Menu

Tree [379d98] master /
 History

HTTPS access


File Date Author Commit
 icons 2013-10-01 Christian Rozée Christian Rozée [971be8] Initial commit
 .gitignore 2020-06-17 Christian Rozée Christian Rozée [b6a78b] Project configuration files updated.
 LICENSE.TXT 2018-04-19 Christian Rozée Christian Rozée [e57b45] License updated. Template class _<> extended.
 Readme.md 2017-09-19 Christian Rozée Christian Rozée [61c37c] Readme improved.
 basic_ref.h 2018-04-19 Christian Rozée Christian Rozée [e57b45] License updated. Template class _<> extended.
 forEach.h 2018-04-19 Christian Rozée Christian Rozée [e57b45] License updated. Template class _<> extended.
 rebor.h 2018-04-19 Christian Rozée Christian Rozée [06a0c8] rebor.h fixed for gcc.
 rebor.pro 2020-06-17 Christian Rozée Christian Rozée [b6a78b] Project configuration files updated.
 rebor.sln 2020-06-17 Christian Rozée Christian Rozée [b6a78b] Project configuration files updated.
 rebor.vcxproj 2020-06-17 Christian Rozée Christian Rozée [b6a78b] Project configuration files updated.
 rebor.vcxproj.filters 2016-01-13 Christian Rozée Christian Rozée [c92f0f] The recycler list now contains only changed and...
 rebor_collections.h 2021-02-17 Christian Rozée Christian Rozée [379d98] Fixes for GCC.
 rebor_global.h 2018-04-19 Christian Rozée Christian Rozée [e57b45] License updated. Template class _<> extended.
 recycler 2013-10-01 Christian Rozée Christian Rozée [971be8] Initial commit
 recycler.cpp 2018-04-19 Christian Rozée Christian Rozée [e57b45] License updated. Template class _<> extended.
 recycler.h 2018-04-19 Christian Rozée Christian Rozée [e57b45] License updated. Template class _<> extended.
 signal_slot.h 2021-02-17 Christian Rozée Christian Rozée [379d98] Fixes for GCC.
 thread_helper.cpp 2018-04-19 Christian Rozée Christian Rozée [e57b45] License updated. Template class _<> extended.
 thread_helper.h 2018-04-19 Christian Rozée Christian Rozée [e57b45] License updated. Template class _<> extended.

Read Me

Object Recycler Icon Project home sourceforge

A reverse tracing garbage collector for C++

Introduction

The Reference Binding Object Recycler (REBOR) is an implementation of an exact (precise) concurrent
reverse tracing garbage collector for C++.

Usage example:

#include <recycler>

_<int> pInt = _new<int>(42);

printf("The answer is %d", *pInt);

This example creates a new garbage-collected object (on the heap) and
initializes it with 42. Then it is printed out.

"_" is the name of a template class! The reason for the name was to
keep the focus on the program and its classes and not the references
(smart pointers). They should be as inconspicuous as possible though I
could not get rid of the angle brackets.

The project is still in an early development state (it started as a
feasibility study), but I already used it successfully in several
projects.

If you find this garbage collector useful or worth being improved I'm
glad for any comment.

Behavior and terms

Java and .NET made garbage collection a key feature. But it comes with some
disadvantages:

  • There is always a performance loss. Its perceptibility depends
    mostly on the implementation of the GC.

  • A deterministic destruction of objects is no longer guaranteed. New
    mechanisms are introduced to solve this problem (using, finally,
    etc.).

This implementation is not intended to solve these problems. Still there
is an effort to keep them unnoticable.

The reason why objects are not immediately destroyed when they are no longer used
(inaccessible from the program) is that they might reference themselves and detecting
unusable objects (finding orphan object clusters) can take a long
time and would block the execution of the program.

An object "cluster" consists of all objects referencing themselves
directly or indirectly via other objects (circular references).
Orphan means that there is no way to access the cluster from the program any more.

Operating systems

This version of the garbage collector works on Microsoft Windows and
Linux operating systems (actually Ubuntu). MinGW is not recommended because it
doesn't support thread local storage.

Using the garbage collector

As there should be only one recycler instance (garbage collector)
the project is set up as dynamic link library.

The library can be build with qmake (via the Qt Creator project file rebor.pro)
or with MS Visual Studio 2013 (rebor.sln).

To use the recycler first include the header recycler.
Then you can directly start to create objects.
Any type can be used, primitives or classes/structures.

Here is a coding example with a user defined class:

#include <recycler>

class MyClass
{
public:
  MyClass() {}
  ~MyClass() {}

  int value() const { return _value; }
  void setValue(int value) { _value = value; }

private:
  int _value;
};

void foo()
{
  _<MyClass> myClass = _new<MyClass>();

  myClass->setValue(42);
}

Restrictions:\
There is only one thing to avoid: Mixing the standard new operator and
the Recycler _new\<> template functions. Always use _new\<>() for objects
containing references unless you carefully design pure hierarchic object
trees. This restriction also applies to Lists and Maps! If you need lists
or maps of objects use rcy::List or rcy::Map.

Recycler (GC) characteristics

  • Concurrent: The main function of the Recycler (the clean up process) runs
    in a separate thread.

  • The Recycler is thread safe. Different threads can use the same Recycler
    (not tested yet!).

  • The references (smart pointers) and objects themselves are not thread safe
    by default. Hence it is not safe to access an object from different
    threads or to share a smart pointer without further precautions. But
    it is possible to have references to the same object in different
    threads.

  • Object destruction:

    • The object memory is always freed by the Recycler (undeterministic).
    • The destructor of objects without circular references is immediatley called
      after the last reference to the object is destroyed (in the same thread).
    • The destructor of objects with circular references is called by the Recycler
      when orphan state is detected (in the Recycler thread).
  • Polymorphic smart pointers. Assigning references
    of different types is possible.

Using the smart pointers with other libraries

Take care when using references (_<>) in template classes.
References have to be in memory assigned by _new<>() for the Recycler
to detect circular references. The _new<>() functon establishes a link between
the reference and the containing object. Therefore references should only be
members in classes or on the stack.

To still be able to have lists and maps of references there are
collection classes in rebor_collections.h. They work similar to the standard
C++ library including an optimized forEach macro that
accepts template arguments.

Performance

  • First tests showed that creating trees with smart pointers is nearly
    as fast as in .NET with C#. It's a bit difficult to tell because the
    mechanisms are very different.

  • The manipulation of smart pointers (assignment) is nearly constant
    in time and independent of the number of existing objects and
    references.

  • The memory overhead for an object or reference is constant. That means
    that the additional memory needed by the recycler is linear to the number
    of objects and references used.

Advantages

  • Precision: No object is forgotten or accidentally destroyed.

  • Concurrency: The program continues during garbage collection. There
    is no point where the program stops to clean up.

  • Simple: Writing programs is nearly as easy as with normal pointers
    (or even easier as there is no need for delete).

  • Only 1st and 2nd generation: The C++ new operator can be used.

  • Coexistence: Other code is not affected.

  • Multithreading: The same Recycler can be used by multiple threads.

  • Open source: Full control and debugging possibilities.

Disadvantages

  • Special template classes for lists and maps needed.

  • Processor power is needed to check for orphan clusters. On multi-core
    systems one processor core may temporarily be used to 100% for orphan detection.

Background

Why another garbage collector?

In my research studies I found some literature about garbage collection
(see footnote; for an introduction into garbage collection I would
recommend the Wikipedia article). But actually implemented garbage
collectors for C++ I found 4 in 2011 and I'm not speaking of GCs solely
based on reference counting (a GC should be able to cope with circular
references):

The last one (librtgc) is very similar to my implementation but with
less features and abandoned 2003 at Version 0.2.

The other garbage collectors are either conservative (not precise),
based on special memory allocation or not concurrent.

I came to the conclusion that, apart from the difficulty to implement
a precise GC due to the lack of
reflection in C++, there seems to be little need for garbage collection
in C++. Experienced developers can handle objects with other
techniques, for example with simple reference counting or the
parent-child pattern (parents deleting their children
when they are destroyed).

But developers new to C++ (coming from java or .NET) quite miss garbage
collection and have great difficulty to change their custom to use the
new operator everywhere, necessary or not.

What I wanted is a simple but effective GC for all platforms. I found
myself challenged to prove that it is possible to write a GC purely in
C++ without meddling with the OS or process memory. To have full compliance with C++ (using
standard new operator, multiple inheritance etc.) and no side effects to
existing C++ code (if there are no GC objects no computer resources
are ussed).

My first GC did fulfill these requirements but was awkward to use.

In the actual version of the GC I abandoned using the
standard C++ new operator for simplicity of usage.

How the garbage collector was developed

My approach was to start with reference counting by smart pointers (a
template class which behaves like a pointer), which left the problem of
circular references.

The key problem with circular references is reflection. The GC needs to know the
references (pointers) an object has to other objects to resolve circular
references.

My solution was to tell the references (smart pointers)
who their containing object is!

How the containing object were assigned to the references changed several times in the past
years as I got new ideas. I called the different methods "generations",
because they always meant a step forward in usability of the framework.
There is an overview in the next section.

The first and second generation distinguished two types of references (smart pointers):

  • externally controlled references (references mainly on the stack): template <class C> class _ {};

  • references controlled by objects explicitly (as member variables): template <class C> class o {};

In the first version (1st generation) the member references (o\<>) got
their containing object as parameter in their constructor:

class Test
{
public:
   Test() : _ref(this) {}
   ~Test() {}

private:
   o<MyClass> _ref;
};

This was a bit awkward and dangerous as the application crashes when the
initialization of the o\<> smart pointers was missing.

In the second attempt (2nd generation) I used thread local storage to
assign the containing objects to the o\<> references, thus being able
to use the default constructor of o\<>.

In the third version I distinguished externally controlled references and
references controlled by objects implicitly. That left only one
reference type: _\<>.

Alternative versions

The development of the garbage collector took place in
steps. Some time after a version was finished and tested I got new ideas
how it could be made more comfortable. I called these steps
"generations".

1st generation

In this version the smart pointers must be initialized with their
containing objects in the constructor (via this).

2nd generation

Here TLS and overloaded new and delete operators are used to attach the
smart pointers to their objects.

Objects controlled by the GC must be derived from AutoDelete and use
the smart pointers _<> and o<> as in the following example:

class Test : public AutoDelete
{
public:
   Test() {}
   virtual ~Test() {}

   void setReference(_<Test> ref) { _myReference = ref; }

private:
   o<Test> _myReference;
};

void main()
{
   _<Test> test = new Test;
   test->setReference(test);
   // Simplest form of circular reference (a reference to self)
}

There are some important rules to follow:

  • Always declare the destructor of a managed class (a class
    derived from AutoDelete) as virtual! Otherwise the destructor
    might not be called!

  • Never instantiate managed classes on the stack. Though it is
    possible to instantiate managed classes directly as members.

  • Never use _\<> as members in managed classes. Always use o\<>
    instead.

  • Never use o\<> on the stack. Use _\<> instead.

3rd generation

The third generation is the actual version of the
framework.

References

About the author

I am a German "Diplom-Physiker" (graduate physicist) working in Germany
and Switzerland as software developer since 1997.