How does Java Garbage Collection work with Circular References? How does Java Garbage Collection work with Circular References? java java

How does Java Garbage Collection work with Circular References?


Java's GC considers objects "garbage" if they aren't reachable through a chain starting at a garbage collection root, so these objects will be collected. Even though objects may point to each other to form a cycle, they're still garbage if they're cut off from the root.

See the section on unreachable objects in Appendix A: The Truth About Garbage Collection in Java Platform Performance: Strategies and Tactics for the gory details.


yes Java Garbage collector handles circular-reference!

How?

There are special objects called called garbage-collection roots (GC roots). These are always reachable and so is any object that has them at its own root.

A simple Java application has the following GC roots:

  1. Local variables in the main method
  2. The main thread
  3. Static variables of the main class

enter image description here

To determine which objects are no longer in use, the JVM intermittently runs what is very aptly called a mark-and-sweep algorithm. It works as follows

  1. The algorithm traverses all object references, starting with the GCroots, and marks every object found as alive.
  2. All of the heap memory that is not occupied by marked objects isreclaimed. It is simply marked as free, essentially swept free ofunused objects.

So if any object is not reachable from the GC roots(even if it is self-referenced or cyclic-referenced) it will be subjected to garbage collection.

Ofcourse sometimes this may led to memory leak if programmer forgets to dereference an object.

enter image description here

Source : Java Memory Management


You are correct. The specific form of garbage collection you describe is called "reference counting". The way it works (conceptually, at least, most modern implementations of reference counting are actually implemented quite differently) in the simplest case, looks like this:

  • whenever a reference to an object is added (e.g. it is assigned to a variable or a field, passed to method, and so on), its reference count is increased by 1
  • whenever a reference to an object is removed (the method returns, the variable goes out of scope, the field is re-assigned to a different object or the object which contains the field gets itself garbage collected), the reference count is decreased by 1
  • as soon as the reference count hits 0, there is no more reference to the object, which means nobody can use it anymore, therefore it is garbage and can be collected

And this simple strategy has exactly the problem you decribe: if A references B and B references A, then both of their reference counts can never be less than 1, which means they will never get collected.

There are four ways to deal with this problem:

  1. Ignore it. If you have enough memory, your cycles are small and infrequent and your runtime is short, maybe you can get away with simply not collecting cycles. Think of a shell script interpreter: shell scripts typically only run for a few seconds and don't allocate much memory.
  2. Combine your reference counting garbage collector with another garbage collector which doesn't have problems with cycles. CPython does this, for example: the main garbage collector in CPython is a reference counting collector, but from time to time a tracing garbage collector is run to collect the cycles.
  3. Detect the cycles. Unfortunately, detecting cycles in a graph is a rather expensive operation. In particular, it requires pretty much the same overhead that a tracing collector would, so you could just as well use one of those.
  4. Don't implement the algorithm in the naive way you and I would: since the 1970s, there have been multiple quite interesting algorithms developed that combine cycle detection and reference counting in a single operation in a clever way that is significantly cheaper than either doing them both seperately or doing a tracing collector.

By the way, the other major way to implement a garbage collector (and I have already hinted at that a couple of times above), is tracing. A tracing collector is based on the concept of reachability. You start out with some root set that you know is always reachable (global constants, for example, or the Object class, the current lexical scope, the current stack frame) and from there you trace all objects that are reachable from the root set, then all objects that are reachable from the objects reachable from the root set and so on, until you have the transitive closure. Everything that is not in that closure is garbage.

Since a cycle is only reachable within itself, but not reachable from the root set, it will be collected.