[Mono-gc-list] Fast allocation vs lightweight collection

David Jeske jeske@chat.net
Tue, 26 Aug 2003 01:19:03 -0700

On Tue, Aug 26, 2003 at 08:59:02AM +0100, Torstensson, Patrik wrote:
> Not really true. The cmpxchg (and other) will lock the CPU cache (or
> cause a cache invalid signal to happen) on a x86. This causes serious
> performance problems 

> (it's better than a kernel lock but still..)

That is quite an understatement. I don't have a manual handy to look
up the numbers, but my gut ballparks that the kernel context switch,
probable TLB flush, plus the cache invalidation which is required for
the kernel locks anyhow will come in at over 10x of the cost of L1/L2
cache invalidation alone in overall performance cost.

> Still, Paolo is right, reference counting is very expensive for
> Multithreaded programs even if you use x86 specific helper functions. 

I think we have a disagreement about what it means to be "very
expensive". The 1 second worst case pause time of most "modern" GC
systems is very expensive to me. It usually means I can't write
software with them and have to resort to C, C++, or a ref-counted
system like Python. I don't mind a 10%, 20% or sometimes even 30%
performance hit, as long as it is spread evenly throughout the

For me, ANY pausing scheme is "very expensive", and ANY incremental
scheme with bounded worst-case pauses is acceptable.

Compared to other incremental schemes, reference counting adds some
cost to the mutator in exchange for greater throughput when memory is
being turned over quickly.

The write-barrier of a tri-color scheme is possibly less costly, but
yeilds more work for the incremental collector to do.

David Jeske (N9LCA) + http://www.chat.net/~jeske/ + jeske@chat.net