Saturday, 6 November 2010

Mono 2.8: a step closer to a reliable foundation

We previously complained about the use of Boehm's conservative garbage collector in earlier versions of Mono because it is fundamentally flawed and prone to causing unpredictable memory leaks that result in applications dying with out-of-memory errors when there is plenty of garbage left to be reclaimed. Specifically, we gave a simple 9-line example program that fills and forgets ten hash tables that ran out of memory when run on Mono 2.4. What happens when this program is run on Mono 2.8 using the new SGen garbage collector?

Running the test with Mono 2.8 using the default Boehm GC often reproduces the same leak that we saw before, as expected. Repeating our previous test using the new SGen garbage collector we find that the program does not die after four iterations with an out-of-memory error but gets as far as eight of the intended ten iterations before dying with a segmentation fault:

$ mono-sgen TailCall.exe
m[42] = 42
Took 3.40511s

m[42] = 42
Took 3.41273s

m[42] = 42
Took 3.20464s

m[42] = 42
Took 3.96534s

m[42] = 42
Took 3.14944s

m[42] = 42
Took 3.10114s

m[42] = 42
Took 3.14187s

m[42] = 42
Took 3.27123s


  at (wrapper managed-to-native) object.__icall_wrapper_mono_gc_alloc_vector (intptr,intptr,intptr) <0x00003>
  at (wrapper managed-to-native) object.__icall_wrapper_mono_gc_alloc_vector (intptr,intptr,intptr) <0x00003>
  at (wrapper alloc) object.AllocVector (intptr,intptr) <0x000ac>
  at System.Collections.Generic.Dictionary`2<double, double>.Resize () <0x001bc>
  at System.Collections.Generic.Dictionary`2<double, double>.set_Item (double,double) <0x0014f>
  at <StartupCode$TailCall>.$Program.main@ () <0x0007c>
  at (wrapper runtime-invoke) object.runtime_invoke_void (object,intptr,intptr,intptr) <0x0007d>

Native stacktrace:

        mono-sgen [0x80dec34]
        mono-sgen [0x812b2cb]
        mono-sgen [0x8174e17]
        mono-sgen [0x8175428]
        mono-sgen [0x8065318]
        mono-sgen(mono_runtime_invoke+0x40) [0x81a9aa0]
        mono-sgen(mono_runtime_exec_main+0xd6) [0x81ad1f6]
        mono-sgen(mono_main+0x1a41) [0x80bb501]
        mono-sgen [0x805b388]
        /lib/tls/i686/cmov/ [0xb7451b56]
        mono-sgen [0x805b131]

Debug info from gdb:

[Thread debugging using libthread_db enabled]
[New Thread 0xb7103b70 (LWP 8401)]
0xb76f3430 in __kernel_vsyscall ()
  2 Thread 0xb7103b70 (LWP 8401)  0xb76f3430 in __kernel_vsyscall ()
* 1 Thread 0xb7439720 (LWP 8400)  0xb76f3430 in __kernel_vsyscall ()

Thread 2 (Thread 0xb7103b70 (LWP 8401)):
#0  0xb76f3430 in __kernel_vsyscall ()
#1  0xb75a9f75 in sem_wait@@GLIBC_2.1 ()
    at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/sem_wait.S:80
#2  0x0822c778 in mono_sem_wait (sem=0x89ce64c, alertable=0)
    at mono-semaphore.c:102
#3  0x081560c7 in finalizer_thread (unused=0x0) at gc.c:1048
#4  0x08183065 in start_wrapper (data=0xa37c760) at threads.c:747
#5  0x0821a7df in thread_start_routine (args=0xa36762c) at wthreads.c:285
#6  0x0816da8b in gc_start_thread (arg=0xa37c808) at sgen-gc.c:5350
#7  0xb75a380e in start_thread (arg=0xb7103b70) at pthread_create.c:300
#8  0xb75078de in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 1 (Thread 0xb7439720 (LWP 8400)):
#0  0xb76f3430 in __kernel_vsyscall ()
#1  0xb75aac8b in read () from /lib/tls/i686/cmov/
#2  0x080dedfc in read (signal=11, ctx=0xb72fcd0c)
    at /usr/include/bits/unistd.h:45
#3  mono_handle_native_sigsegv (signal=11, ctx=0xb72fcd0c)
    at mini-exceptions.c:1935
#4  0x0812b2cb in mono_arch_handle_altstack_exception (sigctx=0xb72fcd0c,
    fault_addr=0x8, stack_ovf=0) at exceptions-x86.c:1163
#5  <signal handler called>
#6  alloc_large_inner (vtable=<value optimised out>,
    size=<value optimised out>) at sgen-los.c:368
#7  0x08174e17 in mono_gc_alloc_obj_nolock (vtable=0xa3af948, size=0)
    at sgen-gc.c:3219
#8  0x08175428 in mono_gc_alloc_vector (vtable=0xa3af948, size=147681864,
    max_length=18460231) at sgen-gc.c:3437
#9  0xb72ecb0b in ?? ()
#10 0xb72e97d5 in ?? ()
#11 0xb72ec695 in ?? ()
#12 0xb72ec2a8 in ?? ()
#13 0xb72e8d9d in ?? ()
#14 0xb72e8fd6 in ?? ()
#15 0x08065318 in mono_jit_runtime_invoke (method=0xa330bdc, obj=0x0,
    params=0xbfd1aafc, exc=0x0) at mini.c:5392
#16 0x081a9aa0 in mono_runtime_invoke (method=0xa330bdc, obj=0x0,
    params=0xbfd1aafc, exc=0x0) at object.c:2709
#17 0x081ad1f6 in mono_runtime_exec_main (method=0xa330bdc, args=0xb6c00638,
    exc=0x0) at object.c:3838
#18 0x080bb501 in main_thread_handler (argc=2, argv=0xbfd1ace4) at driver.c:999
#19 mono_main (argc=2, argv=0xbfd1ace4) at driver.c:1836
#20 0x0805b388 in mono_main_with_options (argc=2, argv=0xbfd1ace4) at main.c:66
#21 main (argc=2, argv=0xbfd1ace4) at main.c:97

Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.


Seven years after the Mono team described their use of the Boehm garbage collector as "an interim measure", the SGen collector is still experimental. Hopefully these issues will be resolved and the Mono platform will benefit from a reliable garbage collector in the not too-distant future. However, we cannot help but wonder why the Mono team have not chosen to release a simple but reliable garbage collector that people could use while they wait for SGen to be stabilized. After all, multicore-friendly garbage collection can be easy.


gelin yan said...

running the same version (implemented with c#) with mono 2.10 and sgen is no problem....

Justin said...

Since gelin says he wrote a C# version, I thought I would pass along that the F# version works fine now as well on 'mono-sgen'.

The exact code here compiled with Mono on Linux and executed using 'mono-sgen' completes all iterations successfully.

Mono JIT compiler version 2.11 (master/e8807c5 Thu Jul 7 17:59:18 PDT 2011)

Using 'mono' instead of 'mono-sgen' still runs into problems before completing.

Wallace said...

What is your opinion on the state of mono for F# in 2012 ?

Justin said...

@Wallace - I assume you are asking Flying Frog but I will butt in anyway...

In my experience, Mono now supports tail calls just fine. This was one of the initial complaints. That said, the bug is still open in Bugzilla:

Also, again in my experience, mono-sgen no longer leaks memory in the way that the Flying Frog guys have complained about (see my comment above for example). Of course, sgen is STILL not the default garbage collector. So, plain old Mono (without sgen) still chokes on code like that in the post above.

Brian said...


Thanks for your research. It is quite interesting as I dabble with functional programming and F#.

I ran across an discussion you had about F# and Java a couple of years ago due to a retweet from Don Syme.

I was also curious about the viability on F# on Mono at this point. It turns out that the current (beta) build 3.0.3 bundles F# 3 compiler interactive interpreter as fsharpc and fsharpi respectively.

Invoking mono-sgen does complete your tail call test successfully on my OS X system. The default Boehm gc hangs and burns an entire core after 6 iterations through the loop, though. I ran out of patience waiting for it to crash and spit out a stack trace. The F# interactive interpreter also succeeds because the /usr/bin/fsharpi scripts invokes mono with --gc=sgen if it is available.

I think this is good news.

//brian reiter

$ mono-sgen --version
Mono JIT compiler version 3.0.3 (master/39c48d5 Tue Jan 8 12:12:24 EST 2013)
Copyright (C) 2002-2012 Novell, Inc, Xamarin Inc and Contributors.
TLS: normal
SIGSEGV: altstack
Notification: kqueue
Architecture: x86
Disabled: none
Misc: softdebug
LLVM: yes(3.1svn-mono)
GC: sgen
$ mono-sgen ./HashTableBenchmark.exe
m[42] = 42
Took 1.09152s

m[42] = 42
Took 1.05821s

m[42] = 42
Took 1.07819s

m[42] = 42
Took 1.0668s

m[42] = 42
Took 1.06811s

m[42] = 42
Took 1.09969s

m[42] = 42
Took 1.06574s

m[42] = 42
Took 1.07106s

m[42] = 42
Took 1.05242s

m[42] = 42
Took 1.06812s

$ fsharpi

F# Interactive for F# 3.0 (Open Source Edition)
Freely distributed under the Apache 2.0 Open Source License

For help type #help;;

> for i in 1..10 do
- let t = System.Diagnostics.Stopwatch.StartNew()
- let m = System.Collections.Generic.Dictionary()
- let mutable x = 0.0
- for i=1 to 10000000 do
- m.[x] <- x
- x <- x + 1.0
- printfn "m[42] = %g" m.[42.0]
- printfn "Took %gs\n" t.Elapsed.TotalSeconds;;
m[42] = 42
Took 1.71913s

m[42] = 42
Took 1.27299s

m[42] = 42
Took 1.18967s

m[42] = 42
Took 1.28772s

m[42] = 42
Took 1.20197s

m[42] = 42
Took 1.09899s

m[42] = 42
Took 1.23109s

m[42] = 42
Took 1.13189s

m[42] = 42
Took 1.24924s

m[42] = 42
Took 1.14216s

val it : unit = ()
> #quit;;

- Exit...