[Mono-dev] Performance problems on ARM/linux

Martin Fuzzey mfuzzey at parkeon.com
Thu Apr 9 13:34:13 EDT 2009

I'm running mono on ARM/linux (i.MX21 processor) using svn build 130875 
(the xbuild in the 2.4 release won't build my application).

It works but the performance doesn't seem very good.
Specifically compared to the same hardware under WinCE and .NET compact
framework i'm seeing a *6 on startup time and *7 on many operations.

The mono application is compiled with xbuild / gmcs  (using framework
2.0 rather than the CF so it's not quite a fair comparison I know).

I've built mono with:
configure: --with-tls=pthread --without-static_mono
--without-sigaltstack --without-mcs_docs --disable-parallel-mark
I'm using the SMALL_CONFIG option for the gc
I'm not using any -enable-minimal options as I'm not worried (for the
moment) about disk size.

[I need --without-static_mono because the application accesses
WaitForSingleObject etc - I can use the versions supplied by the iolayer
via an assemble config file but it only works if both the mono runtime
and the imported native DLL use the same code]

Initially I used ARM_FPU_NONE but that causes floats (but not doubles)
to fail. In particular it caused an assertion error relative the the
load factor (which is a float) of the Hashtable class. With FPU_NONE the
following test app gives incorrect output:

using System;

public class FloatTest {

        static void Main()
                        int i1 = 1;
                        int i2 = 2;
                        float f1 = 1.0f;
                        float f2 = 1.5f;
                        double d1 = 1.0;
                        double d2 = 1.5;
                        bool result;

                        System.Console.WriteLine("Hello world");
                        System.Console.WriteLine("i1=" + i1 + " i2=" + i2);
                        System.Console.WriteLine("f1=" + f1 + " f2=" + f2);
                        System.Console.WriteLine("d1=" + d1 + " d2=" + d2);

                        result = i1 < i2;
                        System.Console.WriteLine("i1 < i2 : " + result);
                        result = i1 > i2;
                        System.Console.WriteLine("i1 > i2 : " + result);

                        result = f1 < f2;
                        System.Console.WriteLine("f1 < f2 : " + result);
                        result = f1 > f2;
                        System.Console.WriteLine("f1 > f2 : " + result);

                        result = d1 < d2;
                        System.Console.WriteLine("d1 < d2 : " + result);
                        result = d1 > d2;
                        System.Console.WriteLine("d1 > d2 : " + result);

[it gave gibberish values and incorrect comparisons for f1,f2  (but not
for d1,d2)]

time shows that it's mostly userspace:

real    0m 51.26s
user    0m 44.05s
sys     0m 5.09s

I've tried using

but that gives me

** ERROR:(handles.c:497):_wapi_handle_new: assertion failed:
(_wapi_has_shut_down == FALSE)

** (fatapp/Fat_App.exe:1994): WARNING **: Thread (nil) may have been
prematurely finalized

Bug 445610 talks about something similar but on OS X.  Should the
profiler work on ARM??

I don't think it's a gc problem because --stats gives:
Major GC collections:   20
Major GC time in msecs: 1012.903000

ie the GC only accounts for 1s of the 50s

I've tried using aot ;  it compiles fine and generates a .so but when
running it I get a segfault:


  at System.Windows.Forms.Control.OnSizeChanged (System.EventArgs)
  at System.Windows.Forms.Control.OnSizeChanged (System.EventArgs) <0x00034>
  at System.Windows.Forms.Control.UpdateBounds (int,int,int,int,int,int)
  at System.Windows.Forms.Control.UpdateBounds (int,int,int,int) <0x000f3>
  at System.Windows.Forms.Control.SetBoundsCoreInternal
(int,int,int,int,System.Windows.Forms.BoundsSpecified) <0x00383>
  at System.Windows.Forms.Control.SetBoundsCore
(int,int,int,int,System.Windows.Forms.BoundsSpecified) <0x0005f>
  at System.Windows.Forms.Control.SetBoundsInternal
(int,int,int,int,System.Windows.Forms.BoundsSpecified) <0x0013b>
  at System.Windows.Forms.Control.SetBounds
(int,int,int,int,System.Windows.Forms.BoundsSpecified) <0x000ab>
  at System.Windows.Forms.Control.set_Size (System.Drawing.Size) <0x00047>
  at (wrapper remoting-invoke-with-check)
System.Windows.Forms.Control.set_Size (System.Drawing.Size) <0xffffffff>
  at Fat_App.Form1.InitializeComponent () <0x00a63>
  at Fat_App.Form1..ctor () <0x00033>
  at (wrapper remoting-invoke-with-check) Fat_App.Form1..ctor ()
  at Fat_App.Program.Main () <0x00033>
  at (wrapper runtime-invoke) object.runtime_invoke_void
(object,intptr,intptr,intptr) <0xffffffff>

Native stacktrace:

        /usr/local/lib/libmono.so.0 [0x4012403c]
        /usr/local/lib/libmono.so.0 [0x40054350]
        /lib/libc.so.6(__default_rt_sa_restorer+0) [0x405788b0]
        /usr/local/lib/libmono.so.0 [0x40224bd0]
        /usr/local/lib/libmono.so.0 [0x40224ed4]
        /usr/local/lib/libmono.so.0 [0x40051f98]
        /usr/local/lib/libmono.so.0 [0x40052a88]
        /usr/local/lib/libmono.so.0 [0x40053750]
        /usr/local/lib/libmono.so.0 [0x40053940]
        /usr/local/lib/libmono.so.0(mono_compile_method+0x7c) [0x40180638]
        /usr/local/lib/libmono.so.0 [0x40125544]

I don't get a full symbolic stack trace even if the binaries are not
But I don't understand why it found mono_compile_method and not the others.

Running the X86 version of mono (built from the same sources) has no
problems (all the profiling options work).
Just one strange thing in that case - even when using --aot the --stats
output on execution shows
Methods from AOT:       0

I would much appreciate any ideas on:
a) Why the profiling / aot stuff doesn't work
b) Ideas for other avenues to persue


Martin Fuzzey

More information about the Mono-devel-list mailing list