This blog has been moved to http://info.timkellogg.me/blog/

Friday, February 10, 2012

C# Reflection Performance And Ruby

I've always known that reflection method invocations C# are slower than regular invocations, but I've never never known to what extent. So I set out to make an experiment to demonstrate the performance of several ways to invoke a method. Frameworks like NHibernate or the mongoDB driver  are known to serialize and deserialize objects. In order to do either of these activities they have to scan the properties of an object and dynamically invoke them to get or set the values. Normally this is done via reflection. However, I want to know if the possibility of memoizing a method call as an expression tree or delegate could offer significant performance benefits. On the side, I also want to see how C# reflection compares to Ruby method invocations.

I posted the full source to a public github repo. To quickly summarize, I wrote code that sets a property on an object 100 million times in a loop. Any setup (like finding a PropertyInfo or MethodInfo) is not included in the timings. I also checked the generated IL to make sure the compiler wasn't optimizing the loops. Please browse the code there if you need the gritty details.

Before I get into the implementation details, here are the results:



You can see that a reflection invoke is on the order of a hundred times slower than a normal property (set) invocation.

Here's the same chart but without the reflection invocation. It does a better job of showing the scale between the other tests.



Obviously, the lesson here is to directly invoke methods and properties when possible. However, there are times when you don't know what a type looks like at compile time. Again, object serialization/deserialization would be one of those use cases.

Here's an explanation of each of the tests:

Reflection Invoke (link)

This is essentially methodInfo.Invoke(obj, new[]{ value } on the setter method of the property. It is by far the slowest approach to the problem. It's also the most common way to solve the problem of insufficient pre-compile time knowledge.

Direct Invoke (link)

This is nothing other than obj.Property = value. Its as fast as it gets, but impractical for use cases where you don't have pre-compile time knowledge of the type.

Closure (link)

This isn't much more flexible than a direct invoke, but I thought it would be interesting to see how the performance degraded. This is where you create a function/closure ( (x,y) => x.Property = y) prior to the loop and just invoke the function inside the loop (action(obj, value)). At first sight it appears to be half as fast as a direct invoke, but there are actually two method calls involved here, so it's actually not any slower than a direct invoke.

Dynamic Dispatch (link)

This uses the C# 4.0 dynamic feature directly. To do this, I declared the variable as dynamic and assigned it using the same syntax as a direct invoke. Interestingly, this performs only 6x slower than direct invoke and about 20x faster than reflection invoke. Take note, if you need reflection, use dynamic as often as possible since it can really speed up method invocation.

Expression Tree (link)

The shortcoming of most of the previous approaches is that they require pre-compile time knowledge of the type. This time I tried building an expression tree (a C# 3.0 feature) and compiled a delegate that invokes the setter. This makes it flexible enough that you can call any property of an object without compile-time knowledge of the name, as long as you know the return type. In this example, like the closure, we're indirectly setting the property, so two method calls. With this in mind, it took almost 2.5 times as long as the closure example, even though they should be functionally equivalent operations. It must be that expression trees compiled to delegates aren't actually as simple as they appear.

Expression Tree with Dynamic Dispatch (link)

Since the expression tree approach requires compile-time knowledge of the return type, it isn't as flexible. Ideally you could use C# 4.0's covariance feature and cast it to Action which compiles, but fails at runtime. So for this one, I just assigned the closure to a variable typed as dynamic to get around the compile/runtime casting issues.

As expected, it's the slowest approach. However, its still 16 times faster than direct reflection. Perhaps, memoizing method calls, like property sets and gets, like this would actually yield a significant performance improvement.

Compared To Ruby

I thought I'd compare these results to Ruby where all method calls are dynamic. In Ruby, a method call looks first in the object's immediate class and then climbs the ladder of parent classes until it finds a suitable method to invoke. Because of this behavior I thought I would be interesting to also try a worst-case scenario with a deep level of inheritance.

To do this fairly, I initially wrote a while loop in Ruby that counted to 100 million. I rewrote the while loop in n.each syntax and saw the execution time get cut in half. Since I'm really just trying to measure method invocation time, I stuck with the n.each syntax.



I honestly thought C# Reflection would be significantly faster than the Ruby with 5 layers of in inheritance. While C# already holds a reference to the method (MethodInfo), Ruby has to search up the ladder for the method each time. I suppose Ruby's performance could be due to the fact that it's written in C and specializes in dynamic method invocation.

Also, it interests me why C# dynamic is so much faster than Ruby or reflection. I took a look at the IL code where the dynamic invoke was happening and was surprised to find a callvirt instruction. I guess I was expecting some sort of specialized calldynamic instruction (Java 7 has one). The answer is actually a little more complicated. There seems to be several calls - most are call instructions to set the stage (CSharpArgumentInfo.Create) and one callvirt instruction to actually invoke the method.

Conclusion

Since the trend of C# is going towards using more Linq, I find it interesting how much of a performance hit developers are willing to exchange for more readable and compact code. In the grand scheme of things, the performance of even a slow reflection invoke is probably insignificant compared to other bottlenecks like database, HTTP, filesystem, etc.

It seems that I've proved the point that I set out to prove. There is quite a bit of performance to be gained by memoizing method calls into expression trees. The application would obviously be best in JSON serialization, ORM, or anywhere when you have to get/set lots of properties on an object with no compile-time knowledge of the type. Very few people, if any, are doing this - probably because of the added complexity. The next step will be to (hopefully) build a working prototype.


7 comments:

  1. The C# mongodb driver does indeed cache it's reflection by compiled expression trees at runtime.

    ReplyDelete
  2. That's good to know. I didn't get a chance to browse the source. I have a feeling many libraries don't take advantage of reflection caching.

    ReplyDelete
  3. Same with NHibernate. Bytecode is being dynamically generated for data mappings upon startup, which results in a slow up-front load when creating a session factory, but usually you create one session factory per app domain.

    I dont know what they do with ORMs in Ruby (I wish I did) but in .NET all popular ORMs cache data mappings in some format OR they use dynamic expando objects.

    These are some great figure you have put together!

    ReplyDelete
  4. You should checkout out DynamicMethod creation. I used it to implement my Clonable class for extremely fast object cloning. You can find the code for that here: https://github.com/iSynaptic/iSynaptic.Commons/blob/master/Application/iSynaptic.Commons/Runtime/Serialization/Cloneable.cs

    I wrote a little bit about this here: http://blog.jordanterrell.com/post/iSynapticCommons-Cloneablelt;Tgt;.aspx

    ReplyDelete
  5. Jordan - I've looked at iSynapticCommons before and I've been very impressed with what I've seen. I see you're emitting CLR OpCodes to build code. An alternative approach is to use Mono.CSharp.Evaluator to compile significant amounts of code at runtime (http://tirania.org/blog/archive/2008/Sep-10.html)

    ReplyDelete