Tim Kellogg: December 2011

Friday, December 30, 2011

Can Bad Code Ruin Your Career?

I started writing this post over a year ago. I was working at a large company where I was stuck in a mouse wheel - always running to keep up but never getting anywhere. The code I had to work with was downright terrible. This, among other things, prodded me into looking for another job. While I was starting my job search I was pondering this post and decided to not finish it because I wasn't sure if some prospective employer would hold it against me.

With that said...

I just finished reading through a messy Java file. It was the usual mess of a class with a 500 line god-method (similar to the god-object) and hundreds of counts of copy and pasted code. Besides the redundant code and lack of structure the coder also used nested loops through ArrayLists when they could have used a HashSet and didn't once use generic collections, using the un-type checked versions instead. After several hours of refactoring and renaming variables I finally got to a point where I could begin fixing the bug I was after. There were absolutely no unit tests - all this code was written inline with HTML in a JSP.

I spend so much time reading bad code that sometimes I wonder if I am beginning to specialize in hacks. Is it possible to read so much bad code that you forget what good code looks like? Humans are an especially adaptive species, and I think it's definitely possible that a great programmer can be forced to work in the muck so long that they forget what good code looks like.

I've seen several situations where good developers produced bad code. These situations are almost always a product of an environment where features are more important than bug fixes. These companies typically invest heavily in sales and neglect IT and development costs. Or sometimes the problem is just that product management knows nothing of software development.

The 5 stages of grief

A recent coworker likened our job of working with brittle, badly designed code to the 5 stages of grief. While we were uneasily laughing about it I silently decided that this was more realistic than I wanted to believe.

For instance, imagine starting a new job. In the interview process you were interviewed by intelligent, enthusiastic developers and were led to believe you were going to be working on cutting edge technologies - a dream right? When you actually get to the job you find out that the code is so backwardly complicated that its nearly impossible to touch anything without bringing the proverbial house of cards crashing down.

Grief Stage 1: Denial and Isolation

Obviously the code isn't the problem, you just weren't careful enough. They probably have specific guidelines and strategies that help them be more productive. It's probably just something wrong with me...

Grief Stage 2: Anger

Dammit! Who the hell even thinks of this crap? [more cursing...] Is this a god-object?? [hair gets thinner...]

Grief Stage 3: Bargaining

This is typically when you start plotting potential strategies to hide the ugliness of the code. Creativity and hopeful thoughts abound. Many IT managers will talk like they are very supportive of you at this stage.

Grief Stage 4: Depression

This is where the reality strikes that this stage is bad for the business plan because it involves spending less time on revenue-producing features. The IT managers that seemed so supportive now flip flop to the CEO's side and deny you the ability to cope with your problems

Grief Stage 5: Acceptance

There are only two outcomes of this stage. Either (1) you accept that you can never fix the code so you decide to move on to another job or (2) you accept that you can never fix the code so you give up on trying. This is what separates good coders from bad.

Conclusion

Again, I started this post over a year ago. I've seen a lot of bad code. At my most recent job I almost took the "give up on trying" path in the acceptance stage. Luckily we hired a great older developer who snapped me out of it. I just started my new job today, I think I will be much happier.

So can bad code ruin your career? My answer is a resounding YES! But it doesn't have to. Honestly, stage 5 can have better endings, but that inevitably requires understanding on behalf of management - a scarce resource.

Wednesday, December 28, 2011

Behavior Driven Development in C#

I've been a fan of Test Driven Development since I worked in an XP shop. But every time the work starts getting bigger and more complex I always struggle to not get lost in the magnitudes of tests. I remember many early-on conversations with my elders about unit test naming conventions. The [method]_[input]_[output] convention starts to break down badly when your inputs become things like mocks, or if there ends up being more than 1 or 2 inputs; same with outputs.

When a coworker introduced me to BDD earlier this year, it really clicked and flowed naturally. The idea of writing tests so they read like sentences out of a book or spec seems like the answer to all my questions. The ruby rspec is beautiful:

The organization of the tests forces you to focus on the expectations of your test and highlight descriptive assertions. This is especially useful for complicated setups with lots of mocks, etc. I put as much of my setup code in one of those before :each blocks, so that way the assertions are limited to simple inputs and one or two observations about the outputs.

There's been a number of people in the .NET community that have attempted BDD but [imo] failed to grasp the simplicity. NBehave is a complete overhaul of unit testing that uses attributes like xUnit. As a result, NBehave doesn't really look at all like rspec - which really isn't a bad thing, necessarily. However, the thing I like about rspec is it's ability to describe things of arbitrary depth, which is handy when testing complex code:

This spec is able to describe possible modes that the object under test can be in (complex inputs). This is made possible by rspec's arbitrary nesting depth. This is definitely a language feature that is much harder to implement in C#.

My current approach to BDD in C# usually looks like

I think this is the simplest BDD layer I can slap on top of NUnit. And simple is important to me because (a) I do a lot of open source projects and I want to keep the barrier to entry for contributions low and (b) the people I work with tend to resist change. When people are resistant to change, it's hard to rationalize using something other than NUnit or introducing lots of nested lambdas.

NUnit remains the most popular unit testing framework and has excellent support with a GUI runner, console runner, and IDE integration with R#, TestDriven.NET, and others. Given all that support, I would really rather not abandon NUnit if possible.

FluentAssertions is a nice simple BDD layer on top of NUnit (or whatever you use). It doesn't change the structure of our spec above, but it does change the structure of our assertion to

This assertion is [imo] very clean and succinct. I like how it reads even clearer than NUnit's fluent syntax. Last weekend I was thinking about this and I decided to explore an idea to make a BDD extension to NUnit that is even clearer than FluentAssertions. The project, BehavioralNUnit for now, is hosted at github. The earliest goal for the project was simply to use operator overloading to make the assertions even more like rspec. For instance, I want to be make the previous assertion:

I was able to do this, but I realized that the C# compiler was insisting that this expression needed to be assigned to something, so I [haven't yet] added another concept somewhat analogous to "it" in rspec:

This is most similar to NSpec's approach by using an indexer instead of a method. This appeals to me because I sometimes find matching parentheses to be a pain (I guess I just like ruby & coffeescript). Then again, I don't like NSpec because it feels like it was written by one of those whining .NET developers that wishes dearly he could get a RoR job - it doesn't abide to .NET conventions at all.

I still have a ton of ideas to hash out with Behavioral NUnit. I'm convinced that BDD in C# can be simpler and more beautiful than it currently is. If you have input or ideas, please fork the repository & try out your ideas (pull requests are welcome).

Monday, December 26, 2011

Why I hate generated code

If you've worked with me for any amount of time you'll soon figure out that I often profess that "I hate generated code". This position comes from years of experience with badly generated code. Let me explain.

The baby comes with a lot of bathwater

In the past year I had an experience with a generated data layer where CodeSmith was used to generate a table, 5 stored procedures, an entity class, a data source class, and a factory class for each entity that was generated. My task was to convert this code into NHibernate mappings.

The interesting thing about this work is how little of the generated code was actually being used. I'm sure, in the beginning, the developer's thoughts were along the lines "oh look at all this code I don't have to write manually :D". However, after some time, subsequent developer's thoughts were along the lines of "with all this dead code, it's hard to find real problems". It's funny how some exciting breakthroughs turn into headaches down the road. The table is always used, but some entities are created & read but never modified, others are only created during migrations and only read from during run time.

Code generators often produce code you don't need. Since all code requires maintenance, dead code is just a liability because it doesn't provide any benefit. I always delete dead code and commented out code (it'll live on in version control, no need to release it into production).

There are several professional developer communities that generate code as a way of life. Ruby on Rails comes prepackaged with scripts to generate models, views, and controllers in a single command. ASP.NET MVC will generate controllers and views with a couple clicks. And if you've ever used either of these frameworks, you'll probably find yourself deleting a lot of generated code.

The problem of transient code generation

The issue that I keep running into with my policy of hating code generation is that it's nearly impossible to be a professional software engineer and not generate code. The most fundamental problem is compilers. When you run a compiler over your source code, it generates some sort of machine readable code that is optimized for various goals like speed or debugging or different platform targets.

While I hate code generators, it's hard to argue how I could possibly hate compilers. They allow me to write code once and compile it several different ways and achieve different goals. Therefore, I have to introduce my first caveat - I don't hate all generated code, I only hate generated source code.

This problem of hating generated code is complicated further by the fact that NHibernate generates source code too. You don't ever check in the code that NHibernate generates because it's done at run time. The most obvious way NHibernate generates code is the SQL that is written in the background to query & perform DML operations. (For those questioning if SQL is source code, consider how SQL is compiled into an execution plan prior to execution). It's also hard to argue that I hate this kind of code generation because it doesn't suffer from the same problems of the CodeSmith generated code. It only generates code just-in-time meaning that it's only generated when needed, so there isn't any extra code generated.

Since NHibernate and compilers do code generation in a way that I like, I'm going to refine my statement to "I hate generated persistent code". This generally means, I still hate generated code when the resulting code sticks around long enough for a fellow developer to have to deal with it.

The thin line between good and bad code generation

When is generated code persistent and when is it transient? We already decided that code generation isn't so bad when it happens during of after the compilation process. But my statement is that I hate persistent code. There are other cases of code generators generating transient source code. One such example is in iSynaptic.Commons.

Since C# doesn't yet (and probably won't ever) include variadic templates or variadic generic types, writers of .NET API's often write some really redundant code to account for all combinations of generic methods or types. I know I've done it. This example uses a T4 template to produce a C# file with a *.generated.cs extension. The T4 template is executed on build but not ignored from version control.

I do like this approach because it takes a DRY approach to a redundant problem without much complication. Another thing I really like about this approach is that T4 templates are a standard part of Visual Studio and are executable from Mono as well. As such, they can be considered a free tool that is openly available (important for open source projects) and, more importantly, are executed as part of the build process.

Another thing I like about this approach is the usage of partial classes to separate the generated portion of the class from the non-generated portion. This minimizes the amount of code that is sheltered from refactoring tools (code inside the *.tt file).

The thing I hate about this particular iSynaptic.Commons example is that the generated file is included in version control. I think, perhaps, this is reduced to a small pet peeve of mine since the generated code isn't wasteful and is updated on every build. Still, I would like a mechanism to (a) have the file ignored from the IDE's perspective and (b) ignored from version control. I wouldn't want anyone to mistakenly edit the file when they should be editing the T4 template.

Summary

The end result of my thought is "I hate source code that is generated prior to the build process". I want to further say that I also hate generated code that is checked into version control, but this is a bit of a lesser point. However, code generation can be a useful tool; as seen in the cases of NHibernate and T4 templates. But even still, code generation should be used wisely and with care. Generating excess code can become a liability that detracts from the overall value of a product.

Thursday, December 1, 2011

Defining Watergile

At the place of my current employment we've had a layer of management placed above us that fervently preaches the mightiness of agile. This management devotes much lecture time into informing us the proper procedure of planning a product. First you gather requirements and architect the entire system and write detailed requirements documents - good enough that developers don't need to refine them any further and QA knows exactly what to test. When requirements are written for the entire system - 12-24 months in advance - then you begin coding. After you're done coding, QA begins to test.

To be clear, anyone reading the previous paragraph should be scratching their head and thinking to themself, "gee, that sounds a lot like waterfall". Well it is, hence the portmanteau watergile (we considered agilfall but it just doesn't roll off the tongue as well).

The trouble is, even though we coined the term just recently, this watergile thing is a frigging pandemic. Every time I crack open a fresh copy of SD Times there seems to be some guy telling you that you need to be measuring KSLOC and a billion other software metrics but at the same time claiming that agile is the only way. It wouldn't be so scary except that this is the source of direction for software development managers.

It's no wonder watergile is so widespread, IT managers are fed a constant stream of B.S. mixed messages. How could anyone make sense of any of it without dismissing most of it? The truth is, waterfall is hard and so is agile. Anything in between is just ad-hoc and setup to fail. If you are a development manager and reading this, find those tech magazines on the corner of your desk and show them to the recycling bin. They're worthless and distracting to progress.