10 april 2018

Open Source Codebase Book Club presents: Dapper.NET (#1)

Jeroen Techniek

We’re doing an ‘Open Source Codebase Book Club’ study group at Infi, and we decided to blog about each codebase we examine. First up in our series is Dapper.NET. We first did a high-level fly-over, followed by a deep dive. Curious what we found? Read on!

We love open source at Infi. But that mainly meant using it so far. We want to do more, and the first step towards doing more means knowing more first. So we decided to carefully study a few prominent open source codebases, to see “how they do things”. This is what the series will look like:

Part 1: Dapper.NET (this post)
Part 2: Nodatime
Part 3: JSON.NET

This first post is then about our findings from the Dapper.NET codebase. Let’s dive in!

About Dapper

Dapper is a thin layer on top of the IDbConnection interface. It allows you to query databases with idiomatic C#. Return values (especially from SELECT queries) can be automatically mapped to objects. As such, it can be considered a “micro-ORM”. Here’s a typical way to use it:

public class Person
{
    public int Id { get; set; }
    public string Name { get; set; }
}

var person42 = connection.Query<Person>(
    "SELECT Id, Name FROM dbo.Person WHERE Id = @Id",
    new { Id = 42 }
).FirstOrDefault();

It is used (and for a large part maintained) by engineers from Stack Overflow.

Fair warning: this will be an in-depth blog post, reviewing Dapper in a conscientious manner. We won’t blame you if you stop reading right now.

Still with us? Good! Let’s get started!

Basics

First, let’s get an overview of the project and its basics.

Age	7 years (first commit: April, 2011)
Website	Dapper on github.io
Hosting	GitHub: StackExchange/Dapper
License	Apache 2.0
Readme	367 lines (some examples, performance highlights, basic docs, links to more info)
Contributing	No explicit guidelines, but pull requests are welcome.
Wiki	–
Issues	298 open, 337 closed
Community	No user group, mailing list, Slack, or similar. Maintainers are active on Stack Overflow, GitHub’s issues, and Twitter though.
Stack Overflow	Dapper tag has around 2,000 questions.
Branching strategy	?
Tagging and versioning	Semver
CI/CD	AppVeyor and custom shell scripts
.gitignore	basic Visual Studio and Windows stuff
Environment	.editorconfig file with dotnet and visual studio specific settings
Dependencies (core)	Only System.* API’s
Dependencies (tests)	XUnit, BenchmarkDotNet

Dapper is an immensely popular library, used by (and spawned from) an immensely popular website: Stack Overflow. Its core maintainers (Nick Craver and Marc Gravell, and previously Sam Saffron, who now works with Jeff Atwood on Discourse) are active members of the C# community. They are a smart, active, honest, straightforward (sometimes blunt) bunch of no-nonsense folks, that keep a tight reign on this library.

Initial Thoughts

First thing to notice is the flat folder structure. The root contains the .sln and projects straight up, with things like the README and build scripts mixed into that root.

Opening the solution in Visual Studio we see a similar picture. A nearly completely flat structure, separating only solution items, core source, and tests. Also noticable: there is only a handful of projects.

Diving into the main Dapper.csproj project we see this pattern repeated, as if looking at a perfect mandelbrot. A completely flat project with 45 .cs files is all there is to it. This should be easy enough to dissect, right!?

Now the hard part starts: how to dive further into this project? I can see various approaches:

Alphabetically: going through all the files of the core project
Test Suite Driven: letting the tests guide me through the public API
API Driven: investigating the public API (e.g. with the Object Browser)
Documentation Guided: letting the documentation or getting started drive me through the codebase
Debug Mode: create my own test app using the source code directly, stepping through the code

All options seem good enough here. I decided to start with the first one, thinking I might discover new approaches along the way.

Deep Dive

It’s hard to describe a voyage of discovery without ending up writing an entire book. So instead, let’s just highlight some things noted while diving.

Xml Comments

The public API of this library is pretty darn well documented. There are Xml Comments for all important public members. Simple, developer-oriented, intellisense-friendly language is used. No wonder people love this library.

.NET Standard

In many places there are examples of compiler directives for .NET Standard versions. As a line-of-business-app developer I seldomly feel the pain of having to support multiple target frameworks. I can only imagine how much complexity this adds for a library author.

Style

I firmly believe that code should ideally read as though it was written by a single person. And even though this codebase is pretty clean and consistent, it feels like there are multiple different people working on it. It’s not at the point where it harms readability though. But it is a useful reminder to stay vigilant in your own code about this.

Spaghettiness

Let’s start with a disclaimer. This codebase was authored by some pretty smart people. People that can balance correctness and pragmatism as if they work at Cirque du Soleil. So the next bit is not direct criticism, but more an observation, and a trigger for introspection. Okay, disclaimers done, let’s get to it.

For whatever reason, several “typical” coding best practices are violated. For example the CreateParamInfoGenerator method in SqlMapper.cs is a whopping 400 lines long. It has loads of nested branches and loops. And it’s inside a 4000 line partial class. Wow…

Now I can imagine a dozen good reasons for this code being like it is. And there’s probably two dozen more I could not think of. Maybe one day I’ll go ahead and dig into it, or ask someone who knows. Until that day, let’s just remember: code quality is merely measured in WTFs/minute, and that number’s pretty low for Dapper.NET.

Keywords and Language Features

The codebase uses several language features, keywords, and patterns I don’t regularly use (or use at all). These include but are not limited to:

unchecked in actual use
a private sealed class
IL Generator usage (this one was particularly interesting, because it was heavily used in a critical path, and I personally (think) I haven’t had a chance to use it in my own code)

If nothing else, it’s fun to see these in action!

Throw actions

Consider this snippet from DynamicParameters.cs:

Action @throw = () => throw new InvalidOperationException(message);

// stuff

if (blah && blurps)
{
    // stuff
}
else
{
    @throw();
}

// more stuff

// another block using `@throw();`

Pretty straightforward. Makes absolute sense to do something like this. Yet a pattern I hadn’t used before.

Using goto

There’s actual goto statement in this codebase. It’s been a while since I had seen that. It’s been a long while since I had used that (in 4G languages, at least). I wonder what the history is here, but the archeological effort is a bit big here: the git blame commit is a generic one. Just history, I suppose.

Humor?

I personally extend Uncle Bob’s “Don’t be cute” advice from Clean Code, and avoid (attempts at) humor in permanent records like code altogether. However, the Dapper maintainers are known to jest and vent in code, specifically in code comments. And even though I personally avoid it, I still enjoy it when I see it. Here’s a typical example:

try
{
    return cmd.ExecuteReader(GetBehavior(wasClosed, behavior));
}
catch (ArgumentException ex)
{ // thanks, Sqlite!
    // etc.

You can just smell the developer’s frustration here.

Towards a Conclusion

Interestingly, most of the things I noticed is “meta” stuff. The codebase as a whole is pretty straightforward. It is a clever facade for the built in System SQL features, just as it claims. There is no “magic”, just “worker-bee code”.

As to the method I chose (going through it alphabetically): this turned out to work quite nicely. The reason it worked well is probably that it was a relatively small and flat codebase. Next time I think I’ll glance over the project structure and based on that I’ll pick the same or a different approach.

About the codebase itself, the main (pleasant) surprise was how straightforward this codebase was. Given the complex and complicated data-access code I’ve seen (and created) in my past, it’s fun and refreshing to see that this data-access library’s structure. Sometimes things can be simple. And clever.