Why I’ve found verbose tests are infinitely better

A good bit of my job involves a mixture of:

DRY (Don’t Repeat Yourself) is a great tool for the first 2 parts; but I’ve found that becomes kludgy once applied to testing, because DRY & testing code different tools for different jobs.

DRY code makes sure you reuse code as much as sensibly possible, to decrease the chance that you introduce bugs by simply having a smaller code footprint.

Also, DRY code allows you to quickly build out features. A lot of production-level code is a tiny bit of business logic, wrapped around scaffolding to keep everything safe.

Testing code is about verifying behavior. It’s a combination of brain-dump, sanity check, and a battle of wits. And the goal for testing code is to help you sleep at night and enjoy life. You write tests to:

When writing a lot of tests, there’s a temptation to start DRYing them up. It makes sense: you want to write & maintain as little code as possible. But, I’ve found that in experience, this ends up causing major problems in the future. The problems are even worse if you start to introduce metaprogramming 😱 (which always happens).

Verbose tests

Almost a year ago, my friend Chase Finch told me about his approach to testing, which was vastly different from what I’d heard before. Essentially: You should explicitly not care if your tests are verbose or have code blocks duplicated over and over for each test, because there are only a few cases you ever touch them:

  1. When writing the test
  2. Code Reviews
  3. When refactoring something
  4. When the test breaks

Also: the more straightforward your tests are, the easier it is for the computer to optimize its runtime.

Below is a quick summary of what I mean by “verbose tests,” along with the benefits I’ve observed in the time I’ve spent regularly writing verbose tests.

What do you mean by “verbose tests”?

You might also hear verbose tests described as “declarative,” but I like going with “verbose” because it keep the intent front-and-center.

As a quick primer, here are some tests from Duck hunt, one of my open-source projects:

class DuckHuntHashSchemaStructModeValidationTest < DuckHuntTestCase
  test "should return false if the object provided is not a hash" do
    schema = DuckHunt::Schemas::HashSchema.define do |s|
      s.test "name"
    end

    assert_equal false, schema.validate?("hello")
    assert_equal 1, schema.errors.size
    assert_equal ["wrong type"], schema.errors["base"]
  end

  test "should return false if one of the properties is not valid" do
    schema = DuckHunt::Schemas::HashSchema.define do |s|
      s.always_wrong_type "name"
    end

    assert_equal false, schema.validate?({:name => "hello"})
    assert_equal 1, schema.errors.size
    assert_equal ["wrong type"], schema.errors["name"]
  end

  test "should return false if the object is missing a required property" do
    schema = DuckHunt::Schemas::HashSchema.define do |s|
      s.test "name", :required => true
      s.always_right_type "hello", :required => false
    end

    assert_equal false, schema.validate?({:hello => "hello"})
    assert_equal 1, schema.errors.size
    assert_equal ["required"], schema.errors["name"]
  end

  test "should return false if the schema has been set to strict mode and the hash provided has extra properties" do
    schema = DuckHunt::Schemas::HashSchema.define do |s|
      s.test "name", :required => true
    end

    assert_equal false, schema.validate?({:name => "hello", :hello => "hello"})
    assert_equal 1, schema.errors.size
    assert_equal ["has properties not defined in schema"], schema.errors["base"]
  end
  #...
end

Verbose tests are fast

Adding dependencies to your test suite slows it down. Especially for spec-based tests, because they add a lot of syntactic sugar on top.

Below is a benchmark of the same test suite written in Duck hunt: one in Minitest-spec, and one in declarative, verbose Minitest:

# Spec Version
Fabulous run in 0.091554s, 4085.0209 runs/s, 10594.8402 assertions/s.

# Verbose
Fabulous run in 0.063634s, 5877.3612 runs/s, 15243.4233 assertions/s.
Speedup calculator
June 23, 2020 at 5:03:24 PM
1
old_latency = 0.091554s
0.091554 s
2
new_latency = 0.063634s
0.063634 s
3
speedup = old_latency/new_latency
1.4387591539

The basic tests are 1.4x faster because they’re much closer to the metal. This is a small, in-memory example, but over a large test suite you can easily shave off seconds/minutes per build. Faster tests → Less pain → Better code → Happier devs.

Writing the tests is a form of review

Tests are essential to code quality and guiding code reviews. Once you’ve written a test, you’ve documented how the feature works and what to expect.

If you’re writing a lot of preamble for your tests, you probably need to rethink the code (or write tests in other places). Let’s take an example:

test "#invite_user adds the user to the account by their email" do
  stub_postmark_email_api
  account = accounts(:deep_space_9)
  user = users(:kira)

  assert account.subscription.active?
  assert_equal :admin, user.permissions
  assert account.users < account.subscription.user_limit
  assert account.can_add_user?

  assert_difference "Invitations.count", +1 do
    InviteUserViaEmail.invite_user(user: user, email: "worf@federation.space")
  end
end

As you can see, there’s a lot of extra stuff that’s checked and prepared before we test the actual behavior we’re looking for. This isn’t necessarily bad, but it’s a smell.

Maybe this is a crucial part of the app, so you’re willing to write extra tests in order to feel safer about it’s implications. But it could also be that you aren’t confident in other parts of the code, so you should write tests for those instead (or possibly even refactor it). You might also need to write some more fixtures or test data, to better prepare for edge cases.

Refactoring should be safe, straightforward, easy to follow

You want your tests to be a strong safety next when refactoring. When your tests are verbose, with as little sugar and metaprogramming as possible, it’s much more likely to be a strong safety next because:

Broken tests are a hassle, even on a good day

If tests are failing, you don’t want to have to spend time re-orienting yourself with the test code. You should just be able to look at a test in isolation and figure out what broke. And that’s even more true when it’s a critical bug late on a Friday night.

Verbose vs. Metaprogrammed

Stepping back a bit, let’s compare some metaprogrammed tests with their verbose counterparts. They both test the same bits of code, but the approaches are wildly different:

# The metaprogrammed example
describe SecurityAuditController do
  shared_example_for "token_validation" do
    it "does not load if the token is empty" do
      @token = ""
      @action.call
      assert_response :not_found
    end

    it "does not load if the token is incorrect" do
      @token = "blah"
      @action.call
      assert_response :not_found
    end

    it "does not load if the token is old" do
      Timecop.freeze(Token.expiration_for(@token) + 10.minutes) do
        @action.call
        assert_response :not_found
      end
    end
  end

  describe "public reports" do
    describe "#show" do
      describe "public tokens" do
        before do
          old_access_attempt_count = AccessAttempts.count
        end

        after do
          assert_equal old_access_attempt_count + 1, AccessAttempts.count
        end

        @token = "valid_token"
        @action = lambda{ get :show, token: @token }
        it_behaves_like "token_validation"

        it "shows the status report" do
          @action.call
          assert_response :ok
          assert_match "Status Report", response.body
        end
      end

      # ... rest of the tests
      # ....
      # ...
    end
  end

  #... LOTS OF OTHER TEST CODE
  #... LOTS OF OTHER TEST CODE
  #... LOTS OF OTHER TEST CODE
  #... LOTS OF OTHER TEST CODE

  describe "guest passes" do
    describe "#accept" do
      describe "public tokens" do
        before do
          old_access_attempt_count = AccessAttempts.count
        end

        after do
          assert_equal old_access_attempt_count + 1, AccessAttempts.count
        end

        @token = "valid_token"
        @action = lambda{ post :accept, token: @token }
        it_behaves_like "token_validation"

        it "shows the welcome message" do
          @action.call
          assert_response :ok
          assert_match "Welcome to Deep Space 9", response.body
        end
      end

      # ... rest of the tests
      # ....
      # ...
    end
  end
end
# The verbose example
class SecurityAuditControllerTest < ActiveSupport::TestClass
  test "#show does not load a report if the public token is empty" do
    assert_difference "AccessAttempts.count", +1 do
      get :show, token: "valid_token"
      assert_response :not_found
    end
  end

  test "#show does not load a report if the token is incorrect" do
    assert_difference "AccessAttempts.count", +1 do
      get :show, token: "blah"
      assert_response :not_found
    end
  end

  test "#show does not load a report if the token is old" do
    assert_difference "AccessAttempts.count", +1 do
    Timecop.freeze(Token.expiration_for(@token) + 10.minutes) do
      get :show, token: "valid_token"
      assert_response :not_found
    end
    end
  end

  test "#show loads the status report" do
    assert_difference "AccessAttempts.count", +1 do
      get :show, token: "valid_token"
      assert_response :ok
      assert_match "Status Report", response.body
    end
  end

  test "#accept does not load a guest pass if the public token is empty" do
    assert_difference "AccessAttempts.count", +1 do
      post :accept, token: "valid_token"
      assert_response :not_found
    end
  end

  test "#accept does not load a guest pass if the token is incorrect" do
    assert_difference "AccessAttempts.count", +1 do
      post :accept, token: "blah"
      assert_response :not_found
    end
  end

  test "#accept does not load a guest passif the token is old" do
    assert_difference "AccessAttempts.count", +1 do
    Timecop.freeze(Token.expiration_for(@token) + 10.minutes) do
      post :accept, token: "valid_token"
      assert_response :not_found
    end
    end
  end

  test "#accept gives a guest access" do
    assert_difference "AccessAttempts.count", +1 do
      post :accept, token: "valid_token"
      assert_response :ok
      assert_match "Welcome to Deep Space 9", response.body
    end
  end
end

In the metaprogrammed example, you need to jump between the shared block and the test itself. Then you need to remember how the @action works (and the right syntax for calling a closure).

It’s indented so much to chain test caveats that the eye tracking is a nightmare. And if you change the code in the shared example, you run the risk of breaking other tests that rely on it.

With the verbose example, you can read each test individually. You can throw debugger and print statements wherever you need to. If your application code has changed for one case (but not the other), you can simply change the test for that case.

But there are even more, real-world use cases where a verbose test suite shines!

Beta feature wrappers

When you’re launching a feature into beta, there’s usually wrapper code that will need to get cut. Either things you’ve stubbed out for the next beta, checks so only specific users can access it, or workaround code.

You can wrap everything into a separate test class, like so:

class BulkInvitationBetaTest < ActiveSupport::TestClass
  test "only allows beta users to bulk-invite people"
  test "redirects to the home page if a non-beta user tries to access it"
  test "does not show the bulk invitiation button on the UI"
end

Then, when the feature is moving out of beta, you can just delete the whole test class!

Deprecated Code

Likewise, if you know some code is deprecated and will be removed soon, you can move all of its related tests into a separate test class. I prefix the test class with Deprecated, so it’s clear that this code’s days are numbered:

class DeprecatedIEDetectionTest < ActiveSupport::TestCase
  test "detects IE11 and loads the polyfills"
  test "detects IE10 and loads the polyfills"
  test "does not load the polyfills if the browser is not IE-like"
end

Again, once this code is gone, you already know which tests to delete!

Experiments

When experimenting with a new idea or approach, you want to keep your existing tests untouched. They’re good reference points, especially if you’re refactoring existing behavior.

Separate test classes per Feature set

This one is definitely an “it depends” situation. But for particular feature sets, it could be useful to keep their tests in their own test class. You never know if that feature will be cut, or break unexpectedly, and that could help make the transition easier.

Some examples are:

Smart ways to curtail verbosity in your tests

Verbose tests don’t have to mean endless lines of boilerplate code. With a few tricks, you can dramatically improve their readability without losing the performance benefits.

Custom Assertion methods

If you’ve got a block of assertions that you’re copying verbatim across multiple tests, consider making it a custom assert method! This follows the existing conventions of the test framework while also reducing how much you need to scroll:

def assert_error_on(field:, error:)
  assert_equal 1, schema.errors.size
  assert_equal [error], schema.errors[field]
end

# ...

assert_error_on(field: "name", error: "wrong type")

Modules for unavoidable shared tests

If you truly need to use shared tests, there’s a way to make it happen with reduced complexity: write a module and include it!

You can write a series of tests as a module, which all end up calling the same method (which you’d define in your test class):

module PublicTokenValidationSharedTests
  def test_does_not_load_if_token_is_empty
    assert_difference "AccessAttempts.count", +1 do
      perform_public_token_action(token: "")
      assert_response :not_found
    end
  end

  def test_does_not_load_if_token_is_incorrect
    assert_difference "AccessAttempts.count", +1 do
      perform_public_token_action(token: "blah")
      assert_response :not_found
    end
  end

  def test_does_not_load_if_token_is_expired
    assert_difference "AccessAttempts.count", +1 do
    Timecop.freeze(Token.expiration_for(@token) + 10.minutes) do
      perform_public_token_action(token: @token)
      assert_response :not_found
    end
  end
  end
end

class SecurityAuditPublicReportTest < ActiveSupport::TestClass
  setup do
    @token = "valid_token"
  end
  include PublicTokenValidationSharedTests

  def perform_public_token_action(token:)
    get :show, token: token
  end

  test "#show loads the status report" do
    assert_difference "AccessAttempts.count", +1 do
      get :show, token: @token
      assert_response :ok
      assert_match "Status Report", response.body
    end
  end
end

class SecurityAuditGuestAccessTest < ActiveSupport::TestClass
  setup do
    @token = "valid_token"
  end
  include PublicTokenValidationSharedTests

  def perform_public_token_action(token:)
    post :accept, token: token
  end

  test "#accept gives a guest access" do
    assert_difference "AccessAttempts.count", +1 do
      post :accept, token: @token
      assert_response :ok
      assert_match "Welcome to Deep Space 9", response.body
    end
  end
end

In my opinion, the best version of these tests would be duplicated across test classes, but that isn’t always feasible. This helps you get closer to that.