Unit testing network interactions

Developer James Coglan explains what lessons the tech team learnt from the problems that came with unit-testing Ruby code.

Developer James Coglan explains what lessons the tech team learnt from the problems that came with unit-testing Ruby code. 

Sometimes when writing Ruby code, especially when using Rails and other DSL-heavy libraries, you’ll find yourself writing some code and not being sure how to unit-test it. This might be because it hides the objects that collaborate with yours, so you can’t easily mock or stub them, or it creates a lot of singletons, or you’re just not sure which layer you should be mocking or stubbing at. We recently experienced all these problems and found it a useful lesson in how to design tests.

We were just starting out on implementing search functionality for our courses, and we’d decided to go with Elasticsearch, using the elasticsearch-persistence gem to communicate with it. After a little experimentation, we ended up with this code for indexing all our courses by title and searching that index:

require 'elasticsearch/persistence'
require 'course'

class CourseSearch
  include Elasticsearch::Persistence::Repository

  client Elasticsearch::Client.new(url: ENV['ELASTICSEARCH_URL'])

  index :courses
  type  :course
  klass Course

  settings do
    mappings do
      indexes :title, analyzer: :english
    end
  end

  def self.index_courses(courses)
    create_index!(force: true)
    courses.each { |course| save(course) }
  end

  def self.search_courses(text)
    search(query: { match: { title: text } }).results
  end
end

Course is a simple value object that provides a class for Elasticsearch::Persistence to instantiate for each result it returns. We’re storing the title, because that’s what we want to search on, and the slug as well; we want to render these results as links to the relevant course page, and so we need their title and slug to generate those links.

class Course
  def initialize(attributes)
    @attributes = attributes
  end

  def slug
    @attributes['slug']
  end

  def title
    @attributes['title']
  end

  def to_hash
    @attributes
  end
end

This CourseSearch class, although minimal, does work. We can use it to index some courses and search for them using a text query. We express that as a spec:

require 'course_search'

describe CourseSearch do
  before do
    courses = [
      {
        'slug'  => 'dutch',
        'title' => 'Introduction to Dutch',
      },
      {
        'slug'  => 'liveable-cities',
        'title' => 'Water for Liveable and Resilient Cities'
      },
      {
        'slug'  => 'orion',
        'title' => 'In the Night Sky: Orion'
      }
    ]

    CourseSearch.index_courses(courses)
  end

  it 'returns courses matching the search terms' do
    expect(CourseSearch.search_courses('city').map(&:to_hash)).to eq([
      {
        'slug'  => 'liveable-cities',
        'title' => 'Water for Liveable and Resilient Cities'
      }
    ])
  end
end

While this test works, it’s not ideal as a unit test. Since it integrates with Elasticsearch, it requires an Elasticsearch server running, it’s much slower than a unit test would be, and the way it checks the result means it’s mostly testing Elasticsearch rather than our own code.

The big risk with this test is that it omits a lot of important detail. Elasticsearch supports a wide range of different options for indexing and searching your documents, and search services in general involve all the complexity of analysing human language. While checking that the query city returns a course with the word Cities in its title, testing that word stemming works, this testing strategy can’t begin to get into all the nuance of search.

And nor should it: all that complexity is Elasticsearch’s job. What we ought to be testing is that our code uses Elasticsearch in the ways we care about: we have made particular choices about how to index and search our content in order to give the best user experience, and we want to enshrine those choices in tests rather than testing how Elasticsearch interprets those choices.

Finding the seams

One of the biggest challenges we faced with this code was deciding what we even want to test. We want a “unit” test, but what does that mean? Where is the boundary between the CourseSearch class and its collaborators?

The CourseSearch class follows the pattern that we see in a lot of Ruby’s frameworks. Most of it is really configuration for some other client: what the name of the index is, how to index each column. This class barely does any work itself, all the real action happens somewhere else. So what should we test?

We could test the interaction between CourseSearch and the Elasticsearch::Persistence::Repository mixin, the boundary between our code and a third-party library. In this scenario, a lot of code cannot be tested; the client, index, type, klass and settings are all at the class level rather than the instance level. These calls happen at “design-time” when the class is loaded, rather than at “run-time” when the class is used. Maybe we can inspect the class after our test has loaded it to examine our config, but we’d just be reproducing the configuration in our tests.

What about the code that does run at runtime: the methods. We could test those using mocking, to check that each method makes the correct calls to the Elasticsearch library.

require 'course_search'
require 'course'

describe CourseSearch do
  describe '.index_courses' do
    before do
      allow(CourseSearch).to receive(:create_index!)
      allow(CourseSearch).to receive(:save)
    end

    it 'creates a new index' do
      expect(CourseSearch).to receive(:create_index!).with(force: true)
      CourseSearch.index_courses([])
    end

    it 'saves each course' do
      course = double(:course)
      expect(CourseSearch).to receive(:save).with(course)
      CourseSearch.index_courses([course])
    end
  end

  describe '.search_courses' do
    let(:response) { double(:response, results: double(:results)) }

    it 'performs a match query on the title' do
      query = { query: { match: { title: 'the query' } } }
      expect(CourseSearch).to receive(:search).with(query).and_return(response)
      CourseSearch.search_courses('the query')
    end

    it 'returns the results' do
      allow(CourseSearch).to receive(:search).and_return(response)
      expect(CourseSearch.search_courses('the query')).to eq(response.results)
    end
  end
end

These tests are reasonably simple, but they don’t say anything interesting; they really just reproduce the implementation. And they don’t actually check very much; yes, they check which kind of query mechanism we’re using, but they don’t check how we’ve indexed the data (the mappings in the config), and indexing and searching need to work hand-in-hand.

These tests also don’t do much to reduce the risk of errors: we’ve only just adopted this library, and don’t have a solid understanding of how it works. When mocking, we should test at boundaries that are well-defined, stable, and that we understand, so that we can say with confidence what the methods we’ve set expectations on will do in production. If we don’t know what they’ll do, then we don’t know that we’re testing the right thing.

The above spec tests the interaction between CourseSearch and its closest collaborator, the Elasticsearch::Persistence::Repository mixin. Maybe there’s a boundary another layer away that we should test at. The only other collaborator visible in this code is the Elasticsearch::Client object. We could write tests that check what commands that receives when our code runs.

But again, there are problems with that approach. First, on a practical level, the Elasticsearch::Client object is another thing created as a singleton at design-time. It’s not something that’s passed into, or created inside, our methods, and that makes it harder to write mock expectations against it. Second, and more importantly, such a test again probably wouldn’t tell us anything useful and would test the wrong thing: it would test how Elasticsearch::Persistence::Repository interacts with Elasticsearch::Client, and those are interactions that happen inside third-party code, not in our own application. We’d just be duplicating the gem’s implementation and checking its internal workings, which might change at any time.

So, if we don’t want to test against our immediate collaborators, and we don’t want to test the internals of those collaborators, and we don’t want an integration test, what’s left? There’s one more boundary we’ve not considered: the calls that our application makes over the wire to the Elasticsearch server. Elasticsearch uses HTTP as its protocol, so we can use a tool like WebMock to check the requests that our app makes.

This has several advantages over our other ideas: it tests a meaningful, stable boundary, since the Elasticsearch protocol is stable and documented, compared to the internals of whatever gems we’re using. And it reduces risk: the thing we’re uncertain about is whether we’re using the Elasticsearch client libraries in the right way to get the results we want, and the things we want to check are how we index and search the data to achieve those results. Observing the requests sent to Elasticsearch will let us check everything our code is really doing.

Recording requests

Having decided on a boundary to test at, we need to find out what should be happening at that boundary. Remember that at this stage we’ve only just adopted this technology and we’re not sure what’s going on internally so we don’t know what to test for. We want to make sure our app makes the right kinds of requests to Elasticsearch, but we don’t know where in our gems’ code those requests happen, and the CourseSearch class certainly doesn’t interact with anything that directly performs those requests.

We wanted to peek at what was being sent over the wire, and there are various tools for doing this. After trying a few of them out we found tcpflow, which lets you monitor what’s going on on any network port. Elasticsearch runs on port 9200, so we can use this command to begin watching it:

sudo tcpflow -c -J -i any port 9200

This prints all incoming and outgoing data on port 9200 to the console, colouring the output blue for data from the client and red for data from the server. With that command running, we can execute a script to create the index and index a course:

courses = [
  {
    'slug'  => 'liveable-cities',
    'title' => 'Water for Liveable and Resilient Cities'
  }
]

CourseSearch.index_courses(courses)

When we run this script, the following is printed by tcpflow:

127.000.000.001.36738-127.000.000.001.09200: DELETE /courses HTTP/1.1
User-Agent: Faraday v0.9.2
Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3
Accept: */*
Connection: close
Host: localhost:9200


127.000.000.001.09200-127.000.000.001.36738: HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 21

{"acknowledged":true}
127.000.000.001.36739-127.000.000.001.09200: HEAD /courses HTTP/1.1
User-Agent: Faraday v0.9.2
Accept: */*
Connection: close
Host: localhost:9200


127.000.000.001.09200-127.000.000.001.36739: HTTP/1.1 404 Not Found
Content-Type: text/plain; charset=UTF-8
Content-Length: 0


127.000.000.001.36740-127.000.000.001.09200: PUT /courses HTTP/1.1
User-Agent: Faraday v0.9.2
Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3
Accept: */*
Connection: close
Host: localhost:9200
Content-Length: 101
Content-Type: application/x-www-form-urlencoded


127.000.000.001.36740-127.000.000.001.09200:
{"settings":{},"mappings":{"course":{"properties":{"title":{"analyzer":"english","type":"string"}}}}}
127.000.000.001.09200-127.000.000.001.36740: HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 21

{"acknowledged":true}
127.000.000.001.36741-127.000.000.001.09200: POST /courses/course HTTP/1.1
User-Agent: Faraday v0.9.2
Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3
Accept: */*
Connection: close
Host: localhost:9200
Content-Length: 76
Content-Type: application/x-www-form-urlencoded


127.000.000.001.36741-127.000.000.001.09200:
{"slug":"liveable-cities","title":"Water for Liveable and Resilient Cities"}
127.000.000.001.09200-127.000.000.001.36741: HTTP/1.1 201 Created
Content-Type: application/json; charset=UTF-8
Content-Length: 142

{"_index":"courses","_type":"course","_id":"AVGGintuILmNrviy3YK9","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}

Here we see a sequence of four requests:

  • DELETE /courses, with a 200 response containing {"acknowledged":true}
  • HEAD /courses, with a 404 response
  • PUT /courses with data {"settings": {}, "mappings": {"course":
    {"properties": {"title": {"analyzer": "english", "type": "string"}}}}}
    , with
    a 200 response containing {"acknowledged":true}
  • POST /courses/course with data {"slug": "liveable-cities", "title": "Water
    for Liveable and Resilient Cities"}
    , with a 201 response containing
    {"_index": "courses", "_type": "course", "_id": "AVGGintuILmNrviy3YK9",
    "_version": 1, "_shards":{"total": 2, "successful": 1, "failed": 0},
    "created": true}

This captures everything we want to assert about our search implementation: the name of the index, the mapping config, the fields we submit for each document. And they’re expressed in terms of the public, documented Elasticsearch API rather than some internal private library functions.

Writing the tests

Let’s begin writing our spec. We can start out by just stubbing out all the requests that index_courses will make, providing just enough information that our code runs without crashing. We don’t want to include extraneous request/response data in the test that doesn’t affect the result, since this only makes it harder to tell which bits of a test are actually important. In this case, all we need to supply are the status codes.

describe CourseSearch do
  let(:elasticsearch_origin) { 'http://localhost:9200' }

  describe '.index_courses' do
    let(:url) { elasticsearch_origin + '/courses' }

    before do
      stub_request(:delete, url).to_return(status: 200)
      stub_request(:head,   url).to_return(status: 404)
      stub_request(:put,    url).to_return(status: 200)

      stub_request(:post, url + '/course').to_return(status: 201)
    end

Then we can write some tests to check each individual call the our CourseSearch class makes. First, we know it should delete the existing index, and we can check this by asking WebMock whether DELETE /courses was requested.

    it 'deletes the existing index' do
      CourseSearch.index_courses([])
      expect(WebMock).to have_requested(:delete, url)
    end

That was relatively easy, and we’re already testing something far more detailed than our original integration test did.

Next, we want to check that it creates a new index, configuring the indexed fields correctly.

    it 'creates a new index, analyzing the title as English' do
      CourseSearch.index_courses([])

      expect(WebMock).to have_requested(:put, url).with { |request|
        JSON.parse(request.body).fetch('mappings') == {
          'course' => {
            'properties' => {
              'title' => { 'analyzer' => 'english', 'type' => 'string' }
            }
          }
        }
      }
    end

Rather than checking the body of the request as a string, we’re parsing it into a data structure and comparing against that. This is important for two reasons: it makes sure the body is valid JSON, and it stops the tests being sensitive to serialisation order. For example, {"analyzer": "english", "type": "string"} and {"type": "string", "analyzer": "english"} are equivalent structures in most languages, but they are different strings. Comparing data structures means we test only what we care about – the meaning of the string – and not the exact serialisation details.

Also notice that in the above tests, we don’t pass in any courses to CourseSearch.index_courses. That’s because these aspects of the method don’t need any course data, and so leaving such data out makes it clearer that it doesn’t affect the test outcome.

Our final test of this functionality does need course data though: we need to check that Elasticsearch receives one POST /courses/course request for each course we want to index.

    it 'adds each given course do the index' do
      courses = [
        Course.new('slug' => 'dutch',           'title' => 'Introduction to Dutch'),
        Course.new('slug' => 'liveable-cities', 'title' => 'Water for Liveable and Resilient Cities'),
        Course.new('slug' => 'orion',           'title' => 'In the Night Sky: Orion')
      ]
      CourseSearch.index_courses(courses)

      courses.each do |course|
        expect(WebMock).to have_requested(:post, url + '/course').with { |request|
          JSON.parse(request.body) == course.to_hash
        }
      end
    end

Here we index three courses, and check that each of them caused a POST request with their hash contents as the body. We could make it more explicit which fields we’re indexing as well by writing the test as:

        expect(WebMock).to have_requested(:post, url + '/course').with { |request|
          JSON.parse(request.body) == {
            'slug'  => course.slug,
            'title' => course.title
          }
        }

Now we have some tests that verify how we’re indexing the data, and which fields our search documents have, we can move on to how we send queries. Again, we need to observe what goes over the wire when we send a query. We run the following script with tcpflow running:

CourseSearch.search_courses('city')

And this is what we see:

127.000.000.001.37057-127.000.000.001.09200: GET /courses/course/_search HTTP/1.1
User-Agent: Faraday v0.9.2
Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3
Accept: */*
Connection: close
Host: localhost:9200
Content-Length: 36
Content-Type: application/x-www-form-urlencoded


127.000.000.001.37057-127.000.000.001.09200:
{"query":{"match":{"title":"city"}}}
127.000.000.001.09200-127.000.000.001.37057: HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 302

{"took":37,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.15342641,"hits":[{"_index":"courses","_type":"course","_id":"AVGGintuILmNrviy3YK9","_score":0.15342641,"_source":{"slug":"liveable-cities","title":"Water for Liveable and Resilient Cities"}}]}}

We send a GET /courses/course/_search request with body {"query:{"match":{"title":"city"}}}, just as it says in our code (the Elasticsearch gem exposes some parts of the wire protocol directly). The response contains this data, reformatted for readability:

{
  "took": 37,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.15342641,
    "hits": [
      {
        "_index": "courses",
        "_type": "course",
        "_id": "AVGGintuILmNrviy3YK9",
        "_score": 0.15342641,
        "_source": {
          "slug": "liveable-cities",
          "title": "Water for Liveable and Resilient Cities"
        }
      }
    ]
  }
}

We want to test two things about our search_courses method: that it makes the right type of query to Elasticsearch, and that it returns the results of the search. Beforehand, we need to stub out this endpoint with enough detail to make the client work. There’s a lot of fields in the above response that we can omit safely, so we’ll only include fields that are strictly necessary to avoid crashes, or that contain data we’re checking in our test. In this case, we just need to include the hits.total and hits.hits field with the _source of each result, which is used to product the result objects.

  describe '.search_courses' do
    let(:url) { elasticsearch_origin + '/courses/course/_search' }

    before do
      stub_request(:get, url).to_return(
        status:   200,
        headers:  { 'Content-Type' => 'application/json' },
        body:     JSON.dump(
          'hits' => {
            'total' => 2,
            'hits' => [
              { '_source' => { 'slug' => 'much-ado-about-nothing', 'title' => 'Much Ado About Nothing: in Performance' } },
              { '_source' => { 'slug' => 'shakespeares-hamlet', 'title' => 'Shakespeare\'s Hamlet: Text, Performance and Culture' } }
            ]
          }
        )
      )
    end

Checking that we send the correct query to match on the title field is very similar to checking the index requests as we saw above: we just check that WebMock saw a GET request with the right JSON payload:

    it 'asks Elasticsearch to perform a match query on the title' do
      CourseSearch.search_courses('performance')

      expect(WebMock).to have_requested(:get, url).with { |request|
        JSON.parse(request.body) == { 'query' => { 'match' => { 'title' => 'performance' } } }
      }
    end

To check the results, we don’t so much care about their exact type as that they have the right API; they should respond to slug and title by returning the appropriate value. We can map these method calls over the results to check we get what we expect:

    it 'returns a Course for each search result' do
      courses = CourseSearch.search_courses('performance')

      expect(courses.map { |c| [c.slug, c.title] }).to eq [
        ['much-ado-about-nothing', 'Much Ado About Nothing: in Performance'],
        ['shakespeares-hamlet', 'Shakespeare\'s Hamlet: Text, Performance and Culture']
      ]
    end

Wrapping up

What we’ve seen here is that writing a ‘unit test’ isn’t necessarily about looking at one class with respect to its immediate collaborators. Instead, it can be about one ‘conceptual unit’, which might include third party library code. That doesn’t make it ‘not a unit test’, it just means we have a fuzzy idea of what a ‘unit’ is.

The important thing is the boundaries between components, not the components themselves. Especially when your code is as simple as this, most of your testing effort will go into checking that the code has the right effects on whichever system it is connected to. It’s important to find a meaningful boundary, one that represents a genuine seam in your architecture where important information passes through, rather than one that happens to exist by accident because of how your internal code and libraries are arranged.

Find out more about how we make FutureLearn.

Category Making FutureLearn

Comments (1)

0/1200

  • Arnaud Meuret

    This only covers HTTP. I think the title of this post is misleading. I was hoping for a general approach to mock any protocol not just another Webmock article.