pytest-it: a pytest plugin for BDD test specs

I recently published pytest-it. It's a pytest plugin to decorate tests with markers inspired by Rspec (describe, context, it). These markers are used in the pytest test reporting to display a plaintext spec of the features under test:

from pytest import mark as m

@m.describe("The test function report format")
class TestPytestItExample(object):

    @m.context("When @pytest.mark.it is used")
    @m.it("Displays an '- It: ' block matching the decorator")
    def test_it_decorator(self):
pytest --it

* tests/test_pytest_it.py...

- Describe: The test function report format...
  - ✓ It: Displays a test pass using '- ✓ '
  - ✓ It: Displays a test fail using '- F '
  - ✓ It: Displays a test skip using '- s '
  - ✓ It: Displays the pytest ID for parameterised tests
  - ✓ It: Does not use the docstring in the test name

  - Context: When @pytest.mark.it is used...
    - ✓ It: Displays an '- It: ' block matching the decorator

    - ...when -v is higher than 0...
      - ✓ It: Displays the full module::class::function prefix to the test

  - Context: When @pytest.mark.it is not used...
    - ✓ It: Displays the test function name

    - ...but the test name starts with 'test_it_'...
      - ✓ It: Prettifies the test name into the 'It: ' value

  - Context: When multiple @pytest.mark.it markers are used...
  - ✓ It: Uses the lowest decorator for the 'It : ' value

We use pytest extensively at Ometria. It provides a lot of features (eg. composable fixtures, parameterised tests, assert rewrites) that make it easy and pleasant to write test code in Python. So why is pytest-it useful?

1. The problem

Software gets complex, and clear communication is crucial to mitigate this. Tests are a critical tool for reducing unwanted changes in software behaviour, and in some cases they will be the only spec you have for legacy software.

Conventional Python test code (including pytest) has up to three levels of hierarchy:

# Level 1 - module

# Level 2 - class
class TestMyClass(object):

    # Level 3 - method
    def test_my_method(self):
        This is a test.
        assert 2 + 2 == 4

Under these constraints, it requires a lot of discipline to communicate clearly to the reader, especially as software evolves and requirements change over time:

  • Is there clear separation between setup, test and teardown code?
  • What input situations are tested?
  • What is the output behaviour under test? Is it logic that should only be changed in coordination with the business, or is it a side-effect?
  • Are multiple business rules tested by the same test function?
  • What business logic is not covered by tests?

Despite their importance, tests tend to receive less attention than the main program code (in my experience), with copy/paste approaches being more common, and less care given to structure, naming and documentation.

These are problems that BDD and other testing tools (including pytest-it) try to mitigate.

2. Trivial to adopt

There are various methods to help improve test clarity and bring it closer to business logic. For example:

  1. Rewrite the test code using a BDD framework that provides tools for this purpose (eg. Behave, Mamba).
  2. Rewrite the program code to facilitate better test structure.
  3. Write a spec of component behaviours that exist separately to the system and test code.

Decorating an existing pytest codebase with pytest-it is an incremental step. It's cheap to adopt compared to the above solutions, as it can be retroactively applied to a test suite without requiring any other changes - the decorators only provide the spec, they don't alter the behaviour of the test.

This is also a good way to introduce BDD thinking to a Python codebase (or team) without requiring a major change in tooling or workflow.

3. More expressive

I often see test names or docstrings that describe the context of the test input (eg. test_new_customer_with_one_order), but not the output. I can see that the test expects the output of a function to be 3, but what does that represent?

The Rspec semantics provide a framework to specifically communicate these concepts, and by using decorators to apply these states to a test, we can display them clearly in the test report. This is significantly more expressive than trying to convey that information in a function name, and the decorators make it easier to parse compared to using a docstring.

4. Behaviour at a glance

Even in well-structured test suites, it can be difficult to scan the behaviour under test, because docstrings are scattered throughout the file(s). Pulling this state into a plain-text spec makes it easier to read, think about and modify.

5. Org-mode

For those who use org-mode, the spec output can be copied straight into an org buffer to work with. This is the killer feature for me, as it means I can focus on the behaviour of the program: where tests need to be added, deleted or moved, and where program behaviour needs to be clarified with the business.

6. Habit forming

Introducing a framework to structure tests can encourage care around test code and communication. The semantics provide a sensible default that can be implemented without requiring a lot of extra tooling or thought. Integrating with pytest also provides an easy way to evaluate the tradeoffs of BDD approaches compared to the classic Python unittest structure.

7. Further information

You can try pytest-it by running:

pip install pytest-it
pytest --it

See Github for more information.