Testing Textual apps: a contributor’s guide
Textual has first-class testing support built around running an app in a test harness and driving it with a Pilot. The core entry point is App.run_test(), which is an async context manager intended for tests; it runs the app headlessly by default and lets you control the app with a Pilot object. Textual’s own testing guide centers on simulating user input with methods like Pilot.press() and Pilot.click(), then asserting on app state. (Textual Documentation)
Recommended stack
For most projects, the sensible default stack is:
- pytest as the test runner
- Textual’s built-in test harness via
app.run_test() - pytest-textual-snapshot for visual regression tests
- pytest’s
monkeypatchfixture for targeted patching of environment, globals, and small seams unittest.mockwhen you need richer mocks, call assertions, or spec-constrained doubles (Textual Documentation)
That combination matches Textual’s official guidance: interactive behavior tests with run_test()/Pilot, and snapshot testing with the official pytest-textual-snapshot plugin, which captures SVG screenshots and compares them across runs to catch visual regressions. (Textual Documentation)
What a basic Textual unit test looks like
The basic pattern is:
- Construct the app.
- Enter
async with app.run_test() as pilot: - Simulate keys or clicks.
- Assert on app state, widget state, or rendered behavior. (Textual Documentation)
Example skeleton:
import pytest
from myapp.app import MyApp
@pytest.mark.asyncio
async def test_submit_button_enables_after_typing() -> None:
app = MyApp()
async with app.run_test() as pilot:
await pilot.press("h", "e", "l", "l", "o")
submit = app.query_one("#submit")
assert not submit.disabled
Why async? Because run_test() itself is async and Textual’s testing guide explicitly notes that tests using it must run in a coroutine. (Textual Documentation)
The most useful Textual testing APIs
run_test()
run_test() is the standard harness for testing apps. It runs headless by default, allows a fixed terminal size, and can optionally enable tooltips and notifications during tests. It also accepts a message_hook, which is a callback invoked whenever a message arrives at a message pump in the app. That hook is extremely useful for deep debugging of event/message flow. (Textual Documentation)
Pilot.press()
Use this to simulate key presses. Textual supports passing multiple key values so you can model typing sequences, not just one keystroke at a time. (Textual Documentation)
Pilot.click()
Use this to simulate mouse clicks against a widget selected by CSS selector. One important trap: if another widget is visually on top, the click may land on that topmost widget instead. That is intentional and mirrors real user behavior. (Textual Documentation)
When to write “unit” tests vs snapshot tests
Use ordinary run_test()-style tests when you care about:
- keybindings
- button behavior
- widget state
- actions and commands
- validation
- messages/events
- focus changes
- app logic (Textual Documentation)
Use snapshot tests when you care about:
- layout regressions
- styling regressions
- visual state changes
- subtle rendering changes that are hard to express as assertions (Textual Documentation)
Snapshot testing in Textual is based on SVG screenshots. The official plugin is pytest-textual-snapshot, and Textual notes that it uses this approach internally for builtin widgets as part of release validation. (Textual Documentation)
Suggested testing strategy for contributors
A practical pyramid for Textual projects:
1. Fast interaction tests
These should be the bulk of the suite. Drive the app with Pilot, assert on widget state, message effects, and app properties. They are usually less brittle than visual snapshots. This is the core testing style shown in Textual’s guide. (Textual Documentation)
2. Snapshot tests for important screens
Add snapshots for the main screens, custom widgets, and historically fragile layouts. Since the plugin stores screenshots and compares them later, it is good at catching visual drift. (Textual Documentation)
3. A few concurrency/worker tests
If the app uses background work, test the user-visible effect of worker completion, cancellation, and error states. Textual’s worker guide exists because concurrent work is a common and tricky part of real apps. (Textual Documentation)
Is mocking a good idea?
Usually: some mocking is good; lots of mocking is bad.
Textual apps are UI-driven and event-driven. If you mock too much of Textual itself, you can easily end up testing your own expectations rather than the real app behavior. The official Textual approach is already a kind of high-fidelity harness: run the real app, send real inputs, assert on real state. (Textual Documentation)
Good uses of mocking / patching
Use mocking or patching for things outside the UI framework:
- HTTP calls
- filesystem access that you do not want to hit for real
- environment variables
- clocks / timestamps
- subprocesses
- expensive backends or services
- feature flags and configuration seams (pytest)
Bad uses of mocking
Be cautious about mocking:
- Textual internals
- widget methods just to “prove” they were called
- event/message plumbing that can be observed through state
- rendering behavior that should really be covered by snapshots or real app assertions
Preferred patching tools
For simple cases, pytest’s monkeypatch is excellent because it safely restores changes after the test and directly supports attrs, dicts, env vars, sys.path, and cwd changes. (pytest)
For richer mocks, unittest.mock is still useful. If you use it, prefer spec or spec_set so your mocks fail when your assumptions drift from the real object API. (Python documentation)
Strong recommendation: test user-visible outcomes
For a Textual contributor unfamiliar with the framework, the safest mental model is:
Prefer asserting on state and behavior the user would care about, not on implementation trivia.
Examples:
Good:
assert app.query_one("#save").disabled is False
assert app.query_one("#status").renderable.plain == "Saved"
Weaker:
mock_save.assert_called_once()
The second can still be appropriate, but only when the outward result is hard to observe directly.
Common traps
1. Forgetting tests must be async
If you use run_test(), your test must run as a coroutine. This is one of the easiest mistakes for new contributors. (Textual Documentation)
2. Clicking hidden or covered widgets
Pilot.click() follows visible screen behavior. If something overlays the target, your click may hit the overlay instead of the intended widget. That is realistic, but it surprises people. (Textual Documentation)
3. Not controlling terminal size
run_test() accepts a size argument. Layout-sensitive tests can become flaky if you implicitly depend on terminal geometry and do not fix it in the harness. (Textual Documentation)
Example:
async with app.run_test(size=(100, 30)) as pilot:
...
4. Treating snapshot tests like ordinary assertions
Snapshot tests are powerful, but they are not always the best first tool. They can fail on legitimate UI changes and require review. Use them where visuals matter, not for every single behavior. The plugin is specifically about SVG screenshot comparison, so it is best for visual regressions. (Textual Documentation)
5. Ignoring concurrency
Textual widgets run in an async environment, and workers exist because apps often need background tasks such as network or subprocess work. Tests that touch those areas need to be written around eventual UI outcomes, not purely synchronous assumptions. (Textual Documentation)
6. Over-mocking async/concurrent code
If background work is central to the feature, replacing too much with mocks can hide timing, sequencing, or message-flow bugs. Patch the external dependency, but let the Textual app and worker machinery run for real where practical. That advice follows from Textual’s own emphasis on real async UI behavior and worker-driven concurrency. (Textual Documentation)
Useful patterns
Pattern: test through selectors
Use stable widget IDs or classes and query them directly from the app:
name_input = app.query_one("#name")
save_button = app.query_one("#save")
This keeps tests readable and avoids depending on fragile tree positions.
Pattern: fix the screen size
For layouts or anything responsive:
async with app.run_test(size=(120, 40)) as pilot:
...
That uses a documented feature of run_test() and removes a whole class of flaky failures. (Textual Documentation)
Pattern: use message_hook when debugging hard failures
run_test() has a message_hook callback that receives every message. That can help contributors understand why a handler did not fire or why a state change never happened. (Textual Documentation)
Example:
messages: list[str] = []
def hook(message) -> None:
messages.append(type(message).__name__)
async with app.run_test(message_hook=hook) as pilot:
await pilot.press("enter")
assert "ButtonPressed" in messages
Pattern: patch boundaries, not core UI
Good:
def test_loads_data(monkeypatch):
monkeypatch.setenv("API_URL", "http://test")
Also good:
from unittest.mock import Mock
client = Mock(spec_set=["fetch_items"])
client.fetch_items.return_value = ["a", "b"]
That uses patching where pytest and Python docs say it shines: globals, env, and dependency seams. (pytest)
Libraries to use
Definitely use
pytesttextualpytest-textual-snapshotfor visual regression testsunittest.mockfrom the stdlib- pytest’s built-in
monkeypatchfixture (Textual Documentation)
Often useful
pytest-xdistif you want to parallelize pytest runs; the snapshot plugin’s README notes it can be used in parallel test runs. (GitHub)
A practical house style for Textual tests
For a contributor guide, I would recommend these rules:
- Prefer pytest over unittest style.
- Use
run_test()for nearly all interaction tests. - Assert on widget/app state after real inputs.
- Use snapshot tests for visuals, not everything.
- Patch only external seams.
- Prefer
monkeypatchfor simple test-local patching. - Use
Mock(spec=...)orspec_set=...when you do mock. - Fix terminal size in layout-sensitive tests.
- Give important widgets stable IDs for querying in tests.
- Use
message_hookwhen event/message flow gets confusing. (Textual Documentation)
Example contributor boilerplate
import pytest
from myapp.app import MyApp
@pytest.mark.asyncio
async def test_typing_enables_submit() -> None:
app = MyApp()
async with app.run_test(size=(100, 30)) as pilot:
await pilot.press("h", "e", "l", "l", "o")
assert app.query_one("#submit").disabled is False
@pytest.mark.asyncio
async def test_clicking_save_updates_status(monkeypatch) -> None:
app = MyApp()
monkeypatch.setenv("MYAPP_OFFLINE", "1")
async with app.run_test(size=(100, 30)) as pilot:
await pilot.click("#save")
assert app.query_one("#status").renderable.plain == "Saved"
This style stays close to Textual’s official testing model and keeps the tests understandable for contributors who are new to the framework. (Textual Documentation)
Bottom line
For Textual, the best default is real app + real simulated input + ordinary assertions, with snapshot tests for visuals and light mocking only at boundaries like HTTP, subprocesses, filesystem, config, and environment. That gets you close to how the app actually behaves while still keeping tests fast and maintainable. (Textual Documentation)
I can also turn this into a polished Markdown contributor doc with sections like “Do/Don’t,” “Test recipes,” and a starter tests/ layout.