We first introduced UI tests last January. At the time we had developed a suite of UI tests that could be run manually in a development environment. However, we faced a number of issues consistently running these tests in our automated pipelines.
One of the main threads of work that needs to be complete early in the Cwtch Stable roadmap is integrating UI tests into our CI pipelines, in addition to expanding their scope. Now that Flutter 3 has stabilized desktop support, and we have invested effort in improving Cwtch performance, it is time to ensure these tests are running on every build.
Current Limitations of Flutter Gherkin
The original flutter_gherkin is under semi-active development; however, the latest published versions don't support using it with flutter test
.
- Flutter Test was originally intended to run single widget/unit tests for a Flutter project.
- Flutter Drive was originally intended to run integration tests on a device or an emulator.
However, in recent releases these lines have become blurred. The new integration_test package that comes built into newer Flutter releases has support for both flutter drive
and flutter test
. This was a great change because it decreases the required overhead to run larger integration tests (flutter drive
sets up a host-controller model that requires a dedicated control channel to be setup, whereas flutter test
can take advantage of the knowledge that it is being run in the same process, and is noticeably faster - very important when the goal is to run tests as often as possible).
There is thankfully code in the flutter_gherkin
repository that supports running tests with flutter test
, however this code currently has a few issues:
- The test code generation produces code that doesn't compile without minor changes.
- Certain functionality like "take a screenshot" does not work on desktop.
Additionally, there are a few limitations in built-in flutter_gherkin steps that we noticed our tests running into:
- Certain tests that fail with async timeouts will cause Flutter exceptions instead of a failed test.
- Certain Flutter widgets like
DropdownButton
are not compatible with built-in steps liketap
because they internally contain multiple copies of the same widget.
Because of the above issues we have chosen to fork flutter_gherkin to fix some of these issues, with the intent of contributing significant fixes upstream, while allowing us to iterate faster on Flutter UI testing.
Integrating Tests into the Pipeline
One of the major limitations of flutter test
is the lack of a headless mode. In order to successfully run tests in our pipeline we need a headless mode, as most of the containers we use do not have any kind of active display.
Thankfully it is possible to use Xfvb to create a virtual framebuffer, and set DISPLAY
to render to that buffer:
export DISPLAY=:99 Xvfb -ac :99 -screen 0 1280x1024x24 > /dev/null 2>&1 &
This allows us to neutralize our main issue with flutter test
, and efficiently run tests in our pipeline.
Catching Bugs!
This small amount of integration work has already caught its first bug.
Once we had fixed most of the issues outlined above, we were still seeing failures on what should have been a very basic scenario. 02_save_load.feature simply turns a set of experiments on and checks that the state is saved. This test runs perfectly fine on development environments, but when uploaded to our build pipeline it always failed in the same place - turning on the file sharing experiment.
The cause of this was an actual bug in Cwtch UI. The file sharing experiment failed to turn on if the directory $USER_HOME/Downloads
didn't exist. This is rarely the case on most real world systems, but is the case in our build pipelines. We have since fixed this behaviour to allow file sharing to be turned on even if the usual Download directories are not available.
As we enable more of our UI tests in our pipeline, and across more platforms, we expect to catch more subtle issues like the above - a big win for people who use Cwtch!
Next Steps
-
More automated tests: We have a nice collection of pre-written tests that we can begin to automatically run within pipelines. We have already begun this work, and anticipate finishing it before Cwtch 1.11.
-
More platforms: Right now UI tests only run on Linux. In order to fully take advantage of these tests we need to be able to run them across our target platforms. We expect to start this work soon; expect more news in a future Cwtch Testing update!
-
More steps: One of our longer-term goals with UI testing was to produce a language around Cwtch testing that went beyond widgets. We had begun to explore this last year with the
expect to see the message
step. As we grow our test library we will be looking for opportunities to build out additional higher-level and Cwtch-specific constructs, e.g.send a file
orset profile picture
.
Help us go further!
We couldn't do what we do without all the wonderful community support we get, from one-off donations to recurring support via Patreon.
If you want to see us move faster on some of these goals and are in a position to, please donate. If you happen to be at a company that wants to do more for the community and this aligns, please consider donating or sponsoring a developer.
Donations of $5 or more can opt to receive stickers as a thank-you gift!
For more information about donating to Open Privacy and claiming a thank you gift please visit the Open Privacy Donate page.