Watch Sphinx Doctests

No astronomy at all here; please move on if tooling for improving tooling bores you.

While giving a lecture on pyVO, I am churning out quite a few pull requests against pyVO at the moment. I am also normally also fairly religious about running unit tests before doing a commit. But then PyVO unit tests became really, really slow a while ago when pytesting of the examples in the documentation was turned on, and so I started relying on the github continuous integration, which feels fairly wasteful – and also makes all kinds of minor idiocies public that I would have caught locally with a test suite that finishes within a minute or so.

Regrettably, tooling for inspecting how doctests with sphinx and pytest run is not really great: All the code from one documentation file translates into a single test, and when that runs for five minutes, it's anyone's guess where the time is spent. After a bit of poking and asking around, it seemed to me that there indeed is no “doctest profiler” (if you will), at least not for pytest-executable doctests embedded in sphinx-processable ReStructuredText.

Well, I thought, let's write a quick one. Originally, I had wanted to use the docutils parser for robustness, but once I tried to pull in the sphinx extensions and got lost in their modules I decided a simple, RE-based parser has to be enough.

And here it is, my my quick-and-dirty doctest profiler: watch-doctests.py. Just put it into your path, make it executable, and you can do something like this:

pyvo/docs/dal > watch-doctests.py index.rst | head -30
---0.00---------------

import pyvo as vo
---0.94---------------

service = vo.dal.SIAService("http://dc.zah.uni-heidelberg.de/lswscans/res/positions/siap/siap.xml")
---0.94---------------

print(service.description)
Scans of plates kept at Landessternwarte Heidelberg-Königstuhl. They
were obtained at location, at the German-Spanish Astronomical Center
(Calar Alto Observatory), Spain, and at La Silla, Chile. The plates
cover a time span between 1880 and 1999.

Specifically, HDAP is essentially complete for the plates taken with
the Bruce telescope, the Walz reflector, and Wolf's Doppelastrograph
at both the original location in Heidelberg and its later home on
Königstuhl.
---1.02---------------

import pyvo as vo
---1.02---------------

from astropy.coordinates import SkyCoord
---1.02---------------

from astropy.units import Quantity

– so, you pass in the ReStructuredText with the embedded sphinx/pytest doctests, and then the thing extracts every line to be executed in the doctests (it ignores the outputs, so it will not actually check any assertions), prints the runtime so far in a separator and then runs the code through Python as usual: note that no automatic repr() of any non-None results – that the REPL does – happens. This is for profiling, not for test development.

The quick hack helped me speed up the dal and registry doctests by sizeable factors, for instance because I am now avoiding downloads of large datasets, and I am using faster queries where I can.

So, that's nice. But unless someone asks, I will distribute the code here only and in this ad-hoc fashion (probably with a link in the pyVO hackers' docs). I still believe there must be something a lot less hacky that does about the same thing somewhere out there…