DaCHS is Bustered

DaCHS is developed on Debian, and Debian is the recommended deployment platform. Hence, a new major release of Debian (where major means for them: We may break stuff) is always a big thing for me. And so it was with the release that came in July, codenamed “buster”. Both on the “big thing” and on the “break” counts. This posting gives DaCHS deployers some background for their buster upgrades. Astronomers not running Debian themselves won't risk missing anything if they skip this post.

So, after I upgraded the first thing I noticed is that DaCHS would no longer even start because astropy (which it needs, in particular, because that's where pyfits sits these days) was gone. Simple explanation: Upstream astropy doesn't support python2 any more, and so Debian buster only has python3-astropy.

Moving DaCHS to python3, unfortunately, isn't that easy; a major dependency, nevow (essentially, a web framework), isn't ported yet, and porting it is a major thing. Believe me, I've tried. The nasty thing, in particular, is that twisted, which lies below nevow still, hands up lots of byte strings. And in python3, b"a"!="a". You wouldn't believe how many interesting bugs that simple truth introduces when you got a library that handed out “just strings” in python2 and now byte strings in python3. Yikes.

Update (2019-08-28): After quite a bit of experimentation, I finally gave up on providing a python2 version of astropy through release, because for a complicated set of reasons (including numpy declaring a conflict with existing astropys in buster) it is impossible to provide a package that works in buster and doesn't break stretch. So, for buster only you'll have to have a second (or, if running beta, third) gavo line in your sources.list (or equivalent):

deb http://vo.ari.uni-heidelberg.de/debian buster-foreports main

The instructions at our APT repository have been updated, so you won't have to bookmark this particular page.

But that wasn't the end of it. Buster comes with Postgres 11, which I look forward to in particular because it supports parallel query execution. That could help us quite a bit, given out large catalogs that quite often we want to run sequential scans on. But of course this means upgrading postgres. And attempting to do that on my development machine immediately hit a wall. What's nice is that the q3c and pgsphere extensions that we've had to push out ourselves so far are now part of Debian main. What's rather fatal is that our pgsphere extensions dealing with HEALPixes and MOCs aren't part of the buster pgsphere package (the reasons for that are tedious and arcane and have to do with OpenSSL and the GPL).

Also, the pgsphere package coming with buster is called postgres-pgsphere, which is rather unfortunate as it's missing the version indication. So: If you find it on your system, remove it right away. It will conflict with the one true pgsphere package (postgresql-11-pgsphere). That one you'll get from us, and it has the HEALPix stuff built in. TL;DR: run apt install postgresql-q3c postgresql-11-pgsphere before following the postgres update recipe linked above.

There's a bit more to upgrading the database this time. Because of fairly low-level cleanup in Postgres itself. you're risking index corruption on string indices. Realistically, for almost anything you'll have, it's unlikely that you're affected (it's essentially about non-ASCII in strings), but then it's better to be safe than sorry, and hence you should say:

reindex database gavo

first thing after you've upgraded to Postgres 11 (which you should really do once the box is on buster). Only if you have very large tables it might be worth it to restrict the index regeneration to indices that could actually need it; see the postgres link above for how to do that.

One last thing on Postgres upgrades: I've not quite tried to work out why, but probably depending on your /etc/hosts DaCHS on buster is much more likely to connect to your database using IPv6 than it was before. Many older Postgres configurations won't let you in then. If that happens to you, just edit /etc/postgresql/11/main/pg_hba.conf and add a line:

host    all         all         ::1/32          md5

(or something less permissive if you prefer).

The next buster-related shock was when TOPCAT's TAP uploads stopped working while my regression tests didn't find anything wrong. After a bit of cursing I eventually figured out that that's not actually buster's fault but twisted's, which in a commit from May 2018 broke chunked uploads (essentially, that's when you're not saying up front how large your upload will be). I've filed a bug report on twisted, but we can't really wait until any sort of fix will be ready and have a broken TOPCAT-DaCHS relationship until then, so for now we're also shipping a fixed twisted package. If you're running DaCHS without our repository enabled, you will have to patch your the twisted code itself. The bug report tells what to do (no warranties, though, because I'm not entriely sure why they changed it in the first place; it's a very small change, though).

[Update (2019-08-14) scratch the part with the fixed twisted packages. They're too much trouble on stretch systems. You can keep using them on buster boxes if you want, though. The most recent stable release monkeypatches the problem out of presumably broken twisteds, and so will the next beta.]

I hope you're not totally discouraged now, because upgrade you should (though perhaps not right before going on vacation) – distribution upgrades are unavoidable if you want to run services for decades, and that's definitely a goal within the VO. See the Debian release note for Debian's take on dist upgrades, which arguably is a bit more alarmist than it would need to; a lean, server-only system typically is really simple to upgrade.

Given the relatively large number of Debian packages we override in buster, I'll be particularly grateful if you complain early about breakage you observe (ideally use the dachs-support mailing list, but see Support for alternatives), and as usual you are encouraged to try the upgrade first on a development system if you have one. Which you should.