DaCHS 1.3 is out

[decoration]

Almost a year has passed since release 1.2 of DaCHS – I’ve let the normal autumn release slip last year because there weren’t so many release-worthy new features in DaCHS at the traditional release time (i.e., after the College Park interop), and also because running betas when you do need a new feature is a fairly stable thing by now.

But here it finally is: Release 1.3 (tarball for the die-hard self-builders; everyone else just switches back the release branch as necessary and then runs an update/upgrade cycle).

Here’s the commented changelog:

  • New //ssap#view mixin that should be used for future SSAP services, and that existing SSAP services should migrate to at some point. See A new view on SSAP in DaCHS on this blog for details.
  • Columns can now be hidden from TAP/ADQL (and other interfaces) by setting hidden="True".
  • There is now a setting [web]maxSyncUploadSize=500000 (meaning: about 500 kByte) as the default upload limit on sync queries. In compensation, clients uploading too much now receive a more useful error message (except it doesn’t reach TOPCAT users most of the time because it does chunked uploads). To get back the behaviour of 1.2 (which is probably ok if you can live with the occasional resource hog), add maxSyncUploadSize=20000000 to your /etc/gavo.rc.
  • Adding support for https (certificate reading, certificate updating with letsencrypt, registering alternate endpoints, no WebSAMP with https). See HTTPS in DaCHS on this blog for details.
  • New source_table and preview columns in obscore. If you’re using the various obscore mixins, this should be automatic. If you have defined views manually, you will have to amend these (and have a broken obscore until a dachs upgrade ran without error).
  • No longer producing arraysize="1" in VOTables for scalars (except char, for compatibility with a legacy TOPCAT workaround; see VOTable 1.3 Erratum 3 for background information).
  • Support for draft TIMESYS in VOTable (with STC 2 annotation; ask about details if you’re interested. This is for draft VOTable 1.4 and probably only relevant to you if you’re publishing time series).
  • You can now add targetType and targetTitle properties to URL-valued columns to help Aladin figure out what to do with URLs (see Datalinks as product URLs in the reference documentation).
  • New gavo_transform, gavo_ipix, and gavo_urlescale ufuncs for ADQL, fixed gavo_urlescape to have acceptable performance.
  • New generating CatalogResource records with auxiliary capabilities in accordance with Oct 2018 VODataService WD.
  • //soda#sdm_genDesc now matches accref rather than pubDID by default. If you use Datalink with SSA and have a custom pubDID schema (or no index on accref), add a useAccref="False" to your descriptorGenerator statement.
  • There is now a --foreground option for dachs serve start. This is mainly to play nice with systemd, and indeed, the Debian package now comes with a systemd unit file. I’m not terribly familiar with systemd, so please have an eye on DaCHS controlled by systemd and let me know if you see something that’s not as it should be.
  • Fixes for various bugs (most notable: ” in ADQL, WCS in SIAP cutout products) and many minor improvements. Check out the source tree (still via subversion) and read the changelog if you want to know the whole truth.

On systems running from the Debian package, the update should be automatic with the next system upgrade. However, you’ll be saving yourself quite a bit of headache if you check the health of your installation before the upgrade; see Upgrading DaCHS in the operator’s guide on how to upgrade professionally.

A New View on SSAP in DaCHS

When I started working on the VO in 2007, my collagues in Garching already had a software that implemented major parts of the simple spectral access protocol (SSAP) that was being developed back then. It would publish spectra in the FITS format by just blindly dumping all header cards into a database table and then defining a view over that “raw” metadata table to make the whole thing match SSAP’s expectations for how the output table should look like. Sometimes you could just map through a header to an SSA column, sometimes you would just convert a unit, sometimes you would have to write a fairly complex SQL expressions combining multiple fields.

Back then, I didn’t like it – why have two things (a table and a view) that can break when one (just a table in SSA’s format) would do, too? Also, SSAP has about 50 metadata fields, but lets you put constant values into VOTable PARAMs, which seemed a very reasonable way to attain more compact responses. So, when DaCHS grew SSAP support, I defined a mixin (essentially, a configurable interface definition) that let operators define SSA tables and their constant parameters in a fairly simple fashion and directly produced a table you could base your SSAP service on.

That made assumptions about which pieces of metadata are constant and which are not; for instance, the original mixin (“hcd” for “homogeneous collection”) assumed all spectra in a data collection came from the same instrument and had the same resolution and (what was I thinking?) SNR. Unsurprisingly, that broke fairly soon. So, I added a second mixin (“mixc”) for when different instruments or codes produced the data.

But even that was headache, at the latest when I started making time series services using SSAP. And I had to fix a few bugs in the mixins themselves in the meantime, which mostly required re-imports of the data in that design. Such re-imports are non-trivial when you have millions of spectra, and they need to happen at software upgrade time or the services would break with the upgrade. Ouch.

It was about mid-2018 when it dawned on me that sometimes it’s better to have two things that can break even if one would do, after all. Specifically, if fixing the one thing is expensive, it’s an excellent idea to put a facade on top of it that’s cheap to change and can already be used to repair most deficiencies. Why re-build the house if a paint job does the trick?

As to having more compact query responses when you stuff metadata that’s constant in all the rows into VOTable PARAMS – well, in the age of web pages pulling in a megabyte of javascript and two megabytes of images to display five lines of text, I’ve become a bit cavalier in that department. Sure, the average row may have grown by a factor of three, but we’re still talking only a few megabyte even with large responses. To me, these extra bytes seem a fair price to pay for the increased flexibility and overall more straightforward architecture.

So, I’ve now come up with a view-based solution in DaCHS, too: the //ssap#view mixin. This is a bit less radical than the Garching software of 2007, as it doesn’t dump raw headers but instead lets you do the primary transformations in the RD. But it no longer constrains what pieces of metadata should be constant and which may vary between spectra, and it uses the same names for the same pieces of metadata throughout (which also is a step forward over the old SSAP mixins).

With this, DaCHS operators should no longer use the hcd and mixc mixins for new services. The new technique is already reflected the respective tutorial chapter, and the SSAP template (you’re using dachs start, aren’t you?) now uses it, too.

If you have a spectra publishing project in your pipeline, this would be the perfect time to upgrade to the DaCHS 1.2.4 beta, which has the new mixin. It would be great if we could iron out remaining wrinkles before the next release makes changes a load on my conscience.

As to migrating existing SSAP services: Well, it would be great if I could drop the old mixins in a couple of years, as they cause quite a bit of uglyness in DaCHS’s built-in //ssap RD. But the migration regrettably isn’t straightforward, so you may want to wait a bit before embarking on that journey (I’ll be happy to help, though).

DaCHS 1.1 released

Today, I have released DaCHS 1.1, with the main selling point that DaCHS should now speak TAP 1.1 (as defined in the current draft).

First off, if you’re not yet on DaCHS 1.0, please read the corresponding release article before upgrading.

As usual, the general upgrading instructions are available in the operator’s guide (in short: do a dachs val ALL before the Debian upgrade). This time, I’d recommend to use the opportunity to upgrade your underlying server to stretch if you haven’t done so already. If you do that, please have a look at hints on postgres upgrades. Stretch comes with postgres 9.6 (jessie: 9.4). Postgres upgrades are generally safe, but please take a dump before migrating anyway.

So, with this out of the way, here’s a short list of the major changes from DaCHS 1.0 to DaCHS 1.1:

    9

    • 9

    • DaCHS now officially requires python 2.7. If this really is a problem for you, please shout – if wouldn’t be hard to maintain 2.6 compatibility, but by now we feel there’s no reason to bother any more.
    • 9

    • Now supporting TAP 1.1; in particular, TOP n doesn’t trump MAXREC any more, and it doesn’t affect OVERFLOW indication, which may break things that used TOP to override DaCHS’ default TAP match limit of 2000. Also, TAP_SCHEMA is updated (this happens as a side effect of dachs upgrade).
    • 9

    • Now serialising spoint, scircle, and friends to DALI 1.1 xtypes (timestamp, point, polygon, circle). Fields explicitly marked with adql:POINT or adql:REGION will still be serialised to STC-S. Do this only if you have no choice (DaCHS has this for obscore and epntap s_region right now).
    • 9

    • The output column selection is sanitised. This may make for slight changes in service responses, in particular in VOTable formats. See Output Tables in the reference documentation for details if you think this might hit you.
    • 9

    • DaCHS no longer comes with an outdated version pyparsing and instead uses what’s installed on the system. The Debian package further re-uses additional system resources if available (rjsmin, jquery).
    • 9

    • DaCHS now tries a bit harder to come up with sensible names for SODA result files.
    • 9

    • map/@source is no longer limited to identifier-like strings; any key that’s in your source is fair game.
    • 9

    • For incremental imports with data that’s updated now and then, there’s now ignoreSources/@fromdbUpdating.
    • 9

    • Relative imports from custom code (“import foo” in a custom core, for instance, getting res/foo.py) no longer work. See Importing Modules in the reference documentation for details.
    • 9

    • This release fixes a severe bug in the creation of obscore metadata from SSAP tables. If you use //obscore#publishSSAPHCD or //obscore#publishSSAPMIXC mixins, update the obscore definitions by running dachs imp -m <rdid>, followed by dachs imp //obscore (the latter is only necessary once at the end).
    • 9

    • You can now define a footer.html template that’s added at the foot of the main page content – with a bit of CSS magic, this lets you overwrite almost anything on DaCHS HTML pages.

    As always, please complain early if something breaks for you; our regression tests can only cover so much. In particular, our support list is there for you.

    Update (2017-12-06): In particular on jessie, you may see that all DaCHS packages are being held back. To resolve this situation, manually say apt-get install python-gavoutils python-gavostc.