GAVO Blog: Virtual Observatory Matters: Categories

Articles from Operations

HTTPS in DaCHS

2018-09-24 Markus Demleitner

Another little aspect of HTTPS support in DaCHS: In the web interface, the webSAMP button must disappear in pages served through HTTPS: it simply wouldn't work.

(Warning: No astronomy-relevant content at all this time).

I can't say I'm a big fan of the mighty push towards HTTPS that's going on right now – as I'm arguing in the updated operator's guide it doesn't do people's privacy a lot of good (compared to, say, pushing for browsers to not execute Javascript by default or have DNSSEC widely deployed), but it's a fairly substantial operational liability. With HTTPS, operators have to deal with cryptographic material, regularly update their certificates, restart their services in time and assemble the whole thing correctly (don't get me started about proxying, SNI, and all those horrors). Users, on the other hand, have to keep their CA certificates in order, in particular when they do programmatic VO access, where the browser vendors, their employers and who knows who else doesn't do it for them. Pop quiz: How would you install a new CA certificate on your box? And will your default browser see it?

But on the other hand, there are some scenarios in which HTTPS makes sense, and I can remotely fantasise that some of those may even be relevant to the VO. And people have been asking for HTTPS in DaCHS a number of times, at times even because their administrations urged them to switch. So, here it is, hopefully. Turning it on is reasonably easy when you use Letsencrypt (which in particular entails having ports 80 and 443); the section on Letencrypt in the operator's guide tells what to do. In particular don't forget the cron job, because without it, things would break after three months (when the initial certificate expires).

Things get difficult after that. For one, if your box is known under several names (our data center, for instance, can be reached as any of dc.g-vo.org, vo.uni-hd.de, and dc.zah.uni-heidelberg.de; this of course also includes things like www.example.org and example.org), you'll now have to tell DaCHS about it in the new [web]alternateHostnames configuration item; for instance, we have:
```
[web]
serverURL: http://dc.zah.uni-heidelberg.de
alternateHostnames:dc.g-vo.org, vo.uni-hd.de
```
in our /etc/gavo.rc.

And then the Registry has to know you have https. There's actually no convention for that in the VO yet. But since I'd really like to have at least fallback interfaces with plain HTTP, we'll have to come up with something. For now, my plan is to have the alternative protocol (i.e., HTTPS for sites that have an HTTP-serverURL and vice versa) using the brand-new VOResource 1.1 mirrorURLs (in RegTAP 1.1, they are in the mirror_url column rr.interface). To make DaCHS declare the alternate URLs, set [web]registerAlternative to True.

Another change I've introduced for HTTPS is that the default HTML template for the form renderer (i.e., the one people use who come with a browser) now suppresses the SAMP button if the request came in through HTTPS; that's because WebSAMP doesn't work with HTTPS and probably never will – at least I can't see a way to make it happen without totally wrecking what security guarantees HTTPS gives.

All this doesn't yet cater for the case when you use a reverse proxy to terminate HTTPS. If you are in that situation, please talk to me so we can figure out a sane way for you explain to DaCHS what to tell the Registry.

Anyway, if you want to try things out, just switch to the beta repostitory and upgrade. Feedback is highly welcome.

Oh, and if you're a client developer: Our data center is now reachable through HTTPS (at https://dc.g-vo.org), and we already have pushed the records with mirrorURLs declaring HTTPS support to the RegTAP service at dc.g-vo.org (the others will have to wait a bit longer, as we haven't re-published our registry records yet (it's all experimental, after all).

Category: Operations
Horror vacui begone

2018-04-13 Markus Demleitner

Mikhail's qrdcreator in a browser and an editor with a dachs start-produced template.

One of the major usability issues our publishing suite DaCHS has for operators (i.e., people who want publish data) is the “horror vacui”: How do I start a Resource Descriptor (RD – the file DaCHS interprets to build services)?

I used to recommend to start by having a look at the RDs of our existing services and pick whatever matches best your publication project. But finding a matching service and figuring out what is generic, what's a special property of the concrete data collection, and what's a hack that should not be reproduced isn't straightforward at all, not to mention the fact that some of those RDs have been in maintenance mode for almost 10 years and hence may show deprecated practices.

Then came the the VESPA implementation workshop last year, during which Mikhail Minin showed me a piece of javascript and HTML (source on github) he has written to overcome the empty editor window. Essentially, Mikhail has built a fairly comprehensive form interface in a web browser that asks people the right questions to eventually write an RD for EPN-TAP (i.e., solar system) resources.

I had planned to generalise Mikhail's approach to several types of resources supported by DaCHS, ideally inferring the questions to ask from the built-in documentation of mixins and applys. But during the last year, whenever I felt it would be a good time to tackle that generalisation, I quickly gave up again. It was mostly rather trivial stuff such as how to tell apart repeatable metadata (waveband, say) and non-repeatable metadata (instrument, say). But it was bad enough that I quickly found something else to do each time I got started.

Eventually, I gave up on a menu interface altogether – making it flexible and generatable at the same time seemed a fairly complex problem. But that doesn't mean I forgot about overcoming the horror vacui thing. So, when forms aren't flexible enough for data entry, where do you turn? Right! A text editor.

Enter dachs start. That's a new DaCHS subcommand that gets you started with your RD. For one, you can list the templates available:
```
$ dachs start list
siap -- Image collections via SIAP1 and TAP
ssap+datalink -- Spectra via SSAP and TAP, going through datalink
epntap -- Solar system data via EPN-TAP 2.0
scs -- Catalogs via SCS and TAP
```
More templates are planned; siap+datalink, for instance, would cover some frequent use cases. Feel free to mail in requests.

Once you find a suitable template, create your future resource directory, enter it and run dachs start again, this time passing the name of the template you want:
```
$ mkdir ex_data
$ cd ex_data
$ dachs start scs
$ head -16 q.rd | tail -9
<resource schema="ex_data">
  <meta name="creationDate">2018-04-13T12:34:31Z</meta>

  <meta name="title">%title -- not more than a line%</meta>
  <meta name="description">
    %this should be a paragraph or two (take care to mention salient terms)%
  </meta>
  <!-- Take keywords from
    http://astrothesaurus.org/thesaurus/hierarchical-browse/
```
dachs start uses the directory name as the new schema name and then writes a file q.rd (which is the canonical name for the “main” RD in a resource). Within this file, you'll see things to fill out between pairs of percent signs with short explanantions. Where longer explanations are necessary, embedded comments should help.

To give you an idea of the intended use: As a vim user, I've put
```
augroup rd
  au!
  au BufRead,BufNewFile *.rd imap  /%[^%]*%a
  au BufRead,BufNewFile *.rd imap  cf%
augroup END
```
into my ~/.vimrc. That way, while editing the template into an actual RD, hitting F8 takes me to the next thing to be edited; I can then read the instructions, and when I have made up my mind, I can either delete the template element or hit F9 and replace the explanation text with whatever belongs there.

The command is available starting with the 1.1.3 beta (available now by switching to the beta repo) and will be part of the 1.2 release, planned for early June after the Victoria interop.

If you have a publication project: just try it out and give feedback. Note that the templates haven't actually been tested yet, and the comments were written by a DaCHS and VO nerd, so they might not always be great either. Thus, when you get stuck: complain early, complain often!

Category: Operations
Heidelberg Data Center Down^WUp again

2017-11-11 Markus Demleitner

Well, it has happened – perhaps it was the strain of restoring a couple of terabyte of data (as reported yesterday), perhaps it's uncorrelated, but our main database server's RAID threw errors and then disappeared from the SCSI bus today at about 15:03 UTC.

This means that all services from http://dc.g-vo.org are broken for the moment. We're sorry, and we will try to at least limp on as fast as possible.

Update (2017-11-13, 14:30 UTC): Well, it's official. What's broken is the lousy Adaptec controller – whatever configuration we tried, it can't talk to its backplane any more. Worse, we don't have a spare part for that piece here. We're trying to get one as quickly as possible, but even medium-sized shops don't have multi-channel SAS controllers in stock, so it'll have to be express mail.

Of course, the results of the weekend's restore are lost; so, we'll need about 24 hours of restore again to get up to 90% of the services after the box is back up, with large tables being restored after that. Again, we're unhappy about the long downtime, but it could only have been averted by having a hot spare, which for this kind of infrastructure just wouldn't have been justifiable over the last ten years.

Another lesson learned: Hardware RAID sucks. It was really hard to analyse the failure, and the messages of the controller BIOS were completely unhelpful. We, at least, will migrate to JBOD (one of the cool IT acronyms with a laid-back expansion: Just a Bunch Of Disks) and software RAID.

And you know what? At least the box had two power supplies. If these weren't redundant, you bet the power supply would have failed.

To give you an idea how bad things are, here is the open server with the controller card that probably caused the mayhem (left), and 12 TB of fast disk, yearning for action (right).

Update (2017-11-14, 12:21 UTC): We're cursed. The UPS guys with the new controller were in the main institute building. They claimed they couldn't find anyone. Ok, our janitor is on sick leave, and it was lunch break, but still. It can't be that hard to see walk up a single flight of steps. Do we really have to wait another day?

Update (2017-11-14, 14:19 UTC): Well, UPS must have read this – or the original delivery report was bogus. Anyway, not an hour after the last entry the delivery status changed to "delivered", and there the thing was in our mailbox.

Except – it wasn't the controller in the first place. It turned out that, in fact, four disks had failed at the same time. It's hard to believe but that's what it is. Seems we'll have to step carefully until the disks are replaced. We'll run a thorough check tonight while we prepare the database tables.

Unless more disaster strikes, we should be back by tomorrow morning CET – but without the big tables, and I'm not sure yet whether I dare putting them in on these flimsy, enterprise-class, 15k, SAS disks. Well, I give you they've run for five years now.

Update (2017-11-15, 14:37 UTC): After a bit more consideration, I figured I wouldn't trust the aging enterprise disks any more. Our admins then gave me a virtual machine on one of their boxes that should be powerful enough to keep the data center afloat for a while. So, the data center is back up at 90% (counting by the number of regression tests still failing) since an hour ago or so.

Again, the big tables are missing (and a few obscure services the RDs of which showed bitrot and need polishing); they should come in over the next days, one by one; provided the VM isn't much slower than our DB server, you should see about two of them come in per day, with my planned sequence being hsoy, ppmxl, gps1, gaia, 2mass, sdssdr7, urat1, wise, ucac5, ucac4, rosat, ucac3, mwsc, mwsc-e14a, usnob, supercosmos.

Feel free to vote tables up if you severely miss a table.

And all this assumes no further disaster strikes...

Update (2017-11-16, 9:22 UTC): Well, it ain't pretty. The first large catalog, HSOY, is finally in, and the CLUSTER operation ((which dominates restore time) took almost 12 hours; and HSOY, at 0.5 Gigarecord, isn't all that large. So, our replacement machine really is a good deal slower than our normal database server that did that operation in less than three hours. I guess you'll want to do your large-table queries on a different service for the next couple of weeks. Use the Registry!

Update (2017-11-20, 9:05 UTC): With a bit more RAM (DaCHS operators: version 1.1 will have a new configuration item for indexing work memory!), things have been going faster over the weekend. We're now down to 15 regression tests failing (of 330), with just 4 large catalogs missing still, and then a few nitty-gritty, almost invisible tables still needing some manual work.

Update (2017-11-23, 14:51 UTC): Only 10 regression tests are still failing, but progress has become slow again – the machine has been clustering supercosmos.data for the last 36 hours now; it's not that huge a table, so it's a bit hard to understand why this table is holding up things so much. On the plus side, new SSDs for our database server are being shipped, so we should see faster operation soon.

Update (2017-12-01, 13:05 UTC): We've just switched back the database server back to our own server with its fresh SSDs. A few esoteric big tables are yet missing, but we'd say the crisis is over. Hence, that's the last update. Thank you for your attention.

Category: Operations
A Tale of CLUSTER and Failure

2017-11-10 Markus Demleitner

This command nuked 5 TB of database tables (with a bit of folly before).

Whenever you read “backup”, the phrase “lessons learned” is usually not far off. And so it is here, with a little story for DaCHS operators (food for thought, I'd say), astronomers (knowing what's going on behind the curtain sometimes helps write better queries), and everyone else (for amusement and a generous helping of schadenfreude).

It all started yesterday when I upgraded the main database server of our data center (most anything in the VO with a org.gavo.dc in the IVOID depends on it) to Debian stretch. When that was done, I decided that with about 1000 installed packages, too much cruft had accumulated and started happily removing unused software. Until I accidentally removed the postgres package. In itself, that would not have been so disastrous – we're running Debian, which means packages usually keep the configuration and, in particular, the data around even if you remove them. The postgres packages, at the very least, do, and so does DaCHS.

Unless, that is, you purge the postgres package before you notice you've removed it. I, for one, found it appropriate to purge all packages deleted but not purged right after my package deletion spree. Oh bother. Can you imagine my horror when the beastly machine said “dropping cluster main”? And ignored my panic-induced ^C (which, of course, was the right thing to do; the database was toast already anyway).

There I had just flushed 5 Terabytes of highly structured data down the drain.

Well, go restore from backup, you say? As usual with backups, it's not that simple™. You see, backing up databases is tricky. One can of course just back up the files as they are and then try to restore from them. However, while the database is running, it is continually modifying what's on the disk, so such a backup will be an inconsistent, unusable mess. Even if one had a file system that can do snapshots, a running server has in-memory state that is typically needed to make heads and tails of the disk image.

So, to back up a database, there are essentially variations of two themes, roughly:
- ask the database to dump itself. The result is a conventional file that essentially is a recipe for how to re-create a particular state of the database.
- have a “hot spare”. That's another machine with a database server running. In one way or another that other box snoops on what the main machine is doing and just replicates the actions it sees. The net effect is that you have an immediately usable copy of your database server.
Anyway, after the opening of this article you'll not be surprised to learn that we did neither. The hot spare scenario needs a machine powerful enough to usefully serve as a stand-in and to not slow down the main machine when we feed data by the Gigarecords. Running such a machine just for backup would be a major waste of electricity – after all, this is the first time in about 10 years that it would really have been needed, and such a box slurps juice like it's... well, juice.

As to maintaining a dump: Well, for the big catalogs, we use DaCHS' direct grammars [PSA: don't follow this link unless you're running DaCHS]. These are, except perhaps for a small factor, just as fast as a restore from a dump. And the indices (i.e., data structures that tell the computer where to look for objects with a certain position or magnitude rather than having to go through the whole table) need to be re-made when restoring from dumps, too, so we'd be pushing around files of several terabyte for almost no benefit.

Except. Except I could have known better, because during catalog ingestions the most time-consuming task usually is the CLUSTER operation. That's when the machine re-organises the data on disk so it matches expected access patterns – for astronomical data, that's usually by spatial location. Having a large table clustered makes an astonishing difference, in particular when you're still using spinning disks (as we are). So, there's really no way around it.

But it takes time. And more time. And that time is saved when restoring from a dump, because the dump (hopefully) largely preserves the on-disk organisation, and so the CLUSTER is almost a no-op.

Well, the bottom line is: on our Heidelberg data center, the big tables are only coming back slowly; as I write this, from the gigarecord league PPMXL and GPS1 are back, with SDSS DR7 and HSOY expected later today. But it'll probably take until late next week until all the big tables are back in and properly indexed and clustered.

Apologies for any inconvenience. On the other hand, as measured by our regression tests (DaCHS operators: required reading!) 90% of our stuff is fine again, so we could fare worse given we just had a database disaster of magnitude 5 on the Terabyte scale.

Which begs the question: Was it better this way? At least many important services are safely back up, and that might very well not be the case were we running the restore from an actual dump. Hm.

Category: Operations
Register your stuff with purx!

2017-11-02 Markus Demleitner

If you open the TAP dialog of TOPCAT, what you see is Registry content.

The VO Registry lets people find astronomical resources (which is jargon for “dataset, service, or stuff“). Currently, most of its users don't even notice they're using the Registry, as when TOPCAT just magically lists what TAP services are available (image above) – but there are also interfaces that let you directly interact with the registry, for instance GAVO's WIRR service or ESAVO's Registry Search.

Arguably, the usefulness of the Registry scales with its completeness. With sufficient completeness, the domain-specific, structured metadata will also make it interesting for generic discovery of astronomical data; in a quip, looking for UCDs in google will never work quite well – and without that, it's hard to find things with queries like „radio fluxes of early-type stars”.

Either way: If you have a data set or a service dealing with astronomy, it'd be great if you could register it. To do this, so far you either had to set up a publishing registry, which is nontrivial even if you have a software that natively speaks a protocol called OAI-PMH (DaCHS does, but most other publishing suites don't) or you could use one of two web interfaces to define your resource (notes for a talk on this I gave in 2016).

Neither of these options is really attractive if you publish only a few resources (so the overhead of running a publishing registry looks excessive) that change now and then (so using a web browser to update the resource records again and again is tedious). Therefore, GAVO has developed purx, the publishing registry proxy. We've officially announced it during the recent Southern Spring Interop in Santiago de Chile (Program), and the lecture notes for that talk are probably a good introduction to what this is about.

If you're running VO services and have not registered them so far, you probably want to read both these notes and the service documentation. If, on the other hand, you just have a web-published directory of files or a browser-based service, you probably can skip even that. Just grab a sample record (use the one for a simple browser service in both cases) and adapt it to what's fitting for your website. Then put the resulting file online somewhere and paste the URL of that location on purx' enrollment service. In case you're uncertain about some of the terms in the record, perhaps our crib sheet for metadata we ask our data providers for will be helpful.

There's really no excuse any more for not being in the Registry!

Category: Operations

« Page 3 / 3