Posts with the Tag DaCHS:

  • DaCHS 2.8 is out

    Today, I have released DaCHS 2.8 and uploaded it to our APT repository; it should also appear in Debian unstable within the next two weeks. This is the traditional post on what is new in this release.

    If I had to name the highlights of what was added since version 2.7, released last November, I would probably say it's HiPS support and the general move towards SIAPv2, although I would have to admit that both did not involve large amounts of code, in particular when compared to the various changes related to COOSYS and TIMESYS.

    So, what about HiPS support? As you probably know, HiPSes are zoomable images (or catalogues, too); if you have a survey-like image collection published through SIAP, you owe it to yourself to have a look at this.

    Given HiPSes are so interactive in Aladin and the like, it may be surprising that they do not really require an active server component: technically, they are just a directory tree created and organised in a very clever way. So, why would DaCHS have a HiPS renderer and boast about it? Well, there are a few amenities (such as auto-generated hips.params files and properties once you have your RD), and DaCHS will care about the Registry side of a HiPS publication. For details, see the HiPS section in the tutorial.

    The SIAP2 story is that (against my rather substantial skepticism) people insisted on creating a new image search protocol in the early 2010s. Since it doesn't have tangible benefits over the venerable SIA1 and even less over Obscore, DaCHS so far has limited its support for SIAP2 to a single global SIAP2 service based on the Obscore table. But then SIAP1 with its stinky UCDs does show its age, and since support for SIAP2 in various clients has been falling into place over the last few years, DaCHS now nudges you to publish your images through SIAP2, for instance by producing a template for a SIAP2 service in dachs start.

    SIAP2 is also what the image section of the tutorial now reflects. If you already have SIAP1 services, the migration should not be hard (except where you used the siapCutoutCore), but given occasional shakiness in the SIAP2 support of the various tools, I'd still wait for a year or two; I have certainly no plans to remove SIAP1 from DaCHS within the next ten years or so. If you still want to migrate, feel free to ask for a section on doing so in DaCHS' How Do I? document.

    From the department of “this update may break your service”: I you have SODA cutouts of cubes, this update will rather likely break the cutout on the non-spatial axis. To fix things, if that axis is spectral, pass its index in a spectralAxis parameter to //soda#fits_standardDLFuncs (or to //soda#fits_makeWCSParams, if that's what you use)[1]. On the other hand, you can now define a velocityAxis, too (and for other cases, there is still axisMetaOverrides).

    Among the more generally interesting new features may be the UnionGrammar. This is for when you have multiple sorts of inputs that require different parsers, for instance, when the data provider changes the formats in which they deliver the data in the midst of a project. I would hope the example from the unionGrammar documentation illustrates what this could be useful for:

    <unionGrammar>
      <handles pattern=".*\.txt$">
        <reGrammar...>
      </handles>
      <handles pattern=".*\.csv$">
        <csvGrammar...>
      </handles>
    </unionGrammar>
    

    Also note that you can create some uniformity between what the grammars yield (and thus avoid a lot of if-else-ing in the rowmaker) by using rowfilters.

    I would have needed the union grammar several times before but had always quickly hacked around that need with some custom grammar. Another itch that has in this way come up multiple times before and for which 2.8 has what I think is a reasonable solution: I occasionally want to share some logic between multiple RDs, but that logic is not general enough to go into DaCHS itself. For such situations, you can now drop a file local.py into your configuration directory (usually, /var/gavo/etc).

    In code saying from gavo import api (which is what you should in general do when programming against DaCHS; in procs, say <setup imports="gavo.api"/>), you can then access the names defined in there as api.local.<name>. For instance (and that's not contrived), say your observers have several particularly babylonian ways of writing times, and you have to parse these in several data collections (i.e., RDs). You could then add a function like this to your local.py:

    def parse_babylonian_time(raw_time:str) -> float:
      """Tries to interpret raw_time as a time in one of the many forms
      our observers like so much.
    
      Here is the syntaxes supported by the function:
    
      >>> parse_babylonian_time("1h")
      3600.0
      >>> parse_babylonian_time("4h30m")
      16200.0
      >>> parse_babylonian_time("1h30m20s")
      5420.0
      >>> parse_babylonian_time("20m")
      1200.0
      >>> parse_babylonian_time("10.5m")
      630.0
      >>> parse_babylonian_time("1m10s")
      70.0
      >>> parse_babylonian_time("15s")
      15.0
      >>> parse_babylonian_time("s23m")
      Traceback (most recent call last):
      ValueError: Cannot understand time 's23m'
      """
      mat = re.match(
        r"^(?P<hours>\d+(?:\.\d+)?h)?"
        r"(?P<minutes>\d+(?:\.\d+)?m)?"
        r"(?P<seconds>\d+(?:\.\d+)?s)?$", raw_time)
      if mat is None:
        raise ValueError(f"Cannot understand time '{raw_time}'")
      parts = mat.groupdict()
    
      return (float((parts["hours"] or "0h")[:-1])*3600
        + float((parts["minutes"] or "0m")[:-1])*60
        + float((parts["seconds"] or "0s")[:-1]))
    

    (or something similarly abominable). That way, the function is available to all RDs, there is just one implementation to maintain, and it can be centrally tested (dachs test could certainly do with with a facility to execute local.py doctests, too).

    DaCHS 2.8 also comes with yet another way to declare space-time metadata. That's a longer story, and while all this should have happened 10 years ago, there's no particular hurry now. I will therefore write about improvements in TIMESYS and COOSYS in a later post dedicated to votable:Coords and its products. Meanwhile, just two things: In the unlikely case you already have “stc2“ annotations in your RDs, you will have to rename the value attribute in space clauses to location. And: SSAP and SIAP now produce proper TIMESYS-es. If you happen to know the timescales and reference positions of your observation dates, starting in 2.8 you can define them in the respective mixins (the refposition and timescale mixin parameters).

    There are two notable additions in DaCHS' Datalink support (which is newly declared to support version 1.1): For one, you can now pass contentQualifier to descriptor.makeLink[FromFile], which will normally be a product type taken from http://www.ivoa.net/rdf/product-type (e.g., “image” or “dynamic-spectrum“). Because they can help clients select appropriate clients to send a datalink to, it is certainly a good thing to add them to your datalinks where applicable.

    Also, datalink meta makers can now return ProcLinkDef instances. This lets you have multiple distinct processing services within a single Datalink document. To make that a bit prettier, there is also a secret handshake (as in: an INFO element with a name of title) between DaCHS' datalink service and the XSLT that formats datalink documents in browsers (also available for third-party datalink documents). See multiple processing services in the reference for details.

    Let me briefly mention a few more changes you may be interested in:

    • condDescs can now be declared as inputOptional, which is useful when you want to have syntax-adaptive defaults.
    • you can now configure the size of DaCHS connection pools in [db]poolSize (in particular, set it to 0 to disable connection pooling).
    • in ADQL, you can now do things like CONTAINS(CIRCLE(23, 42, 1), some_moc) (i.e., compute boolean predicates between the classical geometries and MOCs).
    • DaCHS no longer fails with numpy-s later than 1.23, and is no longer dependent on the cgi module that is scheduled for removal from python. In consequence, there is a new dependency, python3-multipart.
    [1]That is, unless you already defined spectralAxis because DaCHS' heuristics were wrong before version 2.8. But then your service won't break, either.
  • DaCHS is now at Version 2.7

    Logo-ish 2.7 with a multi-array plot

    Last Friday, I have released Version 2.7 of GAVO's Virtual Observatory server package DaCHS. As is customary, I will give a brief overview of the more noteworthy changes in this blog post. This is probably only of interest to people running DaCHS-based data centres. What I discuss here is both a bit more verbose and a bit less extensive than what you find in the Changes file (when installed from package, you would read it by running /usr/share/doc/python3-gavo/changelog.gz).

    The highlight in this release from my view are simple, numpy-like vector operations in ADQL. Regular readers of this blog will already have seen an example for their use. This is altogether a prototype, which is why what specification is there is only on the IVOA wiki. It is thus likely some details of the vector math will change until they make it into any sort of standard (I am hoping for ADQL 2.2). This should not keep you from trying it out and telling your users about it.

    In that same vein, the FITS binary table grammar now copes with vectors, which makes it easier to populate tables that make these operations useful, and for the sort of large tables where the array magic has particularly much promise, it is now a lot simpler to feed array-valued columns with C boosters.

    Other ADQL work includes the addition of proper, standards-compliant epoch propagation (i.e., “application of proper motion and radial velocity“) in the form of the ivo_epoch_prop and ivo_epoch_prop_pos user defined functions. Regrettably, this will not immediately work for you, as it builds on a feature in pgsphere that upstream has not merged yet; comments on that PR will certainly help make that happen. Of course, if you want, you can just build the pgsphere branch containing the new feature yourself. To make up for this complication, DaCHS will no longer advertise UDFs that will not work given the database extensions present – which will help me be a bit more liberal in letting in UDFs wrapping functionality not in Postgres' default distribution in the future.

    If you run datalink services and have multiple items with the same semantics, you may be interested in using local_semantics in Datalink. The use case here is that clients like TOPCAT will remain on, say, light curves in a red filter when the user jumps between records rather than randomly switching between red and blue ones when both have #coderived semantics (Mark's proposal). If you have data of this kind: you can now pass a localSemantics parameter to the makeLink and makeLinkFromFile methods of datalink descriptors; what string you use is up to you, as long as it's the same between similar rows for different datasets.

    I tend to forget that surprisingly many people actually do something with the ADQL form you get on DaCHS' web interface rather than use a TAP client. Well, a DaCHS operator complained about really sub-standard table headings in the HTML tables coming out of this service. Looking again, I had to admit he was right. So, TAP columns now have more meaningful table headings; in particular, if you write expressions, up to a certain length these expressions will be used as table headings. At least in this respect the ADQL form now has an advantage over using a proper client.

    In case you have a processor doing astrometric calibration with astrometry.net (you probably don't because it would have been very hard to make that work on without a lot of hacks so far) – have another look at the documentation because I have had various reasons to change api.AnetHeaderProcessor's API in quite a number of ways. It's now a lot easier to use with astrometry.net and source-extractor as distributed by Debian, but I'd still not have broken the API so badly if I had suspected anyone but me had significant code against this.

    I should also warn you that DaCHS now uses astropy to format sexagesimal times and coordinates. This is probably welcome news to those who ever encountered one of DaCHS' 05:59:60 outputs (which happened due to the way it did its rounding). Still, if you have regression tests testing for strings like that, you will need to update them.

    From the many minor fixes I should probably mention that DaCHS is now ready for Postgres 15 (which will probably the Postgres version in the next Debian stable). This used to be broken on new installations because Postgres 15 no longer lets normal users write to the public schema. DaCHS needs a database role that can do this, though, because it defines public functions. Since version 2.7, it does the necessary setup to make this possible. If you make your public schema non-world-writable manually – Postgres upgrades will not do that for you, and I would say there is no strong reason to do so for databases backing DaCHS –, do not forget to GRANT ALL ON SCHEMA public TO gavoadmin.

    With this – don't wait, upgrade. If you have GAVO's repository enabled, apt update && apt upgrade would probably do the trick, though of course I recommend having a look at our upgrading guide for robustness and good housekeeping.

  • What's new in DaCHS 2.6

    Rainbowy image with a DaCHS logo

    The transitions of four-times ionised Technetium, with the energies of the lower and upper states on the two axes and the colour a measure of the frequency of the emitted light. Well: DaCHS 2.6 has preliminary support for LineTAP.

    After six months of development, I have just released DaCHS 2.6. This blog post is the traditional discussion of major news for operators of DaCHS-based services. Also have a look at the changelog, which has finally made it to the Debian package; if you installed from package, you can now read it using zless /usr/share/doc/python3-gavo/changelog.gz.

    This post's title picture alludes to LineTAP, an upcoming standard for disseminating data on specral lines intended to obviate SLAP and play nicely with VAMDC. The standard only exists as a rather preliminary draft yet, but there should be a working draft soon-ish. If you have line data to publish or can get your hands on some, consider trying //linetap#table-0 (the “-0” suggests that there will be changes, but I'd hope not terribly many).

    Quite a few changes resulted from a seemingly minor user request: “How do I put a form interface in front of my EPN-TAP table?“ I rather foolishly chose to use the obscore table as an example, which was about the worst choice I could have made, as ivoa.obscore is a view in DaCHS (which means, for instance, that you can't simply add indexes), and a rather large one in Heidelberg at that (more than 80 Megarows, which means that without indexes, interactive services are impossible).

    The first change in that direction was supporting form conditions over pairs of columns; you need that whenever your table has intervals in column pairs, as for instance em_min/em_max in obscore. With the new code, when users write something like 8000 .. 10000, you can instruct DaCHS to translate that into SQL computing whether or not the intervals overlap.

    The spectral queries from that form still timed out, even after I had made sure there were indexes on the larger contributing tables' spectral columns. The reason for that was that the obscore mixin casted the spectral coordinates to double precision[1], and even if there is an index on a real-valued my_col, a condition like:

    my_col::double precision < 4
    

    will not use the index (unless it were over the cast expression, of course). I have hence shortened a few obscore columns (specifically, s_fov, s_resolution, em_min, em_max, em_res_power, and s_pixel_scale) to real; that's what they are in SSAP, and for now I cannot see a case where these would need to be double precision in a discovery protocol.

    Having this service reminded me that registering obscore as an independent resource (rather than just as a table in a tap service's tableset) was something I've been wanting to tackle for quite a time now. This needs proper metadata, in particular coverage metadata. Determining the coverage of obscore is now possible (run dachs limits //obscore), and using codeItems (more or less explicitly), you can inject that metadata where you need it.

    The cover story (“use case,” if you will) underlying this form-based service on top of obscore that started all that was that it was supposed to be friendly to optical astronomers, who by and large are still stuck with Ångström (that is, 10 − 10 m), and hence I wanted to write the spectral information in Ångström, too. In this case, the old displayUnit display hint would have done (because Obscore uses wavelengths, too), but by the time I noticed that, I had already written a spectralUnit display hint. With that, you can write something like:

    <column name="e_min"
      unit="J"
      description="Lower energy in the spectrum"
      displayHint="spectralUnit=Angstrom"/>
    

    This would convert e_min to Ångström when written to HTML table (but not otherwise, following the assumption that non-HTML data will be consumed by machines that have no use for legacy units).

    Talking about HTML: If your root template is derived from root-tree.html (it is not unless you made it so), you have to apply a minor update to it; locate the tmpl_resDetails “script” (it's actually some HTML) in /var/gavo/web/templates/root.html. In there, there's a $description, which for the javascript templater that interprets this thing means “insert the content of the description field, properly escaping it”. Since 2.6, however, DaCHS produces these descriptions in HTML. That's progress, since these descriptions often contain links or other formatting. But it means that you have to tell the templater to not escape things: Just write $!description instead.

    There are a few new things you can do in RDs. First, there are relocatable RDs: It is now recommended to have resdir="." in the opening resource (and dachs start's templates are nudging you to do that). Without that, the resource directory defaults to inputsDir/<schema>, which breaks as soon as you need to rename that directory. Now: renaming resource directories is never easy in DaCHS (for instance, because they are reflected in URLs). But for instance with mirrors, or when forking a resource, such renames happen, and relocatable RD make that a lot simpler. You can obtain the current value of the resource directory from the new \resdir macro.

    Then, by popular request, you can now have index options. If you look at the documentation for create index in the postgres docs, you will notice that there are quite a few things you can do to an index. Acquainting DaCHS' index element with all of these seemed wrong to me, in particular because most of these things are only interesting in rather special circumstances beyond DaCHS' control. Instead, you can now add option elements to an index to change its behaviour, each of which can reflect some postgres configuration item. DaCHS will order your fragments so the resulting command fits Postgres' grammar.

    Since this is somewhat low-level, I recommend isolating the details in userconfig. For instance, you could add streams there saying:

    <STREAM id="staticindex">
      <doc>For indexes on tables that never change, save about 10% storage
      by feeding this.</doc>
      <option>WITH (fillfactor=100)</option>
    </STREAM>
    
    <STREAM id="onfastdisk">
      <doc>FEED this into an index to let it live on a fast disk</doc>
      <option>TABLESPACE fast</option>
    </STREAM>
    

    (the second stream assumes you have set up such a tablespace). You could then configure your indexes like this:

    <index columns="foo">
      <FEED source="%#staticindex"/>
      <FEED source="%#onfastdisk"/>
    </index>
    

    A feature I have put in mainly because of, say, due diligence is that you can now store the administrator password as a hash in /etc/gavo.rc. This has the advantage that people that get to read your configuration cannot (reasonably) become administrators on DaCHS' web interface; I'd consider the hash strong enough that you could put that into version control. Of course, that administrator can't do all that much in the first place.

    The drawback of hashing the admin password is that then DaCHS itself cannot use the password to authenticate against a running server. That is not a disaster, but it will keep it from automatically discarding the root page on changes and automatically clearing a few caches when you import a resource.

    As usual, there are many other changes; let me mention

    • the modern VOTables from SCS I have celebrated here before,
    • the makeIAUId(prefix, long, lat) rowmaker function that makes creating IAU-compliant identifiers a bit simpler,
    • a function utils.formatFloat that may be helpful when producing human-readable floating-point numbers (it's not in gavo.api yet, but I think it will migrate there),
    • the statistics property on columns that you can set to enumerate on TEXT-typed columns to make DaCHS collect preliminary statistics on those (more on that in a later post),
    • the -d option to dachs limits to dump the column statistics DaCHS has gathered (see the DaCHS 2.4 announcement for more on these stats), and
    • that the maximum order of a MOC is now given in ASCII-MOCs DaCHS produces.

    With this: If you have GAVO's repository enabled, you will get DaCHS 2.6 with the next apt upgrade. I will also try to get it into the Debian backports, too, and if I manage that, you will read about it on this blog.

    [1]

    In case you wonder why it did that: The obscore mixin basically fills out templates like:

    CAST(\em_min AS real) AS em_min,
    CAST(\em_max AS real) AS em_max,
    

    where the macro replacements are taken from whatever you give in the mixin's parameters. Now, if \em_min happens to work out to NULL, Postgres just picks any old type (text, IIRC) for the corresponding column. That is not a problem until the result of that table definition is UNION-ed together with another table where \em_min is a proper floating point number: Postgres will then complain about incompatible types in a union. To avoid that, I must give a type to anything contributing to the obscore view.

  • Small Change, Big Win

    Screenshot with the Erratum content (2 lines) highlighted

    That's SCS 1.03 Erratum 2 rendered in my browser with a bit of image processing to celebrate that there's one painful VO legacy less on this world.

    PSA: what follows is VO lore that may be entertaining but will not help you use or publish astronomical data.

    Today, I've made a very small commit to my VO publication package DaCHS (revision 8452):

    --- gavo/web/vodal.py (revision 8451)
    +++ gavo/web/vodal.py (working copy)
    @@ -260,7 +260,6 @@
            version = "1.0"
            parameterStyle = "dali"
            standardId = "ivo://ivoa.net/std/ConeSearch"
    -     defaultOutputFormat = "votable1.1"
    

    One deleted line, small cause, huge effect.

    This story starts with the oldest „operational“ VO standard, Simple Cone Search, which was formally published in 2008 but really got its current shape a lot earlier.

    I've not been there back then, but I think the authors expected that clients would be parsing the VOTables that the services were returning using something called XML binding. That, well, was a technique where code was generated from an XML schema, and only instance documents conforming to that exact schema could be parsed with that code.

    That is of course the opposite of the golden rule of interoperability (“be strict in what you produce and lenient in what you accept”) and thus would have been a terrible implementation choice for interoperable clients (and I believe nobody ever tried it). But somehow – or that is my explanation – the XML binding reasoning translated into the requirement that SCS services could only return VOTable 1.0 or VOTable 1.1, and that made it into the standard. It was hence the law. And that it DaCHS had to keep alive VOTable 1.1 for writing (which the above commit of course doesn't remove, but I can remove it now any time I feel like it). And that it couldn't do a lot of useful things that required features not present in VOTable 1.1.

    Nobody dared to touch the problem for about a decade, as it was actually unclear whether some ancient code might still be doing useful work with SCS and XML binding. And I shouldn't be scoulding them after I have recently broken ESO examples under the assumption that “aw, nobody's gonna do this“. Then, starting about five years ago, we had a couple of discussions at various conferences about how we might bring SCS into the present VO (where it, it has to be said, sticks out a bit for several other reasons, too, like its funky error reporting and the funny UCDs it uses). But these weren't easy: What exactly are we allowed to break within a minor version under the above assumption (“aw, nobody… “)? If we do a major version, how do we plan for co-existence for two parallel major version?

    Well: For the version restriction, in the end a simple Erratum was enough. On January 26, 2022, the IVOA Technical Coordination Group accepted SCS 1.03 Erratum 2. And now I can return whatever VOTable version suits me. Phewy.

    I can now have GROUPs in GROUPs (which I need to annotate photometry), I can finally return tables with my old proposal for STC in VOTable in SCS results (where they would have mattered most – not that anyone cares any more, as that ship has sailed somewhere completely different).

    Hey, I can have xtypes. Doesn't mean anything to you? Well, try this: In TOPCAT, open VO/Cone Search. Type “Constellations” and select the “cslt cone“ service. Run a query for some part of the sky, with a size of a few 10s of degrees. Open a sky plot, and in there, do Layers → Add Area Control, and in that control select the table you have just pulled in. Presto: You'll see the constellation boundaries without further configuration, and that's because TOPCAT has the xtype to figure out that the odd numbers it sees are really the vertex coordinates of a spherical polygon in DALI serialisation.

    Not a big deal, you say? Perhaps. But lots of small deals accumulated make the difference between what you can do and what you cannot, in particular across services (which is what the VO is about).

    Removing the erroneous constraint on VOTable versions in SCS opened the standard up for quite a few small deals. Thanks, TCG!

  • DaCHS 2.5: Check your UCDs

    DaCHS logo on top of a map of UCDs

    In the background of the DaCHS 2.5 release picture: UCDs grabbed from the Registry. The factual background: DaCHS 2.5 will now moan at you when you invent or mistype UCDs

    This afternoon, I have released DaCHS 2.5. As usual, I will discuss the more important changes in a blog post – this one.

    A change many of you will not like too much is that DaCHS now validates UCDs you give it, and it will warn you when you do not follow the UCD rules. This may seem like nit-picking, but as blind discovery is on the verge of becoming usable in the VO, making sure these strings actually are what they should be is becoming operationally important: If I want to find resources that give errors for their photometry, I have to know whether it's stat.error;phot.mag.b or phot.mag.b;stat.error, or else I will miss half the resources out there.

    So, I'm sorry if DaCHS starts complaining about half of your RDs after you update, but it's for a good cause. And don't feel bad about the complaints: DaCHS complained about close to half of my RDs after I had put in that feature.

    By the way, this comes as part of a larger effort on the side of the Operations IG to improve the validity of UCDs and units in the VO, an effort that has unearthed bugs in the SSAP and SLAP specifications in that they require UCDs forbidden by the UCD standard. DaCHS 2.5 still follows SSAP and SLAP, and hence external tools like stilts will protest because of bad UCDs even if DaCHS is happy. Errata for the specifications are being worked on, and once they are accepted, DaCHS and stilts will finally agree on UCD validity, or so I hope.

    Code-wise, a much more intrusive change was that asynchronous services (in particular, async TAP) now use the same formalism for parsing parameters as their synchronous counterparts. It may seem odd that that hasn't been the case up to now, but there were good reasons for that; for instance, with async, people can post incomplete parameter sets that would be rejected by normal sync processing.

    Unless you are running User UWS services, you should not notice anything. If you do run User UWS services, please contact me before upgrading. I would like to work with you on how these should look like in the future.

    Another change that might break your services is that DaCHS now actually complies to VOUnits, which has always forbidden whitespace of all kinds in unit strings. DaCHS, on the other hand, has foolishly encouraged putting whitespace between scale factors and pure units, as in 1e-10 m. That's not interoperable, and hence DaCHS now rejects such units. This may lead to hidden failures when dachs val doesn't notice something is a unit, and things only break during execution. I'm aware of one place where that's relevant: spectral cutout services that need to know the spectral unit If you're running those, make double sure that the spectralUnit in the SSAP mixin does not contain any whitespace. It's 0.1nm according to VOUnits, not 0.1 nm.

    An update that should silently make your services more compliant is that DaCHS' representation of EPN-TAP is updated to what is currently under IVOA review. After you upgrade, DaCHS will try to update your EPN tables' metadata, which in turn should make stilts taplint a lot happier. It will also make DaCHS pass on the new, IVOA table utype to the Registry, which is how people should in the future find EPN-TAP data.

    DaCHS now also contains some code that may help you import data from HDF5 files. For one, there is the HDF5 grammar, which rather directly pulls data from HDF5s written by astropy or vaex. But, really: HDF5 is a rather low-level format not particularly well suited for relational data, and it is virtually impossible to write generic code for doing something sensible with it. The two flavours DaCHS supports have very little in common, and it is therefore almost certain that if you have HDF5s coming from somewhere else, hdf5Grammar will not understand them. Still, let us know what you've got, we may be able to put support for it in.

    Hdf5grammar is written in Python, and thus imports perhaps a few thousand rows per second. For Gigarow-sized data collections, that's nowhere near fast enough, and hence for vaex-written HDF5s, there is booster support. As before, if you have bulk data in HDF5 that you want to put into a database and that was not written by vaex, let us know and we'll see what we can do.

    A surprisingly minor change enabled DaCHS to deal with materialised views, database views that are turned into actual tables by postgres. See the corresponding section in the tutorial for how you can use them. We do not have any materialised views in our Heidelberg data center yet. So, if you use them and notice something is clunky, your feedback is particularly appreciated.

    There are many smaller changes and improvements; let me mention what the changelog euphemistically calls ”better systemd integration”, which really means that so far systemctl restart dachs simply didn't do anything at all. Apologies. And shame on everyone who was bewildered but failed to report this to dachs-support.

    Also, you can use float arrays in boosters now, and DaCHS' ADQL has just leared about COALESCE. That's a SQL feature that lets you deal sensibly with NULLs in some cases: COALESCE(arg1, arg2, ...) will return the first non-NULL argument it encounters. That may sound like a slightly exotic function. Until you need it, at which point you wonder how ADQL could reach its ripe age without COALESCE.

    Finally, let me mention something that is not part of the release, though it is DaCHS-related and is new since the last release: I have cleaned up the access log processing machinery we have used in Heidelberg in the past 15 years or so, and I have packaged it up for general consumption. It is, of course, a DaCHS RD that you can just check out and use in your own DaCHS installation if you have to keep access logs and want to do that with at least some basic respect for your user's rights. See http://docs.g-vo.org/DaCHS/tutorial.html#access-logs for details.

« Page 2 / 6 »