• DaCHS 1.0 released

    Today, I have released DaCHS 1.0 – after long years in the 0.9 range, it was finally time to do so. The jump in the major version number was an opportunity to remove some cruft that had accumulated over the years; this, on the other hand, means that if you're running DaCHS, you should watch the upgrade and see if anything broke later (this might be the perfect time to add regression tests to your RDs).

    The changelog is below, but before that a bold-faced warning:

    Install python-astropy before upgrading

    This is because DaCHS now depends on astropy rather than pyfits and pywcs. The latter is no longer part of Debian stretch, and so we made the jump to astropy (that would have been due during Debian stretch's lifetime anyway) even before 1.0.

    Now, Debian holds back packages with new dependencies, and due to the way DaCHS' modules are distributed, DaCHS will break when some of its packages are held back. The symptom is error messages like "pkg_resources.DistributionNotFound: gavodachs==0.9.8". If you already see those, a apt-get dist-upgrade should get you in business again.

    With this out of the way, here is an annotated log of the major changes:

    • DaCHS' main entry point is now actually called dachs (i.e., call dachs imp q and such in the future). gavo will work as an alias for quite a while to come, though, and it's still used a lot in the documentation (you're welcome to fix this: the docs are maintained on github).
    • Hopefully more useful manpage (of course, also available with man dachs) – have a peek!
    • UWS support is now at version 1.1 (i.e., there's creationDate in jobs, filters in the joblist, and slow polling).
    • Added “declarative” licenses. Please read the Licensing chapter in the tutorial and slap licenses on your data.
    • Now using astropy.wcs instead of pywcs, and astropy.io.fits instead of pyfits. The respective APIs have, unfortunately, changed quite a bit. If you're using them (e.g., in processors), you'll have to change your code; it's unlikely services are impacted at runtime. (see also How do I update my code?).
    • Removed the //epntap#table-2_0mixin. Use
      //epntap2#table-2_0 instead (sorry).
    • Removed sdmCore (use Datalink/SODA instead); the SODA procs in //datalink are also gone, use the ones from //soda instead (sorry, SODA development has been difficult on the IVOA level).
    • Removed imp -u flag and the corresponding updateMode parse option. If you used that or the uploadCore, just mark the DDs involved with updating="True" instead.
    • Massive sanitation of input parameter processing. If you've been using inputTable, inputDD, or have been doing creative things with inputKeys, please check the respective services carefully after upgrading. See also DaCHS' Service Interface in the reference documentation. The most user-visible change in this department is if you've been using repeated parameters to fill array-valued inputs. That's no longer allowed; if you actually must have this kind of thing, you'll need a custom core and must fill the arrays by hand.
    • In DaCHS' SQL interface, tuples now are matched to records and lists to arrays (it was the other way round before). If while importing you manually created tuples to fill to array-like columns, you'll have to make lists from these now.
    • rsc.makeData or rsc.TableForDef no longer automatically make connections when used on database tables. You must give them explicit connection arguments now (with base.getTableConn() as conn:).
    • logo_tiny.png and logo_big.png are now ignored by DaCHS, all logos spit out by it are now based on logo_medium.png, including, if not overridden, the favicon (that you will now get if you have not set it before).
    • Removed (probably largely unused) features editCore, SDM2 support, pkg_resource overrides, simpleView, computedCore.
    • Removed the argparse module shipped with DaCHS. This breaks compatibility with python 2.6 (although you can still run DaCHS with a manually installed argparse.py in 2.6).

    Even though that's quite a mouthful, I expect few people will actually experience breaking services. If you do, by all means let us know on the DaCHS-support mailing list.

    As usual, the general upgrading instructions are available in the operator's guide; if you plan on upgrading to stretch soon, also have a look at hints on postgres upgrades. Stretch comes with postgres 9.6 (jessie: 9.4), and you should migrate sooner or later anyway.

    Users not using Debian's package management can, as usual, grab tarballs from http://soft.g-vo.org/dachs.

  • ADQL tricks at MPIA

    Aerial image of Heidelberg and Königstuhl

    The 2017-06-29 ADQL talk (red circle) from 30000 ft

    Today I was up on Heidelberg's signature mountain, Königstuhl, at the Max-Planck-Institute for Astronomy for a little talk on what I'd provisionally call “intermediate ADQL” – discussing some aspects of ADQL and some TAP techniques that may not be immediately obvious but still generally and straightforwardly applicable to everyday problems. Since I suspect the lecture notes for that talk may be of interest to some readers of this blog, I thought I should share them here.

    What this also contains is a very quick piece of pyVO-based python (which needs both this helper and a recent pyVO) for a use case that comes up fairly often: “Give me all proper motions (radio fluxes, distances, radial velocities, whatever) for object in this region.”

    This uses a discovery case I've been after for quite a while now: Find services by the UCDs of tables within them. And while that's been possible for quite a while on GAVO's Registry UI WIRR, there's still too many services that don't declare their tables to the Registry, and when talking about TAP, the situation is still a bit worse (as has been mentioned in my account of the last interop). So – enjoy the code, but very frankly, you'll still see wires sticking out for a several months yet.

    And if you run a TAP service yourself, please have a look at how to enable table discovery over on the IVOA wiki so we can finally get those pesky wires out of our users' eyes.

  • GAVO at the Northern Spring Interop

    A cake celebrating IVOA 2002-2017

    15 Years of IVOA: The birthday cake our Shanghai hosts prepared for us.

    Every half year, VO enthusiasts from all over the world gather for an “Interoperability conference”, or Interop for short. The latest such event, the Shanghai Interop 2017, ended Friday a week ago. It has been a “long” one again after the short southern spring Interop in Trieste last year (featured in this blog).

    As usual, it was a week of many discussions and much consensus-building. In this post, I'd like to mention a few of the GAVO-related contributions; links typcially go to slides or lecture notes PDFs.

    On the Registry side of things, we're currently (among many other things) briding the gap between DOIs and the Registry in VOResource 1.1, and we invited registry providers to take up the new features, as well as proposing how to update RegTAP (which is used to actually query the Registry) to cope with the new metadata.

    Also in Registry, our efforts of almost a decade to properly support registering tables and similar data collections bore fruit (Britain's Mark Taylor reported on his experiences taking up our current proposal), and the fairly spectacular new Aladin V10 (presented by the CDS' Pierre Fernique, who showed off what I'm tempted to call a “visual registry interface”) urgently needs what we've developed over the years.

    We furthermore reported on new steps to finally let people search the registry using Space-Time constraints (spoiler: the tech is almost there, registry records need lots of work).

    Spatial searches in the registries are one thing enabled by storing and searching for MOCs in relational databases, as was reported by Markus Nullmeier over in an Applications session. The setting may already tell you that these MOCs (Multi Order Coverages, a healpix-based way of representing fairly arbitrary areas on the sky) have applications far beyond Registry.

    Also in Apps, Ole reported on progress in packaging VO applications for easy and reliable installation, in this case for Debian and derivatives. Finally for Apps, Margarida reported on getting lines and line lists into the spectral analysis package SPLAT: Implementation of SLAP and VAMDC interfaces in SPLAT-VO.

    In the wider area of data access protocols and underlying data models, we contributed to Marco's talk on the long-overdue facelifting for the VO's bedrock, Simple Cone Search (Keeping SCS up-to-date within DAL landscape) – the fact that there's an installed base of 15000 of such services may let you guess that we need to tread lightly here. On the bleeding-edge side of things, we presented our current ideas on how, eventually, several data models, data modelling as such and the annotation of data according to these data models might play together in publishing time domain data with DACHS (previously featured on this blog in a slightly less technical way).

    We also talked about education and outreach. Hendrik reported on our ADQL course and how it helps future astronomers learn dealing efficiently with even very large datasets. Hendrik's long-lasting dedication to these topics did not go unpunished at this interop: since the Exec meeting on the Interop Wednesday he is vice-chairing the education interest group of the IVOA. Back in the session I also mused a bit about what metadata changes are needed to make the VO tutorial collection VOTT more useful.

    It is a particular pleasure for me to mention that the IVOA has a new interest group: “Solar System”. Regular readers of the blog will have noticed that I have a particularly soft spot in my heart for that crowd, and so I gave a short overview over how DaCHS is used among them, too.

    And that's just the official programme. Much more fixing, designing, and discussion went on between sessions or in the evenings. The latter, of course, included some decidedly less technical aspects. Including, as pictured above, a nice birthday cake for the IVOA, as it is now 15 year since the first Interop meeting in January 2002.

  • Time Series

    The IOVA's committee on science priorities (CSP) has declared the “time domain” as one of its focus topics quite a while ago, an action boiling down to a call to the IVOA member projects to think about support for time series and their analysis in services, standards, and clients.

    While for several years, response has been lackluster, work on time series has gathered quite a bit of steam recently. For instance, the spectral client SPLAT (co-maintained by GAVO) has grown some preliminary support to properly display time series (very rudimentary in what's currently released), and lively discussions on proper metadata for time series have been going on on the Data Models mailing list of the IVOA – if you're interested in the time domain, this would be a good time to subscribe for a while and comment as appropriate.

    Meanwhile, in our Heidelberg data center, we've joined the fray by publishing our first time series service (science background: searching for exoplanets in the Milky Way bulge using gravitational lensing), which is available through SSA (look for k2c9vst) and through ObsCore (at http://dc.g-vo.org/tap, collection name k2c9vst), too. For details see also the service info.

    Since right now future standards are being worked out, this is a perfect time to publish your time series; this way you get to influence what people will be able to tell machines about their time series in the next couple of years. Ask our staff (contact below) if you want us to publish for you. But you can also self-publish using the DaCHS publication package. Refer to the resource descriptor of the k2c9vst service to get started.

    At its heart is the table definition of the time series, which is basically:

    <table id="instance">
      <column name="hjd" type="double precision"
          unit="d" ucd="time.epoch"
          tablehead="Time"
          description="Time this photometry corresponds to."
          verbLevel="1"/>
      <column name="df" type="double precision"
          unit="adu" ucd="phot.flux"
          tablehead="Diff. Flux"
          description="Difference as defined by 2008MNRAS.386L..77B"
          verbLevel="1"/>
      <column name="e_df"
          unit="adu" ucd="stat.error;phot.flux"
          tablehead="Err. DF"
          description="Error in difference flux."
          verbLevel="15"/>
    </table>
    

    – in the actual service, there are a few more columns, but time, value, and error actually make up a full time series.

    Except that a machine can't really tell what this is yet (well, perhaps it could using UCDs, but that's a different matter). What it needs to work out is what's the independent axis, what the frames are, etc. And to do that, the machine needs annotation, i.e., machine-readable, structured declarations alongside the data and the “classic” metadata like units and descriptions.

    In actual VOTables, this will be happening through VO-DML annotation, which is also still seriously being discussed; whatever we currently spit out you can inspect in the XML source of this example document.

    DaCHS, however, isolates you from the concrete details of writing VOTables. Instead, you write annotations in a JSON-inspired little language we've christened SIL (“Simple Instance Language”; reference). The complicated part is to know what types and attributes you have to declare, which is exactly what the data models is a bout. As said initially, the details are still in flux here, but this is what things look like right now:

    <dm>
      (ivoa:Measurement) {
        value: @df
        statError: @e_df
      }
    </dm>
    
    <dm>
      (stc2:Coords) {
        time: (stc2:Coord) {
          frame:
            (stc2:TimeFrame) {
              timescale: UTC
              refPosition: BARYCENTER
              kind: JD }
          loc: @hjd
        }
        space:
          (stc2:Coord) {
            frame:
              (stc2:SpaceFrame) {
                orientation: ICRS
                epoch: "J2000.0"
              }
            loc: [@raj2000 @dej2000]
        }
      }
    </dm>
    
    <dm>
      (ndcube:Cube) {
        independent_axes: [@hjd]
        dependent_axes: [@df @mag]
      }
    </dm>
    

    If you consider this for a moment, you'll see that each dm element corresponds to something like an object template of a certain “type”. The first, for instance, defines a measurement with a value and a statistical error. Both happen to be given as references to columns in the table defined above (as indicated by the @ signs).

    The last annotation defines a data cube; a time series in this definition is simply a data cube with just a single non-degenerate independently varying axis (the independent_axis attribute; in the value the square brackets indicate a sequence) that happens to be time-like. And that hjd is time-like, VO-DML enabled clients will work out when interpreting the STC (“Space-Time-Coordinates”) annotation. In there, you will see that hjd is referenced from the time attribute and with a time-like frame that also defines that this particular flavor of HJD is what a hypothetical clock at the solar system's barycenter would measure if it stood in the gravitational potential in Greenwhich, and had leap seconds thrown in now and then. And that long story is communicated through “literals”, constant strings like “BARYCENTER” or ”TT”, which are also legal within DaCHS data model annotations.

    This may seem a bit complicated at first. I argue, though, that given what time series clients will have to do anyway, going through the cube and STC annotations is actually about the most straightforward thing you can do.

    But perhaps I'm wrong, so again: None of this is cast in stone right now. Comments are even more welcome than usual, either below or at gavo@ari.uni-heidelberg.de.

  • Asterics Tech Forum

    The 3. Asterics DADI Tech Forum took place last week in Strasbourg - and many GAVO members made contributions as well. This time, there were 3 slots for hackathon sessions, which were also used for discussions. We'll mention two highlights of our contributions here.

    We took the opportunity to push our Provenance Data Model efforts and used the hackathon slots for provenance discussions.

    One topic was the links between the simulation data model and ProvenanceDM, and how to map from SimDM to ProvenanceDM classes. This mapping works quite well and will be included in the working draft for the data model. We also had an interesting talk by José Enrique Ruiz on his view on Provenance, workflows, and - very important - the "deployer" and "system" provenance for storing all the environment variables that may be needed to rerun the processing of some observational data. Michèle Sanguillon also presented for the first time her extension to the prov Python library (W3C) with extensions from our IVOA Provenance Data Model. We also had interested people from outside the usual provenance-interested people joining in, e.g. from the Astron project. More about our Provenance modelling efforts can be found at IVOA Provenance wiki page.

    A world premiere (of sorts) was the first discussion of RegTAP 1.1. RegTAP is a search interface to the VO Registry; it is what TOPCAT or other VO clients uses when you type in keywords to locate services. A fairly direct web-basd interface is our WIRR registry interface. RegTAP will need a bit of a makeover since VOResource, the underlying metadata scheme is currently receiving one, allowing, in particular, for including DOIs and ORCIDs (John Does of this world, rejoice: People can finally uniquely find your data and not that of all the other J. Does) in Registry records and figuring out licenses on data. Licensing may not matter when you use data in a paper but it does matter if you want to redistribute data, e.g. for planetarium programs with catalog data or pretty pictures, or when re-mixing data.

    But of course the GAVOistas happily joined the fray on the many other topics discussed, from a standard format for a time series to interoperable authentication, from datalink applications to figuring out if data coming into a program should be treated as a collection of spectra or rather an object catalog – the latter in the context of the upcoming version 10 of the VO's premier image tool Aladin, which we saw (probably another premiere) demoed. We can already promise you an exciting update!

« Page 18 / 20 »