Time Series

The IOVA’s committee on science priorities (CSP) has declared the “time domain” as one of its focus topics quite a while ago, an action boiling down to a call to the IVOA member projects to think about support for time series and their analysis in services, standards, and clients.

While for several years, response has been lackluster, work on time series has gathered quite a bit of steam recently. For instance, the spectral client SPLAT (co-maintained by GAVO) has grown some preliminary support to properly display time series (very rudimentary in what’s currently released), and lively discussions on proper metadata for time series have been going on on the Data Models mailing list of the IVOA – if you’re interested in the time domain, this would be a good time to subscribe for a while and comment as appropriate.

Meanwhile, in our Heidelberg data center, we’ve joined the fray by publishing our first time series service (science background: searching for exoplanets in the Milky Way bulge using gravitational lensing), which is available through SSA (look for k2c9vst) and through ObsCore (at http://dc.g-vo.org/tap, collection name k2c9vst), too. For details see also the service info.

Since right now future standards are being worked out, this is a perfect time to publish your time series; this way you get to influence what people will be able to tell machines about their time series in the next couple of years. Ask our staff (contact below) if you want us to publish for you. But you can also self-publish using the DaCHS publication package. Refer to the resource descriptor of the k2c9vst service to get started.

At its heart is the table definition of the time series, which is basically


<table id="instance">
  <column name="hjd" type="double precision"
      unit="d" ucd="time.epoch"
      tablehead="Time"
      description="Time this photometry corresponds to."
      verbLevel="1"/>
  <column name="df" type="double precision"
      unit="adu" ucd="phot.flux"
      tablehead="Diff. Flux"
      description="Difference as defined by 2008MNRAS.386L..77B"
      verbLevel="1"/>
  <column name="e_df"
      unit="adu" ucd="stat.error;phot.flux"
      tablehead="Err. DF"
      description="Error in difference flux."
      verbLevel="15"/>
</table>

– in the actual service, there are a few more columns, but time, value, and error actually make up a full time series.

Except that a machine can’t really tell what this is yet (well, perhaps it could using UCDs, but that’s a different matter). What it needs to work out is what’s the independent axis, what the frames are, etc. And to do that, the machine needs annotation, i.e., machine-readable, structured declarations alongside the data and the “classic” metadata like units and descriptions.

In actual VOTables, this will be happening through VO-DML annotation, which is also still seriously being discussed; whatever we currently spit out you can inspect in the XML source of this example document.

DaCHS, however, isolates you from the concrete details of writing VOTables. Instead, you write annotations in a JSON-inspired little language we’ve christened SIL (“Simple Instance Language”; reference). The complicated part is to know what types and attributes you have to declare, which is exactly what the data models is a bout. As said initially, the details are still in flux here, but this is what things look like right now:


<dm>
  (ivoa:Measurement) {
    value: @df
    statError: @e_df
  }
</dm>

<dm>
  (stc2:Coords) {
    time: (stc2:Coord) {
      frame:
        (stc2:TimeFrame) {
          timescale: UTC
          refPosition: BARYCENTER 
          kind: JD }
      loc: @hjd
    }
    space: 
      (stc2:Coord) {
        frame:
          (stc2:SpaceFrame) {
            orientation: ICRS
            epoch: "J2000.0"
          }
        loc: [@raj2000 @dej2000]
    }
  }
</dm>

<dm>
  (ndcube:Cube) {
    independent_axes: [@hjd]
    dependent_axes: [@df @mag]
  }
</dm>

If you consider this for a moment, you’ll see that each dm element corresponds to something like an object template of a certain “type”. The first, for instance, defines a measurement with a value and a statistical error. Both happen to be given as references to columns in the table defined above (as indicated by the @ signs).

The last annotation defines a data cube; a time series in this definition is simply a data cube with just a single non-degenerate independently varying axis (the independent_axis attribute; in the value the square brackets indicate a sequence) that happens to be time-like. And that hjd is time-like, VO-DML enabled clients will work out when interpreting the STC (“Space-Time-Coordinates”) annotation. In there, you will see that hjd is referenced from the time attribute and with a time-like frame that also defines that this particular flavor of HJD is what a hypothetical clock at the solar system’s barycenter would measure if it stood in the gravitational potential in Greenwhich, and had leap seconds thrown in now and then. And that long story is communicated through “literals”, constant strings like “BARYCENTER” or ”TT”, which are also legal within DaCHS data model annotations.

This may seem a bit complicated at first. I argue, though, that given what time series clients will have to do anyway, going through the cube and STC annotations is actually about the most straightforward thing you can do.

But perhaps I’m wrong, so again: None of this is cast in stone right now. Comments are even more welcome than usual, either below or at gavo@ari.uni-heidelberg.de.

Asterics Tech Forum

The 3. Asterics DADI Tech Forum took place last week in Strasbourg – and many GAVO members made contributions as well.
This time, there were 3 slots for hackathon sessions, which were also used for discussions. We’ll mention two highlights of our contributions here.

We took the opportunity to push our Provenance Data Model efforts and used the hackathon slots for provenance discussions.
One topic was the links between the simulation data model and ProvenanceDM, and how to map from SimDM to ProvenanceDM classes. This mapping works quite well and will be included in the working draft for the data model. We also had an interesting talk by José Enrique Ruiz on his view on Provenance, workflows, and – very important – the “deployer” and “system” provenance for storing all the environment variables that may be needed to rerun the processing of some observational data. Michèle Sanguillon also presented for the first time her extension to the prov Python library (W3C) with extensions from our IVOA Provenance Data Model. We also had interested people from outside the usual provenance-interested people joining in, e.g. from the Astron project. More about our Provenance modelling efforts can be found at IVOA Provenance wiki page.

A world premiere (of sorts) was the first discussion of RegTAP 1.1. RegTAP is a search interface to the VO Registry; it is what TOPCAT or other VO clients uses when you type in keywords to locate services. A fairly direct web-basd interface is our WIRR registry interface. RegTAP will need a bit of a makeover since VOResource, the underlying metadata scheme is currently receiving one, allowing, in particular, for including DOIs and ORCIDs (John Does of this world, rejoice: People can finally uniquely find your data and not that of all the other J. Does) in Registry records and figuring out licenses on data. Licensing may not matter when you use data in a paper but it does matter if you want to redistribute data, e.g. for planetarium programs with catalog data or pretty pictures, or when re-mixing data.

But of course the GAVOistas happily joined the fray on the many other topics discussed, from a standard format for a time series to interoperable authentication, from datalink applications to figuring out if data coming into a program should be treated as a collection of spectra or rather an object catalog – the latter in the context of the upcoming version 10 of the VO’s premier image tool Aladin, which we saw (probably another premiere) demoed. We can already promise you an exciting update!

DaCHS, SODA, and Datalink

DaCHS, the Data Center Helper Suite, is a comprehensive suite for publishing astronomical data to the Virtual Observatory, supporting most major protocols out there. On Dec 12, GAVO released a new version, 0.9.8. The most notable change is that now SODA is supported as specified in the last IVOA Proposed Recommendation.

This is fairly big news, as SODA is the VO’s answer to providing cutout services and the like, which obviously is important part with datasets in the Multi-Gigabyte range and the VO’s wider programme of trying to enable users to only download what they need. But even for spectra, which aren’t typically terribly large, we have been using SODA; for instance, when you just want to see the development of a single line over time, say,, it’s nice to not have to bother with the the full spectrum. The spectral client SPLAT has been offering such functionality for a couple of year now — watch out for the scissors icon in discovery results. These indicate SODA support on the respective services.

Another client that will support SODA and its basis Datalink is Aladin – we’ve seen a promising demo of that during the last Interop in Trieste. Until the clients are there, DaCHS contains a (largely re-usable) stylesheet that generates simple UIs for Datalink documents and SODA services. Some examples:

Note again that all of these are not actually web pages, they’re machine-readable metadata collections; if you don’t believe it, pull the URLs with curl. To learn more about the combo of Datalink and SODA, check out this ADASS 2015 poster (preferably before even looking at the not terribly readable standards texts).

If you’re running DaCHS yourself and can’t wait to run Datalink and SODA — here’s how to do that.

ProvenanceDM Working Draft released

We’ve released the first version of working draft for the IVOA Provenance Data Model at the IVOA documents page:

ProvenanceDM Working draft.

Updated versions will be put at the same URL (check the date! The first version is from 21st November 2016).

Want to get your hands on the very latest version?
Check out the volute svn repository! Since it’s not so easy to find what you want there, here’s the path to the Provenance Data Model at volute, and here’s a direct link to the latest development draft [pdf].

We’re happy to receive some feedback on the document via IVOA’s data modelling mailing list dm@ivoa.net.

UWS 1.1 approved!

UWS stands for Universal Worker Service and is an IVOA standard provides a protocol which can be used for accessing databases and other web services from the command line, e.g. using the python uws-client.
This allows to create (asynchronous) jobs for a web service (e.g. an SQL query), check their status, retrieve their results, abort or delete them.

The updated version 1.1 was approved at the InterOperability Meeting last week and brings some nice new features:

  • Job list filtering: When retrieving the job list, one can now retrieve only jobs created after a certain date, the latest n jobs or jobs with a certain phase (e.g. EXECUTING or COMPLETED)
  • WAIT: When asking for job details, it is now possible to append a WAIT parameter and provide an integer as wait-time in seconds. This means that the job details will only be returned when the wait-time is over or the job’s phase has changed, whichever comes first.

For all the details, have a look at the standard itself:
UWS 1.1 Recommendation.

A few examples using the CosmoSim database are given here:
UWS tutorial for CosmoSim (pdf), using 1.0 and
UWS 1.1 update at CosmoSim.

And if you want to implement UWS 1.1 for your own service, here is a test-tool that may be useful for validating for you for validating the new features:
uws-validator.