As usual in May, the people making the Virtual Observatory happen meet for their Interoperability Conference, better known as the Interop – where “meet” still has to be taken with a generous helping of salt (more on this near the end of this post). As has become customary on this blog, let me briefly discuss contributions with a significant involvement of GAVO.
A major thing from my perspective actually happened in the run-up: The IVOA executive committee (“Exec“) approved Version 2.0 of Vocabularies in the VO, a standard saying how hierarchical word lists (“vocabularies“) can be managed, disseminated, and consumed within the VO. Developing the main ideas from sufficiently restricting RDF to coming up with desise (which makes complicated things possible with surprisingly little code), and trying things out on our growing number of vocabularies took up quite a bit of my standards time in the last 20 months or so – and I’m fairly happy with the outcome, which I celebrated with a brief talk on programming with IVOA semantics during Wednesday morning’s semantics session.
In that session I gave a second, more discussion-oriented, talk, probing how to formalise data product types – which is surprisingly involved, even with the relatively straightforward use case “figure out a programme to handle the data“: What’s a spectrum? Well, something that maps a spectral coordinate to… hm. Is it still a spectrum if there’s multiple sorts values (perhaps flux, magnitude, and polarisation)? If we allow, in effect, tuples, why not whole images, which would make spectral cubes spectra – but of course few client programmes that deal with spectra do anything useful with cubes, so clearly such a definition would kill our use case. And what about slit spectra, mapping a spatial coordinat to spectra?
All this of course is reminiscent of the classical problems of semantics: An elephant is a big animal with a trunk. But when an elephant loses its trunk in an accident: does it stop being an elephant? So, much of the art here is finding the sweet spot of usability between strict and formal semantics (that will never fit the real world) and just tossing around loosely defined strings (that will simply not be machine-readable). After the session, I came up with the 2021-05-26 draft of product-type. If you read this a few years down the road, it might be interesting to compare with what product-type is today. I’m curious myself.
Quite a bit after midnight my time (still Thursday UTC), Mark Taylor talked about Software Identification, something I’ve been working on with him recently. It’s is one of the things that is short and trivial but that, when unregulated, just doesn’t work; in this case it’s servers and clients saying what they are when they speak HTTP. I stumbled into the problem while trying to locate severely outdated DaCHS installations – so, I a way I put effort into the Note Mark was talking about (and which I have just uploaded to the IVOA Document Repository) as a sort of penance.
While I was already asleep when Mark gave his talk, I was back at the Interop Friday morning CEST, when Hendrik Heinl talked about the LOFAR TAP service (which, I’m proud to say, runs on top of DaCHS); this was mainly live operations in TOPCAT (which is why there’s no exciting slides), but Hendrik used a pyVO script doing cutouts in an (optical) mosaic of the Fornax cluster built on top of – and that’s the main point – Datalink and SODA. Working this out with Hendrik made me realise the documentation of Datalink in pyVO really needs… love. Or, better, work.
Later on Friday, there was the Registry session, where I gave brief (and somewhat cramped) talks on advanced column metadata (which is intended to one day let you query the registry for things like “roughly complete to 18 mag” or “having objects out to redshift 4“) and how to put VODataService 1.2 coverage into RegTAP – I expect you’ll read more on both topics on this blog as they mature to a level at which this can leave the Registry nerd circles.
And now, about 10 pm on Friday, the meeting is slowly winding down; beyond all the talks (which were, regrettably for a free software spirit like me, on zoom), the real bonus was that there was a gather.town attached to the conference. Now, that’s a closed, proprietary, non-self-hostable platform, too, and so I have all reason to grumble. But: for the first time since February 2020 it felt like a conference, with the most useful action happening outside of the lecture halls, from trying to reach consensus on VEP-006 to teaching DaCHS datalink service declaration to learning about working with visibilities coming from VLBI (where it’s even more difficult than it is with the big antenna arrays). So… this one time I’ve made my peace with proprietary platforms.
A propos of “say no to platforms“ (in this case, slack): Due to the recent troubles with freenode, in addition to the Interop last week saw the the GAVO IRC channel move to libera.chat (where it’s still #gavo). So, for instant messaging us now that the Interop is (in effect) over: Come there.
It was already in the wee hours of Friday last week (CET) when the second “virtual Interop” had its rather unceremonious closing ceremony. Its predecessor in May had about it an air of a state of emergency. For instance, all sessions were monothematic. That was nice on the one hand, because a relatively large part of the time was available for discussion – which, really, is what the Interops are about. But then Interops are also about noticing what everyone else in the Virtual Observatory is cooking up, for which the short-ish talks we usually have at Interops work really well.
In contrast to that first Corona Interop, this second one, replacing what would have taken place in Granada, Spain, had a much more conventional format, which again accomodated many talks. But of course, this made one feel the lack of possibilities to quickly hash out a problem during a coffee break or in a spontaneous splinter quite a bit more.
Be that as it may, I would like to give you some insights on what I’m currently up to at the IVOA level; I am grateful for any feedback you can give on any of these topics.
Given that I currently chair the Semantics Working group, there was a natural focus on topics around vocabularies, and I gave two talks in that department. The one in DAL (DAL is the working group that builds the actual access protocols such as TAP or SIAP) was mainly on Datalink-related aspects of my Vocabularies in the VO 2 draft (VocInVO2), which in particular was an opportunity to thank everyone involved in the Vocabulary Enhancement Proposals we have been running this last year (all of which were about Datalink and hence closely tied to DAL). One thing I was asking for was reviews on a github pull request that would make the bysemantics method of Datalink accesses semantics-aware; basically, as intended by the original Datalink authors, when asking for #calibration links, this will also return, say, #bias links. If you can spare a moment for this: Please do!
Another thing I tried to raise some interest for is the proposed vocabulary of product types; this, I think, should eventually define what people may put into the dataproduct_type column of Obscore results, and there are related uses in Datalink and, believe it or not, the registration of SSAP (spectral) services. A question Alberto raised while I was discussing that made me realise I forgot to mention another vocabularies-related development relevant for DAL: I’ve put the gavo_vocmatch ADQL user-defined function into DaCHS. It lets you match something against a term or its narrower terms, referencing an IVOA vocabulary. For instance, if we had different sorts of time series (which, of course, would be odd for obscore that has the o_ucd column for this kind of thing), you could, using ADQL, still get all time series by querying
SELECT TOP 5 *
Here, the first argument is the vocabulary name (whatever is after the http://www.ivoa.net/rdf in the vocabulary URL), the second the “root” term, and the third the column to match against. Since postgres, for now, isn’t aware of IVOA vocabularies, the second argument must be a literal string rather than, say, an expression involving columns.
I gave a second semantics-related talk in the Registry session. That had its focus on the Unified Astronomy Thesaurus (UAT), from which people should pick the subject keywords in the VO Registry (actually, they should pick from its representation at http://www.ivoa.net/rdf/uat). I’ll probably blog about that a little more some other time. For now, let me recommend a little UAT-based game on my Semantics Based Registry Browser sembarebro: Choose two terms that are pretty far apart (like, perhaps, ionized-coma-gases and cosmic-background-radiation) and then try to join the two sub-graphs. Warning: This may waste your time. But it will acquaint you with the UAT, which may be a good thing.
In that second talk, I also mentioned a second draft vocabulary I’ve put up in the past six months, http://www.ivoa.net/rdf/messenger. This builds upon the terms for VODataService’s waveband element, which enumerated certain flavours of photons (like Radio, Optical, or X-ray). Now that we explore other messengers as well and have more and more solar system resources in the Registry, I’m arguing we ought to open up things by making “Photon” explicit in there and then adding Neutrinos and, later, other messengers. I’ve received a certain amount of pushback there on mixing the electromagnetic spectrum with particle types; on the other hand, the hierarchical nature of our vocabularies would, I think, let us smartly get away with that.
Speaking about solar system resources, I’m also listed as an author on Stéphane Erard’s talk on EPN-TAP and EPNCore v2.0, probably due to my involvement in finally bringing EPN-TAP into the IVOA document repository. I’ve already talked about that in a 2017 post on this blog – and again, if you’re interested in solar system data, this would be a good time to review the EPN-TAP working draft.
Talking about things regluar readers of this blog will have heard of: September’s Crazy Shapes post I’ve referenced in a talk on MOCs in pgsphere, together with a fervent appeal to data centers to become involved in pgsphere maintenance.
And then there was my colleague Margarida’s talk on LineTAP, a proposal to obsolete the little-used SLA protocol (which lets people search for spectral lines) with something combining the much more successful VAMDC with our beloved TAP. Me, I’m in this because I’d like to bring TOSS data closer to VAMDC – but also because having competing infrastructures for the same thing sucks.
And finally, I gave a talk I’ve called Data Model Posture Review in a session of the Data Models working group; I was somewhat worried that given its rather skeptical outlook it wouldn’t be really well-received. But in fact quite a few people shared my main conclusions – and perhaps it was another step towards resolving my decade-old spot of pain: that the VO still doesn’t offer tech to reliably bring two catalogues to the same epoch without human intervention.
With this number of talks I’ve been involved in, I’m essentially back to the level of a normal Interop. Which means I’ve been fairly knocked-out on Friday. And I can’t lie: I still regret I didn’t get to spend a few more warm days in Granada. Corona begone!
The Corona pandemic, regrettably, has also brought with it a dramatic move to closed, proprietary communication and collaboration platforms: I’m being bombarded by requests to join Zoom meetings, edit Google docs, chat on Slack, “stream” something on any of Youtube, Facebook, Instagram, or Sauron (I’ve made one of these up).
Mind you, that’s within the Virtual Observatory. Call me pig-headed, but I feel that’s a disgrace when we’re out to establish Free and open standards (for good reasons). To pick a particularly sad case, Slack right now is my pet peeve because they first had an interface to IRC (which has been doing what they do since the late 80ies, though perhaps not as prettily in a web browser) and then cut it when they had sufficient lock-in. Of course, remembering how Google first had XMPP (that’s the interoperable standard for instant messaging) in Google talk and then cut that, too… ah well, going proprietary unfortunately is just good business sense once you have sufficient lock-in.
Be that as it may, I was finally fed up with all this proprietary tech and set up something suitable for conferecing building on open, self-hostable components. It’s on https://telco.g-vo.org, and you’re welcome to use it for your telecons (assuming that when you’re reading this blog, you have at least some relationship to astronomy and open standards).
What’s in there?
Unfortunately, there doesn’t seem to be an established, Free conferencing system based on SIP/RTP, which I consider the standard for voice communication on the internet (if you’ve never heard of it: it’s what your landline phone uses in all likelihood). That came as a bit of a surprise to me, but the next best thing is a Free and multiply implemented solution, and there’s the great mumble system that (at least for me) works so much better than all the browser-based horrors, not to mention it’s quite a bit more bandwidth-effective. So: Get a client and connect to telco.g-vo.org. Join one of the two meeting rooms, done.
Mumble doesn’t have video, which, considering I’ve seen enough of peoples’ living rooms (not to mention Zoom’s silly bluebox backgrounds) to last a lifetime, counts as an advantage in my book. However, being able to share a view on a document (or slide set) and point around in it is a valid use case. Bonus points if the solution to that does not involve looking at other people’s mail, IM notifications, or screen backgrounds.
Now, a quick web search did not turn up anything acceptable to me, and since I’ve always wanted to play with websockets, I’ve created poatmyp: With it, you upload a PDF, distribute the link to your meeting partners, and all participants will see the slides and a shared pointer. And they can move around in the document together.
What’s left is shared editing. I’ve looked at a few implementations of this, but, frankly, there’s too much npm and the related curlbashware in this field to make any of it enjoyable; also, it seems nobody has bothered to provide a Debian package of one of the systems. On the other hand, there are a few trustworthy operators of etherpads out there, so for now we are pointing to them on telco.g-vo.
Setting up a mumble server and poatmyp isn’t much work if you know how to configure an nginx and have a suitable box on the web. So: perhaps you’ll use this opportunity to re-gain a bit of self-reliance? You see, there’s little point to have your local copy of the Gaia catalogue, and doing that right is hard. Thanks to people writing Free software, running a simple telecon infrastructure, on the other hand, isn’t hard any more.
The people that create the Virtual Observatory standards, organised in the IVOA, meet twice a year: Once in spring for a five-day meeting (this year it happened in Paris), and once in autumn for a three-day meeting back-to-back to ADASS, the venerable (this year it’s the 29th installment) meeting of people dealing with astronomy and computers.
We’re now on day three of ADASS, and for me, so far this has been more or an endless hackathon, with discussing and hacking on things like mirrors for DFBS, ADQL 2.1, the evolution of IVOA vocabularies (more on this soon somewhere around here), a vocabulary of object types, getting LAMOST 5 published properly in the VO, the measurements data model, convincing more registries to push out space-time coverage for their resources (I’m showing a poster on that), and a lot more.
So, getting to actually listen to talks during ADASS almost is something of a luxury, and a mind-widening at that – I’ve just listend to a talk about effectively doubling the precision of VLBI geodesy (in this case, measuring the location of radio telescopes to a few millimeters) by a piece of clever software, and before that I could learn a bit about how complex it is to figure out how much interference something emitting radio waves will cause in some other place on earth (like, well, a radio telescope). In case you’re curious: A bit more than a year from now, short papers on the topics will appear in the proceedings of ADASS XXIX, which in turn you’ll find in the ADASS proceedings collections (or on arXiv before that).
Given the experience of the last few days, I doubt I’ll do anything like the live blog from Paris linked above. I still can’t resist mentioning that at ADASS, I’m having a poster that’s little more than an ad blitz for STC in the registry.
Update (2019-10-13): Well, one week later I’m sitting in the closing session of the Interop, and I’ve even already given my summary of Semantics activities during the interop. Other topics I’ve talked about at this interop include interoperable authentication (I’m really interested in this because I’d like to enable persistent TAP uploads, where your uploaded tables are still there for you when you come back), a minor update to SimpleDALRegExt (which is overall rather technical and you probably don’t want to look at), on the takeup of new Registry tech (which might come over as somewhat sad, but considering that you have to pull along many people to have changes in “the” Registry, it’s not so bad at all), and on, as Mark Taylor called it, operational identification of server software (which I consider entertaining in its somewhat erratic narrative).
And now, after 7 days of essential nonstop discussion and brainstorming, I’m longing to slump into a chair on the train back to Heidelberg and just enjoy the landscape rolling by.
And, of course, there’s a puzzler again: you could win a beautiful towel if you solve a little VO-related problem. This year’s puzzler is about where in the sky you’ll see “nebulae” (in the classic sense defined by NGC) batched together most closely. If you’ve been following this blog for a while, it shouldn’t be too hard, but to participate you’d have to find someone in Stuttgart to hand in your solution.
If you are in Stuttgart: As usual, we’ll be giving hints during the coffee breaks on Tuesday and Wednesday. So, be sure to visit our booth.
About every six month, the people making the standards for the Virtual Observatory meet to sort out the next things we need to tackle, to show off what we’ve done, and to meet each other in person, which sometimes is what it takes to take some excessive heat out of a debate or two. We’ve talkedaboutInterops before. And now it’s time for this (northern) spring’s Interop, which is taking place in Paris (Program).
This time I thought I’d see if there’s any chance I can copy the pattern I’m enjoying at Skyweek now and then: A live blog, where I’ll extend the post as I go. If that’s a plan that can fly remains to be seen, as I’ll give seven talks until Friday, and there’s a plethora of side meetings and other things requiring my attention.
Anyway, the first agenda item is a meeting of the TCG, the Technical Coordination Group, which is made up of the chairs and vice-chairs of the IVOA’s working groups (I’m in there as the vice chair of the semantics WG). We’ll review how the standards under review progress, sanction (or perhaps defer) errata, and generally look at issues of general VO interest.
Update (2019-05-12, 10:50): Oh dang, my VOResource 1.1 Erratum 1 hasn’t quite made it. You see, it’s about authentication, i.e., restricting service access, which, in a federated, interoperating system is trickier than you would think, and quite a few discussions on that will happen during this Interop. So, the TCG has just decided to only consider it passed if nothing happens this week that would kill it. To give you an idea of other things we’ve talked about: Obscore 1.1 Erratum 1 and SODA 1.0 Erratum 1 both try to fix problems with UCD annoation (i.e., a rough idea what it is) not directly related to the standards themselves but intended to help when service results are consumed outside of the standard context, and RegTAP 1.0 Erratum 1 fixes an example in the standard regulating registry discovery that didn’t properly take into account my old nemesis, case-insensitivity of IVOA identifiers. So, yay!, at least one of my Errata made the TCG review.
Update (2019-05-12, 12:15): Yay! After some years of back and forth, the TCG has finally endorsed my Discovering Data Collections note. This is another example of the class of text you don’t really notice. It’s supposed to let you, for instance, type in a table name into TOPCAT and then figure out at which TAP service to query it. You say: I can already do that! I say: Yeah, but only because I’m running a non-standard service, which I’d like to cease at some point.
Update (2019-05-12, 15:55): The TCG meeting slowly draws to an end. This second half was, in particular, concerned with reports from Working and Interest Groups; this is, essentially, an interactive version of the roadmaps, where the various chairs say what they’d like to do in the six month following an Interop. The one from after College Park (VO insiders live by Interops, named by the towns they’re in) you could read at 2018 B Roadmap in the IVOA Wiki – but really, as of next Friday, you’d rather look at what’s going be cooked up here (which will be at 2019 A Roadmap).
Update (2019-05-12, 16:30) It’s now Exec, i.e., the governing body of the VO, consisting of the principal investigators (or, bosses), of the national VO projects (I’m just sitting in for my boss, really). This has, for instance, the final say on what gets to be a standard and what doesn’t. This is, of course, a bit more formal than the hands-on debates going on in the TCG, so I get to look around a bit in the meeting room. And what a meeting room they have here at Paris observatory. Behind me there’s a copy of Louis XIV’s most famous portrait (and for a reason: Louis XIV had the main part of the building we’re in built), along the walls around me are the portraits of the former directors of Paris observatory (among them names all mathematicians or astronomers know: Laplace, Delaunay, Lalande, the Cassinis, and so on), and above me, in the meeting room’s dome, there’s an allegoric image of a Venus transit that I can’t link here lest schools block this important outreach site. What a pity we’ll have to move into a tent when everyone else comes in tomorrow…
Update (2019-05-13, 9:11) The logistics speech is being given by Baptiste Cecconi, who’s just given the carbon footprint of this meeting – 155 tons of CO2 for travel alone, or 1.2 tons per person. That, as he points out, is about what would be sustainable per year. Well, they’re trying to make amends as far as possible. We’ll have vegetarian-only food today (good for me), and locally grown food as far as possible. Also, the conference freebie is a reusable cup so people won’t produce endless amounts of waste plastic cups. I have to say I’m impressed.
Update (2019-05-13, 9:43): One important function of these meetings is that when software authors and users sit together, it’s much easier to fix things. And, first success for me this time around: The LAMOST services at the data center of the Czech academy of sciences do fast positional searches now; you’ll find them by looking for LAMOST in TOPCAT‘s SSAP window, in Aladin 10, in Splat, or really whereever clients let you do discovery of spectral services in the VO.
Update (2019-05-13, 10:59):Next up: “Charge to the Working Groups”. That’s when the various working group chairs give lightning talks on what’s going to happen in their sessions and try to pull as many people as they can. Meanwhile, in the coffee break, I’ve had the next little success: With the people involved, we’ve worked out a good way to fix a Registry problem briefly described by “two publishing registries claim the same authority” (it’s always nice to pretend I’m in Star Trek) – indeed, we’ll only need a single deletion at a single point. Given the potential fallout of such a problem, that’s very satisfying.
Update (2019-05-13, 14:07): While the IG/WG chairs presented their plans, the Ghost of Le Verrier (or was it just the wind?) occasionally haunted the tent, which gave off dreadful noises. And after the session, I quickly ported the build infrastructure for the future EPN-TAP specification (SVN for nerds; previously in this blog for the rest of you) to python 3. Le Verrier was quiet during that time, so I’m sure the guy who led the way to the discovery of Uranus approved.
Update (2019-05-13, 14:29): Mark SubbaRao from Chicago’s Adler Planetarium is giving a plenary talk (in other places, this might be called a “keynote”) on Planetaria and the VO. And he makes the point that there’s 150 million people visiting a plenetarium each year, which, he claims, is a kind of outreach opportunity that no other science has. I’d not bet on that last statement given all the natural history museums, exploratoria, maker faires and the like, but still: That the existence of planetaria says something about the relationship of the public with astronomy is an insight I just had.
Update (2019-05-13, 15:07): So, you think you just sit back and enjoy a colourful talk, and then suddenly there’s work in there. Specifially, there’s a standard called AVM designed to annotate astronomical images to show them in the right place on a planetarium dome (ok, FITS WCS can do that as well) and furnish it with other metadata useful in outreach and education. As Registry and Semantics enthusiast, I immediately clicked on the AVM link at the foot of http://www.data2dome.org and was greeted by something pretty close to a standard IVOA document header. Except it declares itself as an “IVOA draft”; such a document category doesn’t really exist. Even if it did, after around 10 years (there are conflicting date specs in the document) a document shouldn’t be a “draft” any more. If it’s survived that long and is still used, it deserves to be some sort of proper document, I think. So, I took the liberty of cold-contacting one of the authors. Let’s see where that goes.
Update (2019-05-13, 16:29): We’ve just learned about the standardisation process at IPDA (that’s a bit like the IVOA, just for planetary data), and interestingly, people are voting there on their standards – this is against the IVOA practice of requiring consensus. Our argument has always been that a standard only makes sense if all interested parties adopt it and thus have to at least not veto it. I wonder if these different approaches have to do with the different demographics: within the IPDA, there are far fewer players (space agencies, really) with much clearer imbalances (e.g., between NASA and the space agency of the UAE). Hm. I couldn’t say how these would impact our arguments for requiring consensus…
Update (2019-05-13, 17:11): Isn’t that nice? In the session of the solar system interest group, Eleonora Alei is just reporting on her merged catalog of explanets – which is nice in itself, but what’s pleasant for me is to learn she got to make this because of the skills she learned at the ASTERICS school in Strasbourg last November. You see, I was one of the tutors there!
Update (2019-05-14, 8:50): Next up is the first Registry session, with a talk on how to get the information on all our fine VO services into B2Find, a Registry-like thing for the Eurpean Open Science Cloud as its highlight. I’ll also present my findings on what we (as the VO) have gotten wrong when we used “capabilities” do describe things, and also progress on VODataService 1.2; this latter thing is, as far as users are concerned, mainly about finally enabling registry searches by space, time, and spectral coverage.
Update (2019-05-14, 14:11): So, I did run into overtime a bit with my talks, which mostly is a good sign in Interops, because it indicates there’s discussion, which again indicates interest in the topic at hand. The rest of the morning I spent trying to work out how we can map the VO Registry (i.e., the set of metadata records about our services) into b2find in a way that it’s actually useful. I guess we – that’s Claudia from b2find, Theresa as Registry chair, and me – made good progress on this, perhaps not the least because of the atmosphere of the meeting: In the sun in the beautiful garden of Paris observatory. And now: Data Models I.
Update (2019-05-14, 14:51): Whoops – Steve just mentions in his talk on the Planetary Data System that there’s ISO 14721, a reference model for an Open Archival Information System. Since I run such an archive, I’m a bit embarrassed to admit I’ve never heard of that standard. The question, of course, being if this has the same relationship to actually running an Archive as ISO 9001 has to “quality” (Scott Adams once famously said something to the effect of: if you’ve not worked with ISO 9001, you probably don’t know what it is. If you have worked with ISO 9001, you certainly don’t know what it is).
Update (2019-05-15, 9:30): I’ve already given my first talk today: TIMESYS and TOPOCENTER, on a quick way to deal with the problem of adjusting for light travel times when people have not reduced the times they give to one of the standard reference positions. There’s more things close to my heart in this session: MOCs in Space and Time, which might become relevant for the Registry [up-update: and, wow, of quick searches against planetary or asteroid orbits. Gasp]; you see, MOCs are rather compact representations of (so far only spatial) coverages, and the space MOCs are already in use for the Registry in the rr.stc_spatial table on the TAP service at http://dc.g-vo.org/tap. The temporal part of STC-based discovery is just intervals at this point, which probably is good enough – but who knows? And I’m also curious about Dave’s thoughts on the registration of VOEvents, which takes up something I’ve reviewed ages ago and that went dormant then – which was somewhat of a pity, because there’s to this day no way to find active VOEvent streams.
Update (2019-05-15, 14:18): After another Exec session over lunch I ran over to a session somewhat flamboyantly called “TAP-fostered Authentication in the Server-Client scenario“. This is about enabling running access-controlled services, which I’m not really a fan of; but then I figure if people can use VO tools to access their proprietary data, chances are better that that data will eventually be usable from everyone’s VO tools. Data dumped behind custom-written web pages will much less likely be freed in the end, or so I believe. Anyway, I’m now in the game of figuring out how to do this, and I’m giving the (current) Registry perspective. The main part of the session, however, will be free discussion, a time-honored and valuable tradition at Interops.
Update (2019-05-16, 9:00): I’m now in the Theory session, where people deal with simulated data and such things (rather than, as you might guess, with the theory of publishing and/or processing data). The main reason I’m here is that theory was an early adopter of vocabularies. Due to my new(ish) role in the semantics WG, I’ll have to worry about this, because things changed a bit since they started (I’ll talk about that later today) – and also, some of their vocabularies – for instance, object types – are of general interest and shouldn’t probably be theory-only. Let’s see how far my charm goes…
Update (2019-05-16, 12:20): I was doing a bit of back-and-forth between a DAL session (in which, among other things, my colleague Jon gave a talk on a machine-readable grammar for ADQL and Dave tells us how ADQL 2.1 goes on (previously on this blog), and a code sprint the astropy folks have next to the conference, where we’ve been discussing pyVO’s future (remember pyVO? See the update for yesterday 11:16 if not).
Update (2019-05-16, 14:27): Again, in-session running: I gave a quick talk on how we’ll finally get to do data collection-based discovery (rather than service-based, as we do now; lecture notes) and then walked through the garden of Paris observatory to the semantics session, where I joined while people were still discussing the age-old problem of enumerating the observatories, space-probes, and instruments in the world (an endeavour that, very frankly, scares me a tiny bit because of its enormous size). After talks on the use of vocabularies in CAOM2 (Pat) and theory (Emeric), I’ll then do my first formal action in the semantics WG: I’ll disclose my plans for specifying how the IVOA should do vocabulary work in the future.
Update (2019-05-16, 17:56): So, the afternoon, between my talks in Registry II and Semantics, planning for the Semantics roadmap (this is something where WG chairs say what they’re planning until the next Interop; more on this, I guess, tomorrow), talking with the theory people about how their vocabularies will better integrate with the wider VO, and passing on pyVO to core astropy folks, was a bit too busy for live-blogging. I conclude with a “splinter” on the development of Datalink. This is pure discussion without a formal talk, which, frankly, often is the most useful format for things we do at Interops, and there’s almost 20 people here. In contrast to yesterday’s after-show splinter (which was on integration of the VO Registry with b2find), I’m just a participant here. Phewy.
Update (2019-05-17, 8:52): We’re going to start the last act of Interops, where the working group chairs report on the progress made during the interop. That, at the time of writing, only three WGs already have their slide on it shows that that’s always a bit of a real-time affair – understandibly, because the last bargains and agreements are being worked out as I write. This time around, though, there’s a variation to that theme: The astropy hackathon that ran in parallel to the Interop will also present its findings, and I particularly rejoice because they’re taking over pyVO development. That’s excellent news because Stefan, who’s curated pyVO for the last couple of years from Heidelberg, has moved on and so pyVO might have orphaned. That’s what I call a happy end.
Update (2019-05-17, 13:01): So, after reviews and a kind good-bye speech by the Exec chair Mark Allen – which included quite a bit well-deserved applause for the organisers of the meeting –, the official part is over. Of course, I still have a last side-meeting: planning for what we’re going to do within ESCAPE, a project linking astronomy with the European Open Science Cloud. But that’s not going to be more than an hour. Good-bye.
Using and re-using is of course what the Virtual Observatory is about, and we’ve been keeping fairly large plate collections in our data center for quite a while (among them the Archives of Landessternwarte Königstuhl or the Palomar-Leiden Trojan surveys, and there is the WFPDB TAP-accessibly). Therefore, people from GAVO Heidelberg have been to all past astroplate conferences.
For this one, I brought a brand-new tutorial on plate scans in the VO, which, I hope, also works as a general introduction to image discovery in the VO using SIAP, Datalink, and Obscore. If you’re doing image stuff now and then, please have a quick look at the thing – I am particularly grateful for hints on what to improve or perhaps particularly obvious use cases for the material discussed.
Such VO proselytising aside, the conference is discussing the wide variety of creative, low-cost data collectors out there as well as computer-aided re-analysis extracting new knowledge from decades-old data. If I had to choose a single come-to-think-of-it moment, it would be Norbert Zacharias’ observation that if you have a well-behaved object and you’d like to know where it was in 1900, it’s now more accurate to extrapolate Gaia astrometry to the epoch of observation than to measure it on the plate itself. Which is saying a lot about the amazing feat of engineering that Gaia is.
This is not, however, an argument for dumping the old data. Usually, it is exactly what is not so well-behaved (like those) that’s interesting – both in terms of astrometry and in terms of photometry (for which there’s a lot more unruly behaviour in the first place). To figure out how objects don’t behave well, and, for objects disguising as well-behaved only on time scales of the (say) Gaia mission, which these are, the key is “old” data. The freshness of which we’re discussing this week.
The VO Registry lets people find astronomical resources (which is jargon for “dataset, service, or stuff“). Currently, most of its users don’t even notice they’re using the Registry, as when TOPCAT just magically lists what TAP services are available (image above) – but there are also interfaces that let you directly interact with the registry, for instance GAVO’s WIRR service or ESAVO’s Registry Search.
Arguably, the usefulness of the Registry scales with its completeness. With sufficient completeness, the domain-specific, structured metadata will also make it interesting for generic discovery of astronomical data; in a quip, looking for UCDs in google will never work quite well – and without that, it’s hard to find things with queries like „radio fluxes of early-type stars”.
Either way: If you have a data set or a service dealing with astronomy, it’d be great if you could register it. To do this, so far you either had to set up a publishing registry, which is nontrivial even if you have a software that natively speaks a protocol called OAI-PMH (DaCHS does, but most other publishing suites don’t) or you could use one of two web interfaces to define your resource (notes for a talk on this I gave in 2016).
Neither of these options is really attractive if you publish only a few resources (so the overhead of running a publishing registry looks excessive) that change now and then (so using a web browser to update the resource records again and again is tedious). Therefore, GAVO has developed purx, the publishing registry proxy. We’ve officially announced it during the recent Southern Spring Interop in Santiago de Chile (Program), and the lecture notes for that talk are probably a good introduction to what this is about.
If you’re running VO services and have not registered them so far, you probably want to read both these notes and the service documentation. If, on the other hand, you just have a web-published directory of files or a browser-based service, you probably can skip even that. Just grab a sample record (use the one for a simple browser service in both cases) and adapt it to what’s fitting for your website. Then put the resulting file online somewhere and paste the URL of that location on purx’ enrollment service. In case you’re uncertain about some of the terms in the record, perhaps our crib sheet for metadata we ask our data providers for will be helpful.
There’s really no excuse any more for not being in the Registry!
For the 11th time, GAVO has a booth at a meeting of the venerable Astronomische Gesellschaft (AG). This year, we are in Göttingen, again offering advice to users and data providers at our booth (if you’re looking for us: We’re close to the entrance of Hörsaal 5).
And again we have a Puzzler, a little problem easily solved if you know your VO tech – and if you don’t we’ll gladly help you at our booth. We are also giving hints there, one being released at each coffee break on Tuesday and Wednesday (there are little posters with them, too, if you miss one). Of course, if you’re not in Göttingen, you’re still welcome to try your hand. You won’t get to win our great first prize then, the big Crab Nebula towel (it should be easy to spot on the image above).
If, on the other hand, you are in Göttingen, be sure to drop by our splinter meeting. Yours truly, for instance, will speak about EPN-TAP (remember And the Solar System, too right here? That’s what this is about).
Update 2017-09-20, 17:00 We’ve just given out the last hint for the puzzler, and so we can publish them all over on the puzzler archive: Hints for the 2017 puzzler. If you’re in Göttingen, you still have until tomorrow 16:00 to hand in a solution and perhaps win our nice and fuzzy Crab Nebula towel.
Update 2017-09-21, 17:00 And the winner is… again not from Marburg, which is beginning to become a running gag, and they’ve been unlucky for the last three years in a row. Anyway, here’s our proposed solution.
The 3. Asterics DADI Tech Forum took place last week in Strasbourg – and many GAVO members made contributions as well.
This time, there were 3 slots for hackathon sessions, which were also used for discussions. We’ll mention two highlights of our contributions here.
We took the opportunity to push our Provenance Data Model efforts and used the hackathon slots for provenance discussions.
One topic was the links between the simulation data model and ProvenanceDM, and how to map from SimDM to ProvenanceDM classes. This mapping works quite well and will be included in the working draft for the data model. We also had an interesting talk by José Enrique Ruiz on his view on Provenance, workflows, and – very important – the “deployer” and “system” provenance for storing all the environment variables that may be needed to rerun the processing of some observational data. Michèle Sanguillon also presented for the first time her extension to the prov Python library (W3C) with extensions from our IVOA Provenance Data Model. We also had interested people from outside the usual provenance-interested people joining in, e.g. from the Astron project. More about our Provenance modelling efforts can be found at IVOA Provenance wiki page.
A world premiere (of sorts) was the first discussion of RegTAP 1.1. RegTAP is a search interface to the VO Registry; it is what TOPCAT or other VO clients uses when you type in keywords to locate services. A fairly direct web-basd interface is our WIRR registry interface. RegTAP will need a bit of a makeover since VOResource, the underlying metadata scheme is currently receiving one, allowing, in particular, for including DOIs and ORCIDs (John Does of this world, rejoice: People can finally uniquely find your data and not that of all the other J. Does) in Registry records and figuring out licenses on data. Licensing may not matter when you use data in a paper but it does matter if you want to redistribute data, e.g. for planetarium programs with catalog data or pretty pictures, or when re-mixing data.
But of course the GAVOistas happily joined the fray on the many other topics discussed, from a standard format for a time series to interoperable authentication, from datalink applications to figuring out if data coming into a program should be treated as a collection of spectra or rather an object catalog – the latter in the context of the upcoming version 10 of the VO’s premier image tool Aladin, which we saw (probably another premiere) demoed. We can already promise you an exciting update!