Sofa instead of Granada

[Screenshot from an online talk]
Gesticulating wildly to a computer is what happens in an online conference. To me, at least. Let’s hope nobody watched me through the window.

It was already in the wee hours of Friday last week (CET) when the second “virtual Interop” had its rather unceremonious closing ceremony. Its predecessor in May had about it an air of a state of emergency. For instance, all sessions were monothematic. That was nice on the one hand, because a relatively large part of the time was available for discussion – which, really, is what the Interops are about. But then Interops are also about noticing what everyone else in the Virtual Observatory is cooking up, for which the short-ish talks we usually have at Interops work really well.

In contrast to that first Corona Interop, this second one, replacing what would have taken place in Granada, Spain, had a much more conventional format, which again accomodated many talks. But of course, this made one feel the lack of possibilities to quickly hash out a problem during a coffee break or in a spontaneous splinter quite a bit more.

Be that as it may, I would like to give you some insights on what I’m currently up to at the IVOA level; I am grateful for any feedback you can give on any of these topics.

Given that I currently chair the Semantics Working group, there was a natural focus on topics around vocabularies, and I gave two talks in that department. The one in DAL (DAL is the working group that builds the actual access protocols such as TAP or SIAP) was mainly on Datalink-related aspects of my Vocabularies in the VO 2 draft (VocInVO2), which in particular was an opportunity to thank everyone involved in the Vocabulary Enhancement Proposals we have been running this last year (all of which were about Datalink and hence closely tied to DAL). One thing I was asking for was reviews on a github pull request that would make the bysemantics method of Datalink accesses semantics-aware; basically, as intended by the original Datalink authors, when asking for #calibration links, this will also return, say, #bias links. If you can spare a moment for this: Please do!

Another thing I tried to raise some interest for is the proposed vocabulary of product types; this, I think, should eventually define what people may put into the dataproduct_type column of Obscore results, and there are related uses in Datalink and, believe it or not, the registration of SSAP (spectral) services. A question Alberto raised while I was discussing that made me realise I forgot to mention another vocabularies-related development relevant for DAL: I’ve put the gavo_vocmatch ADQL user-defined function into DaCHS. It lets you match something against a term or its narrower terms, referencing an IVOA vocabulary. For instance, if we had different sorts of time series (which, of course, would be odd for obscore that has the o_ucd column for this kind of thing), you could, using ADQL, still get all time series by querying

SELECT TOP 5 * 
FROM ivoa.obscore
WHERE
  1=gavo_vocmatch(
    ’product-type’, 
    ’timeseries’, 
    dataproduct_type)

Here, the first argument is the vocabulary name (whatever is after the http://www.ivoa.net/rdf in the vocabulary URL), the second the “root” term, and the third the column to match against. Since postgres, for now, isn’t aware of IVOA vocabularies, the second argument must be a literal string rather than, say, an expression involving columns.

I gave a second semantics-related talk in the Registry session. That had its focus on the Unified Astronomy Thesaurus (UAT), from which people should pick the subject keywords in the VO Registry (actually, they should pick from its representation at http://www.ivoa.net/rdf/uat). I’ll probably blog about that a little more some other time. For now, let me recommend a little UAT-based game on my Semantics Based Registry Browser sembarebro: Choose two terms that are pretty far apart (like, perhaps, ionized-coma-gases and cosmic-background-radiation) and then try to join the two sub-graphs. Warning: This may waste your time. But it will acquaint you with the UAT, which may be a good thing.

In that second talk, I also mentioned a second draft vocabulary I’ve put up in the past six months, http://www.ivoa.net/rdf/messenger. This builds upon the terms for VODataService’s waveband element, which enumerated certain flavours of photons (like Radio, Optical, or X-ray). Now that we explore other messengers as well and have more and more solar system resources in the Registry, I’m arguing we ought to open up things by making “Photon” explicit in there and then adding Neutrinos and, later, other messengers. I’ve received a certain amount of pushback there on mixing the electromagnetic spectrum with particle types; on the other hand, the hierarchical nature of our vocabularies would, I think, let us smartly get away with that.

Speaking about solar system resources, I’m also listed as an author on Stéphane Erard’s talk on EPN-TAP and EPNCore v2.0, probably due to my involvement in finally bringing EPN-TAP into the IVOA document repository. I’ve already talked about that in a 2017 post on this blog – and again, if you’re interested in solar system data, this would be a good time to review the EPN-TAP working draft.

Talking about things regluar readers of this blog will have heard of: September’s Crazy Shapes post I’ve referenced in a talk on MOCs in pgsphere, together with a fervent appeal to data centers to become involved in pgsphere maintenance.

And then there was my colleague Margarida’s talk on LineTAP, a proposal to obsolete the little-used SLA protocol (which lets people search for spectral lines) with something combining the much more successful VAMDC with our beloved TAP. Me, I’m in this because I’d like to bring TOSS data closer to VAMDC – but also because having competing infrastructures for the same thing sucks.

And finally, I gave a talk I’ve called Data Model Posture Review in a session of the Data Models working group; I was somewhat worried that given its rather skeptical outlook it wouldn’t be really well-received. But in fact quite a few people shared my main conclusions – and perhaps it was another step towards resolving my decade-old spot of pain: that the VO still doesn’t offer tech to reliably bring two catalogues to the same epoch without human intervention.

With this number of talks I’ve been involved in, I’m essentially back to the level of a normal Interop. Which means I’ve been fairly knocked-out on Friday. And I can’t lie: I still regret I didn’t get to spend a few more warm days in Granada. Corona begone!

Building consensus

[image: Markus, handwringing]
Sometimes, building consensus takes a little bending: Me, at the Shanghai Interop of 2017. In-joke: there’s “STC” on the slide.
In the Virtual Observatory, procedures are built on consensus: No (relevant) decisions are passed based some sort of majority vote. While I personally think that’s a very good thing in general – you really don’t want to clobber minorities, and I couldn’t even give a minimal size of such a minority below which it might be ok to ignore them –, there is a profound operational reason for that: We cannot force data centers or software writers to comply with our standards, so they had better agree with them in the first place.

However, building consensus (to avoid Chomsky’s somewhat odious notion of manufacturing consent) is hard. In my current work, this insight manifests itself most strongly when I wear my hat as chair of the IVOA Semantics Working Group, where we need to sort items from a certain part of the world into separate boxes and label those, that is, we’re building vocabularies. “Part of the world” can be formalised, and there are big phrases like “universe of discourse” to denote such formalisations, but to give you an idea, it’s things like reference frames, topics astronomy in general talks about (think journal keywords), relationships between data collections and services, or the roles of files related to or making up a dataset. If you visit the VO’s vocabulary repository, you will see what parts we are trying to systematise, and if you skim the current draft for the next release of Vocabularies in the VO, in section two you can find a few reasons why we are bothering to do that.

As you may expect if you have ever tried classifications like this, what boxes (”concepts” in the argot of the semantics folks) there should be and how to label them are questions with plenty of room for dissent. A case study for this is the discussion on VEP-001 and its successors that has been going on since late last year; it also illustrates that we are not talking about bikeshedding here. The discussion clarified much and, in particular, led to substantial improvements not only to the concept in question but also far beyond that. If you are interested, have a look at a few mail threads (here, here, here, or here; more discussion happened live at meetings).

An ideal outcome of such a process is, of course, a solution that is obvious in retrospect, so everyone just agrees. Sometimes, that doesn’t happen, and one of these times is VEP-001 and the VEP-003 it evolved into. A spontanous splinter between sessions of this week’s Virtual Interop yielded two rather sensible names for the concept we had identified in the previous debates: #sibling on the one hand, and #co-derived on the other (in case you’re RDF-minded: the full vocabulary URIs are obtained by prefixing this with the vocabulary URI, http://www.ivoa.net/rdf/datalink/core). Choosing between the two is a bit of a matter of taste, but also of perhaps changing implementations, and so I don’t see a clear preference. And the people in the conference didn’t reach an agreement before people on the North American west coast really had to have some well-deserved sleep.

In such a situation – extensive discussion yields some very few, apparently rather equivalent solution –, I suspect it is the time to resort to some sort of polling after all. So, in the session I’ve asked the people involved to give their pain level on a scale of 1 to 10. Given there are quite a few consensus scales out there already (I’m too lazy to look for references now, but I’ll retrofit them here if you send some in), I felt this was a bit hasty after I had closed the z**m^H^H^H^H telecon client. But then, thinking about it, I started to like that scale, and so during a little bike ride I came up with what’s below. And since I started liking it, I thought I could put it into words, and into a form I can reference when similar situations come up in the future. And so, here it is:

Markus’ Pain Level Scale

  1. Oh wow. I’m enthusiastic about it, and I’d get really cross if we didn’t do it.
  2. It’s great. I don’t think we’ll find a better solution. People better have really strong reasons to reject it.
  3. Fine. Just go ahead.
  4. Quite reasonable. I have some doubts, but I either don’t have a good alternative, or the alternatives certainly won’t improve matters.
  5. Reasonable. I can live with it, possibly accepting a very moderate amount of pain (like: change an implemenation that I think is fine as it is).
  6. Sigh. I don’t like it much. If you think it’s useful, do it, but don’t blame me if it later turns out it stinks.
  7. Ouch. I wish we didn’t have to go there. For instance: This is going to uglify a few things I care about.
  8. Yikes. I think it’s a bad idea. Honestly, let’s not do it. It’s going to make quite a few things a lot uglier, though I give you it might still just barely work.
  9. OMG. What are you thinking? I won’t go near it, and I pity everyone who will have to. And it’s quite likely going to blow up some things I care about.
  10. Blech. To me, this clearly is a grave mistake that will impact a lot of things very adversely. If I can do anything within reason to stop it, I’ll do it. Consider this a veto, and shame on you if you override it.

You can qualify this with:

+
I’ve thought long and hard about this, and I think I understand the matter in depth. You’ll hence need arguments of the profundity of the Earth’s outer core to sway me.
(unqualified)
I’ve thought about this, and as far as I understand the matter I’m sure about it. More information, solid arguments, or a sudden inspiration while showering might still sway me.
This is a gut feeling. It could very well be phantom pain. Feel free to try a differential diagnosis.

If you like the scale, too, feel free to reference it as https://blog.g-vo.org/building-consensus/#scale.