Posts with the Tag Tutorials:

  • A Proposal for Persistent TAP Uploads

    From its beginning, the IVOA's Table Access Protocol TAP has let users upload their own tables into the services' databases, which is an important element of TAP's power (cf. our upload crossmatch use case for a minimal example). But these uploads only exist for the duration of the request. Having more persistent user-uploaded tables, however, has quite a few interesting applications.

    Inspired by Pat Dowler's 2018 Interop talk on youcat I have therefore written a simple implementation for persistent tables in GAVO's server package DaCHS. This post discusses what is implemented, what is clearly still missing, and how you can play with it.

    If all you care about is using this from Python, you can jump directly to a Jupyter notebook showing off the features; it by and large explains the same things as this blogpost, but using Python instead of curl and TOPCAT. Since pyVO does not know about the proposed extensions, the code necessarily is still a bit clunky in places, but if something like this will become more standard, working with persistent uploads will look a lot less like black art.

    Before I dive in: This is certainly not what will eventually become a standard in every detail. Do not do large implementations against what is discussed here unless you are prepared to throw away significant parts of what you write.

    Creating and Deleting Uploads

    Where Pat's 2018 proposal re-used the VOSI tables endpoint that every TAP service has, I have provisionally created a sibling resource user_tables – and I found that usual VOSI tables and the persistent uploads share virtually no server-side code, so for now this seems a smart thing to do. Let's see what client implementors think about it.

    What this means is that for a service with a base URL of http://dc.g-vo.org/tap[1], you would talk to (children of) http://dc.g-vo.org/tap/user_tables to operate the persistent tables.

    As with Pat's proposal, to create a persistent table, you do an http PUT to a suitably named child of user_tables:

    $ curl -o tmp.vot https://docs.g-vo.org/upload_for_regressiontest.vot
    $ curl -H "content-type: application/x-votable+xml" -T tmp.vot \
      http://dc.g-vo.org/tap/user_tables/my_upload
    Query this table as tap_user.my_upload
    

    The actual upload at this point returns a reasonably informative plain-text string, which feels a bit ad-hoc. Better ideas are welcome, in particular after careful research of the rules for 30x responses to PUT requests.

    Trying to create tables with names that will not work as ADQL regular table identifiers will fail with a DALI-style error. Try something like:

    $ curl -H "content-type: application/x-votable+xml" -T tmp.vot
      http://dc.g-vo.org/tap/user_tables/join
    ... <INFO name="QUERY_STATUS" value="ERROR">'join' cannot be used as an
      upload table name (which must be regular ADQL identifiers, in
      particular not ADQL reserved words).</INFO> ...
    

    After a successful upload, you can query the VOTable's content as tap_user.my_upload:

    A TOPCAT screenshot with a query 'select avg("3.6mag") as blue, avg("5.8mag") as red from tap_user.my_upload' that has a few red warnings, and a result window showing values for blue and red.

    TOPCAT (which is what painted these pixels) does not find the table metadata for tap_user tables (yet), as I do not include them in the “public“ VOSI tables. This is why you see the reddish syntax complaints here.

    I happen to believe there are many good reasons for why the volatile and quickly-changing user table metadata should not be mixed up with the public VOSI tables, which can be several 10s of megabytes (in the case of VizieR). You do not want to have to re-read that (or discard caches) just because of a table upload.

    If you have the table URL of a persistent upload, however, you inspect its metadata by GET-ting the table URL:

    $ curl http://dc.g-vo.org/tap/user_tables/my_upload | xmlstarlet fo
    <vtm:table [...]>
      <name>tap_user.my_upload</name>
      <column>
        <name>"_r"</name>
        <description>Distance from center (RAJ2000=274.465528, DEJ2000=-15.903352)</description>
        <unit>arcmin</unit>
        <ucd>pos.angDistance</ucd>
        <dataType xsi:type="vs:VOTableType">float</dataType>
        <flag>nullable</flag>
      </column>
      ...
    

    – this is a response as from VOSI tables for a single table. Once you are authenticated (see below), you can also retrieve a full list of tables from user_tables itself as a VOSI tableset. Enabling that for anonymous uploads did not seem wise to me.

    When done, you can remove the persistent table, which again follows Pat's proposal:

    $ curl -X DELETE http://dc.g-vo.org/tap/user_tables/my_upload
    Dropped user table my_upload
    

    And again, the text/plain response seems somewhat ad hoc, but in this case it is somewhat harder to imagine something less awkward than in the upload case.

    If you do not delete yourself, the server will garbage-collect the upload at some point. On my server, that's after seven days. DaCHS operators can configure that grace period on their services with the [ivoa]userTableDays setting.

    Authenticated Use

    Of course, as long as you do not authenticate, anyone can drop or overwrite your uploads. That may be acceptable in some situations, in particular given that anonymous users cannot browse their uploaded tables. But obviously, all this is intended to be used by authenticated users. DaCHS at this point can only do HTTP basic authentication with locally created accounts. If you want one in Heidelberg, let me know (and otherwise push for some sort of federated VO-wide authentication, but please do not push me).

    To just play around, you can use uptest as both username and password on my service. For instance:

      $ curl -H "content-type: application/x-votable+xml" -T tmp.vot \
      --user uptest:uptest \
      http://dc.g-vo.org/tap/user_tables/privtab
    Query this table as tap_user.privtab
    

    In recent TOPCATs, you would enter the credentials once you hit the Log In/Out button in the TAP client window. Then you can query your own private copy of the uploaded table:

    A TOPCAT screenshot with a query 'select avg("3.6mag") as blue, avg("5.8mag") as red from tap_user.my_upload' that has a few red warnings, and a result window showing values for blue and red; there is now a prominent Log In/Out-button showing we are logged in.

    There is a second way to create persistent tables (that would also work for anonymous): run a query and prepend it with CREATE TABLE. For instance:

    A TOPCAT screenshot with a query 'create table tap_user.smallgaia AS SELECT * FROM gaia.dr3lite TABLESAMPLE(0.001)'. Again, TOPCAT flags the create as an error, and there is a dialog "Table contained no rows".

    The “error message” about the empty table here is to be expected; since this is a TAP query, it stands to reason that some sort of table should come back for a successful request. Sending the entire newly created table back without solicitation seems a waste of resources, and so for now I am returning a “stub” VOTable without rows.

    As an authenticated user, you can also retrieve a full tableset for what user-uploaded tables you have:

    $ curl --user uptest:uptest http://dc.g-vo.org/tap/user_tables | xmlstarlet fo
    <vtm:tableset ...>
      <schema>
        <name>tap_user</name>
        <description>A schema containing users' uploads. ...  </description>
        <table>
          <name>tap_user.privtab</name>
          <column>
            <name>"_r"</name>
            <description>Distance from center (RAJ2000=274.465528, DEJ2000=-15.903352)</description>
            <unit>arcmin</unit>
            <ucd>pos.angDistance</ucd>
            <dataType xsi:type="vs:VOTableType">float</dataType>
            <flag>nullable</flag>
          </column>
          ...
        </table>
        <table>
          <name>tap_user.my_upload</name>
          <column>
            <name>"_r"</name>
            <description>Distance from center (RAJ2000=274.465528, DEJ2000=-15.903352)</description>
            <unit>arcmin</unit>
            <ucd>pos.angDistance</ucd>
            <dataType xsi:type="vs:VOTableType">float</dataType>
            <flag>nullable</flag>
          </column>
          ...
        </table>
      </schema>
    </vtm:tableset>
    

    Open Questions

    Apart from the obvious question whether any of this will gain community traction, there are a few obvious open points:

    1. Indexing. For tables of non-trivial sizes, one would like to give users an interface to say something like “create an index over ra and dec interpreted as spherical coordinates and cluster the table according to it”. Because this kind of thing can change runtimes by many orders of magnitude, enabling it is not just some optional embellishment.

      On the other hand, what I just wrote already suggests that even expressing the users' requests in a sufficiently flexible cross-platform way is going to be hard. Also, indexing can be a fairly slow operation, which means it will probably need some sort of UWS interface.

    2. Other people's tables. It is conceivable that people might want to share their persistent tables with other users. If we want to enable that, one would need some interface on which to define who should be able to read (write?) what table, some other interface on which users can find what tables have been shared with them, and finally some way to let query writers reference these tables (tap_user.<username>.<tablename> seems tricky since with federated auth, user names may be just about anything).

      Given all this, for now I doubt that this is a use case sufficiently important to make all the tough nuts delay a first version of user uploads.

    3. Deferring destruction. Right now, you can delete your table early, but you cannot tell my server that you would like to keep it for longer. I suppose POST-ing to a destruction child of the table resource in UWS style would be straightforward enough. But I'd rather wait whether the other lacunae require a completely different pattern before I will touch this; for now, I don't believe many persistent tables will remain in use beyond a few hours after their creation.

    4. Scaling. Right now, I am not streaming the upload, and several other implementation details limit the size of realistic user tables. Making things more robust (and perhaps scalable) hence will certainly be an issue. Until then I hope that the sort of table that worked for in-request uploads will be fine for persistent uploads, too.

    Implemented in DaCHS

    If you run a DaCHS-based data centre, you can let your users play with the stuff I have shown here already. Just upgrade to the 2.10.2 beta (you will need to enable the beta repo for that to happen) and then type the magic words:

    dachs imp //tap_user
    

    It is my intention that users cannot create tables in your DaCHS database server unless you say these words. And once you say dachs drop --system //tap_user, you are safe from their huge tables again. I would consider any other behaviour a bug – of which there are probably still quite a few. Which is why I am particularly grateful to all DaCHS operators that try persistent uploads now.

    [1]As already said in the notebook, if http bothers you, you can write https, too; but then it's much harder to watch what's going on using ngrep or friends.
  • Learn To Use The VO

    Thumbnails of the first 60 pages of the lecture notes, grayish goo with occasional colour spots thrown in.

    The first 60 pages of the lecture notes as they currently are. I give you a modern textbook would probably look a bit more colorful from this distance, but perhaps this will still do.

    About ten years ago, I had planned to write something I tentatively called VadeVOcum: A guide for people wanting to use the Virtual Observatory somewhat more creatively than just following and slightly adapting tutorials and use cases. If you will, I had planned to write a textbook on the VO.

    For all the usual reasons, that project never went far. Meanwhile, however, GAVO's courses on ADQL and on pyVO grew and matured. When, some time in 2021, I was asked whether I could give a semester-long course “on the VO”, I figured that would be a good opportunity to finally make the pyVO course publishable and complement the two short courses with enough framing that some coherent story would emerge, close enough to the VO textbook I had in mind in about 2012.

    Teaching Virtual Observatory Matters

    The result was a course I taught at Universität Heidelberg in the past summer semester together with Hendrik Heinl and Joachim Wambsganss. I have now published the lecture notes, which I hope are textbooky enough that they work for self-study, too. But of course I would be honoured if the material were used as a basis of similar courses in other places. To make this simpler, the sources are available on Codeberg without relevant legal restrictions (i.e., under CC0).

    The course currently comprises thirteen “lectures”. These are designed so I can present them within something like 90 minutes, leaving a bit of space for questions, contingencies, and the side tracks. You can build the slides for each of these lectures separately (see the .pres files in the source repository), which makes the PDF to work while teaching less cumbersome. In addition to that main trail, there are seven “side tracks”, which cover more fundamental or more general topics.

    In practice, I sprinkled in the side tracks when I had some time left. For instance, I showed the VOTable side track at the ends of the ADQL 2 and ADQL 3 lectures; but that really had no didactic reason, it was just about filling time. It seemed the students did not mind the topic switches to much. Still, I wonder if I should not bring at least some of the side tracks, like those on UCDs, identifiers, and vocabularies, into the main trail, as it would be unfortunate if their content fell through the cracks.

    Here is a commented table of contents:

    • Introduction: What is the VO and why should you care? (including a first demo)
    • Simple Protocols and their clients (which is about SIAP, SSAP, and SCS, as well as about TOPCAT and Aladin)
    • TAP and ADQL (that's typically three lectures going from the first SELECT to complex joins involving subqueries)
    • Interlude: HEALPix, MOC, HiPS (this would probably be where a few of the other side tracks might land, too)
    • pyVO Basics (using XService objects and a bit of SAMP, mainly along an image discovery task)
    • pyVO and TAP (which is developed around a multi-catalogue SED building case)
    • pyVO and the Registry (which, in contrast to the rest of the course, is employing Jupyter notebooks because much of the Registry API makes sense mainly in interactive use)
    • Datalink (giving a few pyVO examples for doing interesting things with the protocol)
    • Higher SAMP Magic (also introducing a bit of object oriented programming, this is mainly about tool building)
    • At the Limit: VO-Wide TAP Queries (cross-server TAP queries with query building, feature sensing and all that jazz; I admit this is fairly scary and, well, at the limit of what you'd want to show publicly)
    • Odds and Ends (other pyVO topics that don't warrant a full section)
    • Side Track: Terminology (client, server, dataset, data collection, oh my; I had expected this to grow more than it actually did)
    • Side Track: Architecture (a deeper look at why we bother with standards)
    • Side Track: Standards (a very brief overview of what standards the IVOA has produced, with a view of guiding users away from the ones they should not bother with – and perhaps towards those they may want to read after all)
    • Side Track: UCDs (including hints on how to figure out which would denote a concept one is interested in)
    • Side Track: Vocabularies (I had some doubts whether that is too much detail, but while updating the course I realised that vocabularies are now really user-visible in several places)
    • Side Track: VOTable (with the intention of giving people enough confidence to perform emergency surgery on VOTables)
    • Side Track: IVOA Identifiers (trying to explain the various ivo:// URIs users might see).

    Pitfalls: Technical, Intellectual, and Spiritual

    The course was accompanied by lab work, again 90 minutes a week. There are a few dozen exercises embedded in the course, and in the lab sessions we worked on some suitable subset of those. With the particular students I had and the lack of grading pressure, the fact that solutions for most of the exercises come with the lecture notes did not turn out to be a problem.

    The plan was that the students would explain their solutions and, more importantly, the places they got stuck in to their peers. This worked reasonably well in the ADQL part, somewhat less for the side tracks, and regrettably a lot less well in the pyVO part of the course. I cannot say I have clear lessons to be learned from that yet.

    A piece of trouble for the student-generated parts I had not expected was that the projector only interoperated with rather few of the machines the students brought. Coupling computers and projectors was occasionally difficult even in the age of universal VGA. These days, even in the unlikely event one has an adapter for the connectors on the students' computers, there is no telling what part of a computer screen will end up on the wall, which distortions and artefacts will be present and how much the whole thing will flicker.

    Oh, and better forget about trying to fix things by lowering the resolution or the refresh rate or whatever: I have not had one instance during the course in which any plausible action on the side of the computer improved the projected image. Welcome to the world of digital video signals. Next time around, I think I will bring a demonstration computer and figure out a way in which the students can quickly transfer their work there.

    Talking about unexpected technical hurdles: I am employing PDF-attached source code quite extensively in the course, and it turned out that quite a few PDF clients in use no longer do something reasonable with that. With pdf.js, I see why that would be, and it's one extra reason to want to avoid it. But even desktop readers behaved erratically, including some Windows PDF reader that had the .py extension on some sort of blacklist and refused to store the attached files on grounds that they may “damage the computer”. Ah well. I was tempted to have a side track on version control with git when writing the course. This experience is probably an encouragement to follow through with that and at least for the pyVO part to tell students to pull the files out of a checkout of the course's source code.

    Against the outline in the lecture as given, I have now promoted the former HEALPix side track to an interlude session, going between ADQL and pyVO. It logically fits there, and it was rather popular with the students. I have also moved the SAMP magic lecture to a later spot in the course; while I am still convinced it is a cool use case, and giving students a chance to get to like classes is worthwhile, too, it seems to be too much tool building to have much appeal to the average participant.

    Expectably, when doing live VO work I regularly had interesting embarrassments. For instance, in the pyvo-tap lecture, where we do something like primitive SEDs from three catalogues (SDSS, 2MASS and WISE), the optical part of the SEDs was suddenly gone in the lecture and I really wondered what I had broken. After poking at things for longer than I should have, I eventually promised to debug after class and report next time, only to notice right after the lecture that I had, to make some now-forgotten point, changed the search position – and had simply left the SDSS footprint.

    But I believe that was actually a good thing, because showing actual errors (it does not hurt if they are inadvertent) and at least brief attempts to understand them (and, possibly later, explain how one actually understood them) is a valuable part of any sort of (IT-related) education. Far too few people routinely attempt to understand what a computer is trying to tell them when it shows a message – at their peril.

    Reruns, House Calls, TV Shows

    Of course, there is a lot more one could say about the VO, even when mainly addressing users (as opposed to adopters). An obvious addition will be a lecture on the global dataset discovery API I have recently discussed here, and I plan to write it when the corresponding code will be in a pyVO release. I am also tempted to have something on stilts, perhaps in a side track. For instance, with a view to students going on to do tool development, in particular stilts' validators would deserve a few words.

    That said, and although I still did quite a bit of editing based on my experiences while teaching, I believe the material is by and large sound and up-to-date now. As I said: everyone is welcome to the material for tinkering and adoption. Hendrik and I are also open to give standalone courses on ADQL (about a day) or pyVO (two to three days) at astronomical institutes in Germany or elsewhere in not-too remote Europe as long as you house (one of) us. The complete course could be a 10-days block, but I don't think I can be booked with that[1].

    Another option would be a remote-teaching version of the course. Hendrik and I have discussed whether we have the inclination and the resources to make that happen, and if you believe something like that might fit into your curriculum, please also drop us a note.

    And of course we welcome all sorts of bug reports and pull requests on codeberg, first and foremost from people using the material to spread the VO gospel.

    [1]Well… let me hedge that I don't think I'd find a no in myself if the course took place on the Canary Islands…
  • DASCH is now in the VO

    Black dots on a white-ish background.  In the middle, some diffuse greyish stuff around a relatively large black dot.

    This frame would show comet 2P/Encke during its proximity to Earth in 1941 – if it went deep enough. But never mind practicalities: If you want to learn about matching ephemeris against the DASCH plate collection (or, really, any sort of obscore-like table), read on.

    For about a century – that is, into the 1980s –, being an observational astronomer meant taking photographic plates and doing tricks with them (unless you were a radio astronomer or one of the very few astronomers peeking beyond radio and optical in those days, of course). This actually is somewhat fortunate for archivists, because unlike many of the early CCD observations that by now are lost with our ability to read the tapes they were stored on, the plates are still there.

    Why Bother?

    However, to make them usable, the plates need to be digitised. In the GAVO data centre, we keep the results of several scan campaigns large and small, such as HDAP, the various data collections joined in the historical photographic plate image archive HPPA, or the delightfully quirky Münster Flare Plates.

    I personally care a lot about these data collections. This is partly because they are indispensible for understanding the history of astronomy. But more importantly, they are the next best thing we have to a time machine; if we have a way of knowing how the sky looked like seventy years ago, it is these plate collections. Having such a time machine is important for all kinds of scientific efforts, including figuring out whether there are aliens (i.e., 2016ApJ...822L..34S) on Tabby's Star.

    Somewhat to my chagrin, the cited paper 2016ApJ...822L..34S did not use the VO to obtain the plate images but went straight to DASCH's web interface. DASCH, in case you have not heard of it before, is probably the most ambitious project concerned with plate digitisation at the moment – or perhaps: “was”, because they just finished scanning the core part of Harvard's plate collections, which was their primary goal.

    I can understand why Bradley Schaefer, the paper's author, did not bother with a VO search In 2016. For starters, working with halfway homogeneous data from instruments you are somewhat familiar saves a substantial amount of work and thought, in particular if you are, in addition, up against the usual lack of machine-readable metadata. Also, at that time DASCH probably had about as many digitised plates as all the VO's contemporary plate collections taken together.

    DASCH: The Harvard Plates

    Given such stats, I have always wanted to have at least the metadata from DASCH's plates in the VO. Thanks to a recent update to DASCH's publication system, this is now a reality. Since 2024-04-29, I am publishing the metadata of the DASCH plates via Obscore and and SIAP2.

    Followup (2024-05-03)

    This is now DASCH news, and one of my two main contacts on the DASCH side, Peter Williams, has written an insightful post on this, too. Let me use this opportunity to thank him for the delightful cooperation, and extend these thanks to Ben Sabath, who is primarily responsible for the update to the DASCH publication system I mentioned above.

    Matching plates are returned as datalink documents, pointing to a preview, photos of the plate and its jacket, and links to the science data, once downsampled by a factor of 16, once in the original size (example). For now, #this points to the downsampled version, as Amazon charges DASCH about three cents per full-scale plate at the moment, and that can quickly add up by accident (there's nothing wrong with consciously downloading full-scale FITS-es if you need them, of course).

    This is a bit fishy in that the size of the image in the obscore/SIAP2 fields s_xel1 and s_xel2 refers to the unscaled image, and thus I should be returning the full-scale image as datalink #this. I hope I will not cause much confusion with this design.

    In case you look at the links in the datalink documents, let me include a disclaimer: Although they point into the GAVO data centre, the data is served courtesy of the DASCH project. The links only go to us because we need to sign links for you. I mention this because you can save the datalink documents and the links within them; the URLs you are redirected to from there, however, will expire fast. Just do not look at them.

    Plates in Global Discovery

    So – what can you do with DASCH in the VO that you could not do before?

    Most importantly, you will discover DASCH in registry interfaces and its datasets in global queries (in particular the global dataset queries I have discussed a few weeks ago). For instance, DASCH is now in Aladin's discovery tree:

    A screen shot with many selected points, highlighted in green, on the right side.  On the left side, an tree display with many branches folded in.  On a folded-out branch, there is “DASCH SIAP2“ highlighted.  On the right side, there is a large rectangle overplotted in red.

    You can now find DASCH in Aladin and do the usual “in view“ searches. However, currently this yields many matches that are, in practical terms, spurious, as they come from extremely wide-angle instruments. The red rectangle is the footprint of one of these images; note that the view here is a full two pi sky. We will probably do something about this “noise“.

    The addition of DASCH to the VO has a strong effect in some use cases. For instance, at the end of the GAVO plates tutorial, we do an all-VO obscore query that, at the time of the last update of the tutorial in 2019, yielded 4067 datasets (of course, including modern and/or non-optical observations) potentially showing some strongly lensed quasar. With DASCH – and, admittedly, a few more collections that came into the VO since 2019 –, that number is now 10'489; the range of observation dates grew from MJD 12550…52000 to MJD 9800…58600, with the mean decreasing from 51'909 to 30'603. That the mean observation date moves that much back in time is a certain sign that a major part of the expansion is due to DASCH (well, and certainly to APPLAUSE, too).

    Followup (2024-05-03)

    As discussed in my DASCH update, I have taken out the large-coverage plates from my obscore table, which changes the stats (but not the conclusions) quite a bit. They is now 10'098 plates and mean observation date 36'396

    TAP, Uploads, and pyVO on DASCH

    But this is not just about bringing astronomical heritage to the VO. It is also about exposing DASCH through the powerful ADQL/TAP interface. As an example of how this may be useful, consider the comet P2/Encke, which, according to JPL's Small-Body Database was relatively close to Earth (about half an AU) in May 1941. It would have had about 14.5 mag at that point and hence was safely within reach of several of the instruments archived in DASCH. Perhaps we can find serendipitous or even targeted observations of the comet in the collection?

    The plan to find that out is: compute an ephemeris (we are lazy and use an external service, Miriade ephemcc) and then for each day see whether there are DASCH observations in the vicinity of the sky location obtained in this way.

    As usual, it's never that easy because the call to the ephemeris webservice (paste the link into TOPCAT to have a look) returns cursed sexagesimal coordinates. We need to fix them before doing anything serious with the table, and while we are at it, we also repair the date, which is simpler to consume if it is MJD to begin with. Getting the ephemeris thus takes quite a few lines:

    from astropy import table
    from astropy import units as u
    from astropy.coordinates import SkyCoord
    from astropy.time import Time
    
    ephem = table.Table.read(
      "https://vo.imcce.fr/webservices/miriade/ephemcc.php?-from=vespa"
      "&-name=c:p/encke&-ep=1941-04-01&-nbd=90&-step=1d&-observer=500"
      &-mime=votable")
    
    parsed = SkyCoord(ephem["ra"], ephem["dec"], unit=(u.hourangle, u.deg))
    ephem["ra"] = parsed.ra.degree
    ephem["dec"] = parsed.dec.degree
    
    parsed = Time(ephem["epoch"])
    ephem["epoch"] = parsed.mjd
    

    Compared to that, the actual matching against DASCH is almost trivial if you are somewhat familiar with crossmatching in ADQL and the Obscore schema:

    svc = pyvo.dal.TAPService("http://dc.g-vo.org/tap")
    res = svc.run_sync("""
        SELECT *
        FROM
            dasch.plates
            JOIN tap_upload.orbit
            ON (1=CONTAINS(POINT(ra, dec), s_region))
        WHERE
            t_min<epoch
            AND t_max>epoch""",
        uploads={"orbit": ephem})
    

    Followup (2024-05-03)

    You would probably query the dasch.narrow_plates table in actual operations; querying dasch.plates is probably more for people interested in the history of astronomy or DASCH itself.

    Inspect the query for a moment: This is a normal upload join, except we are constructing an ADQL POINT on the fly to be able to see whether we are in the spatial region covered by a DASCH dataset (given in obscore's s_region column). We could have put the temporal condition into the join's ON; but I think the intention is somewhat clearer with the WHERE constraint, and the database engine will probably go through identical motions for both queries – the beauty of having a query planner in the loop is that you do not need to think about such details most of the time.

    Actually, in this case there is one last complication: As said above, we have put a datalink service between you and the downloads to discourage accidental large downloads. We hence use pyVO's (suboptimally documented) datalink interface (iter_datalinks):

    with pyvo.samp.connection() as conn:
        for dl in res.iter_datalinks():
            link = next(dl.bysemantics("#preview-image"))
            pyvo.samp.send_image_to(
                conn,
                link.access_url,
                client_name="Aladin")
    

    Among the artefacts available we pick the scaled jpegs in this fragment (#preview-image), since these are almost free even on the Amazon cloud. Change that #preview-image to #this in the to get scaled calibrated FITS-es, which are still fairly small. This would, for instance, let you overplot the ephemeris in Aladin, which you cannot do with the jpegs as they lack astrometric calibration (for now). But even with #preview-image, we can use Aladin as a glorified image viewer by SAMP-sending the images there, which is why we do the minor magic with functions from pyvo.samp.

    If you want to try this yourself or mangle the program to do something else that requires querying against a reasonable number positions in time and space, just get encke.py and hack away. Make sure to start Aladin before running the program so it has something to send the images to.

    Disclaimers

    This is a contrived example, and it is likely that this particular use case is astronomically wrong in several ways. Let me enumerate a few things that would need looking into before this approaches proper science:

    • We compute the ephemeris for the center of the Earth. At half an AU distance, the resulting parallax will not shift the position enough to hide a plate we should know about, but at least for anything closer, you should try to do a bit better; admittedly, for a resource like DASCH – that contains plates from observatories all over the place – you will have to compromise.
    • The ephemeris is probably wrong; comet's orbits change over time, and I have no idea if the ephemeris service actually uses 2P/Encke's 1941 orbit to compute the positions.
    • The coordinate metadata may be wrong. Ephemcc's documentation says something that sounds a lot as if they were sometimes returning RA and Dec for the equator of the time rather than for J2000 (i.e., ICRS for all intents and purposes), but of course our obscore coverages are for the ICRS. Regrettably, the VOTable returned by the service does not contain a COOSYS element yet, and so there is no easy way to tell.
    • If you look at the table with DASCH matches, you will see they all were observed with an extremely wide-angle instrument sporting an aperture of a mere three inches. Even at the whopping exposure times (two hours), there is probably no way you would see a diffuse object of 14th mag on a plate with a 1940s-era photographic emulsion with that kind of optics (well: feel free to prove me wrong).
    • It would of course be a huge waste of bandwidth to pull the entire plates if we already had a good idea of where we would expect the comet (i.e., had a reliable ephemeris). Hence, a cutout service that would let you retrieve more or less exactly the pixels you would like to use for your research and not the cruft around it would be a nifty supplement. It's in the works, and I'd say you can almost hold your breath. The cutout will simply appear as a SODA service in the datalink documents. See 2020ASPC..522..295D for how you would operate such a service.
  • Updates to GAVO's Tutorials

    Over the years, GAVO has produced a number of VO tutorials, i.e., texts that introduce some technique related to using the Virtual Observatory, preferably within some halfway plausible scenario. In effect, they are software documentation, and as software itself, software documentation suffers from bit rot. To work against that, the tutorials have to be revised occasionally.

    My two student assistants Sonja Gabriel and Chuanming Mao have recently done some of that revising. Let me use this opportunity to show off some of these freshly polished tutorials.

    A classic one (that has, if I may say so myself, aged rather well), is Adding catalog data to object lists using the VO. This is a thinly disguised introduction to TAP uploads, arguably the most powerful of all the VO tech to date. If you have come to this place without ever having done a TAP upload, you owe it to yourself to at least skim the tutorial and quickly follow along the few steps to do positional crossmatches with just about any astronomical catalog and with just about any level of sophistication.

    part of a screenshot: a histogram, a sky photo with overplotted points

    Another classic – it has its roots in the original Italian VO Days[1] – is TOPCAT and Aladin working together. It is using SDSS data of some galaxy cluster to try and get you to to send around data and positions between different programs using SAMP. If you are reading VO blogs, it is not unlikely this kind of thing will make you yawn. But at VO Days, it's little things like this that usually most immediately appeal to students and researchers alike.

    part of a screenshot: a color-magnitude diagram is a very narrow main sequence, and a proper motion plot

    From a tech point of view, Explore the Pleiades with TOPCAT and Aladin also mainly looks at SAMP (perhaps even somewhat less convincingly), but it's such a striking demo of what an amazing instrument Gaia is, and it's a nice introduction to TOPCAT's VO interface and subsetting facility that it's definitely worth a look, in particular as a showcase of having instant results with the VO.

    circular cloud of red crosses and blue circles in a celestial coordinate system

    An entirely different topic (well: it also employs SAMP for a moment) is covered by Data Discovery Using the Virtual Observatory Registry. This is trying to motivate looking for data collections in the VO Registry (in the form of our Browser interface to it). This tutorial has grown quite a bit during the review and now includes two sections joining data from different resources for various purposes. One section illustrates how systematics of quasar redshifts might be looked into using different sources, the other investigates the Tully-Fisher relationship in different spectral bands.

    A TOPCAT-plotted histogram with a sharp peak around 39.5 AU and a much wider one around 44.

    The tutorial on Asteroids in the Solar System was entriely overhauled. It was (and still is) mainly intended to be used in schools, and thus it originally just built on things that ran in a web browser. As is typical of things in web browsers, they have long since vanished. Hence, a rather fundamental update was necessary anyway. While we were looking for interesting things to do – the plot above, by the way, is the distribution of major halfaxes in the Kuiper belt –, we ended up even includeding a brief bit on ADQL.

    Due to its school focus, we are also offering this particular text in German as well as in English. If you are an Astronomy teacher with particularly motivated pupils , we would like to hear from you…

    An aladin window showing two aligned photos of the ring nebula in Lyra

    The last revised tutorial I would like to mention also has a somewhat special (main) target audience: Astrometric Calibration using Aladin. Admittedly, automatic, or “blind” calibration has become really great, and I think getting their images located on the sky is not much of a problem even for amateurs any more, thanks in part to services like astrometry.net. But then – sometimes there is nothing like a good, old manual, ummm, “plate” solution. Aladin and the VO make that lot less tedious than it used to be.

    Of course, I cannot have a post on tutorials without mentioning the VO Text Treasures, a web page that shows the educational material currently registered in the VO Registry. This little page also accounts for bit rot: You can sort by the time last inspected there, and thanks to Sonja's and Chuanming's efforts, our tutorials look very good in that representation at the moment.

    In case you have some material suitable for WIRR yourself: Please register it, too. Send me a mail and I will lend you a hand (or, if you are a VO pro, directly read the pertinent standard).

    [1]That's block courses on VO matters lasting a day or two. If you are in Germany, you can book us for your very own one!
  • From Byurakan to L2: Short Spectra

    A snapshot from the DFBS tutorial: Carbon Stars in different spectral bands.

    A snapshot from the DFBS tutorial: Carbon Stars in different spectral bands.

    On June 30, a small project we've done together with the Armenian Virtual Observatory has ended. Its objective was to publish the spectra from the First Byurakan Survey (the DFBS) in a VO-compilant way. The data comes from one of the big surveys with Schmidt telescopes that form a sizable part of the observational heritage from the second part of the 20th century (you're still using a few of them daily if you tell Aladin to show a DSS plane).

    In this case, spectra from objects on the entire northern sky off the milky way down to about 18th mag were obtained. In a previous cooperation between Armenian and Italian astronomers a good decade ago, the plates were digitised and calibrated, and spectra were extracted. However, they resided behind a web interface so far, which made them somewhat clumsy to work with.

    Now, they're in the VO, and to give you a few ideas for what kind of things you can do with this kind of data, within the project we've also written the tutorial “Outlier Analysis in Low-Resolution Spectra”.

    Have a glance at the tutorial – you see, while the Byurakan survey certainly is a valuable resource by itself, I happen to believe at this point it's particularly valuable because with the next Gaia data release (planned for next year), a deluxe version of it will come: Gaia's RP/BP spectra will be all-sky, properly calibrated, and quite a bit deeper, but still low-resolution. So, if you're just waiting for such a data collection, you can train your methods right now on the DFBS.

Page 1 / 2 »