Implementing SIAv2 Over Rubin Observatory's Data Butler
Jenness, Voutsinas, Dubois-Felsmann et al.
The IVOA Simple Image Access version 2 protocol defines an easy way to provide community access to a collection of data. At the Vera C. Rubin Observatory we currently enable ObsTAP access to our data holdings via an ObsCore export or view of our Data Butler repositories. This approach comes with some deployment constraints, such as requiring pgsphere and compatibility with our CADC TAP implementation, so recently we decided to see whether we could instead provide an SIAv2 service that talks directly to our Data Butler. Here we describe our motivation, implementation strategies, and current deployment status, as well as discussing some metadata mismatches between the Butler data models and SIAv2.
academic
Implementing SIAv2 Over Rubin Observatory's Data Butler
The IVOA Simple Image Access Protocol version 2 (SIAv2) defines a straightforward method for providing community access to data collections. At Vera C. Rubin Observatory, we currently implement ObsTAP data access through ObsCore exports or views from the Data Butler repository. However, this approach has deployment constraints, such as requiring pgsphere support and compatibility with CADC TAP implementations. Consequently, we explored whether we could provide an SIAv2 service that communicates directly with the Data Butler. This paper describes our motivation, implementation strategy, current deployment status, and several metadata mismatch issues between the Butler data model and SIAv2.
Rubin Observatory's Data Butler system consists of a metadata registry and file data storage, with the registry containing sufficient information to construct ObsCore records. Previously, two approaches were used to provide ObsCore tables:
Export records as CSV or Parquet files and load them into a static database
Use registry backend hooks to provide real-time synchronization to ObsCore tables
Static Export Method: Suitable for formal data releases and integration into high-performance Qserv databases, but unsuitable for dynamic datasets such as nightly rapid products
Real-time ObsCore Method: Requires deployment environment support for pgsphere, and requires rebuilding the entire table when configuration changes
These limitations prompted the research team to seek a simpler yet standardized query layer based directly on the Butler system. The IVOA SIAv2 protocol became the obvious choice because:
Direct Butler interface provides greater flexibility
Configuration changes only require simple service restart
Map IVOA SIAv2 protocol queries directly to the Rubin Data Butler query system, implementing a standardized astronomical data access interface while avoiding deployment constraints of traditional ObsCore table methods.
Dowler, P., et al. (2015). IVOA Simple Image Access Version 2.0 - Defines the SIAv2 standard protocol
Jenness, T., et al. (2022). Core architecture paper of the Rubin Data Butler system
Louys, M., et al. (2017). ObsCore data model and TAP implementation standards
Salnikov, A. (2022). Technical note on ObsCore as a Butler registry view
Summary: This paper demonstrates a successful engineering practice case that solves practical deployment problems while maintaining compatibility with international standards. Although there are challenges with data model mismatches, the overall implementation provides valuable reference and tools for the astronomical data management field.