Secure data-out API - enabling encrypted htsget transactions
The European Genome-phenome Archive (EGA) is a service for archiving and sharing personally identifiable genetic and phenotypic data, while the The Genomic Data Infrastructure (GDI) project is enabling access to genomic and related phenotypic and clinical data across Europe. Both projects are focused on creating federated and secure infrastructure for researchers to archive and share data with the research community, to support further research.This project proposal is focusing on the data access part of the infrastructure. The files are encrypted in the archives, using the crypt4gh standard. Currently, there exist data access processes, where the files are either decrypted on the server side and then transferred to the user or re-encrypted server-side and provided to the user in an outbox.Htsget as a data access protocol also allows access to parts of files, but there’s currently no production-level client tools that support access to encrypted data. Our goal is to create a client tool that can access encrypted data over the htsget protocol. It should also work with the GA4GH Passport and Visa standard so we can then enhance the security of our data access interfaces. We will also modify htsget-rs, a Rust htsget server, and crypt4gh-rust as required to support the aforementioned standards. Finally, there will be an effort to implement this feature in already existing tools, like samtools and IGV.