DocTIS makes key molecular datasets and analytical resources available to the scientific community
As the DocTIS project approaches its conclusion, the consortium is making available a growing collection of molecular datasets, metadata and analytical tools generated throughout six and a half years of research into immune-mediated inflammatory diseases (IMIDs).
These resources represent one of the most comprehensive molecular and computational collections developed within the project, supporting future research on disease mechanisms, treatment response and precision medicine approaches across rheumatoid arthritis, psoriatic arthritis, psoriasis, ulcerative colitis, Crohn’s disease and systemic lupus erythematosus.
In line with the FAIR (Findable, Accessible, Interoperable and Reusable) data principles adopted throughout the project, DocTIS has deposited datasets and analytical resources in public and controlled-access repositories to facilitate transparency, reproducibility and long-term reuse by the scientific community.
The resources have been generated through the collaborative work of the DocTIS consortium, coordinated by the Vall d’Hebron Research Institute, VHIR (Sara Marsal), and involving Cardiff University (Ernest Choy), the University of Verona (Giampiero Girolomoni), Charité – Universitätsmedizin Berlin (Britta Siegmund), the Institut d’Investigacions Biomèdiques August Pi i Sunyer, IDIBAPS (Pere Santamaria), the Centro Nacional de Análisis Genómico, CNAG (Holger Heyn), IMIDomics Inc. (Manuel Lopez-Figueroa), HudsonAlpha Institute for Biotechnology (Richard M. Myers), Linköping University (Mikael Benson), Karolinska Institutet (Mikael Benson) and Zabala Innovation.
Publicly available resources
1- Preprocessed scRNA-sequencing data and metadata for psoriatic arthritis (PsA) patients treated with anti-IL17 therapy
This dataset contains preprocessed single-cell RNA sequencing (scRNA-seq) data and associated metadata from PsA patients classified as responders or non-responders to anti-IL17 treatment at baseline, together with healthy controls.
- Repository: Figshare
- Contact person: Samuel Schäfer (Linköping University)
- Associated publication: scDrugPrio: a framework for the analysis of single-cell transcriptomics to address multiple problems in precision medicine in immune-mediated inflammatory diseases (Genome Medicine)
- Status: Data available.
2- Preprocessed scRNA-sequencing data and metadata for psoriatic arthritis (PsA) patients treated with anti-TNF therapy
This dataset contains preprocessed scRNA-seq data and metadata from PsA patients classified as responders or non-responders to anti-TNF treatment at baseline, together with healthy controls.
- Repository: Figshare
- Contact person: Samuel Schäfer (Linköping University)
- Associated publication: scDrugPrio: a framework for the analysis of single-cell transcriptomics to address multiple problems in precision medicine in immune-mediated inflammatory diseases (Genome Medicine)
- Status: Data available.
3- Preprocessed scRNA-sequencing data and metadata for all IMIDs at baseline and healthy controls
This resource provides harmonised and preprocessed scRNA-seq data from patients across all IMIDs included in DocTIS, together with healthy controls, enabling cross-disease analyses of immune-cell states and inflammatory mechanisms.
- Repository: Zenodo
- Contact person: Holger Heyn (CNAG)
- Associated publication: Interpretable inflammation landscape of circulating immune cells (Nature Medicine)
- Status: Data available.
4- Code repository for the paper “Interpretable Inflammation Landscape of Circulating Immune Cells”
The repository contains the complete analytical workflows and code used to generate the results presented in the publication, facilitating reproducibility and reuse.
- Repository: GitHub
- Contact person: Holger Heyn (CNAG)
- Associated publication: Interpretable inflammation landscape of circulating immune cells (Nature Medicine)
- Status: Code available.
Resources Awaiting Release Following Publication:
5- Raw scRNA-sequencing data and metadata for all IMIDs at baseline and healthy controls
This dataset contains the raw single-cell transcriptomic data and associated metadata generated across all IMIDs included in the project.
- Repository: European Genome-Phenome Archive (EGA)
- Contact person: Sara Marsal (VHIR)
- Associated publication: Interpretable inflammation landscape of circulating immune cells (Nature Medicine)
- Status: Data uploaded.
The dataset has been submitted to EGA but has not yet been released. Following release, access will be provided through EGA under controlled-access procedures.
6- Raw scRNA-sequencing data and metadata for all IMIDs at follow-up
This resource contains longitudinal scRNA-seq profiles and metadata generated during patient follow-up across all IMIDs included in the project.
- Repository: European Genome-Phenome Archive (EGA)
- Contact person: Sara Marsal (VHIR)
- Status: Data uploaded.
The dataset has been uploaded to the repository but has not yet been formally submitted for release. Submission is planned following acceptance of the associated publications (Martínez-Mateu et al. under revision, and Guillén et al. on preparation), currently expected by the end of 2026. Once submitted, access will be available through EGA under controlled-access procedures upon request.
7- Raw bulk RNA-Seq data and metadata for all IMIDs at baseline and follow-up
This dataset contains raw bulk transcriptomic sequencing data and associated metadata generated from patients across all IMIDs, covering both baseline and follow-up time points.
- Repository: European Genome-Phenome Archive (EGA)
- Contact person: Sara Marsal (VHIR)
- Status: Data uploaded.
The dataset has been uploaded to the repository but has not yet been formally submitted for release. Submission is expected following acceptance of the associated publication (Martínez-Mateu et al. under revision), currently anticipated by the end of 2026. Once submitted, access will be provided through EGA under controlled-access procedures.
8- Preprocessed bulk RNA-Seq and additional scRNA-seq datasets
This resource includes preprocessed bulk RNA-seq data for all IMIDs at baseline and follow-up, healthy controls, and scRNA-seq data generated from rheumatoid arthritis donors at baseline and follow-up, and healthy donors.
The dataset has been uploaded and will be released following acceptance of the associated publication (Martínez-Mateu et al. under revision), currently expected by the end of 2026. Once released, the data will be available under request.
9- Code repository supporting the combinatorial therapy analyses
This repository contains the analytical code used in the combinatorial therapy study described in the manuscript by Martínez-Mateu et al., currently under review.
- Repository: GitHub
- Contact person: Sergio Martínez (IMIDomics)
- Status: Code uploaded.
The code has already been uploaded and will be publicly released once the associated publication has been accepted.
Summary of DocTIS data and code resources
| Resource | Repository | Status | Contact person |
| Preprocessed scRNA-seq data and metadata from PsA patients treated with anti-IL17 therapy | Figshare | Available | Samuel Schäfer (Linköping University) |
| Preprocessed scRNA-seq data and metadata from PsA patients treated with anti-TNF therapy | Figshare | Available | Samuel Schäfer (Linköping University) |
| Harmonised preprocessed scRNA-seq data from all IMIDs and healthy controls | Zenodo | Available | Holger Heyn (CNAG) |
| Analytical code for Interpretable inflammation landscape of circulating immune cells | GitHub | Available | Holger Heyn (CNAG) |
| Raw scRNA-seq data and metadata from all IMIDs at baseline and healthy controls | European Genome-Phenome Archive (EGA) | Submitted, pending publication | Sara Marsal (VHIR) |
| Raw scRNA-seq data and metadata from all IMIDs at follow-up | European Genome-Phenome Archive (EGA) | Uploaded, pending publication | Sara Marsal (VHIR) |
| Raw bulk RNA-seq data and metadata from all IMIDs at baseline and follow-up | European Genome-Phenome Archive (EGA) | Uploaded, pending publication | Sara Marsal (VHIR) |
| Preprocessed bulk RNA-seq and additional scRNA-seq datasets | Zenodo | Uploaded, pending publication | Sara Marsal (VHIR) |
| Code repository supporting combinatorial therapy analyses | GitHub | Uploaded, pending publication | Sergio Martínez (IMIDomics) |
Supporting future research in IMIDs
The datasets generated within DocTIS combine single-cell and bulk transcriptomic information, clinical metadata and computational workflows across multiple immune-mediated inflammatory diseases. Together, they provide a valuable resource for researchers interested in disease heterogeneity, treatment response, biomarker discovery and systems medicine approaches.
By making these resources available through recognised public and controlled-access repositories, the DocTIS consortium aims to maximise the long-term scientific impact of the project and support future advances in precision medicine for patients living with IMIDs.