Chinese Scientists Construct Transcriptome-related Databases for Fish

With the rapid expansion of transcriptome studies in many fishes, large amounts of RNA-seq data have been published, allowing for a more systematic understanding of the general profiles and details of gene expression in fishes.   

At present, relevant transcriptome databases have been reported in human, mouse, plant and other fields, providing extensive RNA-seq, scRNA-seq and spatial transcriptome information, but no relevant database has been reported in fish. A large amount of effective information of fish omics is buried in the sequencing data, and only a small part is scattered in various comprehensive databases, resulting in a lack of available online resources for fish researchers.    

Recently, a research group led by Prof. XIA Xiaoqin from the Institute of Hydrobiology (IHB) of the Chinese Academy of Sciences collected the relevant data of fish transcriptome and established three related databases - the bulk RNA-seq database FishGET (Fish Transcriptome and Expression Database), the scRNA-seq database FishSCT (Fish Single-Cell Transcriptome Database), and the spatial transcriptome database FishSED (Fish Spatial Expression Database). These studies were published in iScience and Science China Life Sciences.(FishGET, FishSCT, FishSED).  

FishGET contains a total of 1362 RNA-seq paired-end data (including mRNA and lncRNA) from 97 different studies about 8 fish species, including zebrafish, grass carp and rainbow trout. The researchers performed transcript assembly, weighted gene co-expression network analysis (WGCNA), function annotation, neighbor location annotation, lncRNA type annotation, homology annotation, and other types of work on the data. The website also provides a variety of dynamic interactive visualization services, which are used to query and display the gene expression and co-expression networks in various tissues and organs at different developmental stages of fish, in order to promote research into fish genes at the transcriptional level. 

FishSCT contains scRNA-seq data of 9 fish species and is the most complete online resource for zebrafish single cell transcriptome data. Based on 129 datasets from 44 studies published prior to October 2022, 964 marker genes and 26,965 potential marker gene information, as well as expression profiles at single-cell resolution (cell number: 646,641), were obtained through unified analysis, involving 245 cell types from nine fish species. The zebrafish data formed the backbone of the database, comprising 848 markers and 13,800 potential marker gene information for 222 cell types, covering tissues or organs at various stages of the zebrafish's growth and development timeline. FishSCT provides a user-friendly web interface to browse the expression patterns and marker information of target genes, and provides the function of cell type identification  to assist researchers in conducting related analysis of scRNA-seq. 

FishSED collated the published raw data about the spatial transcriptome of zebrafish, covering spatial expression profiles of 56 datasets from 10 projects. The sample types included embryos and several other tissues at all stages of development. After analysis and processing, 3D gene expression profiles covering five sequencing techniques were obtained, and an interactive fish spatial transcriptome data platform was established. As the only database dedicated providing spatial transcriptome data in fish, FishSED provides different visualization services according to different sequencing technologies, and can also search and map multi-gene expression patterns across datasets, facilitating researchers to carry out comparative analysis. 

The above three databases developed in this study provide the spatial and temporal expression profiles of various experimental types during the entire developmental stage, which can provide reference for the study of gene function in fish. 

Fish Gene Expression Databases (Image by IHB)

(Editor: MA Yun)