Researchers Uncover Extensive Sequence Divergence between Reference Genomes of Two Zebrafish Strains

Since 70% of protein-coding genes between zebrafish and human are conserved, zebrafish (Danio rerio) has become one of the most widely-used model organism for studying vertebrate gene function and human disease. Tuebingen and AB are two most common laboratory zebrafish strains. While the zebrafish reference genome is derived from Tuebingen strain, AB strain is still used worldwide but lacks a high-quality genome of AB strain comparable to Tuebingen strain.   

A research group led by Prof. HE Shunping from the Institute of Hydrobiology (IHB) of the Chinese Academy of Sciences reported a high-quality de novo genome assembly of the AB strain (DrAB1) and uncovered extensive sequence divergence between the reference genomes of two zebrafish strains, Tuebingen and AB. This study was published in Molecular Ecology Resources.   

In this study, an integrative approach involving Illumina short-read sequencing, Nanopore long-read sequencing and HiC-based chromatin mapping was adopted by the researchers to generate a 1.40 Gb representative de novo genome assembly of the AB strain (DrAB1) with the contig N50 length of 21 Mb. Compared with the published zebrafish Zv11 reference genome (GRCz11), this genome assembly shows considerable improvements in both contiguity and completeness.   

By whole-genome comparison, the researchers uncovered substantial structural differences and extensive sequence divergence of unprecedented resolution between the two zebrafish strains, especially with respect to 9029929 single-nucleotide polymorphism (SNPs), 2376812 InDels, 32623 insertions, 22089 deletions and 220 inversions, which constitute ~2.6% of DrAB1 genome.    

“Many of these variants may have potential functional effects on phenotype,” said Prof. HE. Among the coding sequences affected by the candidate structural variations, the researchers found that some genes were predicted or reported to express in nervous system or be associated with brain development and photoreceptor cell development.   

Furthermore, the researchers discovered 32.8 Mb of deleted regions of the long arm on chromosome 4 in DrAB1 assembly mainly consisting of repeat elements and zinc-finger proteins by comparison between the two zebrafish genomes. Considering the wide range of molecular functions of zinc-finger proteins, these data would serve as a reference for adopting appropriate zebrafish strain in the future.   

According to the results of this study, the two different zebrafish strains may harbor dramatically different complement of proteins and regulatory sequences, suggesting that the strain-specific genetic variations should be considered in experimental design since they could potentially confound studies intended for data applications and translations to human diseases.
(Editor: MA Yun)