Huynh Ky * , Van Quoc Giang , Nguyen Loc Hien , Nguyen Chau Thanh Tung , Huynh Nhu Dien , Nguyen Nhut Thanh , Vo Cong Thanh and Yeap Swee Keong

* Corresponding author (hky@ctu.edu.vn)

Main Article Content

Abstract

High-performance sequences are generating increasingly comprehensive catalogs of crop genetic variation. To make optimal use of this vast collection of data for research purposes, a robust and reproducible analytical pipeline discipline is required that is capable of accurately detecting and favoring variants. The entire genome sequencing data from the rice variety Nang Thom Cho Dao was analyzed using the appropriate bioinformatic pipeline. A total of 21 million reads with 6,6 GB of data were analyzed. SNPs and indels from the Nang Thom Cho Dao genome were found to be variable when compared to the Nipponbare reference rice genome. The result showed that the novel Indel of BADH2 gene in Nang Thom Cho Dao genome. The study will contribute valuable information to the development of genetic markers for rice breeding strategies using Nang Thom Cho Dao rice varieties.

Keywords: Bioinformatic, InDels, Nang Thom Cho Dao, SNPs, Variants

Article Details

References

Barton, H. J., & Zeng, K. (2019). The impact of natural selection on short insertion and deletion variation in the great tit genome. Genome Biology and Evolution, 11(6), 1514-1524. https://doi.org/10.1093/gbe/evz068

Bolser, D., Staines, D. M., Pritchard, E., & Kersey, P. (2016). Ensembl plants: integrating tools for visualizing, mining, and analyzing plant genomics data. In D. Edwards (Ed.), Plant Bioinformatics: Methods and Protocols (pp. 115-140). Springer New York. https://doi.org/10.1007/978-1-4939-3167-5_6

Chandler, V. L., & Brendel, V. (2002). The maize genome sequencing project. Plant Physiology, 130(4), 1594. https://doi.org/10.1104/pp.015594

Cingolani, P., Platts, A., Wang, L. L., Coon, M., Nguyen, T., Wang, L., Land, S. J., Lu, X., & Ruden, D. M. (2012). A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly, 6(2), 80-92. https://doi.org/10.4161/fly.19695

Cock, P. J. A., Fields, C. J., Goto, N., Heuer, M. L., & Rice, P. M. (2010). The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic acids research, 38(6), 1767-1771. https://doi.org/10.1093/nar/gkp1137

Kawahara, Y., de la Bastide, M., Hamilton, J. P., Kanamori, H., McCombie, W. R., Ouyang, S., Schwartz, D. C., Tanaka, T., Wu, J., Zhou, S., Childs, K. L., Davidson, R. M., Lin, H., Quesada-Ocampo, L., Vaillancourt, B., Sakai, H., Lee, S. S., Kim, J., Numa, H., Itoh, T., Buell, C. R., & Matsumoto, T. (2013). Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice (New York, N.Y.), 6(1), 4-4. https://doi.org/10.1186/1939-8433-6-4

Keel, B. N., & Snelling, W. M. (2018). Comparison of Burrows-wheeler transform-based mapping algorithms used in high-throughput whole-genome sequencing: application to illumina data for livestock genomes. Frontiers in genetics, 9, 35-35. https://doi.org/10.3389/fgene.2018.00035

Li, H. (2014). Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics (Oxford, England), 30(20), 2843-2851. https://doi.org/10.1093/bioinformatics/btu356

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., & Genome Project Data Processing, S. (2009). The sequence Alignment/Map format and SAMtools. Bioinformatics (Oxford, England), 25(16), 2078-2079. https://doi.org/10.1093/bioinformatics/btp352

Li, J.-Y., Wang, J., & Zeigler, R. S. (2014). The 3,000 rice genomes project: new opportunities and challenges for future rice research. GigaScience, 3(1). https://doi.org/10.1186/2047-217x-3-8

Petrackova, A., Vasinek, M., Sedlarikova, L., Dyskova, T., Schneiderova, P., Novosad, T., Papajik, T., & Kriegova, E. (2019). Standardization of sequencing coverage depth in ngs: recommendation for detection of clonal and subclonal mutations in cancer diagnostics. Frontiers in oncology, 9, 851-851. https://doi.org/10.3389/fonc.2019.00851

Salgotra, R. K., Gupta, B. B., & Stewart, C. N. (2014). From genomics to functional markers in the era of next-generation sequencing. Biotechnology Letters, 36(3), 417-426. https://doi.org/10.1007/s10529-013-1377-1

Shimomura, M., Kanamori, H., Komatsu, S., Namiki, N., Mukai, Y., Kurita, K., Kamatsuki, K., Ikawa, H., Yano, R., Ishimoto, M., Kaga, A., & Katayose, Y. (2015). The Glycine max cv. Enrei genome for improvement of japanese soybean cultivars. International Journal of Genomics, 2015, 358127. https://doi.org/10.1155/2015/358127

Thottathil, G. P., Jayasekaran, K., & Othman, A. S. (2016). Sequencing crop genomes: a gateway to improve tropical agriculture. Tropical life sciences research, 27(1), 93-114. https://pubmed.ncbi.nlm.nih.gov/27019684

Wang, J., Raskin, L., Samuels, D. C., Shyr, Y., & Guo, Y. (2015). Genome measures used for quality control are dependent on gene function and ancestry. Bioinformatics (Oxford, England), 31(3), 318-323. https://doi.org/10.1093/bioinformatics/btu668

Zhu, Y.-N., Shi, D.-Q., Ruan, M.-B., Zhang, L.-L., Meng, Z.-H., Liu, J., & Yang, W.-C. (2013). Transcriptome analysis reveals crosstalk of responsive genes to multiple abiotic stresses in cotton (Gossypium hirsutum L.). PloS one, 8(11), e80218-e80218. https://doi.org/10.1371/journal.pone.0080218