Deep polygenic neural network for predicting and identifying yield-associated genes in Indonesian rice accessions

Lee, S., Lozano, A., Kambadur, P. & Xing, E. P. An efficient nonlinear regression approach for genome-wide detection of marginal and interacting genetic variations. J. Comput. Biol. 23, 372–389 (2016).
Google Scholar
Banerjee, S., Zeng, L., Schunkert, H. & Söding, J. Bayesian multiple logistic regression for case-control GWAS. PLoS Genet. 14, 1–27 (2018).
Google Scholar
Yoo, Y. J., Sun, L. & Bull, S. B. Gene-based multiple regression association testing for combined examination of common and low frequency variants in quantitative trait analysis. Front. Genet. 4, 1–17 (2013).
Google Scholar
Yoo, Y. J., Sun, L., Poirier, J. G., Paterson, A. D. & Bull, S. B. Multiple linear combination (MLC) regression tests for common variants adapted to linkage disequilibrium structure. Genet. Epidemiol. 41, 108–121 (2017).
Google Scholar
Li, X. et al. Genetic control of the root system in rice under normal and drought stress conditions by genome-wide association study. PLoS Genet. 13, 1–24 (2017).
McMahan, C. et al. A Bayesian hierarchical model for identifying significant polygenic effects while controlling for confounding and repeated measures. Stat. Appl. Genet. Mol. Biol. 16, 407–419 (2017).
Google Scholar
International Rice Genome Sequencing Project. The map-based sequence of the rice genome. Nature 436, 793–800 (2005).
Google Scholar
Yao, W. et al. Exploring the rice dispensable genome using a metagenome-like assembly strategy. Genome Biol. 16, 1–20 (2015).
Google Scholar
Zhao, H. et al. RiceVarMap: A comprehensive database of rice genomic variations. Nucleic Acids Res. 43, D1018–D1022 (2015).
Google Scholar
Chen, H. et al. A high-density SNP genotyping array for rice biology and molecular breeding. Mol. Plant 7, 541–553 (2014).
Google Scholar
Food and Agriculture Organization of the United Nations. FAO’s Director-general on how to feed the world in 2050. Popul. Dev. Rev. 35, 837–839 (2009).
Google Scholar
World Population Review. Megadiverse Countries 2020. https://worldpopulationreview.com/country-rankings/megadiverse-countries (2020).
UN DESA. World Population Prospects. https://population.un.org/wpp/Graphs/Probabilistic/POP/TOT/360 (2019).
Goff, S. A. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science (80-). 296, 92–100 (2002).
Google Scholar
Yu, J. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science (80-). 296, 79–92 (2002).
Google Scholar
Jiang, C. K. et al. Identification and distribution of a single nucleotide polymorphism responsible for the catechin content in tea plants. Hortic. Res. 7, 1–9 (2020).
Google Scholar
Sapkota, S., Boatwright, J. L., Jordan, K., Boyles, R. & Kresovich, S. Identification of novel genomic associations and gene candidates for grain starch content in sorghum. Genes (Basel). 11, 1–15 (2020).
Google Scholar
Wu, D. et al. Identification of a candidate gene associated with isoflavone content in soybean seeds using genome-wide association and linkage mapping. Plant J. 104, 950–963 (2020).
Google Scholar
Sun, L. et al. New quantitative trait locus (QTLs) and candidate genes associated with the grape berry color trait identified based on a high-density genetic map. BMC Plant Biol. 20, 1–13 (2020).
Google Scholar
To, H. T. M. et al. A genome-wide association study reveals the quantitative trait locus and candidate genes that regulate phosphate efficiency in a Vietnamese rice collection. Physiol. Mol. Biol. Plants 26, 2267–2281 (2020).
Google Scholar
Lin, Y. et al. Phenotypic and genetic variation in phosphorus-deficiency-tolerance traits in Chinese wheat landraces. BMC Plant Biol. 20, 1–9 (2020).
Google Scholar
Liu, W. et al. Genome-wide association study reveals the genetic basis of fiber quality traits in upland cotton (Gossypium hirsutum L.). BMC Plant Biol. 20, 1–13 (2020).
Google Scholar
Thabet, S. G., Moursi, Y. S., Karam, M. A., Börner, A. & Alqudah, A. M. Natural variation uncovers candidate genes for barley spikelet number and grain yield under drought stress. Multidiscip. Digit. Publ. Inst. 11, 1–23 (2020).
Su, Y., Xu, H. & Yan, L. Support vector machine-based open crop model (SBOCM): Case of rice production in China. Saudi J. Biol. Sci. 24, 537–547 (2017).
Google Scholar
Basith, S., Manavalan, B., Shin, T. H. & Lee, G. SDM6A: A web-based integrative machine-learning framework for predicting 6mA sites in the rice genome. Mol. Ther. Nucleic Acids 18, 131–141 (2019).
Google Scholar
Yu, H. & Dai, Z. SNNRice6mA: A deep learning method for predicting DNA N6-methyladenine sites in rice genome. Front. Genet. 10, 1–6 (2019).
Google Scholar
Putri, R. E., Yahya, A., Adam, N. M. & Abd Aziz, S. Rice yield prediction model with respect to crop healthiness and soil fertility. Food Res. 3, 171–176 (2019).
Google Scholar
Supro, I. A., Mahar, J. A. & Mahar, S. A. Rice yield prediction and optimization using association rules and neural network methods to enhance agribusiness. Indian J. Sci. Technol. 13, 1367–1379 (2020).
Google Scholar
Maeda, Y., Goyodani, T., Nishiuchi, S. & Kita, E. Yield prediction of paddy rice with machine learning. In Proc. 2018 Int. Conf. Parallel Distrib. Process. Tech. Appl. 361–365 (2018).
Das, B., Nair, B., Reddy, V. K. & Venkatesh, P. Evaluation of multiple linear, neural network and penalised regression models for prediction of rice yield based on weather parameters for west coast of India. Int. J. Biometeorol. 62, 1809–1822 (2018).
Google Scholar
Amaratunga, V. et al. Artificial neural network to estimate the paddy yield prediction using climatic data. Math. Probl. Eng. 2020, (2020).
Chu, Z. & Yu, J. An end-to-end model for rice yield prediction using deep learning fusion. Comput. Electron. Agric. 174, 105471 (2020).
Google Scholar
Armagan, A., Dunson, D. B. & Lee, J. Generalized double pareto shrinkage. Stat. Sin. 23, 119–143 (2013).
Google Scholar
van Erp, S., Oberski, D. L. & Mulder, J. Shrinkage priors for Bayesian penalized regression. J. Math. Psychol. 89, 31–50 (2019).
Google Scholar
Huang, S., Shingaki-Wells, R. N., Taylor, N. L. & Millar, A. H. The rice mitochondria proteome and its response during development and to the environment. Front. Plant Sci. 4, 1–6 (2013).
Google Scholar
Teixeira, P. F. & Glaser, E. Processing peptidases in mitochondria and chloroplasts. Biochim. Biophys. Acta Mol. Cell Res. 1833, 360–370 (2013).
Google Scholar
Sharma, M. & Pandey, G. K. Expansion and function of repeat domain proteins during stress and development in plants. Front. Plant Sci. 6, 1–15 (2016).
Sheikh, A. H. et al. Interaction between two rice mitogen activated protein kinases and its possible role in plant defense. BMC Plant Biol. 13, 1–11 (2013).
Google Scholar
Yang, Z. et al. Transcriptome-based analysis of mitogen-activated protein kinase cascades in the rice response to Xanthomonas oryzae infection. Rice 8, 1–13 (2015).
Google Scholar
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5999–6009 (2017).
Cheng, H. T. et al. Wide & deep learning for recommender systems. In ACM Int. Conf. Proceeding Ser. 7–10 (2016) https://doi.org/10.1145/2988450.2988454.
Bahdanau, D., Cho, K. H. & Bengio, Y. Neural machine translation by jointly learning to align and translate. In 3rd Int. Conf. Learn. Represent. ICLR 2015—Conf. Track Proc. 1–15 (2015).
Baurley, J. W., Budiarto, A., Kacamarga, M. F. & Pardamean, B. A web portal for rice crop improvements. Int. J. Web Portals 10, 15–31 (2018).
Google Scholar
Wang, D. R. et al. An imputation platform to enhance integration of rice genetic resources. Nat. Commun. 9, 1–10 (2018).
Google Scholar
Dominic, N., Prayoga, J. S., Kumala, D., Surantha, N. & Soewito, B. The comparative study of algorithms in building the green mobile cloud computing environment. Springer B. Lect. Notes Netw. Syst. 343, 43–54 (2021).
Google Scholar
Mittag, F., Römer, M. & Zell, A. Influence of feature encoding and choice of classifier on disease risk prediction in genome-wide association studies. PLoS One 10, e0135832 (2015).
Google Scholar
Song, M., Wheeler, W., Caporaso, N. E., Landi, M. T. & Chatterjee, N. Using imputed genotype data in the joint score tests for genetic association and gene–environment interactions in case-control studies. Genet. Epidemiol. 42, 146–155 (2018).
Google Scholar
Yusuf, I. et al. Genetic risk factors for colorectal cancer in multiethnic Indonesians. Sci. Rep. 11, 1–9 (2021).
Google Scholar
Probst, P., Boulesteix, A. L. & Bischl, B. Tunability: Importance of hyperparameters of machine learning algorithms. J. Mach. Learn. Res. 20, 1–32 (2019).
Google Scholar
Dominic, N., Daniel Cenggoro, T. W., Budiarto, A. & Pardamean, B. Transfer learning using inception-resnet-v2 model to the augmented neuroimages data for autism spectrum disorder classification. Commun. Math. Biol. Neurosci. 2021, 1–21 (2021).
Lattes, M. B. Report: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67, 301–320 (2005).
Google Scholar
Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010).
Google Scholar
Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67, 301–320 (2005).
Google Scholar
Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948).
Google Scholar
Shannon, C. E. A mathematical theory of communication part III: Mathematical preliminaries. Bell Syst. Tech. J. 27, 623–656 (1948).
Google Scholar
Croiseau, P. et al. Fine tuning genomic evaluations in dairy cattle through SNP pre-selection with the Elastic-Net algorithm. Genet. Res. (Camb) 93, 409–417 (2011).
Google Scholar
Sarkar, R. K., Rao, A. R., Meher, P. K., Nepolean, T. & Mohaparta, T. Evaluation of random forest regression for prediction of breeding value from genomewide SNPs. J. Genet. 94, 187–192 (2015).
Google Scholar
Rashkin, S. R. et al. A pharmacogenetic prediction model of progression-free survival in breast cancer using genome-wide genotyping data from CALGB 40502 (Alliance). Clin. Pharmacol. Ther. 105, 738–745 (2019).
Google Scholar
Wen, J., Ford, C. T., Janies, D. & Shi, X. A parallelized strategy for epistasis analysis based on Empirical Bayesian Elastic Net models. Bioinformatics 36, 3803–3810 (2020).
Google Scholar
Chen, C., Twycross, J. & Garibaldi, J. M. A new accuracy measure based on bounded relative error for time series forecasting. PLoS One 12, 1–23 (2017).
Elavarasan, D., Vincent, D. R., Sharma, V., Zomaya, A. Y. & Srinivasan, K. Forecasting yield by integrating agrarian factors and machine learning models: A survey. Comput. Electron. Agric. 155, 257–282 (2018).
Google Scholar
Spiess, A. N. & Neumeyer, N. An evaluation of R2 as an inadequate measure for nonlinear models in pharmacological and biochemical research: A Monte Carlo approach. BMC Pharmacol. 10, 1–11 (2010).
Google Scholar
Pal, R. Chapter 4: Validation methodologies. Predict. Model. Drug Sensit. https://doi.org/10.1016/b978-0-12-805274-7.00004-x (2017).
Google Scholar
Nallamilli, B. R. R. et al. Polycomb group gene OsFIE2 regulates rice (Oryza sativa) seed development and grain filling via a mechanism distinct from Arabidopsis. PLoS Genet. 9, e1003322 (2013).
Google Scholar
Jeong, K. et al. Phosphorus remobilization from rice flag leaves during grain filling: an RNA-seq study. Plant Biotechnol. J. 15, 15–26 (2017).
Google Scholar
Zhu, Q.-L. et al. In silico analysis of a MRP transporter gene reveals its possible role in anthocyanins or flavonoids transport in Oryze sativa. Am. J. Plant Sci. 04, 555–560 (2013).
Google Scholar
Liu, Y. et al. Anthocyanin biosynthesis and degradation mechanisms in Solanaceous vegetables: A review. Front. Chem. 6, 52 (2018).
Google Scholar
Panche, A. N., Diwan, A. D. & Chandra, S. R. Flavonoids: An overview. J. Nutr. Sci. 5, (2016).
Singh, V., Sharma, V. & Katara, P. Comparative transcriptomics of rice and exploitation of target genes for blast infection. Agric. Gene 1, 143–150 (2016).
Google Scholar
van Ooijen, G. et al. Structure-function analysis of the NB-ARC domain of plant disease resistance proteins. J. Exp. Bot. 59, 1383–1397 (2008).
Google Scholar
Głowacki, S., Macioszek, V. K. & Kononowicz, A. K. R proteins as fundamentals of plant innate immunity. Cell. Mol. Biol. Lett. 16, 1–24 (2011).
Google Scholar
Tian, L. et al. Rna-binding protein RBP-P is required for glutelin and prolamine mRNA localization in rice endosperm cells. Plant Cell 30, 2529–2552 (2018).
Google Scholar
Wang, C. et al. Chloroplastic Os3BGlu6 contributes significantly to cellular ABA pools and impacts drought tolerance and photosynthesis in rice. New Phytol. 226, 1042–1054 (2020).
Google Scholar
Sun, L. et al. Carbon Starved Anther modulates sugar and ABA metabolism to protect rice seed germination and seedling fitness. Plant Physiol. https://doi.org/10.1093/plphys/kiab391 (2021).
Google Scholar
Talla, S. K. et al. Cytokinin delays dark-induced senescence in rice by maintaining the chlorophyll cycle and photosynthetic complexes. J. Exp. Bot. 67, 1839–1851 (2016).
Google Scholar
Chandran, A. K. N., Jeong, H. Y., Jung, K. H. & Lee, C. Development of functional modules based on co-expression patterns for cell-wall biosynthesis related genes in rice. J. Plant Biol. 59, 1–15 (2016).
Google Scholar
Wang, Y. et al. Genetic bases of source-, sink-, and yield-related traits revealed by genome-wide association study in Xian rice. Crop J. 8, 119–131 (2020).
Google Scholar
Patishtan, J., Hartley, T. N., Fonseca de Carvalho, R. & Maathuis, F. J. M. Genome-wide association studies to identify rice salt-tolerance markers. Plant Cell Environ. 41, 970–982 (2018).
Google Scholar
Saha, J., Sengupta, A., Gupta, K. & Gupta, B. Molecular phylogenetic study and expression analysis of ATP-binding cassette transporter gene family in Oryza sativa in response to salt stress. Comput. Biol. Chem. 54, 18–32 (2015).
Google Scholar
Leonard, G. D., Fojo, T. & Bates, S. E. The role of ABC transporters in clinical practice. Oncologist 8, 411–424 (2003).
Google Scholar
Mackon, E. et al. Recent insights into anthocyanin pigmentation, synthesis, trafficking, and regulatory mechanisms in rice (Oryza sativa L.) caryopsis. Biomolecules 11, 1–26 (2021).
Google Scholar
Nguyen, Q.-T.T., Huang, T.-L. & Huang, H.-J. Identification of genes related to arsenic detoxification in rice roots using microarray analysis. Int. J. Biosci. Biochem. Bioinform. 4, 22–27 (2014).
Google Scholar
Narsai, R. et al. Mechanisms of growth and patterns of gene expression in oxygen-deprived rice coleoptiles. Plant J. 82, 25–40 (2015).
Google Scholar
Wu, Y. S. & Yang, C. Y. Comprehensive transcriptomic analysis of auxin responses in submerged rice coleoptile growth. Int. J. Mol. Sci. 21, 1292 (2020).
Google Scholar
Chen, X. et al. Transcriptome and proteome profiling of different colored rice reveals physiological dynamics involved in the flavonoid pathway. Int. J. Mol. Sci. 20, 2463 (2019).
Google Scholar
Kim, C. K. et al. Multi-layered screening method identification of flavonoid-specific genes, using transgenic rice. Biotechnol. Biotechnol. Equip. 27, 3944–3951 (2013).
Google Scholar
Koes, R. E., Quattrocchio, F. & Mol, J. N. M. The flavonoid biosynthetic pathway in plants: Function and evolution. BioEssays 16, 123–132 (1993).
Google Scholar
Davies, K. M. et al. The evolution of flavonoid biosynthesis: A bryophyte perspective. Front. Plant Sci. 11, 1–21 (2020).
Google Scholar