Skip to main content
3 Biotech logoLink to 3 Biotech
. 2023 Jun 10;13(7):230. doi: 10.1007/s13205-023-03659-z

Genome-wide identification and characterization of glutathione S-transferase gene family in quinoa (Chenopodium quinoa Willd.)

Shivani Tiwari 1, Swati Vaish 2, Nootan Singh 2, Mahesh Basantani 3, Atul Bhargava 1,
PMCID: PMC10257622  PMID: 37309406

Abstract

The present investigation was envisaged for large scale in-silico genome wide identification and characterization of glutathione S-transferases (GSTs) in Chenopodium quinoa. In this study, a total of 120 GST genes (CqGSTs) were identified and divided into 11 classes of which tau and phi were highest in numbers. The average protein length of protein was found to be 279.06 with their corresponding average molecular weight of 31,819.4 kDa. The subcellular localization analysis results showed that proteins were centrally localized in the cytoplasm followed by chloroplast, mitochondria and plastids. Structural analysis revealed the presence of 2 -14 exons in CqGST genes. Most of the proteins possessed two exon one intron organization. MEME analysis identified 15 significantly conserved motifs with a width of 6–50 amino acids. Motifs 1, 3, 2, 5, 6, 8, 9 and 13 were found specifically in tau class family; motifs 3, 4, 5, 6, 7 and 9 were found in phi class gene family, while motifs 3, 4, 13 and 14 were found in metaxin class. Multiple sequence alignment revealed highly conserved N-terminus with active site serine (Ser; S) or cysteine (Cys; C) residue for the activation of GSH binding and GST catalytic activity. The gene loci were found to be unevenly distributed across 18 different chromosomes with a maximum of 17 genes located on chromosome number 7. Dominance of alpha helix was followed by coil, extended strand and beta turns. Gene duplication analysis revealed that segmental duplication and purifying type selection were highest in number and found to be main source of expansion of GST gene family. Cis acting regulatory elements analysis showed the presence of 21 different elements involved in stress, hormone and light response and cellular development. The evolutionary relationship of CqGST proteins carried out using maximum likelihood method revealed that all the tau and phi class GSTs were closely associated with those of G. max, O. sativa and A. thaliana. Molecular docking of GST molecules with the fungicide metalaxyl showed that the CqGSTF1 had the lowest binding energy. The comprehensive study of CqGST gene family in quinoa provides groundwork for further functional analysis of CqGST genes in the species at molecular level and has potential applications in plant breeding.

Keywords: Quinoa, GST, Characterization, Phylogenetic analysis, Chromosomal localization, Gene duplication

Introduction

Quinoa (Chenopodium quinoa Willd.) is an allotetraploid (2n = 4x = 36) pseudocereal of Amaranthaceae family that originated in the Andean region of South America (Hong et al. 2017). Quinoa produces gluten free nutritious grains that have exceptional balance between carbohydrates, vitamins, minerals, essential amino acids, dietary fibers and oils (Yasui et al. 2016; Dakhili et al. 2019). It is usually cultivated in clay loam or sandy loam soils, with approximate pH of 5.5–8.5 with good drainage and moderate slopes (Fuentes and Bhargava 2011). The optimum temperature for quinoa is around 8–15 °C, although it can withstand up to − 4 °C (Alvar-Beltrán et al. 2020). Quinoa is recognized as a crop of great value due to its nutritious grain and high tolerance towards abiotic stresses (Yasui et al. 2016). Quinoa is adapted to a wide range of marginal soils, including drought prone ones and those with high salinity, which help it to thrive under adverse agroclimatic conditions (Morales et al. 2017; Vita et al. 2021). Due to its potential health benefits and adaptability to adverse climatic conditions, the year 2013 was declared by the FAO as the International Year of Quinoa (Bhargava and Ohri 2015; Jarvis et al. 2017). Considering its economic and nutritional significance, it could be a fascinating challenge for plant breeders to use transgenic approaches to develop varieties resistant against diverse biotic and abiotic stresses. The availability of sequenced genome of quinoa provides an opportunity to search and characterize diverse gene families which are functionally important. Recently, several gene families like HSP 70 (Liu et al. 2018), WRKY (Yue et al. 2019), trihelix transcription factor (Li et al. 2022), PYL (Pizzio 2022) and CIPK CBL (Xiaolin et al. 2022) have been identified and well characterized in quinoa.

Environmental stresses such as high salt, drought, extreme temperatures are often very harmful to plants, and they produce reactive oxygen species (ROS) to cause oxidative stress and cell damage (Czarnocka and Karpiński 2018). In order to maintain the redox balance within cells, plants have a series of antioxidant enzymes such as superoxide dismutase (SOD), peroxidase (POD), catalase (CAT), glutathione peroxidase (GPx) and glutathione-S-transferase (GST) and non-enzymatic mechanisms like glutathione, alpha-tocopherol, carotenoids and flavonoids to maintain the balance between production and elimination of ROS (Mittler et al. 2004).

GSTs (EC 2.5.1.8) constitute ancient protein superfamily of multifunctional proteins and are an integral part of the antioxidant enzymatic defense system in plants that works downstream of Cyt P450 by catalyzing the nucleophilic conjugation of reduced tripeptide glutathione (GSH) (g-Glu–Cys–Gly) into wide variety of hydrophobic and electrophilic substrates to make a chemical compound more hydrophilic to be expelled from the cell (Marimo et al. 2016; Vaish et al. 2020; Song et al. 2021). GSTs are characterized by the presence of canonical thioredoxin fold in the highly conserved N-terminal domain which dominantly consists of α-helices and β-strands with a β1α1β2α2β3β4α3 topology. It possesses a G-site for glutathione binding. The variable C-terminal domains consisting of all the α-helices and possesses an H-site for secondary hydrophobic substrate binding (Vaish et al. 2020).

In plants, GSTs play vital roles in the detoxification of xenobiotics and toxic lipid peroxides (Chronopoulou et al. 2014; Fafián-Labora et al. 2020), glucosinolate biosynthesis (Czerniawski and Bednarek 2018) and in metabolism (Liu et al. 2014). The GST proteins of plants have been classified into 14 distinct classes, namely tau, phi, theta, zeta, lambda, γ-subunit of the eukaryotic translation elongation factor 1B (EF1B), dehydroascorbate reductase (DHAR), metaxin, tetrachlorohydroquinone dehalogenase (TCHQD), Ure2p, microsomal prostaglandin E synthase type 2 (mPGES2), hemerythrin, iota, and glutathionyl-hydroquinone reductases (GHR) (Nianiou-Obeidat et al. 2017). Given the important detoxification effects of GSTs during plant stress, a number of studies have been carried out on GSTs in many plants. With the advent of whole genome sequencing, a large number of GSTs have been reported and characterized through genome-wide analyses in a number of plants like Physcomitrella patens (Liu et al. 2013), Solanum Lycopersicum (Islam et al. 2017), Ipomoea batatas (Ding et al. 2017), Cucurbita maxima (Kayum et al. 2018), Vigna radiata (Vaish et al. 2018), Malus domestica (Fang et al. 2020), Raphanus sativus (Gao et al. 2020), Cicer arietinum (Ghangal et al. 2020), Medicago truncatula (Han et al. 2018; Hasan et al. 2021), Triticum aestivum (Hao et al. 2021) and Cucumis melo (Song et al. 2021).

Despite the availability of whole genome sequence of quinoa, large scale in-silico genome wide identification and characterization of GSTs have not been carried out in this potential crop. Till date, there is no report of GST gene family identification and characterization in quinoa which provided us with an opportunity to perform in-silico genome-wide analysis of the GST gene family in this underutilized crop.

Material and methods

The in-silico analysis was carried out using 1 Gbps LAN speed on 8 GB RAM personal computer having AMD Ryzen 5 5500 U processor with Radeon Graphics 2.10 Ghz and 64-Bit operating system.

Protein sequence retrieval and search in quinoa database available at Phytozome

C. quinoa genome database available at Phytozome (https://phytozome-next.jgi.doe.gov/) was used to conduct in-silico genome identification and characterization of GST genes. GST protein sequences of Arabidopsis thaliana were obtained from The Arabidopsis Information Resource (TAIR) (http://www.arabidopsis.org/). Glycine max, Physcomitrella patens and Medicago truncatula GST sequences were retrieved from NCBI by the locus ID or the accession number published by Liu et al. (2013), Liu et al. (2015) and Hasan et al. (2021), respectively, while sequences of Oryza sativa were obtained from Rice Genome Annotation Project (RGAP) by the accession number published by Jain et al. (2010). pBLAST searches of each GST class was performed separately using retrieved GST protein sequences of five different species as query with an e-value of 0.0001. The identified amino acid, genomic and coding sequences were downloaded and subjected to NCBI Batch CD (conserved domain) search (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) (Marchler-Bauer et al. 2017), SMART (Simple Modular Architecture Research Tool) database (http://smart.embl-heidelberg.de/) (Letunic et al. 2021) and Pfam search (http://pfam.xfam.org/search) for confirmation of conserved C-terminal domain and conserved N-terminal domain with the thioredoxin fold.

Subcellular localization and in-silico physicochemical characterization of identified GSTs

Subcellular localization of the identified GSTs were predicted by using three different tools namely CELLO online tool v.2.5 (http://cello.life.nctu.edu.tw/) (Yu et al. 2006), WoLF PSORT (www.genscript.com/wolf-psort.html) (Horton et al. 2007) and DeepLoc (Armenteros et al. 2017) (http://www.cbs.dtu.dk/services/DeepLoc/). The physicochemical parameters such as isoelectric point (pI), molecular weight, aliphatic index and Grand Average of Hydropathy (GRAVY) were analysed using default parameters of ProtParam tool at Expasy server (http://web.expasy.org/protparam/) (Gasteiger et al. 2005).

Gene structure visualization

The exon/intron organization of quinoa GSTs were analysed using Gene Structure Display Server 2.0 (GSDS, https://gsds.cbi.pku.edu.cn/) (Hu et al. 2015) by aligning CDS and genomic sequences that were retrieved from Phytozome (https://phytozome-next.jgi.doe.gov/).

Motif identification

Conserved motifs of quinoa GSTs were identified using the Mutiple Em for Motif Elicitation (MEME) analysis (http://meme-suite.org/) (Bailey et al. 2009). The parameters for analysis were 20 as motif number while the motif width was 6- 50. The results were visualized using TBtool software.

Protein sequence alignment and catalytic residue position prediction

The CqGSTs protein sequences were aligned with protein sequences of Arabidopsis thaliana, O. sativa, G. max, P. patens and M. truncatula using Clustal Omega (Sievers et al. 2011). The protein sequence alignments were visualized in ESPript 3.0 (http://espript.ibcp.fr/ESPript/cgi-bin/ESPript.cgi) (Robert and Gouet 2014).

Chromosomal localization and gene duplication analysis of CqGST genes

The genomic locations of the CqGST genes were retrieved from the genomic data (Jarvis et al. 2017). The locations of these genes were diagrammatically depicted on their respective chromosomes using online TBtools software v0.667 (https://github.com/CJ-Chen/TBtools). Gene duplication events were analysed by pBLAST search of CqGSTs against each other on NCBI pBLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins). The CqGSTs having sequence similarity > 80% were assumed to be duplicated genes (Kong et al. 2013). The homologous pair genes present on same chromosome were considered as tandem duplicated (TD), while those located at different chromosomal locations were depicted as segmental duplicated (SD) genes (Holub et al. 2001). PAL2NAL online tool (https://bio.tools/pal2nal) (Suyama et al. 2006) was used for the estimation of synonymous rate (dS), non-synonymous rate (dN), and evolutionary constraint (dN/dS) between the duplicated CqGST gene pairs. This was carried out using sequences that were aligned using Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/) along with their respective mRNA sequences. dN/dS ratio is the basis of mode of selection between duplicated genes. Positive, neutral and purifying selection was considered based on the values > 1, = 1 and < 1, respectively. The divergence time T (million years ago i.e., Mya) of each duplicated gene pair was calculated using the formula: (T = dS/2λ), where T is divergence time, dS is the number of synonymous substitutions per site, and λ is the fixed rate of 1.5 × 10–8 synonymous substitutions per site per year for dicotyledonous plants (Koch et al. 2000).

Protein secondary structure prediction

The components of protein secondary structure like random coil, beta turn, extended strand and alpha helix of CqGSTs were predicted using Self-Optimized Prediction Method with Alignment (SOPMA), a secondary structure prediction tool (https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_sopma.html) (Combet et al. 2000).

Evolutionary or phylogenetic analysis

The evolutionary relationship of CqGST proteins was carried out with P. patens (a bryophyte), A. thaliana (an angiosperm), O. sativa, G. max, M. truncatula GST proteins in MEGA 7 (Molecular Evolutionary Genetics Analysis) (Kumar et al. 2016) using maximum likelihood method. Bootstrap analysis was performed with 1000 replicates for the accuracy of the constructed tree.

Promoter analysis in CqGST genes

In order to analyze cis acting elements in promoter region, 2000 base pair upstream genomic sequences of initiation codon (ATG) of the identified CqGSTs were retrieved from Chenopodium database (https://www.cbrc.kaust.edu.sa/chenopodiumdb/). To identify various elements, the extracted genomic sequences were subjected to online software PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) (Lescot et al. 2002).

Homology modelling and molecular docking analysis

One member from each GST family was selected for docking against the acylalanine fungicide, metalaxyl (methyl N-(methoxyacetyl)-N-(2,6-xylyl)-dl-alaninate) which is used to control downy mildew infestation in a number of crops, including quinoa (Danielsen et al. 2003). Docking was not performed for metaxin and EF1B since the catalytic residue was not predicted for these two families. Homology modelling of protein was done by using I-TASSER (Zheng et al. 2021) (https://zhanggroup.org/I-TASSER/). It selects template having best identity from protein data bank hit and gives predicted model. The 3D structure of the ligand was downloaded from PubChem database (http://www.pubchem.ncbi.nlm.nih.gov) in SDF format. These structures were further converted into PDB file format using the PyMol tool and used for docking studies with identified protein members of quinoa. Molecular docking study was carried out using AutoDock v.4 tool (Morris et al. 2009).

Results

Quinoa GST genes

Comprehensive searches using Phytozome database led to the identification of total 120 GST genes in quinoa. To classify GSTs, all protein sequences were retrieved from Phytozome and analyzed through SMART, Pfam and NCBI Batch CD search database. A total of 120 full length GST genes containing both the domain were named as CqGSTs and classified into 11 different classes: tau (65 members), phi (19 members), lambda (2 members), theta (3 members), zeta (4 members), DHAR (2 members), hemerythrin (4 members), metaxin (2 members), mPGES (6 members), EF1B (7 members), GHR (6 members). Plant GSTs tau and phi were highest in numbers. All the classes of CqGSTs were named CqGSTU, CqGSTF, CqGSTL, CqGSTT, CqGSTZ, CqDHAR, CqGSTH, CqGSTM, CqGSTmi, CqGSTEF1B and CqGSTGHR. The numbering for each member of the class was done based on their chromosomal localization in the ascending order (Table 1).

Table 1.

List of identified GST members in Chenopodium quinoa along with their detailed genomic information and physicochemical features

Sr no. Locus ID Gene name Chr. no. Start End Base pair Strand Protein (aa) pI Mol.Wt. (kDa) GRAVY Aliphatic\Index
1 AUR62037578 CqGSTU1 Chr05 22,258,067 22,265,792 7725 Forward 229 5.64 24,913.84 − 0.4 65.76
2 AUR62037579 CqGSTU2 Chr05 22,266,529 22,267,465 936 Reverse 222 5.92 25,784.89 − 0.216 98.78
3 AUR62037930 CqGSTU3 Chr05 31,850,247 31,851,615 1368 Reverse 172 5.21 20,184.12 − 0.459 79.24
4 AUR62037936 CqGSTU4 Chr05 32,278,507 32,279,315 808 Forward 155 6.21 18,104.97 − 0.166 90.65
5 AUR62037937 CqGSTU5 Chr05 32,279,645 32,281,429 1784 Reverse 225 5.18 26,005.1 − 0.257 90.18
6 AUR62037938 CqGSTU6 Chr05 32,288,793 32,295,251 6458 Forward 387 4.97 42,858.16 − 0.055 97.26
7 AUR62037941 CqGSTU7 Chr05 32,449,451 32,450,850 1399 Forward 222 4.89 26,250.07 − 0.345 90.45
8 AUR62037944 CqGSTU8 Chr05 32,654,240 32,654,593 353 Forward 117 4.58 13,507.64 0.144 121.62
9 AUR62017444 CqGSTU9 Chr06 43,484,072 43,484,836 764 Reverse 220 6 24,952 − 0.125 96.5
10 AUR62044663 CqGSTU10 Chr00 45,020,735 45,021,025 290 Reverse 96 4.84 11,060.67 − 0.051 94.38
11 AUR62042823 CqGSTU11 Chr06 57,771,916 57,775,409 3493 Reverse 220 6.84 24,997.03 − 0.138 97.45
12 AUR62026482 CqGSTU12 Chr06 66,508,196 66,510,448 2252 Reverse 227 4.98 26,087.02 − 0.171 85.07
13 AUR62026484 CqGSTU13 Chr06 66,489,506 66,492,004 2498 Reverse 229 5.99 25,748.75 − 0.18 89.08
14 AUR62017445 CqGSTU14 Chr06 43,533,054 43,534,970 1916 Reverse 210 6.83 24,091.14 − 0.039 100.24
15 AUR62026483 CqGSTU15 Chr06 66,493,346 66,495,958 2612 Reverse 229 4.87 26,220.13 − 0.187 92.05
16 AUR62042824 CqGSTU16 Chr06 57,779,096 57,780,971 1875 Forward 222 5.54 25,364.37 − 0.112 90.45
17 AUR62035903 CqGSTU17 Chr07 24,669,985 24,671,288 1303 Forward 224 5.43 26,261.42 − 0.176 96.56
18 AUR62035904 CqGSTU18 Chr07 24,672,990 24,673,832 842 Forward 149 6.3 17,470.16 − 0.317 79.87
19 AUR62001302 CqGSTU19 Chr07 98,865,582 98,867,040 1458 Reverse 235 6.07 27,207.71 − 0.319 92.64
20 AUR62001701 CqGSTU20 Chr07 103,459,631 103,460,839 1208 Reverse 202 6.54 23,733.44 − 0.39 89.26
21 AUR62025397 CqGSTU21 Chr07 106,343,591 106,346,764 3173 Forward 161 5.02 18,404.18 − 0.111 96.21
22 AUR62025473 CqGSTU22 Chr07 107,763,570 107,765,588 2018 Forward 223 5.58 25,745.54 − 0.473 87.44
23 AUR62021697 CqGSTU23 Chr08 8,943,013 8,945,332 2319 Forward 347 7.63 38,973.37 − 0.131 95.19
24 AUR62021696 CqGSTU24 Chr08 8,946,584 8,947,569 985 Forward 226 6.42 26,211.54 − 0.25 105.18
25 AUR62021695 CqGSTU25 Chr08 8,966,413 8,970,065 3652 Reverse 228 8.28 26,524 − 0.232 91.45
26 AUR62021692 CqGSTU26 Chr08 9,050,946 9,051,821 875 Forward 200 5.9 23,015.75 − 0.149 106.25
27 AUR62021691 CqGSTU27 Chr08 9,068,011 9,069,374 1363 Forward 230 6.54 26,555.9 − 0.24 91.52
28 AUR62021690 CqGSTU28 Chr08 9,072,050 9,075,464 3414 Reverse 348 6.11 40,313.64 − 0.261 93.56
29 AUR62021689 CqGSTU29 Chr08 9,077,561 9,078,839 1278 Forward 232 5.74 26,702.89 − 0.316 98.32
30 AUR62021688 CqGSTU30 Chr08 9,080,067 9,081,578 1511 Forward 230 5.39 26,466.62 − 0.287 92.87
31 AUR62021687 CqGSTU31 Chr08 9,085,491 9,087,253 1762 Forward 231 5.61 26,657.75 − 0.383 90.78
32 AUR62018920 CqGSTU32 Chr10 11,529,705 11,531,271 1566 Reverse 224 5.28 25,766.81 − 0.114 96.16
33 AUR62018919 CqGSTU33 Chr10 11,531,703 11,533,254 1551 Reverse 214 4.96 24,449.53 − 0.057 100.61
34 AUR62033918 CqGSTU34 Chr10 47,430,348 47,431,888 1540 Reverse 224 5.5 25,805.84 − 0.287 89.6
35 AUR62035336 CqGSTU35 Chr11 11,262,390 11,263,905 1515 Reverse 224 5.42 26,026.11 − 0.163 98.3
36 AUR62035335 CqGSTU36 Chr11 11,336,767 11,337,934 1167 Reverse 149 6.52 17,588.21 − 0.462 73.36
37 AUR62041794 CqGSTU37 Chr12 34,308,452 34,310,039 1587 Forward 224 6.23 26,150.29 − 0.261 98.35
38 AUR62026287 CqGSTU38 Chr14 25,331,081 25,334,093 3012 Reverse 197 4.75 22,448.84 − 0.072 88.17
39 AUR62026289 CqGSTU39 Chr14 25,349,451 25,353,218 3767 Reverse 228 5.26 25,944.95 − 0.131 96.36
40 AUR62026290 CqGSTU40 Chr14 25,355,749 25,356,129 380 Reverse 126 5.01 13,945 − 0.13 81.51
41 AUR62026291 CqGSTU41 Chr14 25,369,752 25,370,216 464 Reverse 107 5.88 12,293.33 − 0.172 101.12
42 AUR62037811 CqGSTU42 Chr14 44,238,894 44,240,398 1504 Forward 219 5.3 25,413.43 − 0.295 94.75
43 AUR62037813 CqGSTU43 Chr14 44,344,324 44,348,405 4081 Forward 203 5.56 23,561.36 − 0.114 99.9
44 AUR62037814 CqGSTU44 Chr14 44,358,959 44,360,695 1736 Forward 227 8.19 25,675.99 − 0.076 96.61
45 AUR62025080 CqGSTU45 Chr15 50,947,192 50,947,825 633 Reverse 114 6.15 13,639.66 − 0.455 75.26
46 AUR62025085 CqGSTU46 Chr15 50,990,916 50,992,018 1102 Forward 202 7.7 23,848.52 − 0.473 84.95
47 AUR62025088 CqGSTU47 Chr15 51,029,526 51,035,545 6019 Forward 371 5.72 42,037.14 − 0.231 88.54
48 AUR62025095 CqGSTU48 Chr15 51,102,015 51,111,215 9200 Reverse 187 4.95 21,387.47 − 0.212 84.97
49 AUR62008398 CqGSTU49 Chr16 4,065,234 4,070,967 5733 Reverse 429 5.29 49,549.39 − 0.236 101.72
50 AUR62008404 CqGSTU50 Chr16 4,153,451 4,154,576 1125 Forward 221 5.34 25,390.38 − 0.237 89.5
51 AUR62008405 CqGSTU51 Chr16 4,162,275 4,163,834 1559 Forward 228 5.17 26,813.09 − 0.289 96.62
52 AUR62008406 CqGSTU52 Chr16 4,165,237 4,166,614 1377 Forward 232 5.22 26,717.93 − 0.18 105.86
53 AUR62008407 CqGSTU53 Chr16 4,169,081 4,169,886 805 Reverse 226 5.53 25,852.93 − 0.208 97.57
54 AUR62008408 CqGSTU54 Chr16 4,171,850 4,173,231 1381 Reverse 234 5.91 27,070.29 − 0.344 92.52
55 AUR62008409 CqGSTU55 Chr16 4,179,428 4,181,751 2323 Reverse 225 5.61 26,142.13 − 0.322 92.27
56 AUR62008410 CqGSTU56 Chr16 4,186,653 4,188,495 1842 Reverse 230 5.65 26,513.68 − 0.289 89.04
57 AUR62041790 CqGSTU57 Chr17 40,148,842 40,149,664 822 Forward 221 5.75 25,605.64 − 0.209 95.7
58 AUR62041791 CqGSTU58 Chr17 40,150,004 40,151,521 1517 Reverse 225 5.34 26,205.29 − 0.274 88.44
59 AUR62041792 CqGSTU59 Chr17 40,156,851 40,179,701 22,850 Forward 290 6.38 33,610.25 − 0.592 75.31
60 AUR62008765 CqGSTU60 Chr17 66,283,593 66,287,858 4265 Forward 236 6.27 26,596.27 − 0.813 80.59
61 AUR62008847 CqGSTU61 Chr17 67,490,100 67,491,129 1029 Reverse 222 5.39 25,660.54 − 0.285 92.79
62 AUR62008848 CqGSTU62 Chr17 67,491,834 67,492,999 1165 Reverse 223 5.36 26,018.82 − 0.35 86.91
63 AUR62025684 CqGSTU63 Chr18 23,464,056 23,477,054 12,998 Reverse 240 6.66 27,708.19 − 0.363 92.75
64 AUR62020080 CqGSTU64 Chr18 30,540,853 30,542,115 1262 Forward 220 6.85 25,895.92 − 0.392 82.86
65 AUR62020081 CqGSTU65 Chr18 30,542,691 30,544,606 1915 Forward 221 6.91 25,739.75 − 0.309 89.1
1 AUR62026020 CqGSTF1 Chr07 80,504,079 80,505,139 1060 Forward 216 7.74 24,561.57 − 0.161 86.71
2 AUR62033160 CqGSTF2 Chr07 94,403,967 94,405,978 2011 Forward 217 5.5 24,562.2 − 0.19 93.5
3 AUR62033161 CqGSTF3 Chr07 94,433,227 94,443,156 9929 Forward 240 6.67 27,399.39 − 0.403 88.21
4 AUR62033162 CqGSTF4 Chr07 94,485,894 94,489,106 3212 Forward 214 6.24 24,012.56 − 0.265 87.52
5 AUR62033175 CqGSTF5 Chr07 95,026,283 95,027,659 1376 Reverse 242 5.9 27,546.77 − 0.126 91.57
6 AUR62033176 CqGSTF6 Chr07 95,036,182 95,036,815 633 Reverse 184 5.56 21,203.59 − 0.15 94.51
7 AUR62033177 CqGSTF7 Chr07 95,037,534 95,039,279 1745 Reverse 214 5.75 24,289.96 − 0.233 93.5
8 AUR62033178 CqGSTF8 Chr07 95,040,874 95,041,248 374 Reverse 124 5.34 14,487.85 − 0.091 103.95
9 AUR62033179 CqGSTF9 Chr07 95,057,377 95,058,945 1568 Reverse 214 5.31 24,037.62 − 0.08 83.46
10 AUR62033180 CqGSTF10 Chr07 95,061,685 96,064,148 1E + 06 Reverse 218 5.04 25,000.56 − 0.29 91.7
11 AUR62013858 CqGSTF11 Chr10 15,176,795 15,179,961 3166 Reverse 223 6.12 25,280.79 − 0.409 83.5
12 AUR62008598 CqGSTF12 Chr17 62,528,187 62,530,381 2194 Forward 197 5.75 22,882.2 − 0.434 89.14
13 AUR62008599 CqGSTF13 Chr17 62,534,839 62,536,278 1439 Forward 214 5.3 24,008.56 − 0.071 82.99
14 AUR62008602 CqGSTF14 Chr17 62,632,018 62,638,824 6806 Forward 430 6.32 48,830.46 − 0.209 92.14
15 AUR62008604 CqGSTF15 Chr17 62,658,237 62,669,988 11,751 Forward 218 5.91 24,720.42 − 0.248 83.21
16 AUR62008607 CqGSTF16 Chr17 62,679,846 62,681,734 1888 Forward 214 5.91 24,108.69 − 0.19 91.17
17 AUR62008609 CqGSTF17 Chr17 62,850,817 62,853,467 2650 Forward 214 6.25 24,060.62 − 0.286 85.28
18 AUR62008610 CqGSTF18 Chr17 62,859,591 62,861,331 1740 Forward 218 5.8 24,992.13 − 0.136 100.14
19 AUR62008630 CqGSTF19 Chr17 63,533,471 63,533,854 383 Forward 127 8.52 15,012.66 − 0.134 99.13
1 AUR62040536 CqGSTL1 Chr00 13,469,432 13,473,006 3574 Reverse 260 5.21 29,085.43 − 0.139 92.73
2 AUR62041623 CqGSTL2 Chr01 100,372,908 100,385,913 13,005 Reverse 1028 6.82 113,884.86 − 0.115 84.37
1 AUR62010591 CqGSTT1 Chr13 10,768,543 10,777,898 9355 Forward 230 9.22 26,192.53 − 0.188 100.52
2 AUR62010590 CqGSTT2 Chr13 10,781,037 10,783,997 2960 Forward 233 6.5 26,741.66 − 0.279 90.86
3 AUR62017234 CqGSTT3 Chr16 68,183,290 68,195,926 12,636 Forward 292 9.14 33,000.1 − 0.276 94.93
1 AUR62024246 CqGSTZ1 Chr06 1,069,058 1,070,356 1298 Reverse 229 5.14 26,235.94 − 0.335 93.28
2 AUR62013634 CqGSTZ2 Chr10 12,347,463 12,349,907 2444 Forward 223 5.24 25,317.09 − 0.246 85.25
3 AUR62032505 CqGSTZ3 Chr14 42,313,607 42,321,587 7980 Forward 242 8.44 27,732.89 − 0.404 92.81
4 AUR62004214 CqGSTZ4 Chr01 116,540,900 116,543,258 2358 Reverse 224 5.36 25,201.91 − 0.205 83.57
1 AUR62035961 CqGSTDHAR1 Chr04 22,089,257 22,094,551 5294 Reverse 174 5.95 19,123.12 − 0.03 97.47
2 AUR62022530 CqGSTDHAR2 Chr01 93,473,248 93,477,932 4684 Reverse 212 6.09 23,533.29 − 0.099 96.57
1 AUR62008342 CqGSTH1 Chr16 3,358,572 3,371,304 12,732 Forward 1143 6.09 131,200.91 − 0.364 77.02
2 AUR62021756 CqGSTH2 Chr08 8,201,575 8,214,048 12,473 Reverse 1120 5.8 128,670.9 − 0.376 76.5
3 AUR62037305 CqGSTH3 Chr14 10,282,014 10,295,553 13,539 Forward 1246 5.7 139,397.54 − 0.275 79.26
4 AUR62002840 CqGSTH4 Chr06 19,255,868 19,269,917 14,049 Forward 1247 5.81 139,438.58 − 0.288 78.97
1 AUR62021331 CqGSTM1 Chr01 16,537,256 16,542,557 5301 Reverse 289 5.77 32,460.06 − 0.192 87.4
2 AUR62033246 CqGSTM2 Chr01 52,855,100 52,861,286 6186 Reverse 329 5.22 36,922.8 − 0.257 86.53
1 AUR62006187 CqGSTMi1 Chr15 3,433,426 3,442,275 8849 Forward 363 8.2 39,964 − 0.146 86.47
2 AUR62040567 CqGSTMi2 Chr00 14,528,356 14,539,835 11,479 Forward 328 8.89 36,427.4 − 0.278 76.46
3 AUR62013931 CqGSTMi3 Chr10 16,899,006 16,906,074 7068 Reverse 377 9.1 41,690.79 − 0.183 80
4 AUR62037507 CqGSTMi4 Chr12 26,058,521 26,062,960 4439 Forward 151 9.28 17,148.39 0.35 85.83
5 AUR62028521 CqGSTMi5 Chr17 44,448,009 44,456,799 8790 Forward 315 8.55 34,578.84 − 0.131 85.11
6 AUR62004745 CqGSTMi6 Chr05 61,536,667 61,541,109 4442 Reverse 151 9.06 17,112.38 0.407 85.23
1 AUR62035146 CqGSTEF1B1 Chr03 1,343,991 1,349,912 5921 Reverse 225 5.78 26,332.32 − 0.794 64.49
2 AUR62022367 CqGSTEF1B2 Chr10 27,104,863 27,109,452 4589 Reverse 415 5.74 47,454.59 − 0.343 78.51
3 AUR62022369 CqGSTEF1B3 Chr10 27,140,328 27,159,888 19,560 Reverse 491 6.6 55,025.98 − 0.465 74.73
4 AUR62034332 CqGSTEF1B4 Chr03 51,421,029 51,424,593 3564 Forward 416 5.79 47,231.4 − 0.354 79.5
5 AUR62034333 CqGSTEF1B5 Chr03 51,481,446 51,485,168 3722 Forward 416 5.45 47,374.58 − 0.324 81.83
6 AUR62042585 CqGSTEF1B6 Chr10 58,953,083 58,953,451 368 Reverse 122 4.59 14,489.55 − 0.481 67.05
7 AUR62042586 CqGSTEF1B7 Chr10 58,955,993 58,959,931 3938 Reverse 311 8.83 34,887.53 − 0.221 92.15
1 AUR62033029 CqGSTGHR1 Chr01 8,677,338 8,680,969 3631 Forward 329 6.56 37,869.89 − 0.488 72.61
2 AUR62033030 CqGSTGHR2 Chr01 8,731,951 8,736,630 4679 Forward 348 6.27 39,855.23 − 0.437 78.48
3 AUR62040763 CqGSTGHR3 Chr02 14,209,838 14,214,347 4509 Reverse 348 7.07 40,147.73 − 0.435 82.1
4 AUR62040764 CqGSTGHR4 Chr02 14,222,320 14,225,704 3384 Reverse 296 6.32 33,928.12 − 0.611 69.19
5 AUR62008787 CqGSTGHR5 Chr17 66,538,547 66,543,242 4695 Forward 408 8 46,176.97 − 0.257 88.6
6 AUR62025453 CqGSTGHR6 Chr07 107,323,805 107,335,636 11,831 Forward 408 7.09 46,048.76 − 0.236 88.6

Subcellular localization analysis

Subcellular localization analysis revealed that most of the proteins were found majorly in cytoplasm. The average protein length of protein was found to be 279.06 with their corresponding average molecular weight of 31,819.4 kDa. The estimated mean isoelectric point (pI) was found to be 6.14. The highest pI value was 9.2 in theta and mPGES class. The grand average of hydropathy values of most of the proteins of all the classes were negative indicating that all the CqGST proteins were hydrophilic having good interaction with water. Mean aliphatic index of proteins was 89.33. The subcellular localization analysis results showed that proteins were centrally localized in the cytoplasm followed by chloroplast, mitochondria, plastid, plasma membrane, nucleus, extracellular, ER and cytoskeleton (Table 2).

Table 2.

Subcellular location of various CqGSTs

Sr no. Locus ID Gene name Subcellular localization by WoLFPSORT Subcellular localization by DeepLoc Subcellular localization by CELLO
1 AUR62037578 CqGSTU1 Cytoskeleton Soluble cytoplasmic Cytoplasmic
2 AUR62037579 CqGSTU2 Cytoplasm Soluble cytoplasmic Cytoplasmic
3 AUR62037930 CqGSTU3 Cytoplasm Soluble cytoplasmic Cytoplasmic
4 AUR62037936 CqGSTU4 Nucleus Soluble cytoplasmic Cytoplasmic
5 AUR62037937 CqGSTU5 Cytoplasm Soluble cytoplasmic Cytoplasmic
6 AUR62037938 CqGSTU6 Cytoplasm Soluble cytoplasmic Cytoplasmic
7 AUR62037941 CqGSTU7 Cytoplasm Soluble cytoplasmic Cytoplasmic
8 AUR62037944 CqGSTU8 Cytoplasm Soluble cytoplasmic Extracellular
9 AUR62017444 CqGSTU9 Cytoplasm Soluble cytoplasmic Cytoplasmic
10 AUR62044663 CqGSTU10 Cytoplasm Soluble cytoplasmic Cytoplasmic
11 AUR62042823 CqGSTU11 Cytoplasm Soluble cytoplasmic Cytoplasmic
12 AUR62026482 CqGSTU12 Chloroplast Soluble cytoplasmic Cytoplasmic
13 AUR62026484 CqGSTU13 Chloroplast Soluble cytoplasmic Cytoplasmic
14 AUR62017445 CqGSTU14 Nucleus Soluble cytoplasmic Cytoplasmic
15 AUR62026483 CqGSTU15 Cytoplasm Soluble cytoplasmic Cytoplasmic
16 AUR62042824 CqGSTU16 Nucleus Soluble cytoplasmic Cytoplasmic
17 AUR62035903 CqGSTU17 Cytoplasm Soluble cytoplasmic Cytoplasmic
18 AUR62035904 CqGSTU18 Cytoplasm Soluble cytoplasmic Cytoplasmic
19 AUR62001302 CqGSTU19 Cytoplasm Soluble cytoplasmic Cytoplasmic
20 AUR62001701 CqGSTU20 Cytoplasm Soluble cytoplasmic Cytoplasmic
21 AUR62025397 CqGSTU21 Cytoplasm Soluble cytoplasmic Cytoplasmic
22 AUR62025473 CqGSTU22 Cytoplasm Soluble cytoplasmic Cytoplasmic
23 AUR62021697 CqGSTU23 Cytoplasm Soluble cytoplasmic Cytoplasmic
24 AUR62021696 CqGSTU24 Cytoplasm Soluble cytoplasmic Cytoplasmic
25 AUR62021695 CqGSTU25 Cytoplasm Soluble cytoplasmic Extracellular
26 AUR62021692 CqGSTU26 Cytoplasm Soluble cytoplasmic Cytoplasmic
27 AUR62021691 CqGSTU27 Mitochondria Soluble cytoplasmic Cytoplasmic
28 AUR62021690 CqGSTU28 Chloroplast Soluble cytoplasmic Cytoplasmic
29 AUR62021689 CqGSTU29 Chloroplast Soluble cytoplasmic Cytoplasmic
30 AUR62021688 CqGSTU30 Cytoplasm Soluble cytoplasmic Cytoplasmic
31 AUR62021687 CqGSTU31 Mitochondria Soluble cytoplasmic Cytoplasmic
32 AUR62018920 CqGSTU32 Chloroplast Soluble cytoplasmic Cytoplasmic
33 AUR62018919 CqGSTU33 Cytoplasm Soluble cytoplasmic Cytoplasmic
34 AUR62033918 CqGSTU34 Cytoplasm Soluble cytoplasmic Cytoplasmic
35 AUR62035336 CqGSTU35 Cytoplasm Soluble cytoplasmic Cytoplasmic
36 AUR62035335 CqGSTU36 Cytoplasm Soluble cytoplasmic Cytoplasmic
37 AUR62041794 CqGSTU37 Cytoplasm Soluble cytoplasmic Cytoplasmic
38 AUR62026287 CqGSTU38 Cytoplasm Soluble cytoplasmic Cytoplasmic
39 AUR62026289 CqGSTU39 Cytoplasm Soluble cytoplasmic Cytoplasmic
40 AUR62026290 CqGSTU40 Chloroplast Soluble mitochondrial Cytoplasmic
41 AUR62026291 CqGSTU41 Chloroplast Soluble cytoplasmic Mitochondrial
42 AUR62037811 CqGSTU42 Nucleus Soluble cytoplasmic Cytoplasmic
43 AUR62037813 CqGSTU43 Cytoplasm Soluble cytoplasmic Cytoplasmic
44 AUR62037814 CqGSTU44 Nucleus Soluble cytoplasmic Chloroplast
45 AUR62025080 CqGSTU45 Cytoplasm Soluble cytoplasmic Cytoplasmic
46 AUR62025085 CqGSTU46 Cytoplasm Soluble cytoplasmic Cytoplasmic
47 AUR62025088 CqGSTU47 Cytoplasm Soluble cytoplasmic Cytoplasmic
48 AUR62025095 CqGSTU48 Cytoplasm Soluble cytoplasmic Cytoplasmic
49 AUR62008398 CqGSTU49 Cytoplasm Soluble cytoplasmic Cytoplasmic
50 AUR62008404 CqGSTU50 Chloroplast Soluble cytoplasmic Cytoplasmic
51 AUR62008405 CqGSTU51 Cytoplasm Soluble cytoplasmic Cytoplasmic
52 AUR62008406 CqGSTU52 Cytoplasm Soluble cytoplasmic Cytoplasmic
53 AUR62008407 CqGSTU53 Cytoplasm Soluble cytoplasmic Cytoplasmic
54 AUR62008408 CqGSTU54 Cytoplasm Soluble cytoplasmic Cytoplasmic
55 AUR62008409 CqGSTU55 Cytoplasm Soluble cytoplasmic Cytoplasmic
56 AUR62008410 CqGSTU56 Cytoplasm Soluble cytoplasmic Cytoplasmic
57 AUR62041790 CqGSTU57 Cytoplasm Soluble cytoplasmic Cytoplasmic
58 AUR62041791 CqGSTU58 Cytoplasm Soluble cytoplasmic Cytoplasmic
59 AUR62041792 CqGSTU59 Cytoplasm Soluble plastid Cytoplasmic
60 AUR62008765 CqGSTU60 Cytoplasm Soluble cytoplasmic Cytoplasmic
61 AUR62008847 CqGSTU61 Cytoplasm Soluble cytoplasmic Cytoplasmic
62 AUR62008848 CqGSTU62 Cytoplasm Soluble cytoplasmic Cytoplasmic
63 AUR62025684 CqGSTU63 Cytoplasm Soluble cytoplasmic Cytoplasmic
64 AUR62020080 CqGSTU64 Cytoplasm Soluble cytoplasmic Cytoplasmic
65 AUR62020081 CqGSTU65 Cytoskeleton Soluble cytoplasmic Cytoplasmic
1 AUR62026020 CqGSTF1 Soluble cytoplasmic Cytoplasmic
2 AUR62033160 CqGSTF2 Chloroplast Soluble cytoplasmic Cytoplasmic
3 AUR62033161 CqGSTF3 Cytoplasm Soluble cytoplasmic Cytoplasmic
4 AUR62033162 CqGSTF4 Chloroplast Soluble cytoplasmic Cytoplasmic
5 AUR62033175 CqGSTF5 Cytoplasm Soluble mitochondrial Cytoplasmic
6 AUR62033176 CqGSTF6 Cytoplasm Soluble cytoplasmic Cytoplasmic
7 AUR62033177 CqGSTF7 Chloroplast Soluble cytoplasmic Cytoplasmic
8 AUR62033178 CqGSTF8 Cytoplasm Soluble cytoplasmic Cytoplasmic
9 AUR62033179 CqGSTF9 Chloroplast Soluble cytoplasmic Cytoplasmic
10 AUR62033180 CqGSTF10 Extracellular Soluble cytoplasmic Cytoplasmic
11 AUR62013858 CqGSTF11 Chloroplast Soluble cytoplasmic Cytoplasmic
12 AUR62008598 CqGSTF12 Cytoplasm Soluble cytoplasmic Cytoplasmic
13 AUR62008599 CqGSTF13 Chloroplast Soluble cytoplasmic Cytoplasmic
14 AUR62008602 CqGSTF14 Cytoplasm Soluble cytoplasmic Cytoplasmic
15 AUR62008604 CqGSTF15 Chloroplast Soluble cytoplasmic Cytoplasmic
16 AUR62008607 CqGSTF16 Chloroplast Soluble cytoplasmic Cytoplasmic
17 AUR62008609 CqGSTF17 Chloroplast Soluble cytoplasmic Cytoplasmic
18 AUR62008610 CqGSTF18 Cytoplasm Soluble cytoplasmic Cytoplasmic
19 AUR62008630 CqGSTF19 Cytoplasm Soluble cytoplasmic Cytoplasmic
1 AUR62040536 CqGSTL1 Chloroplast Soluble plastid Cytoplasmic
2 AUR62041623 CqGSTL2 Plastid Membrane plastid Plasma membrane
1 AUR62010591 CqGSTT1 Mitochondria Soluble nucleus Cytoplasmic
2 AUR62010590 CqGSTT2 Cytoplasm Soluble cytoplasmic Cytoplasmic
3 AUR62017234 CqGSTT3 Cytoplasm Soluble cytoplasmic Cytoplasmic
1 AUR62024246 CqGSTZ1 Mitochondria Soluble cytoplasmic Cytoplasmic
2 AUR62013634 CqGSTZ2 Chloroplast Soluble cytoplasmic Chloroplast
3 AUR62032505 CqGSTZ3 Mitochondria Soluble mitochondrial Mitochondrial
4 AUR62004214 CqGSTZ4 Chloroplast Soluble cytoplasmic Chloroplast
1 AUR62035961 CqGSTDHAR1 Cytoplasm Soluble cytoplasmic Cytoplasmic
2 AUR62022530 CqGSTDHAR2 Cytoplasm Soluble cytoplasmic Cytoplasmic
1 AUR62008342 CqGSTH1 Nucleus Soluble nucleus Nuclear
2 AUR62021756 CqGSTH2 Nucleus Soluble nucleus Nuclear
3 AUR62037305 CqGSTH3 cytoplasm Soluble nucleus Nuclear
4 AUR62002840 CqGSTH4 Nucleus Soluble nucleus Nuclear
1 AUR62021331 CqGSTM1 Nucleus Membrane mitochondrial Plasma membrane
2 AUR62033246 CqGSTM2 Cytoplasm Membrane mitochondrial Plasma membrane
1 AUR62006187 CqGSTMi1 Chloroplast Membrane plastid Chloroplast
2 AUR62040567 CqGSTMi2 Chloroplast Membrane mitochondrial Chloroplast
3 AUR62013931 CqGSTMi3 Chloroplast Membrane mitochondrial Chloroplast
4 AUR62037507 CqGSTMi4 Cytoplasm Membrane ER Plasma membrane
5 AUR62028521 CqGSTMi5 Chloroplast Membrane plastid Chloroplast
6 AUR62004745 CqGSTMi6 Cytoplasm Membrane ER Plasma membrane
1 AUR62035146 CqGSTEF1B1 Mitochondria Soluble cytoplasmic Cytoplasmic
2 AUR62022367 CqGSTEF1B2 Chloroplast Soluble cytoplasmic Cytoplasmic
3 AUR62022369 CqGSTEF1B3 Cytoplasm Soluble nucleus Chloroplast
4 AUR62034332 CqGSTEF1B4 Chloroplast Soluble cytoplasmic Cytoplasmic
5 AUR62034333 CqGSTEF1B5 Chloroplast Soluble cytoplasmic Cytoplasmic
6 AUR62042585 CqGSTEF1B6 Chloroplast Soluble cytoplasmic Cytoplasmic
7 AUR62042586 CqGSTEF1B7 Cytoplasm Membrane peroxisome Cytoplasmic
1 AUR62033029 CqGSTGHR1 Cytoplasm Soluble cytoplasmic Mitochondrial
2 AUR62033030 CqGSTGHR2 Mitochondria Soluble mitochondrial Mitochondrial
3 AUR62040763 CqGSTGHR3 Chloroplast Soluble mitochondrial Mitochondrial
4 AUR62040764 CqGSTGHR4 Cytoplasm Soluble cytoplasmic Mitochondrial
5 AUR62008787 CqGSTGHR5 Chloroplast Soluble plastid Mitochondrial
6 AUR62025453 CqGSTGHR6 Chloroplast Soluble plastid Extracellular

Gene structure organization

Gene structure organization having number of exons and introns were analyzed using coding and genomic sequences in Gene Structure Display server 2.0 to investigate possible structural evolution of GST gene family. Structural analysis revealed the presence of 2–14 exons in CqGST genes (Fig. 1). Most of the proteins possessed two exon one intron organization. The classes showed greater intron numbers with mixed conservation of splice site sequence.

Fig. 1.

Fig. 1

Gene structure analyses of quinoa CqGST genes drawn using GSDS tool. Color description; CDS—blue color, intron—black thread, upstream and downstream—green

Motif analysis

Further, we analyzed quinoa GST proteins for motif discovery via MEME suite (v4.11.2) using parameters, motif discovery mode: Normal; site distribution: 0 or 1 occurrence per sequence; motif length 10–50; number of motifs: 15. Top fifteen motifs based on lowest e-value were identified and selected. Motifs 1, 3, 2, 5, 6, 8, 9 and 13 were found specifically in tau class family, motifs 3, 4, 5, 6, 7 and 9 were found in phi class gene family, while motif 3, 4, 13 and 14 were found in metaxin class. Motif 15 and motif 11 were highly conserved for GHR and EF1B class respectively, while motif 4 was present in almost all classes (Fig. 2).

Fig. 2.

Fig. 2

Conserved motif analyses of CqGSTs were identified with MEME suite by inputting complete CqGST protein sequence with 15 motifs. Individual motifs were represented by different colored boxes

Multiple sequence alignment

The multiple sequence alignment was performed with GST protein sequences of A. thaliana, O. sativa, G. max, P. patens and M. truncatula to identify the conserved and catalytic residue among different GST classes (Fig. 3). It revealed highly conserved N-terminus with active site serine (Ser; S) or cysteine (Cys; C) residue for the activation of GSH binding and GST catalytic activity. Tau, phi, theta and zeta possessed serine active site residues, whereas lambda, GHR, DHAR mPGES had cysteine active site residues.

Fig. 3.

Fig. 3

Fig. 3

Catalytic residue depiction in CqGST proteins using ESpript. Multiple sequence alignment was performed with A. thaliana, G. max, P. patens, M. truncatula and O. sativa GST sequences. The red colour indicate the active site cysteine in GHR, DHAR and Lambda while serine in theta, zeta and tau CqGSTs

Chromosomal localization and gene duplication analysis of CqGST genes

All 120 CqGST gene loci were found to be unevenly distributed across 18 different chromosomes. Seventeen genes were located on chromosome 7, 16 on chromosome 17, 10 on chromosome 08, chromosome 10 and chromosome 16, 9 genes on chromosome 14, chromosome 5 and chromosome 6. Only two genes were found to be present on chromosomes 2, 11, 12 and 13; and a single gene on chromosome 4. No genes were found on chromosome 9 (Fig. 4). Tandem and segmental duplication play an important role in gene family expansion. Further analysis revealed that segmental duplication and purifying type selection were highest in number and found to be main source of expansion of GST gene family as compared to tandem duplication and positive type selection. We found 92 duplicated gene pairs (Fig. 5) with percent identity of more than 80% against each other using blast search and further calculated the ratio of non-synonymous substitutions and synonymous substitutions (Ka/Ks) for all duplicated CqGST genes to examine that genes were naturally favored or not. Twenty-one duplicated gene pairs were found to be tandem duplicated while 71 were duplicated segmentally. Ka/Ks ratio of gene pairs was less than 1 indicating that non-synonymous substitutions were not favored among duplicated genes and purifying selection was more prevalent. We also calculated the divergence time for these genes. The mean average divergence time for these genes was approximately 9.27 million years ago (MYA) (Table 3). CqGSTU genes were majorly involved in gene duplication event which played a major role in quinoa GST gene family expansion.

Fig. 4.

Fig. 4

Chromosomal distribution of 120 CqGST genes on 18 different chromosomes using TB Tool software. CqGST Genes present on different chromosome were represented by different colors. Color description: genes present on chr1—rust orange, chr2—light green, chr3 and chr4—grey, chr5 and chr11—aqua, chr6—brown, chr7 and chr 16—green, chr8—dark brown, chr10—red, chr 12 and chr14—mustard yellow, chr 13 and chr 15—yellow, chr 17—pink and chr 18—sky blue

Fig. 5.

Fig. 5

A circular plot depicting duplication of 92 gene pairs present in quinoa with different colors using TB tool software

Table 3.

Estimated Ka/Ks ratios and divergence times of the duplicated CqGST genes

S no. Gene name 1 Chr no. Gene name 2 Chr no. Percent identity Ka Ks Ka/Ks Duplication time (Mya) Duplication type Selection type
1 CqGSTU1 Chr05 CqGSTU5 Chr05 89.43% 0.4158 0.7940 0.5236 26.46 Tandem Purifying
2 CqGSTU1 Chr05 CqGSTU58 Chr17 88.62% 0.4201 0.7123 0.5898 23.74 Segmental Purifying
3 CqGSTU2 Chr05 CqGSTU4 Chr05 93.59% 0.0316 0.1078 0.2932 3.593 Tandem Purifying
4 CqGSTU2 Chr05 CqGSTU57 Chr16 94.62% 0.0321 0.0972 0.3306 3.24 Segmental Purifying
5 CqGSTU5 Chr05 CqGSTU58 Chr17 92.04% 0.0381 0.1432 0.2663 4.77 Segmental Purifying
6 CqGSTU6 Chr05 CqGSTU59 Chr17 93% 0.5209 0.9775 0.5328 32.58 Segmental Purifying
7 CqGSTU8 Chr05 CqGSTU37 Chr11 90.48% 0.1207 0.1955 0.6178 6.51 Segmental Purifying
8 CqGSTU9 Chr06 CqGSTU44 Chr14 92.76% 0.0324 0.1090 0.2976 3.63 Segmental Purifying
9 CqGSTU10 Chr00 CqGSTU17 Chr07 100% 0.0000 0.0000 0.2035 0 Segmental Purifying
10 CqGSTU10 Chr00 CqGSTU35 Chr10 92.78% 0.0303 0.1070 0.2829 3.56 Segmental Purifying
11 CqGSTU12 Chr06 CqGSTU38 Chr12 93.79% 0.0785 0.1124 0.6981 3.74 Segmental Purifying
12 CqGSTU12 Chr06 CqGSTU41 Chr14 84.31% 0.1181 0.5356 0.2205 17.85 Segmental Purifying
13 CqGSTU13 Chr06 CqGSTU40 Chr14 92.80% 0.0418 0.1305 0.3204 4.35 Segmental Purifying
14 CqGSTU13 Chr06 CqGSTU41 Chr14 97.14% 0.0269 0.0848 0.3170 2.82 Segmental Purifying
15 CqGSTU14 Chr06 CqGSTU43 Chr14 91.67% 0.0386 0.1567 0.2462 5.22 Segmental Purifying
16 CqGSTU15 Chr06 CqGSTU39 Chr14 92.95% 0.0363 0.1519 0.2387 5.06 Segmental Purifying
17 CqGSTU17 Chr07 CqGSTU35 Chr10 95.96% 0.0190 0.0717 0.2653 2.39 Segmental Purifying
18 CqGSTU18 Chr07 CqGSTU36 Chr11 90.67% 0.0470 0.0334 1.4071 1.11 Segmental Positive
19 CqGSTU19 Chr07 CqGSTU63 Chr17 95.74% 0.0222 0.1866 0.1189 6.22 Segmental Purifying
20 CqGSTU20 Chr07 CqGSTU45 Chr14 92.98% 0.0350 0.0281 1.2486 0.93 Segmental Positive
21 CqGSTU20 Chr07 CqGSTU46 Chr15 94.58% 0.0255 0.0511 0.4987 1.7 Segmental Purifying
22 CqGSTU20 Chr07 CqGSTU48 Chr15 89.74% 0.3383 0.3252 1.0404 10.84 Segmental Positive
23 CqGSTU20 Chr07 CqGSTU64 Chr18 91.13% 0.0398 0.1847 0.2155 6.15 Segmental Purifying
24 CqGSTU20 Chr07 CqGSTU65 Chr18 87.19% 0.0704 0.0754 0.9331 2.51 Segmental Purifying
25 CqGSTU21 Chr07 CqGSTU62 Chr17 93.12% 0.0374 0.0907 0.4128 3.02 Segmental Purifying
26 CqGSTU22 Chr07 CqGSTU60 Chr17 92.91% 0.3185 0.8690 0.3665 28.97 Segmental Purifying
27 CqGSTU23 Chr07 CqGSTU24 Chr08 88.16% 0.0639 0.1998 0.3197 6.66 Segmental Purifying
28 CqGSTU23 Chr07 CqGSTU49 Chr15 87.50% 0.0788 0.2402 0.3281 8.01 Segmental Purifying
29 CqGSTU24 Chr08 CqGSTU49 Chr15 80.34% 0.0925 0.2220 0.4166 7.40 Segmental Purifying
30 CqGSTU27 Chr08 CqGSTU50 Chr16 88.74% 0.0492 0.1251 0.3934 4.17 Segmental Purifying
31 CqGSTU29 Chr08 CqGSTU52 Chr16 88.41% 0.0642 0.1255 0.5120 4.18 Segmental Purifying
32 CqGSTU30 Chr08 CqGSTU31 Chr08 84.48% 0.0841 0.2455 0.3426 8.18 Tandem Purifying
33 CqGSTU30 Chr08 CqGSTU53 Chr16 86.58% 0.0604 0.1497 0.4035 4.99 Segmental Purifying
34 CqGSTU30 Chr08 CqGSTU54 Chr16 82.13% 0.0943 0.3304 0.2853 11.01 Segmental Purifying
35 CqGSTU31 Chr08 CqGSTU53 Chr16 80.17% 0.0994 0.2779 0.3576 9.26 Segmental Purifying
36 CqGSTU31 Chr08 CqGSTU54 Chr16 92.34% 0.0333 0.1984 0.1677 6.61 Segmental Purifying
37 CqGSTU33 Chr10 CqGSTU34 Chr10 87.11% 0.0654 0.0611 1.0700 2.04 Tandem Positive
38 CqGSTU38 Chr12 CqGSTU41 Chr14 92.45% 0.2519 1.2935 0.1948 43.12 Segmental Purifying
39 CqGSTU45 Chr15 CqGSTU46 Chr15 96.49% 0.0192 0.0574 0.3342 1.91 Tandem Purifying
40 CqGSTU45 Chr15 CqGSTU47 Chr15 92.31% 0.5645 0.9946 0.5675 33.15 Tandem Purifying
41 CqGSTU45 Chr15 CqGSTU48 Chr15 93.86% 0.0365 0.0245 1.4908 0.82 Tandem Positive
42 CqGSTU45 Chr15 CqGSTU64 Chr18 91.23% 0.0427 0.1953 0.2184 6.51 Segmental Purifying
43 CqGSTU45 Chr15 CqGSTU65 Chr18 84.21% 0.0989 0.1047 0.9445 3.49 Segmental Purifying
44 CqGSTU46 Chr15 CqGSTU64 Chr18 90.64% 0.0409 0.2245 0.1822 7.48 Segmental Purifying
45 CqGSTU46 Chr15 CqGSTU65 Chr18 86.21% 0.0719 0.1264 0.5691 4.21 Segmental Purifying
46 CqGSTU46 Chr15 CqGSTU48 Chr15 91.45% 0.3091 0.3581 0.8631 11.94 Tandem Purifying
47 CqGSTU47 Chr15 CqGSTU48 Chr15 97.87% 0.5602 0.7139 0.7847 23.80 Tandem Purifying
48 CqGSTU48 Chr15 CqGSTU64 Chr18 88.03% 0.3132 0.6329 0.4949 21.10 Segmental Purifying
49 CqGSTU48 Chr15 CqGSTU65 Chr18 81.20% 0.3798 0.4596 0.8264 15.32 Segmental Purifying
50 CqGSTU55 Chr16 CqGSTU56 Chr16 89.04% 0.0852 0.1898 0.4486 6.33 Tandem Purifying
51 CqGSTF2 Chr07 CqGSTF16 Chr17 96.26% 0.0175 0.0888 0.1973 2.96 Segmental Purifying
52 CqGSTF3 Chr07 CqGSTF4 Chr07 91.39% 0.2246 0.3272 0.6866 10.91 Tandem Purifying
53 CqGSTF3 Chr07 CqGSTF17 Chr17 90.07% 0.2242 0.3650 0.6142 12.17 Segmental Purifying
54 CqGSTF4 Chr07 CqGSTF17 Chr17 97.67% 0.0115 0.1161 0.0991 3.87 Segmental Purifying
55 CqGSTF5 Chr07 CqGSTF7 Chr07 88.43% 0.0780 0.1430 0.5455 4.77 Tandem Purifying
56 CqGSTF5 Chr07 CqGSTF14 Chr17 93.64% 0.1004 0.2293 0.4378 7.64 Segmental Purifying
57 CqGSTF6 Chr07 CqGSTF8 Chr07 95.92% 0.0314 0.0452 0.6948 1.51 Tandem Purifying
58 CqGSTF6 Chr07 CqGSTF19 Chr17 85.71% 0.0955 0.1053 0.9065 3.51 Segmental Purifying
59 CqGSTF7 Chr07 CqGSTF14 Chr17 87.04% 0.0618 0.1391 0.4444 4.64 Segmental Purifying
60 CqGSTF8 Chr07 CqGSTF19 Chr17 83.87% 0.0887 0.1014 0.8748 3.38 Segmental Purifying
61 CqGSTF9 Chr07 CqGSTF13 Chr17 96.28% 0.0170 0.1201 0.1411 4.00 Segmental Purifying
62 CqGSTF10 Chr07 CqGSTF12 Chr17 91.37% 0.0515 0.1630 0.3159 5.43 Segmental Purifying
63 CqGSTT2 Chr13 CqGSTT3 Chr16 80.69% 0.1201 0.3129 0.3837 10.43 Segmental Purifying
64 CqGSTZ2 Chr10 CqGSTZ4 Chr01 95.09% 0.0176 0.1252 0.1410 4.17 Segmental Purifying
65 CqGSTDHAR1 Chr04 CqGSTDHAR2 Chr01 81.13% 0.0075 0.1168 0.0643 3.89 Segmental Purifying
66 CqGSTH1 Chr16 CqGSTH2 Chr08 92.42% 0.0135 0.0597 0.2262 1.99 Segmental Purifying
67 CqGSTH3 Chr14 CqGSTH4 Chr06 98.56% 0.0071 0.0587 0.1216 1.96 Segmental Purifying
68 CqGSTM1 Chr01 CqGSTM2 Chr01 85.20% 0.0111 0.0750 0.1474 2.50 Tandem Purifying
69 CqGSTMi1 Chr15 CqGSTMi5 Chr17 83.52% 0.0301 0.1066 0.2829 3.55 Segmental Purifying
70 CqGSTMi2 Chr00 CqGSTMi3 Chr10 84.13% 0.0171 0.0695 0.2463 2.32 Segmental Purifying
71 CqGSTMi4 Chr12 CqGSTMi6 Chr05 92.76% 0.0324 0.0646 0.5008 2.15 Segmental Purifying
72 CqGSTEF1B1 Chr03 CqGSTEF1B2 Chr10 82.74% 0.0917 0.7587 0.1209 25.29 Segmental Purifying
73 CqGSTEF1B1 Chr03 CqGSTEF1B3 Chr10 83.63% 0.0851 0.8136 0.1046 27.12 Segmental Purifying
74 CqGSTEF1B1 Chr03 CqGSTEF1B4 Chr03 84.07% 0.0828 0.8725 0.0949 29.08 Tandem Purifying
75 CqGSTEF1B1 Chr03 CqGSTEF1B5 Chr03 83.63% 0.0847 0.8571 0.0988 28.57 Tandem Purifying
76 CqGSTEF1B1 Chr03 CqGSTEF1B6 Chr10 99.19% 0.0036 0.1574 0.0232 5.25 Segmental Purifying
77 CqGSTEF1B1 Chr03 CqGSTEF1B7 Chr10 97.00% 0.1271 0.2749 0.4622 9.16 Segmental Purifying
78 CqGSTEF1B2 Chr10 CqGSTEF1B3 Chr10 90.41% 0.0490 0.2030 0.2414 6.77 Tandem Purifying
79 CqGSTEF1B2 Chr10 CqGSTEF1B4 Chr03 91.61% 0.0437 0.2507 0.1743 8.36 Segmental Purifying
80 CqGSTEF1B2 Chr10 CqGSTEF1B5 Chr03 90.41% 0.0473 0.2175 0.2175 7.25 Segmental Purifying
81 CqGSTEF1B2 Chr10 CqGSTEF1B6 Chr10 91.06% 0.0407 0.6499 0.0626 21.66 Tandem Purifying
82 CqGSTEF1B3 Chr10 CqGSTEF1B4 Chr03 98.56% 0.0066 0.1303 0.0503 4.34 Segmental Purifying
83 CqGSTEF1B3 Chr10 CqGSTEF1B5 Chr03 96.40% 0.0188 0.1438 0.1306 4.79 Segmental Purifying
84 CqGSTEF1B3 Chr10 CqGSTEF1B6 Chr10 92.68% 0.0328 0.7165 0.0457 23.88 Tandem Purifying
85 CqGSTEF1B4 Chr03 CqGSTEF1B5 Chr03 96.16% 0.0186 0.1663 0.1120 5.54 Tandem Purifying
86 CqGSTEF1B4 Chr03 CqGSTEF1B6 Chr10 92.62% 0.0322 0.7986 0.0404 26.62 Segmental Purifying
87 CqGSTEF1B5 Chr03 CqGSTEF1B6 Chr10 92.68% 0.0328 0.6440 0.0510 21.47 Segmental Purifying
88 CqGSTGHR1 Chr01 CqGSTGHR2 Chr01 84.66% 0.0884 0.4827 0.1832 16.09 Tandem Purifying
89 CqGSTGHR1 Chr01 CqGSTGHR3 Chr02 83.44% 0.0998 0.4451 0.2241 14.84 Segmental Purifying
90 CqGSTGHR1 Chr01 CqGSTGHR4 Chr02 84.05% 0.0413 0.0960 0.4302 3.20 Segmental Purifying
91 CqGSTGHR2 Chr01 CqGSTGHR3 Chr02 95.42% 0.0232 0.0416 0.5579 1.39 Segmental Purifying
92 CqGSTGHR5 Chr17 CqGSTGHR6 Chr07 97.07% 0.0130 0.0556 0.2346 1.85 Segmental Purifying

Protein secondary structure prediction

Dominance of alpha helix was followed by coil, extended strand and beta turns. Percentage of alpha helix was found to be higher in all CqGSTs except CqGSTU60, CqGSTZ3, CqGSTMi1, CqGSTMi5, CqGSTEF1B1, CqGSTEF1B2, CqGSTEF1B3, CqGSTEF1B6 and in all GHR genes in which high coil percentage was found. Highest alpha helix and coil percentage was present in CqGSTU40 and in CqGSTMi5 which is 69.84 and 54.6 respectively (Fig. 6) (Table 4).

Fig. 6.

Fig. 6

Secondary structure prediction of all CqGSTs using SOPMA showed the dominance of alpha helices followed by coils

Table 4.

Quinoa GSTs secondary structure prediction using SOPMA

Gene name Coil (C) (%) Beta-Turn (T) (%) Extended strand (E) (%) Alpha helix (H) (%)
CqGSTU1 37.12 4.8 14.41 43.67
CqGSTU2 23.87 5.41 13.51 57.21
CqGSTU3 26.16 4.65 11.05 58.14
CqGSTU4 21.29 1.29 9.03 68.39
CqGSTU5 29.33 7.56 13.33 49.78
CqGSTU6 30.75 7.24 16.54 45.48
CqGSTU7 26.13 3.6 14.86 55.41
CqGSTU8 31.62 5.98 11.97 50.43
CqGSTU9 29.09 4.55 14.09 52.27
CqGSTU10 20.83 7.29 10.42 61.46
CqGSTU11 29.55 3.64 13.64 53.18
CqGSTU12 27.31 3.52 11.01 58.15
CqGSTU13 25.33 4.37 10.48 59.83
CqGSTU14 27.14 3.33 13.81 55.71
CqGSTU15 24.45 3.06 11.35 61.14
CqGSTU16 27.03 5.41 13.51 54.05
CqGSTU17 25.89 6.7 12.95 54.46
CqGSTU18 24.16 7.38 9.4 59.06
CqGSTU19 28.51 4.68 11..06 55.74
CqGSTU20 26.73 2.97 12.38 57.92
CqGSTU21 27.33 7.45 16.15 49.07
CqGSTU22 29.6 5.38 12.11 52.91
CqGSTU23 27.38 8.65 15.85 48.13
CqGSTU24 25.66 4.87 9.29 60.18
CqGSTU25 29.39 5.26 9.21 56.14
CqGSTU26 28.5 7.5 13 51
CqGSTU27 29.57 4.35 12.17 53.91
CqGSTU28 28.16 6.9 11.78 53.16
CqGSTU29 30.17 5.6 9.05 55.17
CqGSTU30 26.09 6.96 12.61 54.35
CqGSTU31 23.38 8.23 12.99 55.41
CqGSTU32 25.45 5.36 12.05 57.14
CqGSTU33 28.04 6.07 13.55 52.34
CqGSTU34 25.89 6.25 11.16 56.7
CqGSTU35 28.57 4.46 11.61 55.36
CqGSTU36 25.5 4.03 6.71 63.76
CqGSTU37 26.34 4.46 16.52 52.68
CqGSTU38 25.38 4.57 11.68 58.38
CqGSTU39 25.44 4.82 9.65 60.09
CqGSTU40 15.87 3.97 10.32 69.84
CqGSTU41 36.45 7.48 18.69 37.38
CqGSTU42 28.31 5.02 12.33 54.34
CqGSTU43 28.08 4.43 9.85 57.64
CqGSTU44 30.84 4.41 14.1 50.66
CqGSTU45 21.05 6.14 17.54 55.26
CqGSTU46 25.74 5.45 9.9 58.91
CqGSTU47 34.77 6.74 17.52 40.97
CqGSTU48 22.46 6.42 14.97 56.15
CqGSTU49 26.34 7.93 11.42 54.31
CqGSTU50 31.22 7.69 10.41 50.68
CqGSTU51 28.51 4.39 10.53 56.58
CqGSTU52 26.29 5.17 11.21 57.33
CqGSTU53 26.99 4.87 12.39 55.75
CqGSTU54 29.49 6.41 11.97 52.14
CqGSTU55 28.89 4.89 12.44 53.78
CqGSTU56 29.13 3.91 9.57 57.39
CqGSTU57 27.15 4.07 14.48 54.3
CqGSTU58 32 4.44 11.56 52
CqGSTU59 39.66 3.45 13.1 43.79
CqGSTU60 41.53 6.36 15.25 36.86
CqGSTU61 27.48 4.5 13.51 54.5
CqGSTU62 26.91 5.83 11.66 55.61
CqGSTU63 27.92 6.25 11.25 54.58
CqGSTU64 27.73 6.82 14.09 51.36
CqGSTU65 29.41 4.52 14.03 52.04
CqGSTF1 32.41 6.02 15.28 46.3
CqGSTF2 31.8 6.45 14.75 47
CqGSTF3 32.8 4.58 15 48.33
CqGSTF4 31.31 7.94 13.55 47.2
CqGSTF5 29.75 7.02 11.57 51.65
CqGSTF6 23.91 6.52 8.7 60.87
CqGSTF7 29.91 7.01 13.55 49.53
CqGSTF8 27.42 6.45 9.68 56.45
CqGSTF9 28.04 6.54 14.95 50.47
CqGSTF10 32.32 5.05 14.22 45.41
CqGSTF11 37.22 5.83 14.35 42.6
CqGSTF12 32.99 7.61 13.2 46.19
CqGSTF13 32.24 7.48 15.42 44.86
CqGSTF14 30.93 7.67 15.81 45.58
CqGSTF15 31.19 5.96 14.68 48.17
CqGSTF16 32.24 6.54 13.55 47.66
CqGSTF17 33.18 5.61 14.95 46.26
CqGSTF18 27.06 7.8 14.68 50.46
CqGSTF19 23.62 7.09 4.72 64.57
CqGSTL1 38.46 3.85 14.62 43.08
CqGSTL2 31.81 8.95 14.4 44.84
CqGSTT1 32.61 6.09 8.26 53.04
CqGSTT2 35.19 4.72 10.73 49.36
CqGSTT3 28.77 7.53 10.96 52.74
CqGSTZ1 37.12 3.93 11.79 47.16
CqGSTZ2 38.57 4.48 10.76 46.19
CqGSTZ3 42.15 3.72 12.4 41.74
CqGSTZ4 35.71 4.91 12.5 46.88
CqGSTDHAR1 41.38 2.3 5.17 51.15
CqGSTDHAR2 38.68 3.77 14.15 43.4
CqGSTH1 37.45 2.01 6.12 54.42
CqGSTH2 39.55 2.14 6.7 51.61
CqGSTH3 40.05 3.69 7.62 48.64
CqGSTH4 40.34 3.45 6.98 49.24
CqGSTM1 39.79 2.08 13.15 44.98
CqGSTM2 37.69 5.17 12.16 44.98
CqGSTMi1 47.38 5.79 17.36 29.48
CqGSTMi2 31.4 3.05 8.23 57.32
CqGSTMi3 33.95 3.98 11.41 50.66
CqGSTMi4 25.17 3.97 9.93 60.93
CqGSTMi5 54.6 4.13 13.33 27.94
CqGSTMi6 23.84 3.31 13.91 58.94
CqGSTEF1B1 48 4 21.33 26.67
CqGSTEF1B2 42.65 2.89 14.94 39.52
CqGSTEF1B3 43.79 4.07 12.83 39.31
CqGSTEF1B4 40.38 2.4 14.9 42.31
CqGSTEF1B5 39.42 3.61 14.66 42.31
CqGSTEF1B6 31.97 5.74 29.51 31.79
CqGSTEF1B7 40.51 2.89 10.93 45.66
CqGSTGHR1 44.38 2.43 11.85 41.34
CqGSTGHR2 44.83 3.16 11.21 40.8
CqGSTGHR3 44.54 3.16 11.49 40.8
CqGSTGHR4 47.64 3.38 11.49 37.5
CqGSTGHR5 46.32 2.45 13.73 37.5
CqGSTGHR6 45.59 3.68 11.27 39.46

Evolutionary or phylogenetic analysis

To understand evolutionary relationship between GST gene family members, an unrooted phylogenetic tree was generated between quinoa and other species including A. thaliana, O. sativa, G. max, P. patens and M. truncatula. The phylogenetic analysis showed that CqGSTs were grouped into 11 GST classes including tau, phi, theta, zeta, lambda, DHAR, mPGES, GHR, EF1B, metaxin and hemerythrin. In quinoa, tau and phi are the largest class of GSTs with 65 and 19 members respectively followed by EF1B (7 members), GHR and mPGES (6 member each), hemerythrin (4 members), theta (3 members), metaxin, DHAR and lambda (2 member each). The occurrence of tau and phi as major classes of GST gene family is in accordance with GSTs reported in G. max, O. sativa and A. thaliana. All the tau and phi class CqGSTs were found to be closely associated with G. max, O. sativa and A. thaliana. The analysis clearly represents that CqGSTs of specific class clustered with their respective class GSTs with the exception of lambda class which clustered with DHAR and metaxin that clustered with tau and mPGES class GSTs (Fig. 7).

Fig. 7.

Fig. 7

Phylogenetic relationship among GST proteins of A. thaliana, G. max, O. sativa, M. truncatula, P. patens and C. quinoa using MEGA 7. A total of 11 different clades were depicted in different colors (Color description: Blue—Tau; Brown—Hemerythrin; Light green—Lambda; Purple—DHAR; Red—GHR; Light blue—Zeta; Yellow—Theta; Orange—EF1B; Dark green—Phi; Grey—Microsomal; Light orange—Metaxin)

Promoter analysis

Various cis- acting regulatory elements (CAREs) are found to be present in promoter region of the genes. This study identified 21 cis-elements which are responsible for various responses including light, hormone, stress, cellular development and other elements. After the core promoters (AT ~ TATA box, TATA box and CAAT box), MYB (395) was highest in number followed by STRE (280), ARE (279), ABRE (231), ERE (160), AAGAA (112), TCT (112), TGA (57), TC rich repeat (52), P box (44), Circadian (36), AE and CCAAT (35), TCCC (29), DRE (28), GARE (27), CTAG (15) and ACA (1). CqGSTL2 (275) and CqGSTU7 (243) possessed highest number of cis-acting elements while CqGSTU4 (49), CqGSTL1 and CqGSTT3 (50) have least (Fig. 8). Presence of highest number of stress responsive element (MYB) can be linked with their role in stress responses.

Fig. 8.

Fig. 8

Cis-acting regulatory element in upstream promoter region of CqGSTs using PlantCARE. The scale represents the number of particular CARE elements in corresponding genes. Grey color is indicative of absence of cis acting regulatory elements

Molecular docking

One candidate member from each identified family was selected for molecular docking study with metalaxyl. Docking study of molecules (CqGSTU1, CqGSTF1, CqGSTL1, CqGSTT1, CqGSTZ1, CqGSTDHAR1, CqGSTH1, CqGSTMi1, CqGSTGHR1) showed that metalaxyl binds with CqGSTF1 with lowest binding energy among all (Fig. 9) (Table 5).

Fig. 9.

Fig. 9

Molecular docking study of one member of each quinoa GST gene family proteins with metalaxyl

Table 5.

Binding energies of CqGST molecules with metalaxyl

Protein molecules Run Binding energy Cluster RMSD Reference RMSD
CqGSTU1 2 − 3.63 0.00 99.10
CqGSTF1 25 − 5.21 0.00 97.63
CqGSTL1 24 − 4.77 0.00 90.22
CqGSTT1 18 − 3.25 0.00 98.47
CqGSTZ1 22 − 2.66 0.00 93.71
CqGSTDHAR1 13 − 2.52 0.00 106.64
CqGSTH1 10 − 3.26 0.00 179.61
CqGSTMi1 7  + 6.7 0.00 115.18
CqGSTGHR1 1 − 3.7 0.00 113.82

Discussion

GST genes have been found to be involved in various physiological activities in plants that include plant growth and development, signal transduction pathways, tetrapyrrole signaling, hormone responses and importantly in plant’s biotic and abiotic stress metabolism (Csiszár et al. 2019). Therefore, GSTs have become a potential target for plant breeders. The comprehensive genome-wide identification studies of various gene families using the genomic information available at different databases is precise and highly significant (Vaish et al. 2022). In the present investigation, genome wide analysis of GST genes in C. quinoa have been reported for the first time. This study identified 120 GST genes in quinoa that are higher in number as compared to those reported in several other species like 31 in V. radiata (Vaish et al. 2018), 60 in A. thaliana (Lallement et al. 2014a, b), 65 in Brassica oleracea (Vijayakumar et al. 2016), 81 in S. lycopersicum (Csiszár et al. 2014), 82 in O. sativa (Jain et al. 2010), 84 in H. vulgare (Rezaei et al. 2013), 85 in C. annuum L. (Islam et al. 2019) and 101 in G. max (Liu et al. 2015), but fewer than reported in Triticum aestivum and B. napus having 346 and 179 GST genes, respectively (Wei et al. 2019; Hao et al. 2021).

Among the 11 identified classes in quinoa, tau and phi class GSTs were most represented with 65 and 19 members, respectively followed by EF1B (7), GHR and microsomal (6), hemerythrin and zeta (4) and theta (3). Metaxin, DHAR and lambda were represented by 2 members each. The occurrence of tau and phi class was most abundant that was in accordance with those reported for other crops like O. sativa and C. arietinum (Jain et al. 2010; Ghangal et al. 2020). The high number of tau and phi genes reflect their functional importance in plant growth and development, and therefore can be termed as ‘plant specific GST’s’ due to their dominance over other classes (Kumar et al. 2013). The CqGSTs varied in size, sequence and physicochemical parameters like isoelectric point, molecular weight, GRAVY, aliphatic index which were comparable with GSTs of other plant species.

Subcellular location prediction of proteins is vital for having an in-depth understanding of its function and physicochemical characteristics (Liao et al. 2021; Cong et al. 2022). In the present investigation, subcellular localization prediction showed dominance of proteins in cytoplasm, followed by its presence in other cellular compartments including chloroplast, mitochondria, plastid, plasma membrane, nucleus, extracellular, ER and cytoskeleton. The identified GST genes were found to be distributed over 18 chromosomes at different locations. To study the expanded mechanism of GST genes in quinoa, gene duplication events were analyzed which showed that the origin of new members in a gene family was majorly due to gene duplication. The ratio of non-synonymous to synonymous substitutions (Ka/Ks value) < 1 indicates that a gene pair has purifying selection, the Ka/Ks value > 1 indicates positive selection, while Ka/Ks = 1 indicates neutral selection. In the current study, Ka/Ks was less than 1 for most of the genes which was indicative of dominance of purifying selection over positive selection. In quinoa, tandem and segmental duplication were two major driving force for gene family expansion of GST. Chromosome 7 possessed highest seventeen genes while chromosome 4 possessed the lowest i.e., only one CqGST gene. Segmental duplication has major contribution for rapid expansion of GSTs in quinoa as well as reported in other crops like C. annum (Islam et al. 2019), B. napus (Wei et al. 2019), C. arietinum (Ghangal et al. 2020), S. lycopersicum (Islam et al. 2017) and P. bretschneideri (Wang et al. 2018) while tandem duplication was reported as major event for expansion of GSTs in M. accuminata (Vaish et al. 2022), T. aestivum (Hao et al. 2021), Boleracea and Brapa (Wei et al. 2019). Secondary structure prediction of GSTs showed highest percentage of alpha helix, followed by coiled and extended strands.

Phylogenetic relationship revealed that all the tau and phi class GSTs were closely associated with those of G. max, O. sativa and A. thaliana. Well-conserved signature motifs were identified in CqGSTs i.e. W(A/V)S(P/M) in tau, SQPS/C in theta, SSCS/A in zeta, CPF/YA in lambda, CPFC/S in DHAR, and CPWA in GHR (Vaish et al. 2020). Similar results have also been reported in S. lycopersicum (Islam et al. 2017), C. annum (Islam et al. 2019) and T. aestivum (Wang et al. 2019a, b). The presence of conserved motifs validates them as GST proteins and confers their functions in plants. There are 2 exons in most of tau class members, 3- 6 in phi, 5–6 in DHAR, 6–8 in EF1B, 3–6 in GHR, 11–14 in hemerythrin, 6 in metaxin, 4–12 in microsomal prostaglandin E synthetase and 2–9 in zeta class. Similar gene organization (2 exons and 1 intron) have also been reported in cotton and mung bean (Dong et al. 2016; Vaish et al. 2018). Structural heterogeneity was observed in almost all classes along with variable number of exons/introns within the members of the same class. A number of previous studies have also reported the presence of multiple introns in zeta and lambda classes, while fewer in tau (Ding et al. 2017; Islam et al. 2017; Kayum et al. 2018; Han et al. 2018; Vaish et al. 2018).

CARE, a specific motif that exists in the promoter region of a gene, has the unique ability to combine with transcription factors and aid in regulation of the downstream genes (Zhu et al. 2021). Thus, the identification and analysis of CAREs in gene promoter regions is of immense utility in understanding the molecular regulation of these genes (Xiaolin et al. 2022). Promoter analysis revealed the presence of different CAREs in response to hormone, light, cellular and stress (Kaur et al. 2017) as well as core promoters including AT ~ TATA box, TATA box and CAAT box (Rahman et al. 2021). Various regulatory elements like ERE motif, GARE motif, ABRE, AAGAA, TCCC, TCT, TGA, MYB, P box were found in the promoter region of CqGSTs that are responsive to light and various hormones like auxin, gibberellins, abscisic acid, thus playing an important role in plant growth and metabolism. The presence of different stress responsive element like STRE, DRE, TC rich repeats confirmed the role of CqGSTs in biotic and abiotic stress. Circadian elements responsible for circadian control were also found in CqGSTs. Sporadic reports are available regarding the role of this element in plant metabolism. Alderete et al. (2018) extensively studied putative circadian regulation while analyzing the NtGST gene (phi) in tobacco seedlings.

Downy mildew caused by Peronospora farinosa (Fr.) Fr. f. sp. chenopodii is a serious disease in quinoa that leads to considerable drop in crop yields across continents (Colque-Little et al. 2021). Metalaxyl, a member of the phenylamides group of systemic fungicides, has been traditionally used for the control of downy mildew in several economically important crops. But there have been reports of health hazards associated with this fungicide (Gupta 2018). Molecular docking study showed that CqGSTF1 possessed highest affinity with metalaxyl, and therefore can be effectively applied to enhance in vivo GST activity in quinoa. Binding of the fungicide metalaxyl with phi family members could be a potential target for enhancing GST activity for detoxification of fungicides. Although safeners induced GST activity has been reported in variety of crops including arabidopsis, maize, and wheat but there is no such report of fungicide induced GST activity is available in quinoa till date. Results of the present study identified metalaxyl as potential GST inducer that could be beneficial for crop development and stress modulation. Availability of this information might encourage researchers for further functional validation.

Conclusion

The in-silico genome wide identification and characterization of GST genes in quinoa assume significance since a number of these assist the plant in their physiological functions, and enables it to grow through biotic and abiotic stresses. Modern biotechnological tools coupled with conventional breeding approaches can help raise plants adapted to various abiotic stresses. These findings can be used for developing stress tolerant plants with enhance productivity through molecular cloning and characterization, as well as their expression studies under stress conditions. The role of GSTs in crop improvement with special emphasis on plant growth and productivity will open new avenues for genetic improvement of plants.

Author contributions

Conceptualization, ST, MB and AB. Methodology, ST, SV, NS and MB. Analysis, ST, SV, MB and AB. Writing- original draft preparation, ST, MB and AB. Writing- review and editing, ST, SV and MB. Supervision, AB. All authors have read and agreed to the published version of the manuscript.

Funding

No funding, financial or non-financial interests were involved.

Data availability

The datasets were derived from sources in the public domain.

Declarations

Conflict of interest

The authors report there are no competing interests to declare.

Ethical approval

Not applicable.

Research involving human participants and/or animals

Not applicable. No human or animal trials were conducted.

Informed consent

Not applicable. No human trials involved.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

References

  1. Abdul Kayum M, Nath UK, Park J-I, Biswas MK, Choi EK, Song J-Y et al (2018) Genome-wide identification, characterization, and expression profiling of Glutathione S-Transferase (GST) family in pumpkin reveals likely role in cold-stress tolerance. Genes 9:84. 10.3390/genes9020084 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alderete LGS, Guido ME, Agostini E, Mas P (2018) Identification and characterization of key circadian clock genes of tobacco hairy roots: putative regulatory role in xenobiotic metabolism. Environ Sci Pollut Res Int 25:1597–1608. 10.1007/s11356-017-0579-9 [DOI] [PubMed] [Google Scholar]
  3. Alvar-Beltrán J, Verdi L, Marta AD, Dao A, Vivoli R, Sanou J et al (2020) The effect of heat stress on quinoa (cv. Titicaca) under controlled climatic conditions. J Agric Sci 158:1–7. 10.1017/S0021859620000556 [Google Scholar]
  4. Armenteros JJA, Sønderby CK, Sønderby SK, Nielsen H, Winther O (2017) DeepLoc: prediction of protein subcellular localization using deep learning. Bioinformatics 21:3387–3395. 10.1093/bioinformatics/btx431 [DOI] [PubMed] [Google Scholar]
  5. Bailey TL, Bodén M, Buske FA, Frith M, Grant CE, Clementi L et al (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37(Web Server Issue):202–208. 10.1093/nar/gkp335 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bhargava A, Ohri D (2015) Quinoa in the Indian subcontinent. In: Bazile A et al (eds) FAO and CIRAD: state of the art report of Quinoa in the World in 2013. FAO, Rome, pp 511–523 [Google Scholar]
  7. Chronopoulou E, Kontouri K, Chantzikonstantinou M, Pouliou F, Perperopoulou F, Voulgari G et al (2014) Plant glutathione transferases: structure, antioxidant catalytic function and in planta protective role in biotic and abiotic stress. Curr Chem Biol 8:58–75. 10.2174/2212796809666150302213733 [Google Scholar]
  8. Colque-Little C, Amby DB, Andreasen C (2021) A review of Chenopodium quinoa (Willd.) diseases—an updated perspective. Plants (basel, Switzerland) 10:1228. 10.3390/plants10061228 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Combet C, Blanchet C, Geourjon C, Deléage G (2000) NPS@: network protein sequence analysis. TIBS 25:147–150. 10.1016/s0968-0004(99)01540-6 [DOI] [PubMed] [Google Scholar]
  10. Cong H, Liu H, Cao Y, Chen Y, Liang C (2022) Multiple protein subcellular locations prediction based on deep convolutional neural networks with self-attention mechanism. Interdiscip Sci Comput Life Sci 14:421–438. 10.1007/s12539-021-00496-7 [DOI] [PubMed] [Google Scholar]
  11. Csiszár J, Horvath E, Vary Z, Galle A, Bela K, Brunner S, Tari I (2014) Glutathione transferase supergene family in tomato: salt stress-regulated expression of representative genes from distinct GST classes in plants primed with salicylic acid. Plant Physiol Biochem 78:15–26. 10.1016/j.plaphy.2014.02.010 [DOI] [PubMed] [Google Scholar]
  12. Csiszár J, Hecker A, Labrou NE, Schröder P, Riechers DE (2019) Plant glutathione transferases: diverse, multi-tasking enzymes with yet-to-be discovered functions. Front Plant Sci 10:1304. 10.3389/fpls.2019.01304 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Czarnocka W, Karpiński S (2018) Friend or foe? Reactive oxygen species production, scavenging and signaling in plant response to environmental stresses. Free Radic Biol Med 122:4–20. 10.1016/j.freeradbiomed.2018.01.011 [DOI] [PubMed] [Google Scholar]
  14. Czerniawski P, Bednarek P (2018) Glutathione S-transferases in the biosynthesis of sulfur-containing secondary metabolites in Brassicaceae plants. Front Plant Sci 9:1639. 10.3389/fpls.2018.01639 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Dakhili S, Abdolalizadeh L, Hosseini SM, Shojaee-Aliabadi S, Mirmoghtadaie L (2019) Quinoa protein: composition, structure and functional properties. Food Chem 299:125161. 10.1016/j.foodchem.2019.125161 [DOI] [PubMed] [Google Scholar]
  16. Danielsen S, Bonifacio A, Ames T (2003) Diseases of quinoa (Chenopodium quinoa). Food Rev Int 19:43–59. 10.1081/FRI-120018867 [Google Scholar]
  17. Ding N, Wang A, Zhang X, Wu Y, Wang R, Cui H et al (2017) Identification and analysis of glutathione S-transferase gene family in sweet potato reveal divergent GST-mediated networks in aboveground and underground tissues in response to abiotic stresses. BMC Plant Biol 17:225. 10.1186/s12870-017-1179-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Dong Y, Li C, Zhang Y, He Q, Daud MK, Chen J, Zhu S (2016) Glutathione S-Transferase gene family in Gossypium raimondii and G. arboreum: Comparative genomic study and their expression under salt stress. Front Plant Sci 7:139. 10.3389/fpls.2016.00139 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Fafián-Labora JA, Rodríguez-Navarro JA, O’Loghlen A (2020) Small extracellular vesicles have GST activity and ameliorate senescence-related tissue damage. Cell Metab 32:71–86. 10.1016/j.cmet.2020.06.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Fang X, An Y, Zheng J, Shangguan L, Wang L (2020) Genome-wide identification and comparative analysis of GST gene family in apple (Malus domestica) and their expressions under ALA treatment. 3 Biotech 10:307. 10.1007/s13205-020-02299-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Fuentes FF, Bhargava A (2011) Morphological analysis of quinoa germplasm grown under lowland desert conditions. J Agron Crop Sci 197:124–134. 10.1111/j.1439-037X.2010.00445.x [Google Scholar]
  22. Gao J, Chen B, Lin H, Liu Y, Wei Y, Chen F, Li W (2020) Identification and characterization of the glutathione S-transferase (GST) family in radish reveals a likely role in anthocyanin biosynthesis and heavy metal stress tolerance. Gene 743:144484. 10.1016/j.gene.2020.144484 [DOI] [PubMed] [Google Scholar]
  23. Gasteiger E, Hoogland C, Gattiker A, Duvaud S, Wilkins MR, Appel RD et al (2005) Protein identification and analysis tools on the ExPASy server. The proteomics protocols handbook. Springer Protocols Handbooks, Humana Press, pp 571–607. 10.1385/1-59259-890-0:571 [Google Scholar]
  24. Ghangal R, Rajkumar MS, Garg R et al (2020) Genome-wide analysis of glutathione S-transferase gene family in chickpea suggests its role during seed development and abiotic stress. Mol Biol Rep 47:2749–2761. 10.1007/s11033-020-05377-8 [DOI] [PubMed] [Google Scholar]
  25. Gupta PK (2018) Toxicity of fungicides. In: Gupta RC (ed) Veterinary toxicology. Academic Press Elsevier, pp 569–580. 10.1016/B978-0-12-811410-0.00045 [Google Scholar]
  26. Han XM, Yang ZL, Liu YJ, Yang HL, Zeng QY (2018) Genome-wide profiling of expression and biochemical functions of the Medicago, glutathione S-transferase gene family. Plant Physiol Biochem 126:126–133. 10.1016/j.plaphy.2018.03.004 [DOI] [PubMed] [Google Scholar]
  27. Hao Y, Xu S, Iyu Z, Wang H, Kong L, Sun S (2021) Comparative analysis of the GLUTATHIONE S-transferase gene family of four Triticeae species and transcriptome analysis of GST Genes in common wheat responding to salt stress. Int J Genomics 18:6289174. 10.1155/2021/6289174 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Hasan MS, Singh V, Islam S, Islam MS, Ahsan R, Kaundal A et al (2021) Genome-wide identification and expression profiling of glutathione S-transferase family under multiple abiotic and biotic stresses in Medicago truncatula L. PLoS ONE 16(2):e0247170. 10.1371/journal.pone.0247170 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Holub EB (2001) The arms race is ancient history in Arabidopsis, the wildflower. Nat Rev Genet 2:516–527. 10.1038/35080508 [DOI] [PubMed] [Google Scholar]
  30. Hong S, Cheon K, Yoo K, Lee H, Cho K, Suh J et al (2017) Complete chloroplast genome sequences and comparative analysis of C. quinoa and C. album. Front Plant Sci 8:1696. 10.3389/fpls.2017.01696 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Horton P, Park K, Obayashi T, Fujita N, Harada H, Adams-Collier CJ et al (2007) WoLF PSORT: protein localization predictor. Nucleic Acids Res 35(Web Server Issue):W585–W587. 10.1093/nar/gkm259 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Hu B, Jin J, Guo A-Y, Zhang H, Luo J, Gao G (2015) GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics 31:1296–1297. 10.1093/bioinformatics/btu817 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Islam S, Rahman IA, Islam T, Ghosh A (2017) Genome-wide identification and expression analysis of glutathione S-transferase gene family in tomato: gaining an insight to their physiological and stress-specific roles. PLoS ONE 12:e0187504. 10.1371/journal.pone.0187504 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Islam S, Sajib SD, Jui ZS, Arabia S, Islam T (2019) Genome-wide identification of glutathione S-transferase gene family in pepper, its classification, and expression profiling under different anatomical and environmental conditions. Sci Rep 9:910. 10.1038/s41598-019-45320-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Jain M, Ghanashyam C, Bhattacharjee A (2010) Comprehensive expression analysis suggests overlapping and specific roles of glutathione S-transferases during development and stress responses in rice. BMC Genomics 11:73. 10.1186/1471-2164-11-73 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Jarvis DE, Ho YS, Lightfoot DJ, Schmöckel SM, Li B, Borm TJA et al (2017) The genome of Chenopodium quinoa. Nature 542:307–312. 10.1038/nature21370 [DOI] [PubMed] [Google Scholar]
  37. Kaur A, Pati PK, Pati AM, Nagpal AK (2017) In-silico analysis of cis-acting regulatory elements of pathogenesis-related proteins of Arabidopsis thaliana and Oryza sativa. PLoS ONE 12:e0184523. 10.1371/journal.pone.0184523 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Koch MA, Haubold B, Mitchell-Olds T (2000) Comparative evolutionary analysis of chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera (Brassicaceae). Mol Biol Evol 17:1483–1498. 10.1093/oxfordjournals.molbev.a026248 [DOI] [PubMed] [Google Scholar]
  39. Kong X, Lv W, Jiang S, Zhang D, Cai G, Pan J, Li D (2013) Genome-wide identification and expression analysis of calcium-dependent protein kinase in maize. BMC Genomics 14:433. 10.3389/fpls.2016.00469 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kumar S, Asif MH, Chakrabarty D, Tripathi RD, Dubey RS, Trivedi PK (2013) Differential expression of rice lambda class GST gene family members during plant growth, development, and in response to stress conditions. Plant Mol Biol Rep 31:569–580. 10.1007/s11105-012-0524-5 [Google Scholar]
  41. Kumar S, Stecher G, Tamura K (2016) MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol 33:1870–1874. 10.1093/molbev/msw054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Lallement PA, Brouwer B, Keech O, Hecker A, Rouhier N (2014a) The still mysterious roles of cysteine-containing glutathione transferases in plants. Front Pharmacol 5:192. 10.3389/fphar.2014.00192 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Lallement PA, Meux E, Gualberto JM, Prosper P, Didierjean C, Saul F, Haouz A, Rouhier N, Hecker A (2014b) Structural and enzymatic insights into Lambda glutathione transferases from Populus trichocarpa, monomeric enzymes constituting an early divergent class specific to terrestrial plants. Biochem J 462:39–52. 10.1042/BJ20140390 [DOI] [PubMed] [Google Scholar]
  44. Lescot M, Déhais P, Thijs G, Marchal K, Moreau Y, Van de Peer Y et al (2002) PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res 30:325–327. 10.1093/nar/30.1.325 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Letunic I, Khedkar S, Bork P (2021) SMART: recent updates, new developments and status in 2020. Nucleic Acids Res 49(D1):D458–D460. 10.1093/nar/gkaa937 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Li K, Fan Y, Zhou G et al (2022) Genome-wide identification, phylogenetic analysis, and expression profiles of trihelix transcription factor family genes in quinoa (Chenopodium quinoa Willd.) under abiotic stress conditions. BMC Genomics 23:499. 10.1186/s12864-022-08726-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Liao Z, Pan G, Sun C, Tang J (2021) Predicting subcellular location of protein with evolution information and sequence-based deep learning. BMC Bioinform 22(Suppl 10):515. 10.1186/s12859-021-04404-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Liu YJ, Han XM, Ren LL, Yang HL, Zeng QY (2013) Functional divergence of the glutathione S-transferase supergene family in Physcomitrella patens reveals complex patterns of large gene family evolution in land plants. Plant Physiol 161:773–786. 10.1104/pp.112.205815 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Liu S, Liu Y, Yang X, Tong C, Edwards D, Parkin IA et al (2014) The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat Commun 5:3930. 10.1038/ncomms4930 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Liu H-J, Tang Z-X, Han X-M, Yang Z-L, Zhang F-M, Yang H-L et al (2015) Divergence in enzymatic activities in the soybean GST supergene family provides new insight into the evolutionary dynamics of whole-genome duplicates. Mol Biol Evol 32:2844–2859. 10.1093/molbev/msv156 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Liu J, Wang R, Liu W, Zhang H, Guo Y, Wen R (2018) Genome-wide characterization of heat-shock protein 70s from Chenopodium quinoa and expression analyses of Cqhsp70s in response to drought stress. Genes 9:35. 10.3390/genes9020035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Marchler Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S et al (2017) CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res 45:D200–D203. 10.1093/nar/gkw1129 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Marimo P, Hayeshi R, Mukanganyama S (2016) Inactivation of glutathione transferase 2 by Epiphyllocoumarin. Biochem Res Int 2016:1–8. 10.1155/2016/2516092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Mittler R, Vanderauwera S, Gollery M, Breusegem FV (2004) Reactive oxygen gene network of plants. Trends Plant Sci 9:490 [DOI] [PubMed] [Google Scholar]
  55. Morales A, Zurita-Silva A, Maldonado J, Silva H (2017) Transcriptional responses of Chilean quinoa (Chenopodium quinoa Willd.) under water deficit conditions uncovers ABA-independent expression patterns. Front Plant Sci 8:216. 10.3389/fpls.2017.00216 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS et al (2009) AutoDock4 and Auto-DockTools4: automated docking with selective receptor flexibility. J Comput Chem 30:2785–2591. 10.1002/jcc.21256 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Nianiou-Obeidat I, Madesis P, Kissoudis C, Voulgari G, Chronopoulou E, Tsaftaris A et al (2017) Plant glutathione transferase-mediated stress tolerance: functions and biotechnological applications. Plant Cell Rep 36:791–805. 10.1007/s00299-017-2139-7 [DOI] [PubMed] [Google Scholar]
  58. Pizzio GA (2022) Genome-wide identification of the PYL gene family in Chenopodium quinoa: from genes to protein 3D structure analysis. Stresses 2:290–307. 10.3390/stresses2030021 [Google Scholar]
  59. Rahman MM, Rahman MM, Eom JS et al (2021) Genome-wide identification, expression profiling and promoter analysis of trehalose6-phosphate phosphatase gene family in rice. J Plant Biol 64:55–71. 10.1007/s12374-020-09279-x [Google Scholar]
  60. Rezaei MK, Shobbar ZS, Shahbazi M, Abedini R, Zare S (2013) Glutathione S-transferase (GST) family in barley: identification of members, enzyme activity, and gene expression pattern. J Plant Physiol 170:1277–1284. 10.1016/j.jplph.2013.04.005 [DOI] [PubMed] [Google Scholar]
  61. Robert X, Gouet P (2014) Deciphering key features in protein structures with the new ENDscript server. Nucleic Acids Res 42(Web Server Issue):W320–W324. 10.1093/nar/gku316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Sievers F, Wilm A, Dineen DG, Gibson TJ, Karplus K, Li W et al (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7:539. 10.1038/msb.2011.75 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Song W, Zhou F, Shan C, Zhang Q, Ning M, Liu X et al (2021) Identification of glutathione S-transferase genes in hami melon (Cucumis melo var. saccharinus) and their expression analysis under cold stress. Front Plant Sci 12:672017. 10.3389/fpls.2021.672017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Suyama M, Torrents D, Bork P (2006) PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res 34:W609–W612. 10.1093/nar/gkl315 [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Vaish S, Awasthi P, Tiwari S, Tiwari SK, Basantani MK (2018) In silico genome-wide identification and characterization of glutathione S-transferase gene family in Vigna radiata (L.) Wilczek. Genome 61:311–322. 10.1139/gen-2017-0192 [DOI] [PubMed] [Google Scholar]
  66. Vaish S, Gupta D, Mehrotra R, Mehrotra S, Basantani MK (2020) Glutathione S-transferase: a versatile protein family. 3 Biotech 10:321. 10.1007/s13205-020-02312-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Vaish S, Praveen R, Gupta D, Basantani MK (2022) Genome-wide identification and characterization of glutathione S-transferase gene family in Musa acuminata L. AAA group and gaining an insight to their role in banana fruit development. J Appl Genet 63:609–631. 10.1007/s13353-022-00707-x [DOI] [PubMed] [Google Scholar]
  68. Vijayakumar H, Thamilarasan SK, Shanmugam A, Natarajan SK, Jung HJ, Park JI et al (2016) Glutathione transferases superfamily: cold-inducible expression of distinct GST genes in Brassica oleracea. Int J Mol Sci 17:1211. 10.3390/ijms17081211 [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Vita F, Ghignone S, Bazihizina N, Rasouli F, Sabbatini L, Kiani-Pouya A et al (2021) Early responses to salt stress in quinoa genotypes with opposite behavior. Physiol Plant 173:1392–1420. 10.1111/ppl.13425 [DOI] [PubMed] [Google Scholar]
  70. Wang L, Qian M, Wang R et al (2018) Characterization of the glutathione S-transferase (GST) gene family in Pyrus bretschneideri and their expression pattern upon superficial scald development. Plant Growth Regul 86:211–222. 10.1007/s10725-018-0422-4 [Google Scholar]
  71. Wang C, Xu H, Lin S, Deng W, Zhou J, Zhang Y, Shi Y, Di P, RY (2019a) GPS 5.0: an update on the prediction of kinase-specific phosphorylation sites in proteins. Genom Proteom Bioinform 18:72–80. 10.1016/j.gpb.2020.01.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Wang R, Ma J, Zhang Q et al (2019b) Genome-wide identification and expression profiling of glutathione transferase gene family under multiple stresses and hormone treatments in wheat (Triticum aestivum L). BMC Genomics 20:986. 10.1186/s12864-019-6374-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Wei L, Zhu Y, Liu R, Zhang A, Zhu M, Xu W et al (2019) Genome wide identification and comparative analysis of glutathione transferases (GST) family genes in Brassica napus. Sci Rep 9:9196. 10.1038/s41598-019-45744-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Xiaolin Z, Baoqiang W, Xian W, Xiaohong W (2022) Identification of the CIPK-CBL family gene and functional characterization of CqCIPK14 gene under drought stress in quinoa. BMC Genomics 23:447. 10.1186/s12864-022-08683-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Yasui Y, Hirakawa H, Oikawa T, Toyoshima M, Matsuzaki C, Ueno M et al (2016) Draft genome sequence of an inbred line of Chenopodium quinoa, an allotetraploid crop with great environmental adaptability and outstanding nutritional properties. DNA Res 23:535–546. 10.1093/dnares/dsw037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Yu C-S, Chen Y-C, Lu C-H, Hwang J-K (2006) Prediction of protein subcellular localization. Proteins 64:643–651. 10.1002/prot.21018 [DOI] [PubMed] [Google Scholar]
  77. Yue H, Chang Xi, Zhi Y, Wang L, Xing G, Song W, Nie X (2019) Evolution and identification of the WRKY gene family in quinoa (Chenopodium quinoa). Genes 10:131. 10.3390/genes10020131 [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Zheng W, Zhang C, Li Y, Pearce R, Bell EW, Zhang Y (2021) Folding non-homology proteins by coupling deep-learning contact maps with I-TASSER assembly simulations. Cell Rep Methods 1:100014. 10.1016/j.crmeth.2021.100014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Zhu X, Wang B, Wang X, Zhang C, Wei X (2021) Genome-wide identification, characterization and expression analysis of the LIM transcription factor family in quinoa. Physiol Mol Biol Plants 27:787–800. 10.1007/s12298-021-00988-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets were derived from sources in the public domain.


Articles from 3 Biotech are provided here courtesy of Springer

RESOURCES