Skip to main content
. 2023 Oct 5;24:223. doi: 10.1186/s13059-023-03071-z

Table 4.

Summary of pangene clusters obtained for datasets ACK2 and rice3 and the corresponding orthogroups in Ensembl Plants. Core clusters contain genes from all analyzed genomes; in rice, shell clusters contain genes from two species. BUSCO completeness percentages for core sets are shown in parentheses. Clusters with multiple copies have several genes from the same species. gDNA segments are shell clusters that bring together a gene model and a matching genomic segment from the underlying WGA. Column ‘match Compara’ shows the number of pangene clusters that contain the same genes as the corresponding Compara orthogroups. The last column shows the number of pangene clusters that contain sequences that share an InterPro domain (the number in square brackets is for core clusters only)

Dataset Core clusters [%BUSCO] Multiple copies Shell clusters gDNA segments Match Compara Share InterPro domains
Compara orthogroups ACK2 20,192 [90.6] 161 [18,259]
minimap2 clusters ACK2 20,647 [94.1] 731 18,245 [18,792]
GSAlign clusters ACK2 16,476 [74.9] 454 14,181 [14,817]
Compara orthogroups rice3 13,020 [65.6] 219 6386 16,766 [11,571]
minimap2 clusters rice3 22,880 [85.2] 3360 7825 6521 18,281 23,062 [19,239]
GSAlign clusters rice3 20,399 [84.6] 2885 9730 6103 17,103 22,834 [17,135]