Figure S23. Cluster Analysis of Sequences clustering of receptors and related molecules sequences.
A Cluster Analysis of Sequences clustering layout where shapes indicate sequences and lines are connections indicating similarity between sequences at or surpassing the P-value similarity threshold. Sequences are positioned in clusters based on similarity. Initial identification and annotation of clusters was performed using the inbuilt convex clustering at the P-value of 1 × 10−100. (A) Clustering was loosened till the canonical receptor annotated groups formed a cluster at 1 × 10−65. (B) Loosening of the P-value to 1 × 10−60 identified relationships between clusters of interest and identified the intermediate group as connecting to both canonical and chemokine-like plus groups. All sequences connected to groups of interest are vertebrate sequences. (C) Further loosening to P-value 1 × 10−50 connects the vertebrate sequences of interest to a large cluster of sequences which contains vertebrate and invertebrate sequences which are annotated as opioid and somatostatin receptors and other GPCRs. Crosses indicate invertebrate sequences and Y-shape indicates the reference viral sequences included. Shapes are colour-coded by the group of interest: purple = canonical chemokine receptors; yellow = chemokine-like plus; green = atypical receptor 3/GPR182; blue = intermediate group; pink = relaxin receptors.