Skip to main content
. Author manuscript; available in PMC: 2023 Feb 14.
Published in final edited form as: Methods Mol Biol. 2022 Jan 1;2443:27–55. doi: 10.1007/978-1-0716-2067-0_2

Table 7.

Programming recipes to analyze data in Ensembl Plants, including perl API (A), R BiomaRt (B), FTP (F), SQL (S), REST (R), and Ensembl VEP (V) examples. These recipes and their software dependencies, together with a few more scripts for phylogenomic analyses, are updated at https://github.com/Ensembl/plant-scripts

Recipe Description
A1 Load the Registry object with details of genomes available
A2 Check which analyses are available for a species
A3 Get soft-masked sequences from Arabidopsis thaliana
A4 Get BED file with repeats in chr4
A5 Find the DEAR3 gene
A6 Get the transcript used in Compara analyses
A7 Find all orthologues of a gene
A8 Get markers mapped on chr1D of bread wheat
A9 Find all syntelogues among rices
A10 Print all translations for other features genes
B1 Check plant marts and select dataset
B2 Check available filters and attributes
B3 Download GO terms associated with genes
B4 Get Pfam domains annotated in genes
B5 Get SNP consequences from a selected variation source
C1 Find RNA-seq CRAM files for a genome assembly
F1 Download peptide sequences in FASTA format
F2 Download CDS nucleotide sequences in FASTA format
F3 Download transcripts (cDNA)
F4 Download soft-masked genomic sequences
F5 Upstream/downstream sequences
F6 Get mappings to UniProt proteins
F7 Get indexed, bgzipped VCF file with variants mapped
F8 Get precomputed VEP cache files
F9 Download all homologies in a single TSV file, several GBs
F10 Download UniProt report of Ensembl Plants
F11 Retrieve list of new species in current release
F12 Get current plant species tree cladogram
S1 Check currently supported Ensembl Genomes (EG) core schemas
S2 Count protein-coding genes of a particular species
S3 Get stable_ids of transcripts used in Compara analyses
S4 Get variants significantly associated to phenotypes
S5 Get Triticumaestivumhomeologous genes across A, B, and D subgenomes
S6 Count the number of whole-genome alignments of all genomes
S7 Extract all the mutations and consequence for a known line on triticum_aestivum
R1 Create an HTTP client and helper functions
R2 Get metadata for all plant species
R3 Find features overlapping genomic region
R4 Fetch phenotypes overlapping genomic region
R5 Find homologues of selected gene
R6 Get annotation of orthologous genes/proteins
R7 Fetch variant consequences for multiple variant ids
R8 Check consequences of single SNP within CDS sequence
R9 Retrieve variation sources of a species
V1 Download, install, and update VEP
V2 Unpack downloaded cache file and check SIFT support
V3 Predict effect of variants
V4 Predict effect of variants for species not in Ensembl