Skip to main content
. 2022 Nov 29;13:1067562. doi: 10.3389/fgene.2022.1067562

FIGURE 1.

FIGURE 1

Overall scheme for constructing and developing TSSNote and CyaPromBERT. TSSNote facilitates downloading raw dRNA-seq datasets from NCBI SRA database and conducts alignment, sorting, and filtering for extracting promoters and non-promoters. These sequences are later used to train a BERT model for the task of promoter prediction. Randomly generated DNA sequences with similar size to promoter length are added to reduce biases, and overfitting is used to improve the model’s robustness. The trained model is capable of promoter prediction, regional scanning, and visualization at base-pair level.