Reviewer name and names of any other individual's who aided in reviewer |
Jiaxu Wang |
Do you understand and agree to our policy of having open and named reviews, and having your review included with the published papers. (If no, please inform the editor that you cannot review this manuscript.) |
Yes |
Is the language of sufficient quality? |
Yes |
Please add additional comments on language quality to clarify if needed
|
|
Are all data available and do they match the descriptions in the paper? |
Yes |
Additional Comments |
|
Are the data and metadata consistent with relevant minimum information or reporting standards? See GigaDB checklists for examples <a href="http://gigadb.org/site/guide" target="_blank">http://gigadb.org/site/guide</a> |
Yes |
Additional Comments |
|
Is the data acquisition clear, complete and methodologically sound? |
Yes |
Additional Comments |
|
Is there sufficient detail in the methods and data-processing steps to allow reproduction? |
Yes |
Additional Comments |
|
Is there sufficient data validation and statistical analyses of data quality? |
No |
Additional Comments |
The authors ran DSR for the in vitro transcribed transcriptional RNAs from 6 cell lines to remove the possible natural modifications. The data can be used as a control RNA pool for natural or artificial modification studies.
however, more statistical analyses should be performed for the data quality. see comments below:
(1) For more possible usage of this data, some QC analysis is better to be provided to confirm the quality of these sequencing data. For example: 1) What is the correlation between in vitro transcribed transcriptional RNAs and original DSR for each cell line? 2) how many genes have been captured in each cell line?
(2) In Figure 2B, the author provides 3 conditions for ‘exclude’ and ‘include’, some statistical analysis should be performed to confirm how many cases in condition 1, condition 2, and condition 3. How many mismatches are showing in only 1 cell line, some cell lines or all the cell lines? The shared correct genes may be more confident references for the modification analysis.
(3) Different reads of the same gene could have different mismatches in the IVT RNAs due to RT-PCR bias or other reasons (especially for the lower expressed RNAs), for example, there are 100 reads in total, 90 reads are the correct nucleotide at a given position, 10 reads have a mismatch in the IVT sample, then how to define the signal as the control reference? Given that the nature modification is low in RNA, some threshold should be applied for the confident result, for example, what is the lowest expression threshold that could be used as a confident control reference?
|
Is the validation suitable for this type of data? |
Yes |
Additional Comments |
|
Is there sufficient information for others to reuse this dataset or integrate it with other data? |
No |
Additional Comments |
For more possible usage of this data, more QC data should be performed, please refer to my above comments |
Any Additional Overall Comments to the Author |
|
Recommendation |
Major Revision |