Reviewer name and names of any other individual's who aided in reviewer |
Endre Barta |
Do you understand and agree to our policy of having open and named reviews, and having your review included with the published papers. (If no, please inform the editor that you cannot review this manuscript.) |
Yes |
Is the language of sufficient quality? |
Yes |
Please add additional comments on language quality to clarify if needed
|
|
Are all data available and do they match the descriptions in the paper? |
No |
Additional Comments |
The authors provided only the assembly in Fasta and GenBank format and the contigs (scaffolds?) in GenBank format. Neither the annotation nor the raw Illumina reads are available. |
Are the data and metadata consistent with relevant minimum information or reporting standards? See GigaDB checklists for examples <a href="http://gigadb.org/site/guide" target="_blank">http://gigadb.org/site/guide</a> |
Yes |
Additional Comments |
In the cases where the data is uploaded, the provided metadata is consistent. |
Is the data acquisition clear, complete and methodologically sound? |
Yes |
Additional Comments |
|
Is there sufficient detail in the methods and data-processing steps to allow reproduction? |
No |
Additional Comments |
The exact parameters used during the processing are completely missing. For example, it is unclear how the RagTag-based correcting and scaffolding were carried out. |
Is there sufficient data validation and statistical analyses of data quality? |
Not my area of expertise |
Additional Comments |
|
Is the validation suitable for this type of data? |
No |
Additional Comments |
Without having the raw Illumina reads and the exact command line parameters used, it is not possible to validate the provided results. |
Is there sufficient information for others to reuse this dataset or integrate it with other data? |
Yes |
Additional Comments |
|
Any Additional Overall Comments to the Author |
Assembling the reference genomes of endangered species is a task of immense importance, with the potential to significantly advance our understanding and conservation of these species. This work provides an initial genome assembly based on Illumina short-read sequencing. The correction and scaffolding of the contigs were made with the RagTag program using the red deer PacBio-based chromosome-level assembly. The potential benefits of this work are vast, from gaining knowledge to initiating and furthering population studies to preserve the species. According to the annotation and the BUSCO analysis, the final assembly seems especially good, considering that it is short-read based. However, there are some concerns about the methodology and the provided data. 1. The Illumina short reads and the annotation data (GFFs, VCFs) are not available. 2. The methods used are not reproducible because the descriptions of the exact parameters are missing. 3. It seems that the authors did not use the ‘-r’ parameter during the scaffolding, which resulted in inserting 100bp Ns instead of the actual size insertion based on the red deer reference genome. 4. There is no K-mer based genome size estimation. 5. The chromosome number is not known. Is there any chromosomal rearrangement between the red deer and the Visayan Spotted Deer? 6. It is not justified why the protein- and mitochondria-based trees are drawn as cladograms and not as phylograms. This way, the actual distances between the different species cannot be seen. 7. Although the short reads were mapped back to the assembly, no variation data is provided. 8. Is it necessary to include these high number (46104) short (1000>) contigs in the assembly? 9. Although the red deer assembly was used for the correction and scaffolding, the annotation was compared to the mule deer. |
Recommendation |
Major Revision |