| Upload additional files | DRR-202011-01/form/gx-DR-1604540547_SW.pdf |
| Reviewer name and names of any other individual's who aided in reviewer | Sven Winter |
| Do you understand and agree to our policy of having open and named reviews, and having your review included with the published papers. (If no, please inform the editor that you cannot review this manuscript.) | Yes |
| Is the language of sufficient quality? | No |
| Please add additional comments on language quality to clarify if needed | In general, most of the manuscript is written in a sufficient quality, but there are certain parts that need improvement. Please see detailed comments below. |
| Are all data available and do they match the descriptions in the paper? | No |
| Additional Comments | The data under the listed BioProject PRJNA597275 is not the same as described in the manuscript. I would suggest, that the authors update the species name on NCBI and make sure that the pacbio data is uploaded (so far I only see Nanopore data). The amount of stLFR data is also more than described in the manuscript. |
| Are the data and metadata consistent with relevant minimum information or reporting standards? See GigaDB checklists for examples <a href="http://gigadb.org/site/guide" target="_blank">http://gigadb.org/site/guide</a> | Yes |
| Additional Comments | |
| Is the data acquisition clear, complete and methodologically sound? | No |
| Additional Comments | It mostly is but it needs to be verified that pacbio and not ONT data was used. |
| Is there sufficient detail in the methods and data-processing steps to allow reproduction? | No |
| Additional Comments | I would like to see more details about the library preparations. Even though the authors reference protocols, basic information, e.g., what tissue type was used for Hi-C, should be given in the manuscript. Also again, the issue with pacbio or ONT needs to be resolved, and details about either library preparation need to be specified (pacbio CLR or CCS?, ONT libprep kit and sequencer) |
| Is there sufficient data validation and statistical analyses of data quality? | Yes |
| Additional Comments | I assume that there is, but I would like to see a bit more details of how the data was filtered, and the quality was checked. For example, how exactly did SOAPnuke filter the stLFR reads? What is an obvious sequencing error rate? Why was 50% of the data lost in the process? Same for the Hi-C data, why was there so much loss of raw data? |
| Is the validation suitable for this type of data? | Yes |
| Additional Comments | Yes I think it is, but it needs to be explained more. See detailed comments below. |
| Is there sufficient information for others to reuse this dataset or integrate it with other data? | Yes |
| Additional Comments | |
| Any Additional Overall Comments to the Author | Overall, this is a good genome assembly and the manuscript is suitable for publication in Gigabyte, yet there are some methods that need to be described in more detail to be reproducible. I hope that my comments will help the authors to improve the readability and completeness of the manuscript. As page and line numbers are missing, it was easier to add my comments directly to the manuscript file. Therefore, please find my detailed comments attached. Looking forward to seeing this manuscript published in Gigabyte soon. Sven Winter |
| Recommendation | Major Revision |