| Reviewer name and names of any other individual's who aided in reviewer | Li HL |
| Do you understand and agree to our policy of having open and named reviews, and having your review included with the published papers. (If no, please inform the editor that you cannot review this manuscript.) | Yes |
| Is the language of sufficient quality? | Yes |
| Please add additional comments on language quality to clarify if needed | |
| Are all data available and do they match the descriptions in the paper? | Yes |
| Additional Comments | |
| Are the data and metadata consistent with relevant minimum information or reporting standards? See GigaDB checklists for examples <a href="http://gigadb.org/site/guide" target="_blank">http://gigadb.org/site/guide</a> | Yes |
| Additional Comments | |
| Is the data acquisition clear, complete and methodologically sound? | Yes |
| Additional Comments | |
| Is there sufficient detail in the methods and data-processing steps to allow reproduction? | Yes |
| Additional Comments | |
| Is there sufficient data validation and statistical analyses of data quality? | No |
| Additional Comments | |
| Is the validation suitable for this type of data? | Yes |
| Additional Comments | |
| Is there sufficient information for others to reuse this dataset or integrate it with other data? | Yes |
| Additional Comments | |
| Any Additional Overall Comments to the Author | Voelker and his colleagues reported a high-quality de novo genome assembly using PacBio and Illumina sequencing data as a platform for comparative genomics in the Myrtaceae. The concept and part of results sound attractive, however, some details should be inquired. Following are my major concerns about this work. 1. The PacBio sequencing implies a genome coverage (sequencing depth) of around 55x. Why the authors sequenced such a coverage? 80x or more would be a good improvement. 2. The authors assembled the genome using PacBio and Illumina sequencing data. However some important information of the results are not clear. Such as, what about the percentage of repeat sequences and heterozygosity of the tea tree genome in the latest version ? 3. Three different tools Canu, Flye and MaSuRCA were used to assemble the tea tree genome, with a larger total assembly length for the Canu assembly (451 Mb), whereas, below 365.23 Mb generated by other tools. It is difficult to understand, how could such a difference exist. Why Canu could might have assembled more than one haplotype? |
| Recommendation | Major Revision |