A high-quality draft genome for Melaleuca alternifolia (tea tree): a new platform for evolutionary genomics of myrtaceous terpene-rich species

. 2021 Aug 9;2021:gigabyte28. doi: 10.46471/gigabyte.28

Reviewer name and names of any other individual's who aided in reviewer	Li HL
Do you understand and agree to our policy of having open and named reviews, and having your review included with the published papers. (If no, please inform the editor that you cannot review this manuscript.)	Yes
Is the language of sufficient quality?	Yes
Please add additional comments on language quality to clarify if needed
Are all data available and do they match the descriptions in the paper?	Yes
Additional Comments
Are the data and metadata consistent with relevant minimum information or reporting standards? See GigaDB checklists for examples <a href="http://gigadb.org/site/guide" target="_blank">http://gigadb.org/site/guide</a>	Yes
Additional Comments
Is the data acquisition clear, complete and methodologically sound?	Yes
Additional Comments
Is there sufficient detail in the methods and data-processing steps to allow reproduction?	Yes
Additional Comments
Is there sufficient data validation and statistical analyses of data quality?	No
Additional Comments
Is the validation suitable for this type of data?	Yes
Additional Comments
Is there sufficient information for others to reuse this dataset or integrate it with other data?	Yes
Additional Comments
Any Additional Overall Comments to the Author	Voelker and his colleagues reported a high-quality de novo genome assembly using PacBio and Illumina sequencing data as a platform for comparative genomics in the Myrtaceae. The concept and part of results sound attractive, however, some details should be inquired. Following are my major concerns about this work. 1. The PacBio sequencing implies a genome coverage (sequencing depth) of around 55x. Why the authors sequenced such a coverage? 80x or more would be a good improvement. 2. The authors assembled the genome using PacBio and Illumina sequencing data. However some important information of the results are not clear. Such as, what about the percentage of repeat sequences and heterozygosity of the tea tree genome in the latest version ? 3. Three different tools Canu, Flye and MaSuRCA were used to assemble the tea tree genome, with a larger total assembly length for the Canu assembly (451 Mb), whereas, below 365.23 Mb generated by other tools. It is difficult to understand, how could such a difference exist. Why Canu could might have assembled more than one haplotype?
Recommendation	Major Revision