Skip to main content
. 2023 Aug 23;2023:gigabyte87. doi: 10.46471/gigabyte.87
Reviewer name and names of any other individual's who aided in reviewer Tomasz Neugebauer
Do you understand and agree to our policy of having open and named reviews, and having your review included with the published manuscript. (If no, please inform the editor that you cannot review this manuscript.) Yes
Is the language of sufficient quality? Yes
Please add additional comments on language quality to clarify if needed
Is there a clear statement of need explaining what problems the software is designed to solve and who the target audience is? Yes
Additional Comments
Is the source code available, and has an appropriate Open Source Initiative license <a href="https://opensource.org/licenses" target="_blank">(https://opensource.org/licenses)</a> been assigned to the code? Yes
Additional Comments
As Open Source Software are there guidelines on how to contribute, report issues or seek support on the code? No
Additional Comments This is hosted on GitHub, and there is an "Issues" forum on the repository, but it has no posted open or closed issues. I would encourage the authors to add a statement about support and how to submit issues .
Is the code executable? Unable to test
Additional Comments I did not execute the code, because I do not have an AWS account.
Is installation/deployment sufficiently outlined in the paper and documentation, and does it proceed as outlined? Unable to test
Additional Comments
Is the documentation provided clear and user friendly? Yes
Additional Comments
Is there enough clear information in the documentation to install, run and test this tool, including information on where to seek help if required? Yes
Additional Comments If a user encounters errors with s3md5, should they contact the author of s3md5 or aws-s3-integrity-check? This tool relies on s3md5 for a significant portion of the functionality, but s3md5 doesn't look like it has been updated or maintained.
Is there a clearly-stated list of dependencies, and is the core functionality of the software documented to a satisfactory level? Yes
Additional Comments
Have any claims of performance been sufficiently tested and compared to other commonly-used packages? Yes
Additional Comments
Is test data available, either included with the submission or openly available via cited third party sources (e.g. accession numbers, data DOIs)? Yes
Additional Comments
Are there (ideally real world) examples demonstrating use of the software? Yes
Additional Comments
Is automated testing used or are there manual steps described so that the functionality of the software can be verified? Yes
Additional Comments
Any Additional Overall Comments to the Author Overall, I think this is a useful tool that offers important functionality for digital preservation of datasets. The manuscript describes the functionality of the tool clearly, and offers detailed instructions and test results. I recommend accepting this article, but suggest a minor revision to address the following : One weakness, alluded to in the limitations section with the following comment: "Fourth, this tool has not been tested using server-side encryption different from the default option using an SSE-S3 key." I would suggest a clearer statement here instead, as it seems that the tool will not work for server-side encryption different from SSE-S3. It doesn't seem like it's just a matter of not testing. In summary, it would be important to include a statement about whether or not the authors would expect that the tool would work for the objects described in the AWS documentation as follows: "If an object is created by the PUT Object, POST Object, or Copy operation, or through the AWS Management Console, and that object is encrypted by server-side encryption with customer-provided keys (SSE-C) or server-side encryption with AWS Key Management Service (AWS KMS) keys (SSE-KMS), that object has an ETag that is not an MD5 digest of its object data." (https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html) The other significant issue is with the sustainability and support of this solution, due to the reliance on s3md5, a separate script authored by someone other than the authors, developed about 10 years ago and last updated in 2016. There is an open issue with s3md5 from March 2019 (https://github.com/antespi/s3md5/issues/11) that has not received any response or comment. Is s3md5 supported or maintained? Will the authors of aws-s3-integrity-check commit to supporting questions about it, and potentially a fork of s3md5 if necessary, since they rely on it for such a significant part of the integrity check?
Recommendation Minor Revisions