Nestor: A Tool for Natural Language Annotation of Short Texts

Rachael TB Sexton; Michael P Brundage

doi:10.6028/jres.124.029

. 2019 Nov 1;124:1–5. doi: 10.6028/jres.124.029

Nestor: A Tool for Natural Language Annotation of Short Texts

Rachael TB Sexton ¹, Michael P Brundage ¹

PMCID: PMC7339772 PMID: 34877166

1. Summary

Nestor is a software tool that annotates natural language CSV (comma-separated variable) files, with a UTF-8 (Unicode Transformation Format – 8-bit) encoding, using a process called tagging [1]. The objective of Nestor is to help analysts make their natural language data, which is often unstructured, filled with technical content, jargon, mispellings, and abbreviations, computable to improve analysis. An example of natural language data that could be input to Nestor and the subsequent output data and the corresponding output is shown in Table 1.

Table 1.

An example of natural language input (Raw Text column in this example) and subsequent output (Item(s), Problem(s), Solution(s), Problem(s) & Item(s), Solution(s) & Item(s) columns in this example) for Nestor. These input files often also contain other non-text based data points that can be used for other analysis, but are not directly used by Nestor.

Raw Text	Item(s)	Problem(s)	Solution(s)	Problem(s) & Item(s)	Solution(s) & Item(s)
Hyd leak at saw attachment. Replaced seal in saw attachment but still leaking - Reapirs pending with ML	Hydraulic; Saw attachment; Seal	Leak	Replaced; Repaired	Hydraulic Leak	Replaced Seal
HP Coolant pressure at 75 psi; Bad gauge/Low pressure lines cleaned ou	High Pressure Coolant; Gauge; Low Pressure Line	Broken; Low Pressure	Cleaned	Broken Gauge	Cleaned Low Pressure Line
Major hydraulic leak at SP#6 horseshoe. Repaired horseshoe seals.	Hydraulic; SP#6; Horseshoe Seal	Leak	Repaired	Hydraulic Leak	Repaired Horseshoe Seal
Clamping spool guard broken, replaced - operator could have done this!	Clamping Spool Guard; Operator	Broken	Replaced	Clamping Spool Guard Broken	N/A

Open in a new tab

The annotated datasets generated by Nestor (as either a CSV or .h5 file) can be used for different analysis techniques, such as failure prediction, problem hot spot identification, and maintenance technician expertise assessment, as shown in [2–10]. Currently, the majority of use cases involve maintenance in the engineering domain (manufacturing, mining, heating ventilation and air conditioning (HVAC)), however, any natural language CSV file with UTF-8 encoding can be input to Nestor.

2. Software Specifications

NIST Operating Unit	Engineering Laboratory, Systems Integration Division, Informational Modeling and Testing Group
Category	Analysis Graphical User Interface (GUI).
Targeted Users	Manufacturers, Maintainers, Maintenance Technicians, Analysts
Operating Systems	Windows: Windows 10 or greater; Mac: OSx v10.1 or greater; Linux: Linux 5.0 ×86 64 or greater
Programming Language	Executable: None; Source: Python v3.6 or greater See https://github.com/usnistgov/nestor/tree/master/requirements
Inputs/Outputs	Input: UTF-8 encoded .csv file. Output(s): Annotated .csv file, .h5 file dashboard.
Documentation	User’s Guide - https://nestor.readthedocs.io/en/latest/index.html Source Code: https://github.com/usnistgov/nestor
Disclaimer	https://www.nist.gov/disclaimer

Open in a new tab

3. Methods

This software provides a Graphical User Interface (GUI) (both as a standalone application¹ and the source code²) as seen in Fig. 1.

The software takes natural language inputs in the form of UTF-8 encoded CSV files and allows a user to select the columns containing natural language text. After columns in the CSV files are selected, the software will rank the concepts according to their frequency occurring in the data and allow the user to select similar concepts, create an alias, and provide a classification. Once the user completes this process, the software tool will automatically annotate the dataset and provide an annotated CSV and .h5 file as shown in Fig. 2. These files can then be used for various analysis techniques, such as problem identification, failure prediction, and technician skill assessment [2–7].

Fig. 2. — A screenshot of the Nestor GUI report tab.

Biography

About the authors: Rachael T.B. Sexton, MS is a Mechanical Engineer in the Information Modeling and Testing Group of the Systems Integration Division at NIST, currently researching the usability of natural language processing for mining useful system representations for Smart Manufacturing Systems. Their interests include statistical network analysis, Bayesian global optimization, human factors, inverse reinforcement learning, and hybrid (physics/data-driven) modeling.

Michael P. Brundage, PhD is an Industrial Engineer in the Information Modeling and Testing Group. Dr. Brundage serves as the Project Leader for the Knowledge Extraction and Application for Manufacturing Operations project in the Model-Based Enterprise Program. Dr. Brundage’s interests include Smart Manufacturing Diagnostics for Intelligent Maintenance, Sustainable Manufacturing Performance Measurement, Smart Manufacturing Capability Assessment, and Manufacturing Knowledge Visualization.

The National Institute of Standards and Technology is an agency of the U.S. Department of Commerce.

Footnotes

¹

https://www.nist.gov/services-resources/software/nestor

²

https://github.com/usnistgov/nestor

4. References

[1].Madhusudanan Navinchandran F, Bones L, Brundage M, Hoffman M, Moccozet S, Sexton R (2018) Nestor: a toolkit for quantifying tacit maintenance knowledge, for investigatory analysis in smart manufacturing. 10.18434/t4/1502464. Available at https://github.com/usnistgov/nestor [DOI]
[2].Sexton R, Hodkiewicz M, Brundage MP, Smoker T (2018) Benchmarking for keyword extraction methodologies in maintenance work orders. Proceedings of the Annual Conference of the PHM Society, Vol. 10. [Google Scholar]
[3].Sexton R, Brundage MP, Hoffman M, Morris KC (2017) Hybrid datafication of maintenance logs from ai-assisted human tags. 2017 IEEE International Conference on Big Data (Big Data) (IEEE; ), pp 1769–1777. [Google Scholar]
[4].Brundage MP, Morris K, Sexton R, Moccozet S, Hoffman M (2018) Developing maintenance key performance indicators from maintenance work order data. ASME 2018 13th International Manufacturing Science and Engineering Conference (American Society of Mechanical Engineers; ), pp V003T02A027–V003T02A027. [Google Scholar]
[5].Brundage MP, Sexton R, Hodkiewicz M, Morris KC, Arinez J, Ameri F, Ni J, Xiao G (2019) Where do we start? guidance for technology implementation in maintenance management for manufacturing. Journal of Manufacturing Science and Engineering 141(9):091005. [Google Scholar]
[6].Sharp M, Sexton R, Brundage MP (2017) Toward semi-autonomous information. IFIP International Conference on Advances in Production Management Systems (Springer, Cham: ), pp 425–432. [Google Scholar]
[7].Brundage MP, Kulvantunyou B, Ademujimi T, Rakshith B (2017) Smart manufacturing through a framework for a knowledge-based diagnosis system. Proceedings of the ASME 2017 International Manufacturing Science and Engineering Conference, MSEC, Vol. 2017, pp 1–9. [Google Scholar]
[8].Hastings E, Sexton R, Brundage MP, Hodkiewicz M (2019) Agreement behavior of isolated annotators for maintenance work-order data mining. Proceedings of the Annual Conference of the PHM Society, Vol. 11. [Google Scholar]
[9].Sexton R, Hodkiewicz M, Brundage MP (2019) Categorization errors for data entry in maintenance work-orders. Proceedings of the Annual Conference of the PHM Society, Vol. 11. [Google Scholar]
[10].Navinchandran M, Sharp ME, Brundage MP, Sexton RTB (2019) Studies to predict maintenance time duration and important factors from maintenance workorder data. Proceedings of the Annual Conference of the PHM Society, Vol. 11. [Google Scholar]

[R1] [1].Madhusudanan Navinchandran F, Bones L, Brundage M, Hoffman M, Moccozet S, Sexton R (2018) Nestor: a toolkit for quantifying tacit maintenance knowledge, for investigatory analysis in smart manufacturing. 10.18434/t4/1502464. Available at https://github.com/usnistgov/nestor [DOI]

[R2] [2].Sexton R, Hodkiewicz M, Brundage MP, Smoker T (2018) Benchmarking for keyword extraction methodologies in maintenance work orders. Proceedings of the Annual Conference of the PHM Society, Vol. 10. [Google Scholar]

[R3] [3].Sexton R, Brundage MP, Hoffman M, Morris KC (2017) Hybrid datafication of maintenance logs from ai-assisted human tags. 2017 IEEE International Conference on Big Data (Big Data) (IEEE; ), pp 1769–1777. [Google Scholar]

[R4] [4].Brundage MP, Morris K, Sexton R, Moccozet S, Hoffman M (2018) Developing maintenance key performance indicators from maintenance work order data. ASME 2018 13th International Manufacturing Science and Engineering Conference (American Society of Mechanical Engineers; ), pp V003T02A027–V003T02A027. [Google Scholar]

[R5] [5].Brundage MP, Sexton R, Hodkiewicz M, Morris KC, Arinez J, Ameri F, Ni J, Xiao G (2019) Where do we start? guidance for technology implementation in maintenance management for manufacturing. Journal of Manufacturing Science and Engineering 141(9):091005. [Google Scholar]

[R6] [6].Sharp M, Sexton R, Brundage MP (2017) Toward semi-autonomous information. IFIP International Conference on Advances in Production Management Systems (Springer, Cham: ), pp 425–432. [Google Scholar]

[R7] [7].Brundage MP, Kulvantunyou B, Ademujimi T, Rakshith B (2017) Smart manufacturing through a framework for a knowledge-based diagnosis system. Proceedings of the ASME 2017 International Manufacturing Science and Engineering Conference, MSEC, Vol. 2017, pp 1–9. [Google Scholar]

[R8] [8].Hastings E, Sexton R, Brundage MP, Hodkiewicz M (2019) Agreement behavior of isolated annotators for maintenance work-order data mining. Proceedings of the Annual Conference of the PHM Society, Vol. 11. [Google Scholar]

[R9] [9].Sexton R, Hodkiewicz M, Brundage MP (2019) Categorization errors for data entry in maintenance work-orders. Proceedings of the Annual Conference of the PHM Society, Vol. 11. [Google Scholar]

[R10] [10].Navinchandran M, Sharp ME, Brundage MP, Sexton RTB (2019) Studies to predict maintenance time duration and important factors from maintenance workorder data. Proceedings of the Annual Conference of the PHM Society, Vol. 11. [Google Scholar]

PERMALINK

Nestor: A Tool for Natural Language Annotation of Short Texts

Rachael TB Sexton

Michael P Brundage

1. Summary

Table 1.

2. Software Specifications

3. Methods

Fig. 1.

Fig. 2.

Biography

Footnotes

4. References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Nestor: A Tool for Natural Language Annotation of Short Texts

Rachael TB Sexton

Michael P Brundage

1. Summary

Table 1.

2. Software Specifications

3. Methods

Fig. 1.

Fig. 2.

Biography

Footnotes

4. References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases