Evaluating causes of error in landmark-based data collection using scanners

Brian M Shearer; Siobhán B Cooke; Lauren B Halenar; Samantha L Reber; Jeannette E Plummer; Eric Delson; Melissa Tallman

doi:10.1371/journal.pone.0187452

. 2017 Nov 3;12(11):e0187452. doi: 10.1371/journal.pone.0187452

Evaluating causes of error in landmark-based data collection using scanners

Brian M Shearer ^1,^2,³, Siobhán B Cooke ^3,⁴, Lauren B Halenar ^2,^3,⁵, Samantha L Reber ⁶, Jeannette E Plummer ⁷, Eric Delson ^1,^2,^3,^8,⁹, Melissa Tallman ^3,^10,^*

Editor: Kornelius Kupczik¹¹

¹Ph.D. Program in Anthropology, The Graduate Center (CUNY), New York, New York, United States of America

²New York Consortium in Evolutionary Primatology, New York, New York, United States of America

³NYCEP Morphometrics Group, New York, New York, United States of America

⁴Center for Functional Anatomy and Evolution, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America

⁵Department of Biology, Farmingdale State College (SUNY), Farmingdale, New York, United States of America

⁶School of Forensic and Applied Sciences, University of Central Lancashire, Preston, United Kingdom

⁷Department of Archaeology, University of Sheffield, South Yorkshire, United Kingdom

⁸Division of Vertebrate Paleontology, American Museum of Natural History, New York, New York, United States of America

⁹Department of Anthropology, Lehman College (CUNY), Bronx, New York, United States of America

¹⁰Department of Biomedical Sciences, Grand Valley State University, Grand Valley, Michigan, United States of America

¹¹Max Planck Institute for Evolutionary Anthropology, GERMANY

Competing Interests: The authors have declared that no competing interests exist.

^✉

* E-mail: tallmame@gvsu.edu

Roles

Brian M Shearer: Conceptualization, Data curation, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing

Siobhán B Cooke: Data curation, Investigation, Visualization, Writing – original draft, Writing – review & editing

Lauren B Halenar: Data curation, Investigation, Writing – review & editing

Samantha L Reber: Data curation, Formal analysis, Investigation

Jeannette E Plummer: Formal analysis, Investigation

Eric Delson: Conceptualization, Funding acquisition, Writing – review & editing

Melissa Tallman: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Supervision, Visualization, Writing – original draft, Writing – review & editing

Kornelius Kupczik: Editor

PMCID: PMC5669428 PMID: 29099867

Abstract

In this study, we assess the precision, accuracy, and repeatability of craniodental landmarks (Types I, II, and III, plus curves of semilandmarks) on a single macaque cranium digitally reconstructed with three different surface scanners and a microCT scanner. Nine researchers with varying degrees of osteological and geometric morphometric knowledge landmarked ten iterations of each scan (40 total) to test the effects of scan quality, researcher experience, and landmark type on levels of intra- and interobserver error. Two researchers additionally landmarked ten specimens from seven different macaque species using the same landmark protocol to test the effects of the previously listed variables relative to species-level morphological differences (i.e., observer variance versus real biological variance). Error rates within and among researchers by scan type were calculated to determine whether or not data collected by different individuals or on different digitally rendered crania are consistent enough to be used in a single dataset. Results indicate that scan type does not impact rate of intra- or interobserver error. Interobserver error is far greater than intraobserver error among all individuals, and is similar in variance to that found among different macaque species. Additionally, experience with osteology and morphometrics both positively contribute to precision in multiple landmarking sessions, even where less experienced researchers have been trained in point acquisition. Individual training increases precision (although not necessarily accuracy), and is highly recommended in any situation where multiple researchers will be collecting data for a single project.

Introduction

Over the last decade, landmark based three-dimensional geometric morphometrics (3DGM) utilizing digital specimen scans has become an increasingly integral tool in the fields of physical anthropology and paleontology. 3DGM allows researchers to analyze complex (i.e., non-linear) shape data through the application of landmarks to anatomically homologous points on multiple specimens [1]. Landmarks can be acquired either directly from a physical specimen, as with a Microscribe digitizer, or digitally via a computer program, such as Landmark Editor [2], on a virtual rendition of a bone. The latter method has become popular recently with the decreased price and increased ease-of-use of surface scanners, which allow researchers to create a permanent digital copy of a specimen for later use in landmark-based analyses and/or for storage and sharing with other researchers via an online database (e.g., www.morphosource.org). Many researchers have also begun using computed tomography scanners (CT) to digitally render their specimens when interested in both internal and external morphology, as dramatic increases in processing power of commercial computers and greater access to CT scanners has made this technology more practical in non-medical research (see [3,4,5] for reviews). Digital renderings of bony tissue from both surface and CT scanners are often treated as equivalent by researchers (e.g., [6]) and are used interchangeably based upon availability. However, there is no broadly consistent protocol for rendering digital scans or for applying landmarks to digital models, and the possibility that landmark-based 3DGM studies can potentially suffer from problems of inter- and intraobserver error as a result of these variables has not been thoroughly investigated (but see [7]).

In any landmark-based study using digitally rendered specimens there are multiple factors which may introduce error. Technological sources of error potentially include scanner type and brand (which inherently vary in their surface capture abilities based on design features) resolution at which a specimen is scanned, and the fitting and smoothing algorithms that may be used in post-processing of the surfaces that may differ per proprietary software programming idiosyncrasies. Scanning protocol-based sources of error result from the individual choices made by a researcher regardless of what scan technology they choose to utilize, and may include scanning methods (e.g., particular number of frames, scanning angle, or overall number of image families used at the discretion of the researcher), or reconstruction/rendering methods used that may include differences in a particular scan model refinement method (e.g., to what extent the “Mesh Doctor” function in Geomagic Studio or Wrap is used rather than a targeted refinement protocol using other available tools). User-based sources of error include differences in data collection experience among researchers, inherent researcher tendencies for precision and accuracy, and comprehension of instructions. Data collection-based sources of error involve repeatability of landmark protocols.

Landmarks are traditionally classified into three different types based on potential for anatomical homology. Type I landmarks are generally the most desirable type of landmark because of their ease of reproducibility and in identification of anatomical homology. They can be defined as points where multiple tissues intersect [8], for example, where the coronal and sagittal sutures meet (Bregm(A). Type II landmarks can be defined as points of potential homology that are based only on geometric evidence. Type II landmarks are often placed on the maxima or minima of structures, such as the tip of the canine. Type III landmarks are mathematically deficient in at least one coordinate, and are generally defined only with respect to other landmarks in that they characterize more than a single region of an object’s form [8]. Landmark types II and III are less desirable than Type I, as they are more difficult to accurately find and precisely mark, and generally describe structures that are not necessarily homologous in the traditional sense of the word [8], but are more likely to be mathematically or geometrically homologous. More recent research has introduced semilandmarks from 2D morphometrics [9,10] to 3DGM studies (e.g., [11]). Semilandmarks are used to compare the shapes of biological curves that are suspected to hold some functional or phylogenetic information but present an even more difficult case of repeatability. These curves are usually anchored with anatomically homologous landmarks which are also spaced equidistantly between the anchoring points. These points are then “slid” into their most “homologous” positions prior to multivariate analyses by minimizing either the bending energy or Procrustes distances in the sample (see [12] for an example of how both of these methods affect data processing). Semilandmark curves have been demonstrated to be most useful when applied over large surfaces that do not contain numerous traditional landmarks (e.g., the occipital bone of the cranium [13] or the trochlear surface of the tibia [14]).

Several researchers have conducted small-scale error studies examining between-scanner error and interobserver error with non-GM data and their results mostly suggest these types of error are of minimal concern. For example, Tocheri et al. [15] conducted an error study using non-landmark-based methods, in which they examined the variance in surface shape metrics of gorilla tarsals as collected by two researchers on virtual 3D models generated from both CT and laser surface scanners. They found that laser scan surfaces and those extracted from CT scans were not distinguishable, and that the two individuals who rendered and collected the data did not do so in a statistically different fashion. Likewise, Sholts et al. [16] measured scan model area and volume when constructed with multiple protocols and by two different individuals. They report intra- and interobserver error in scan construction at 0.2% and 2% variance, respectively, which they interpret as non-significant for scan sharing.

In a study conceived concurrently with this one, Robinson and Terhune [17] compared both inter- and intraobserver error rates between the two researchers on 14 differently sized crania of 11 primate taxa using traditional linear measurements, tactile 3D landmarking (i.e., Microscribe), and digital landmarking of computer rendered models. In regards to variance levels when applying landmarks to digital 3D models for morphometric analyses, they demonstrate negligible differences in rates of error between how scans were created (e.g., NextEngine vs CT), and that interobserver variation is higher than both intraobserver and intraspecific variation. Conversely, Fruciano and colleagues [18] also compared intra- and interobserver rates between two researchers using three different surface scan methodologies for a series of marsupial crania. These researchers found significant differences in landmark protocols both between observers and among the different scan types, and found that the differences in landmark collection protocols led to statistically different results when estimating phylogenetic signal in their dataset.

These studies demonstrate that training and a consistently applied protocol could reduce some technological and user-based error, although many of these results are contradictory. All previous studies thus far fail to address the possibility that in-person training may be impractical or impossible in some cases, and they use only three scan types while a wide variety of scanners is currently available on the market. Additionally, with the involvement of many more researchers of varying expertise levels, this study will provide more robust results regarding the magnitude of potential interobserver error.

As landmark-based studies increasingly move toward the use of surface scanners for creating virtual specimens of fossil (e.g., [19,20, 21, 22]) and extant (e.g., [23, 24, 25]) organisms that can be archived for sharing and future use, questions addressing the compatibility of data collected by different researchers with inherently different methods and equipment are paramount if truly collaborative and accurate research is to be achieved. Quantifying and understanding how intra- and interobserver error are affected by both technology and user error is especially relevant now as data sharing efforts are becoming common in the paleoanthropology and paleontology communities through open-access web databases like PRIMO (http://primo.nycep.org) and MorphoSource (www.morphosource.org), where both morphometric data and raw scans are shared freely among researchers.

Given the multiple potential sources of error in any landmark-based study, our goal here is to investigate whether landmarks can be placed at truly homologous points given the inherent differences in researcher experience, landmarking techniques, and the quality of a digital model resulting from different scanners and scanning protocols. To evaluate the gravity of some of these issues, we assess the compatibility of landmark data gathered by nine researchers with varying degrees of experience on scans of a single macaque cranium digitally rendered by four different scanners (see Table 1). We apply multivariate statistics to evaluate rates of precision and accuracy among researchers, and test the following three predictions:

Table 1. List of scanners and scanner types used for this project.

Faces refers to the number of triangles in a surface.

Scanner name	Type (abbreviations used in later tables)	Scanner resolution	Scan surface area (mm²) / volume (mm³)
NextEngine, Inc. NextEngine 3D Scanner HD	Laser surface scanner (NE)	0.1 mm	47,075 / 208,180
Breuckmann OptoTOP-HE	Structured white light surface scanner ((B)	2 μm	46,085 / 256,581
Minolta Vivid 910	Laser surface scanner (M)	1.12 mm	49,000 / 275,592
General Electric Phoenix v\|tome\|x s240	Computed Tomography (CT)	< 1 μm	5,905,620 / 566,477

Observer	User experience		Order
Researcher 1 R1 (LX)	AMNH volunteer; undergraduate experience in osteology; first time collecting 3DGM data; received in-person instruction from R9 (T) in how to collect the data		M, CT, NE, B
Researcher 2 R2 (MX)	AMNH volunteer; undergraduate experience in osteology; 1 year of experience collecting 3DGM data; received in-person instruction from R9 (T) in how to collect the data		CT, B, NE, M
Researcher 3 R3 (LX)	AMNH volunteer; undergraduate experience in osteology; first time collecting 3DGM data; received in-person instruction from R9 (T) in how to collect the data		B, NE, CT, M
Researcher 4 R4 (MX)	AMNH volunteer; undergraduate experience in osteology; 1 year of experience collecting 3DGM data; received in-person instruction from R9 (T) in how to collect the data		CT, M, NE, B
Researcher 5 R5 (HX)	Ph.D. in physical anthropology with a morphology emphasis; regular user of 3DGM data; received the list of landmark definitions but no in-person training		B, M, CT, NE
Researcher 6 R6 (HX)	Ph.D. in physical anthropology with a morphology emphasis; regular user of 3DGM data; received the list of landmark definitions but no in-person training		B, CT, M, NE
Researcher 7 R7 (MX)	AMNH volunteer; undergraduate experience in osteology; 1 year of experience collecting 3DGM data; received in-person instruction from R9 (T) in how to collect the data		M, B, CT, NE
Researcher 8 R8 (HX)	Graduate student in physical anthropology with morphology emphasis; significant experience in osteology; significant experience collecting 3DGM data; received the list of landmark definitions and in-person clarification of questions from R9 (T)		M, CT, NE, B
Researcher 9 (HX, T)	Ph.D. in physical anthropology with a morphology emphasis; regular user of 3DGM data, Trainer.		M, NE, B, CT
Low experience (LX)	Medium experience (MX)	High experience (HX)	Trainer
Researcher 1 Researcher 3	Researcher 2 Researcher 4 Researcher 7	Researcher 5 Researcher 6 Researcher 8	Researcher 9

#	Osteometric Point Name	Description	Side	Landmark type	Included in Landmark Set:
1	Glabella	Most anterior point in the mid-sagittal plane between the supraciliary arches	Midline	III	F, R
2	Nasion	Point where nasals and frontal meet in midline	Midline	I	F, R
3	Rhinion	Most inferior point in midline where nasals meet		I	F, R
4	Nasiospinale	Most inferior point in midline on nasal aperture		I	F, R
5	Alare (L)	Most lateral point on nasal aperture in transverse plane	Left	III	F, R
6	Alare (R)	Most lateral point on nasal aperture in transverse plane	Right	III	F, R
7		Point of maximum curvature on inferiormost corner of nasal aperture	Left	III	F, R
8		Point of maximum curvature on inferiormost corner of nasal aperture	Right	III	F, R
9		Superior most point in lateral half of supraorbital margin	Left	III	F, R
10	Orbitale (L)	Most inferior point on infraorbital margin	Left	III	F, R
11	Ectoconchion (L)	Lateral most point on orbit in transverse plane	Left	III	F, R
12		Medial most point on orbit in transverse plane	Left	III	F, R
13	Frontomalare temporale (L)	Point where zygomatico-frontal suture crosses lateral edge of zygoma.	Left	I	F, R
14		Center of supraorbital foramen/notch	Left	II	F, R
15		Point of maximum curvature on inferolateral infraorbital margin	Left	III	F, R
16		Point of maximum curvature on inferomedial infraorbital margin	Left	III	F, R
17		Superior most point in lateral half of supraorbital margin	Right	III	F, R
18	Orbitale (R)	Most inferior point on infraorbital margin	Right	III	F, R
19		Medial most point on orbit in transverse plane	Right	III	F, R
20	Ectoconchion (R)	Lateral most point on orbit in transverse plane	Right	III	F, R
21		Center of supraorbital foramen/notch	Right	II	F, R
22	Frontomalare temporale (R	Point where zygomatico-frontal suture crosses lateral edge of zygoma	Right	I	F, R
23		Point of maximum curvature on inferomedial infraorbital margin	Right	III	F, R
24		Point of maximum curvature on inferolateral infraorbital margin	Right	III	F, R
25		Point of maximum postorbital constriction	Left	III	F
26		Point of maximum postorbital constriction	Right	III	F
27	Porion (L)	Most superolateral point of external auditory meatus	Left	III	F, R
28	Porion (R)	Most superolateral point of external auditory meatus	Right	III	F, R
29	Zygion (L)	Most lateral Point of zygomatic arch	Left	III	F
30	Zygion (R)	Most lateral Point of zygomatic arch	Right	III	F
31	Prosthion	Most anterior point of alveolar process of maxilla in midline	Midline	I	F, R
32		Widest breadth of alveolar process of maxilla	Left	III	F
33		Widest breadth of alveolar process of maxilla	Right	III	F
34	Opisthocranion	Most posterior point of cranium in midline	Midline	II	F, R
35	Opisthion	Most posterior point of foramen magnum in midline	Midline	III	F, R
36	Basion	Most anterior point of foramen magnum in midline	Midline	III	F, R
37		Most posterior point of horizontal plate of palatine bone in midline	Midline	II	F, R
38–47	Curve 1	Asterion (L) to Opisthocranion	SLC	S	F, S
48–57	Curve 2	Opisthocranion to Asterion (R)	SLC	S	F, S
58–67	Curve 3	Opisthocranion to Bregma	SLC	S	F, S

Taxon	N	Specimen numbers
Macaca mulatta	1	NMNH (National Museum of Natural History) 173813
Macaca nemestrina	2	AMNH 11090, 106037
Macaca nigra	1	AMNH 196414
Macaca ochreata	1	AMNH 153599
Macaca sylvanus	2	NMNH 476780, 476785
Macaca thibetana	1	AMNH 83994
Macaca tonkeana	2	AMNH 152907, 153401

#	Individual Procrustes alignments									Procrustes alignment—All users
#	R1 (LX)	R2 (MX)	R3 (LX)	R4 (MX)	R5 (HX)	R6 (HX)	R7 (MX)	R8 (HX)	R9 (T)	Procrustes alignment—All users
1	0.006	0.007	0.006	0.003	0.004	0.005	0.019	0.005	0.009	0.017
2	0.004	0.007	0.005	0.005	0.003	0.009	0.005	0.008	0.009	0.014
3	0.002	0.004	0.003	0.002	0.001	0.002	0.006	0.002	0.002	0.006
4	0.004	0.005	0.005	0.003	0.003	0.005	0.009	0.003	0.003	0.009
5	0.003	0.006	0.004	0.003	0.003	0.003	0.006	0.004	0.003	0.009
6	0.004	0.006	0.004	0.004	0.003	0.002	0.006	0.004	0.004	0.008
7	0.003	0.005	0.004	0.002	0.004	0.003	0.008	0.004	0.003	0.008
8	0.004	0.005	0.006	0.003	0.004	0.002	0.008	0.003	0.004	0.009
9	0.009	0.004	0.004	0.002	0.003	0.002	0.004	0.003	0.004	0.008
10	0.011	0.006	0.004	0.003	0.002	0.002	0.005	0.004	0.004	0.008
11	0.003	0.004	0.004	0.004	0.003	0.003	0.005	0.004	0.010	0.008
12	0.003	0.005	0.005	0.003	0.003	0.003	0.005	0.005	0.006	0.007
13	0.008	0.006	0.005	0.004	0.003	0.002	0.006	0.003	0.004	0.011
14	0.006	0.003	0.003	0.002	0.002	0.002	0.005	0.005	0.004	0.006
15	0.011	0.003	0.004	0.003	0.003	0.002	0.006	0.003	0.006	0.007
16	0.007	0.005	0.004	0.003	0.004	0.002	0.006	0.007	0.003	0.007
17	0.004	0.005	0.006	0.002	0.002	0.002	0.004	0.004	0.004	0.007
18	0.009	0.008	0.005	0.002	0.003	0.002	0.005	0.003	0.004	0.009
19	0.003	0.004	0.003	0.003	0.002	0.003	0.005	0.005	0.004	0.006
20	0.004	0.005	0.005	0.003	0.004	0.002	0.005	0.004	0.012	0.009
21	0.003	0.003	0.004	0.002	0.002	0.002	0.005	0.007	0.004	0.006
22	0.006	0.007	0.005	0.004	0.003	0.002	0.005	0.003	0.004	0.010
23	0.009	0.007	0.009	0.002	0.003	0.003	0.009	0.011	0.005	0.009
24	0.006	0.005	0.005	0.002	0.003	0.003	0.005	0.004	0.005	0.007
25	0.008	0.007	0.008	0.007	0.003	0.005	0.011	0.003	0.011	0.025
26	0.007	0.006	0.007	0.007	0.003	0.005	0.012	0.005	0.011	0.024
27	0.006	0.005	0.006	0.003	0.004	0.002	0.007	0.004	0.006	0.008
28	0.006	0.004	0.006	0.004	0.003	0.002	0.007	0.003	0.005	0.008
29	0.008	0.008	0.013	0.006	0.003	0.009	0.012	0.008	0.007	0.025
30	0.006	0.007	0.012	0.006	0.003	0.006	0.011	0.006	0.007	0.025
31	0.004	0.004	0.004	0.003	0.003	0.003	0.008	0.003	0.005	0.009
32	0.010	0.013	0.007	0.003	0.002	0.004	0.030	0.003	0.008	0.032
33	0.007	0.012	0.006	0.004	0.002	0.004	0.031	0.003	0.009	0.032
34	0.007	0.007	0.008	0.003	0.002	0.002	0.010	0.005	0.004	0.014
35	0.004	0.003	0.005	0.002	0.002	0.002	0.005	0.003	0.002	0.008
36	0.002	0.004	0.005	0.002	0.002	0.002	0.005	0.003	0.003	0.007
37	0.003	0.004	0.004	0.003	0.003	0.004	0.005	0.003	0.004	0.009

Researcher	Landmark Set	NextEngine	Breuckmann	Minolta	CT	Total average variance by Landmark set
R1 (LX)	Full	0.019	0.031	0.026	0.026	0.034
	Reduced	0.026	0.034	0.030	0.025	0.042
	Semilandmark	0.017	0.024	0.038	0.026	0.038
R2 (MX)	Full	0.040	0.035	0.030	0.029	0.040
	Reduced	0.039	0.032	0.035	0.025	0.038
	Semilandmark	0.057	0.050	0.044	0.047	0.055
R3 (LX)	Full	0.015	0.051	0.064	0.043	0.052
	Reduced	0.013	0.028	0.033	0.027	0.034
	Semilandmark	0.053	0.101	0.110	0.077	0.091
R4 (MX)	Full	0.019	0.015	0.028	0.019	0.025
	Reduced	0.015	0.016	0.023	0.017	0.021
	Semilandmark	0.026	0.026	0.047	0.030	0.041
R5 (HX)	Full	0.019	0.032	0.023	0.021	0.037
	Reduced	0.015	0.019	0.016	0.014	0.020
	Semilandmark	0.030	0.053	0.039	0.033	0.061
R6 (HX)	Full	0.018	0.019	0.019	0.017	0.022
	Reduced	0.015	0.016	0.019	0.014	0.022
	Semilandmark	0.034	0.040	0.036	0.041	0.041
R7 (MX)	Full	0.028	0.021	0.025	0.042	0.052
	Reduced	0.021	0.020	0.023	0.041	0.040
	Semilandmark	0.043	0.027	0.037	0.047	0.061
R8 (HX)	Full	0.040	0.031	0.025	0.030	0.034
	Reduced	0.043	0.034	0.023	0.028	0.035
	Semilandmark	0.075	0.066	0.051	0.058	0.066
R9 (T)	Full	0.026	0.024	0.033	0.043	0.038
	Reduced	0.023	0.020	0.027	0.036	0.037
	Semilandmark	0.038	0.046	0.050	0.068	0.057
Total average variance by scanner for all users and landmark sets	Full	0.026	0.029	0.031	0.030	0.0288
	Reduced	0.024	0.026	0.025	0.025	0.0252
	Semilandmark	0.041	0.048	0.05	0.48	0.0468

	Sum of Squares	df	Mean Square	F	p-value
Between Groups	.001	3	.000	1.957	.120
Within Groups	.072	356	.000
Total	.073	359

	Sum of Squares	df	Mean Square	F	p-value
Between Groups	.004	3	.001	1.843	.139
Within Groups	.255	356	.001
Total	.259	359

Source	Type III Sum of Squares	df	Mean Square	F	p-value
Corrected Model	.041	35	.001	11.688	p<0.001
Intercept	.299	1	.299	3005.062	p<0.001
Scanner	.001	3	.000	3.965	.008
User	.024	8	.003	30.201	p<0.001
Scanner User	.015	24	.001	6.483	p<0.001
Error	.032	324	.000
Total	.372	360
Corrected Total	.073	359

(I) user	(J) user	Mean Difference (I-J)	p-value	95% Confidence Interval
(I) user	(J) user	Mean Difference (I-J)	p-value	Lower Bound	Upper Bound
R1 (LX)	R8 (HX)	.0361	p<0.001	.0229	.0494
	R2 (MX)	.0127	.070	-.0005	.0260
	R3 (LX)	-.0227	p<0.001	-.0360	-.0095
	R7 (MX)	.0237	p<0.001	.0105	.0370
	R5 (HX)	.0236	p<0.001	.0104	.0369
	R9 (T)	.0119	.119	-.0014	.0251
	R4 (MX)	.0301	p<0.001	.0169	.0433
	R6 (HX)	.0245	p<0.001	.0113	.0377
R8 (HX)	R1 (LX)	-.0361	p<0.001	-.0494	-.0229
	R2 (MX)	-.0234	p<0.001	-.0366	-.0101
	R3 (LX)	-.0588	p<0.001	-.0721	-.0456
	R7 (MX)	-.0124	.088	-.0256	.0009
	R5 (HX)	-.0125	.082	-.0257	.0007
	R9 (T)	-.0242	p<0.001	-.0375	-.0110
	R4 (MX)	-.0060	.891	-.0193	.0072
	R6 (HX)	-.0116	.140	-.0249	.0016
R2 (MX)	R1 (LX)	-.0127	.070	-.0260	.0005
	R8 (HX)	.0234	p<0.001	.0101	.0366
	R3 (LX)	-.0355	p<0.001	-.0487	-.0222
	R7 (MX)	.0110	.194	-.0022	.0242
	R5 (HX)	.0109	.206	-.0024	.0241
	R9 (T)	-.0009	1.000	-.0141	.0124
	R4 (MX)	.0174	.002	.0041	.0306
	R6 (HX)	.0118	.127	-.0015	.0250
R3 (LX)	R1 (LX)	.0227	p<0.001	.0095	.0360
	R8 (HX)	.0588	p<0.001	.0456	.0721
	R2 (MX)	.0355	p<0.001	.0222	.0487
	R7 (MX)	.0465	p<0.001	.0332	.0597
	R5 (HX)	.0463	p<0.001	.0331	.0596
	R9 (T)	.0346	p<0.001	.0214	.0479
	R4 (MX)	.0528	p<0.001	.0396	.0661
	R6 (HX)	.0472	p<0.001	.0340	.0605
R7 (MX)	R1 (LX)	-.0237	p<0.001	-.0370	-.0105
	R8 (HX)	.0124	.088	-.0009	.0256
	R2 (MX)	-.0110	.194	-.0242	.0022
	R3 (LX)	-.0465	p<0.001	-.0597	-.0332
	R5 (HX)	-.0001	1.000	-.0134	.0131
	R9 (T)	-.0119	.121	-.0251	.0014
	R4 (MX)	.0064	.855	-.0069	.0196
	R6 (HX)	.0008	1.000	-.0125	.0140
R5 (HX)	R1 (LX)	-.0236	p<0.001	-.0369	-.0104
	R8 (HX)	.0125	.082	-.0007	.0257
	R2 (MX)	-.0109	.206	-.0241	.0024
	R3 (LX)	-.0463	p<0.001	-.0596	-.0331
	R7 (MX)	.0001	1.000	-.0131	.0134
	R9 (T)	-.0117	.130	-.0250	.0015
	R4 (MX)	.0065	.841	-.0068	.0197
	R6 (HX)	.0009	1.000	-.0124	.0141
R9 (T)	R1 (LX)	-.0119	.119	-.0251	.0014
	R8 (HX)	.0242	p<0.001	.0110	.0375
	R2 (MX)	.0009	1.000	-.0124	.0141
	R3 (LX)	-.0346	.000	-.0479	-.0214
	R7 (MX)	.0119	.121	-.0014	.0251
	R5 (HX)	.0117	.130	-.0015	.0250
	R4 (MX)	.0182	.001	.0050	.0315
	R6 (HX)	.0126	.076	-.0006	.0259
R4 (MX)	R1 (LX)	-.0301	p<0.001	-.0433	-.0169
	R8 (HX)	.0060	.891	-.0072	.0193
	R2 (MX)	-.0174	.002	-.0306	-.0041
	R3 (LX)	-.0528	p<0.001	-.0661	-.0396
	R7 (MX)	-.0064	.855	-.0196	.0069
	R5 (HX)	-.0065	.841	-.0197	.0068
	R9 (T)	-.0182	.001	-.0315	-.0050
	R6 (HX)	-.0056	.925	-.0188	.0077
R6 (HX)	R1 (LX)	-.0245	p<0.001	-.0377	-.0113
	R8 (HX)	.0116	.140	-.0016	.0249
	R2 (MX)	-.0118	.127	-.0250	.0015
	R3 (LX)	-.0472	p<0.001	-.0605	-.0340
	R7 (MX)	-.0008	1.000	-.0140	.0125
	R5 (HX)	-.0009	1.000	-.0141	.0124
	R9 (T)	-.0126	.076	-.0259	.0006
	R4 (MX)	.0056	.925	-.0077	.0188

(I) scanner	(J) scanner	Mean Difference (I-J)	p-value	95% Confidence Interval
(I) scanner	(J) scanner	Mean Difference (I-J)	p-value	Lower Bound	Upper Bound
BR	CT	-.0009	.939	-.0047	.0030
	M	-.0013	.817	-.0051	.0025
	NE	.0033	.116	-.0005	.0072
CT	BR	.0009	.939	-.0030	.0047
	M	-.0004	.991	-.0043	.0034
	NE	.0042	.027	.0003	.0080
M	BR	.0013	.817	-.0025	.0051
	CT	.0004	.991	-.0034	.0043
	NE	.0046	.011	.0008	.0085
NE	BR	-.0033	.116	-.0072	.0005
	CT	-.0042	.027	-.0080	-.0003
	M	-.0046	.011	-.0085	-.0008

(I) scanner	(J) scanner	Mean Difference (I-J)	p-value	95% Confidence Interval
(I) scanner	(J) scanner	Mean Difference (I-J)	p-value	Lower Bound	Upper Bound
BR	CT	.0006	.997	-.0067	.0079
	M	-.0021	.875	-.0094	.0052
	NE	.0068	.079	-.0005	.0141
CT	BR	-.0006	.997	-.0079	.0067
	M	-.0027	.767	-.0100	.0046
	NE	.0062	.129	-.0011	.0135
M	BR	.0021	.875	-.0052	.0094
	CT	.0027	.767	-.0046	.0100
	NE	.0089	.009	.0016	.0162
NE	BR	-.0068	.079	-.0141	.0005
	CT	-.0062	.129	-.0135	.0011
	M	-.0089	.009	-.0162	-.0016

Researcher	Full Landmark	Reduced Landmark	Semilandmark Only
1 (LX)	PC 1: 26.4%	PC 1: 29.9%	PC 1: 49.7%
	PC 2: 15.2%	PC 2: 16.9%	PC 2: 22.2%
	PC 3: 12.1%	PC 3: 14.2%	PC 3: 10.6%
2 (MX)	PC 1: 33.9%	PC 1: 31.7%	PC 1: 52.8%
	PC 2: 10.4%	PC 2: 11.7%	PC 2: 13.4%
	PC 3: 9.9%	PC 3: 10.7%	PC 3: 8%
3 (LX)	PC 1: 39.1%	PC 1: 16.4%	PC 1: 46.5%
	PC 2: 19.9%	PC 2: 14.1%	PC 2: 20.4%
	PC 3: 9.0%	PC 3: 9.4%	PC 3: 11.6%
4 (MX)	PC 1: 33.4%	PC 1: 35.0%	PC 1: 54.0%
	PC 2: 25.6%	PC 2: 14.1%	PC 2: 22.4%
	PC 3: 8.8%	PC 3: 7.9%	PC 3: 8.9%
5 (HX)	PC 1: 87.6%	PC 1: 37.5%	PC 1: 92.7%
	PC 2: 1.9%	PC 2: 12.8%	PC 2: 1.6%
	PC 3: 1.4%	PC 3: 6.1%	PC 3: 1.2%
6 (HX)	PC 1: 28.5%	PC 1: 28.5%	PC 1: 34.6%
	PC 2: 14.7%	PC 2: 14.7%	PC 2: 17.4%
	PC 3: 11.1%	PC 3: 11.1%	PC 3: 8.0%
7 (MX)	PC 1: 54.7%	PC 1: 78.3%	PC 1: 37.9%
	PC 2: 17.2%	PC 2: 7.2%	PC 2: 22.6%
	PC 3: 8.2%	PC 3: 2.9%	PC 3: 15.6%
8 (HX)	PC 1: 20.6%	PC 1: 30.3%	PC 1: 33.1%
	PC 2: 16.7%	PC 2: 20.3%	PC 2: 21.5%
	PC 3: 11.6%	PC 3: 11.0%	PC 3: 10.6%
9 (T)	PC 1: 25.2%	PC 1: 35.5%	PC 1: 35.8%
	PC 2: 21.9%	PC 2: 15.8%	PC 2: 24.7%
	PC 3: 13.5%	PC 3: 8.3%	PC 3: 10.2%

	Full	Full with Sliding	Reduced	Semilandmark	Semilandmark with Sliding
R6 (HX) intraobserver error	0.03	0.02	0.03	0.05	0.03
R8 (HX) intraobserver error	0.05	0.04	0.05	0.09	0.07
R6 (HX) different macaques	0.11	0.10	0.13	0.13	0.10
R8 (HX) different macaques	0.10	0.09	0.12	0.12	0.10
interobserver error	0.13	0.11	0.08	0.10	0.09

Source	Type III Sum of Squares	df	Mean Square	F	p-value
Corrected Model	.028	35	.001	7.938	p < 0.001
Intercept	.228	1	.228	2252.955	p < 0.001
scanner	.000	3	.000	.379	.768
user	.016	8	.002	19.504	p < 0.001
scanner user	.012	24	.001	5.028	p < 0.001
Error	.033	324	.000
Total	.289	360
Corrected Total	.061	359

Source	Type III Sum of Squares	df	Mean Square	F	p-value
Corrected Model	.143	35	.004	11.343	p<0.001
Intercept	.788	1	.788	2190.950	p<0.001
scanner	.004	3	.001	3.675	.013
user	.103	8	.013	35.776	p<0.001
scanner user	.036	24	.001	4.157	p<0.001
Error	.117	324	.000
Total	1.048	360
Corrected Total	.259	359

PERMALINK

Evaluating causes of error in landmark-based data collection using scanners

Brian M Shearer

Siobhán B Cooke

Lauren B Halenar

Samantha L Reber

Jeannette E Plummer

Eric Delson

Melissa Tallman

Roles

Abstract

Introduction

Table 1. List of scanners and scanner types used for this project.

Materials and methods

Materials

Fig 1. Scan comparison anterior view of Macaca thibetana (AMNH 129).

Fig 2. Scan comparison inferior view of Macaca thibetana (AMNH 129).

Methods

Table 2. List of observers who collected data, their experience, and the order in which they landmarked the scan replicates (scanner abbreviations from Table 1).

Fig 3. Landmarks employed in this study.

Table 3. List of landmarks used in this study.

Table 4. Sample of Macaca used for testing the magnitude of interobserver error.

Effects of landmark position on error

Effects of scan type on error

Effects of experience on error

Effects of training on error

Interobserver error vs. shape variability in multiple species

Results

Effects of landmark type on error

Table 5. Average Procrustes distance from the centroid to each replicate for every Type I, II or III landmark in the analysis.

The effects of scan type on error

Table 6. Average variance for intraobserver trials for different scan types for the entire landmark protocol.

Fig 4. Box plot illustrating the amount of intraobserver error for each user with each scanner using each landmark set.

Fig 5. Box plot illustrating the amount of intraobserver error for each scanner type for each landmark set.

Table 7. Results of a one-way ANOVA for scanner for the Full data set.

Table 9. One-way ANOVA for scanner of the Semilandmark data set.

Table 8. One-way ANOVA for scanner of the Reduced landmark dataset.

Table 10. Results of a two-way ANOVA for user and scanner for the Full landmark set.

Table 18. Tukey’s post hoc pairwise comparisons of users for the Semilandmark dataset.

Table 11. Tukey’s post hoc pairwise comparisons for scanners for the Full landmark set.

Table 12. Tukey’s post hoc pairwise comparisons for users for the Full landmark set.

Table 13. Results of a two-way ANOVA for the Reduced landmark dataset.

Table 14. Tukey’s post hoc pairwise comparisons for scanners for the Reduced landmark set.

Table 15. Tukey’s post hoc pairwise comparisons for users for the Reduced landmark set.

Table 16. Results from a two-way ANOVA of the Semilandmark dataset.

Table 17. Tukey’s post hoc pairwise comparisons of scanning types for the Semilandmark dataset.

Fig 6. Boxplot of the distribution of pairwise Procrustes distances between different users for each scanner and landmark configuration.

Effects of user experience on error

Fig 7. Boxplot illustrating the range of intraobserver error for each researcher for all forty trials.

Table 19. Percent of variance on the first three axes from principal component analyses by user for each landmark set combining all scan types and replicates (n = 40 combined scans per user).

Effects of in-person training on error

Fig 8. PCA plots of all trials from all users.

Fig 9. Boxplots illustrating the change in interobserver error when those without in-person training were removed.

Fig 10. UPGMA dendrograms illustrating how different researchers cluster.

Interobserver error vs. shape variance among multiple specimens

Fig 11. Boxplots comparing inter- and intraobserver for Researchers 6 and 8 relative to the variation found among difference species of Macaca.

Table 20. Average pairwise Procrustes distance between landmarked trials by the same user (intraobserver error), landmarked trials between two different users (interobserver error) and between different macaques.

Fig 12. Boxplots comparing inter- and intraobserver error Researchers 6 and 8 to the variation in different species of Macaca for the Full and Semilandmark only configurations after semilandmark sliding.

Discussion

(1) Error rates appear to remain consistent among and within users regardless of overall scan quality or type

(2) Users with more osteology and 3DGM experience generally had less intraobserver error, but experience with osteology or morphometrics did not improve interobserver error

(3) Interobserver error was consistently higher than all other potential error types observed among researchers in this study

Conclusions

Acknowledgments

Data Availability

Funding Statement

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases