On the combination of two visual cognition systems using combinatorial fusion

Amy Batallones; Kilby Sanchez; Brian Mott; Cameron Coffran; D Frank Hsu

doi:10.1007/s40708-015-0008-0

. 2015 Feb 3;2(1):21–32. doi: 10.1007/s40708-015-0008-0

On the combination of two visual cognition systems using combinatorial fusion

Amy Batallones ¹, Kilby Sanchez ^1,^✉, Brian Mott ¹, Cameron Coffran ², D Frank Hsu ¹

PMCID: PMC4883159 PMID: 27747501

Abstract

When combining decisions made by two separate visual cognition systems, statistical means such as simple average (M ₁) and weighted average (M ₂ and M ₃), incorporating the confidence level of each of these systems have been used. Although combination using these means can improve each of the individual systems, it is not known when and why this can happen. By extending a visual cognition system to become a scoring system based on each of the statistical means M ₁, M ₂, and M ₃ respectively, the problem of combining visual cognition systems is transformed to the problem of combining multiple scoring systems. In this paper, we examine the combined results in terms of performance and diversity using combinatorial fusion, and study the issue of when and why a combined system can be better than individual systems. A data set from an experiment with twelve trials is analyzed. The findings demonstrated that combination of two visual cognition systems, based on weighted means M ₂ or M ₃, can improve each of the individual systems only when both of them have relatively good performance and they are diverse.

Keywords: Combinatorial fusion analysis (CFA), Decision-making, Visual cognition, Rank-score characteristics (RSC) function, Cognitive diversity

Introduction

Many decisions that humans have to make are partially, or even wholly, based on visual input. The split second nature of such decisions may make the process seem simple. However, there are many factors that are considered and combined during this short time frame. On a neurological level, there has been growing interest in understanding the factors that are combined within the visual aspect alone [1, 2], as well as how visual information is joined with information from other senses [3–7]. Combination of multiple visual decisions has also been explored [5, 8, 9].

Prior research into how pairs of people can interactively make decisions based on visual perception has been conducted by several researchers including Bahrami et al. [8], Ernst [5], and Kepecs et al. [9]. In Bahrami’s work, four predictive models are used on experiments of varying degrees of noise, feedback, and communication: coin-flip (CF), behavioral feedback (BF), weighted confidence sharing (WCS), and direct signal sharing (DSS). Bahrami concludes that the WCS model is the only one that can be fit over the empirical data. His findings indicate that the accuracy of the decision-making is aided by communication between the pairs and can greatly improve the overall performance of the pair.

Marc O. Ernst expands on the concept of WCS [5] between pairs by proposing a hypothetical soccer match during which two referees determine whether the ball falls behind a goal line. Similar to Bahrami’s proposal, Ernst’s findings indicate that simply taking the approach of BF or a CF omits information which could lead to an optimal joint decision between the pair. However, while Ernst agrees that the WCS model can lead to a beneficial joint determination, his findings also indicate that there are improvements that can be made to the WCS model to achieve a more optimal joint decision. With Ernst’s scenario, Bahrami’s WCS model can be applied as the distance of the individual’s decision (d _i) divided by the spread of the confidence distribution (σ), which is d _i/σ _i. A modified version of WCS (which closely resembles DSS) using sigma-square can produce a more accurate estimate through the joint opinion, which is represented as d _i/σ ²_i. In an affirmation of Bahrami’s research, Ernst also notes that joint decision-making comes with a cost when individuals with dissimilar judgments attempt to come to a consensus in such a manner. Bahrami and Ernst set forth very different experimental methods, but their aim is very much the same: to devise an algorithm for optimal decision-making between two people based on visual sensory input.

In the other direction, neural bases for decision-making and combining sensory information within senses have been studied by Gold and Shadlin [10] and Hillis et al. [1]. Koriat [11] indicated that there is no need to combine two heads’ decisions under a normal environment. His suggestion is to simply take the decision of the most confident person.

Combinatorial Fusion Analysis (CFA), an emerging information fusion paradigm, was proposed for analyzing the combination of multiple scoring systems (MSS) (see Hsu et al. [12–14]). CFA has been shown to be useful in several research domains, including sensor feature selection and combination [15, 16], information retrieval, system selection and combination [12, 17], text categorization [18], protein structure prediction [19], image recognition [20], target tracking [21], ChIP-seq peak detection [22], and virtual screening [23]. These studies have shown in its respective domain that combination of MSS performs better than individual systems when the individual scoring systems perform relatively well and they are characteristically different [13, 14].

In a series of previous studies [24–26], a modified version of the soccer goal line decision proposed by Ernst is used as the data collection method. In this experiment, two subjects observe a small target being thrown into a grass field. The subjects are separately asked of their decision on their perceived landing point of the target and their respective confidences in their decisions. More recently, we conducted two sets of experiments with a total of 20 trials on two different days (12 trials and 8 trials) [27, 28]. In each of these trials, a small token was thrown into a grass field and landed at location A = (A _x, A _y). Two subjects P and Q standing 40 feet away from the landing site would perceive the landing site as at location P = (P _x, P _y) and Q = (Q _x, Q _y) with confidence radius σ _P and σ _Q, respectively. In these works, each visual cognition system is treated as a scoring system which assigns a score to each of the partitioned intervals in the common visual space. Then the problem of combining visual cognition systems is transformed to the problem of combining multiple scoring systems. The combination is analyzed using the CFA framework. Results obtained showed that combination by rank as well as by score can improve individual systems.

In this paper, we explore the issue of when and why a combination of two cognitive systems is better than each individual system using the CFA. In particular, we use the concept of “cognitive diversity” and the notion of “performance ratio” to analyze the outcome of the combination. Using the data set from the experiment with twelve trials [27], we demonstrate, as in other domain applications, that combination is positive (better than or equal to the best of the two individual systems) only if the two systems, based on weighted mean using confidence radius, are relatively good (higher performance ratio) and they are diverse (higher cognitive diversity).

Section 2 of this paper discusses two methods of combining visual cognition systems: statistical mean and combinatorial fusion. In Sect. 2.1, three statistical means M ₁, M ₂, and M ₃ are calculated as average or weighted mean using the confidence radius as the weight. Based on these means, scoring systems p and q are constructed from the two visual cognition systems P and Q, respectively, in Sect. 2.2. Section 2.3 gives the method to combine these two visual scoring systems using the CFA framework. Section 3 gives the definition of cognitive diversity and the notion of performance ratio. Section 4 consists of examples, in particular the data set of an experiment with twelve trials of pairs of visual cognition systems [27]. Combination of these two visual cognition systems and analysis of the combination for the data set is discussed in more detail in Sect. 4.2 and 4.3. A summary of the results and possible future works is discussed in Sect. 5.

The CFA framework for combining two visual cognition systems

Computing various statistical means

When we make a decision based on visual input, we can consider this decision-making as a contemplation of various choices or candidates. Given two perceived locations P = (P _x, P _y) and Q = (Q _x, Q _y) (with confidence radius σ _P and σ _Q, respectively) of the actual landing site A = (A _x, A _y), we wish to find a new location L (obtained by the joint decision of P and Q) so that L is better than P and Q (distance between L and A is smaller than those between P and A, and Q and A). When determining a joint decision, typically an average or a weighted average approach is used to determine a mean. Average mean M ₁ = (M _1x, M _1y) of the two locations P = (P _x, P _y) and Q = (Q _x, Q _y) is calculated as

M_{1} = (P + Q) / 2,

and weighted means are obtained by

M_{2} = (P / σ_{P} + Q / σ_{Q}) / (1 / σ_{P} + 1 / σ_{Q}),

and

M_{3} = (P / σ_{P}^{2} + Q / σ_{Q}^{2}) / (1 / σ_{P}^{2} + 1 / σ_{Q}^{2}),

where P and Q are the perceived locations of the individual subjects P and Q, and σ _P and σ _Q are the confidence measurement of the two subjects, respectively.

Converting each visual cognition system to a scoring system

In the experiments we conducted, each of the two subjects provides an individually determined decision on where they respectively perceived the same target has landed in a field. Each coordinate on the field can be considered as a candidate for the respective participants’ decisions of the perceived landing point. We are able to obtain a weight for each decision and their combination by asking each subject of a radius measurement of confidence around his or her decision. The smaller the radius measure of confidence, the more confident is the participant. We use radius R to calculate the spread (i.e., standard deviation) of the distribution around the perceived landing point, or σ. In our research, we use

σ = 0.5 R .

Set common visual space

The σ values are used in Formulas (1), (2), and (3) to determine the positions of the means and denoted as M ₁, M ₂, and M ₃ respectively. The distance between M _i and A, m _i = d(M _i, A), where A is the actual landing site, is used to evaluate the performance of M _i. With the field used as a two-dimensional coordinate grid, P, Q, and A are represented as x- and y- coordinates. Three formulas are used to calculate the mean of P and Q, as M _i, where i = 1, 2, or 3. M _i falls somewhere in between points P and Q and is determined as a coordinate.

The longer of either segment PM _i or M _i Q is extended 30 % to the left to point P′ or to the right to point Q′, respectively. The shorter side is extended more to create the widened observation area P′Q′ so that M_i is the midpoint of P′ and Q′. We refer to the line segment P′Q′ as the common visual space (Fig. 1).

Fig. 1 — The extension of PQ to P′Q′ based on M _i for i = 1, 2, or 3

We partition the length, d(P′,Q′), of line segment P′Q′ into 127 intervals with midpoint d_i in each interval i, i = 1, 2, …, 127, and with each interval length d(P′,Q′)/127. The midpoint of the center interval, in this case, d₆₄, contains M _i.

Treat P and Q as two scoring systems p and q

Normal distribution probability curves for each participant are created with the point P and Q as the mean and using the confidence radii values, σ ²_P and σ ²_Q of P and Q as the variances of P and Q, respectively (see Fig. 2 in the case of 15 intervals). The following formula is used to determine normal distribution:

Y = (1 / (σ \sqrt (2 π))) * e^{[- (x - μ) * * 2] / 2 σ * * 2},

where x is a normal random variable, μ is the mean, and σ is the standard deviation. A normal distribution curve spans infinitely to the right and to the left. Therefore, our two scoring systems p and q create overlapping distributions that span the entire visual plane between P′ and Q′. Scoring system p and scoring system q, respectively, scores each of the 127 intervals on the common visual space. For normal distribution functions with point P and Q as the mean and σ _P and σ _Q as the standard deviation respectively, each of the scoring systems p and q assigns interval d_i a score between 0 and 1 according to formula (5) (see Fig. 2 in the case of 15 intervals). These are the score functions s _p and s _q. The values of the score function s are sorted from highest to lowest to obtain the rank functions r _p and r _q, respectively (see Fig. 3). The d _i with the lowest integer as its rank has the highest score.

Fig. 2 — Partition of P′Q′ into 15 intervals with center M _i

Fig. 3 — Score and rank function for respective scoring systems p and q undergo CFA to produce score combination C and rank combination D

Combining scoring systems p and q using both score and rank combination

Let D be a set of candidates with |D| = n. Let N = [1, n] be the set of integers from 1 to n and R be a set of real numbers. In the context of a CFA framework, a scoring system A consists of a score function s _A and a rank function r _A on the set D of possible n positions (in this paper, D = {d _i| i = 1, 2, …, 127}).

In the setting of this paper, the score function s _C of the score combination of derived scoring systems p and q in our experiment is

s_{C} (d_{i}) = (s_{p} (d_{i}) + s_{q} (d_{i})) / 2 .

The score function s _D of the rank combination of the two scoring systems p and q in our experiment is

s_{D} (d_{i}) = (r_{p} (d_{i}) + r_{q} (d_{i})) / 2 .

When we sort s _C(d _i) in descending order, we obtain the rank function of the score combination, called r _C(d _i). When we sort s _D(d _i) in ascending order, we obtain the rank function of the rank combination, called r _D(d _i). The top ranked interval in r _C(d _i) is called C. The top ranked interval in r _D(d _i) is called D (see Fig. 3). These points are considered the optimal score and rank combination, respectively, and are used for evaluation of the combination result. The performance of the points (P, Q, M _i, C, and D) is determined by each respective point’s distance from target A. A shorter distance indicates higher performance (Fig. 4).

Fig. 4 — Layout of M_i, i = 1, 2, or 3, C, and D in relation to P, Q, and their distance to A. The distances between the 5 estimated points and A are noted on each line [24]

Cognitive diversity and performance ratio

Cognitive diversity

Given the score function s _A of the system A and its derived rank function r _A, rank-score characteristic (RSC) function f _A, which is a composite function of s _A and the inverse of r _A, defined by Hsu et al. [13, 14] is a function from N to R and can be computed mathematically as (see Fig. 5).

f_{A} (i) = (s_{A} r_{A}^{- 1}) (i) = s_{A} (r_{A}^{- 1} (i)) .

Fig. 5 — Score function s _A, rank function r _A, and RSC function f _A of the scoring system A [13, 14]

The cognitive diversity between two scoring systems p and q, d(p,q) is calculated using RSC functions f _p and f _q (also see [23]) as

d (p, q) = d (f_{p}, f_{q}) = {(\sum_{i = 1}^{127} {(f_{p} (i) - f_{q} (i))}^{2} / 127)}^{1 / 2} .

Performance ratio

The performances of each P and Q for all trials are used in calculating the performance ratio. Performance of P (or Q) is determined by the distance between P (or Q) and A, d(P, A) [or d(Q, A)], respectively. Shorter distance indicates high performances. Each distance is inverted and then multiplied by the maximum distance md = max{d(P _i, A _i), d(Q _i, A _i) | i = 1, 2,…, 12} for all trials. Let $MAX = max \{\frac{m d}{d (P_{i}, A_{i})}, \frac{m d}{d (Q_{i}, A_{i})} |i = 1, 2, \dots, 12\}$ . Then this set of numbers is each divided by MAX. In this way, the performance for each of the 12 P and Q is in the set (0, 1]. The smaller performance over the higher performance for P and Q is the performance ratio after it is normalized again among the twelve ratios to be in (0, 1].

Example

Data set

We use the data set from an experiment of twelve trials conducted by the authors in [27]. Each trial consists of two volunteers P and Q with confidence radius σ _P and σ _Q. Each gives a visual cognitive estimate of the actual token landing site A as P and Q respectively.

Table 1 lists coordinates of P (P _x, P _y), Q (Q _x, Q _y), and A (A _x, A _y) as well as the confidence radius σ _P and σ _Q of P and Q respectively.

Table 1.

Coordinates of P, Q, and A and confidence radius (σ) of P and Q for the 12 trials [27]

Trial	(P _x, P _y)	σ _P	(Q _x, Q _y)	σ _Q	(A _x , A _y)
1	(11.5, 134.5)	11.5	(78.5, 105)	16	(94, 124)
2	(23.5, 56)	7	(112, 96.75)	21.5	(28.5, 43)
3	(105, 134.25)	21	(78.5, 87.75)	22	(39.5, 119)
4	(229.25, 151.5)	14	(256, 162.5)	15.5	(216.25, 149.75)
5	(125.5, 13.5)	0.5	(112.75, 57.25)	3	(113.75, 46)
6	(184.5, 108.25)	21.5	(164.5, 249.75)	12	(173.25, 212.5)
7	(22, 190.5)	7	(17, 227.75)	6	(14.75, 195)
8	(98.75, 57)	12.5	(71.25, 25.5)	12	(16.5, 1)
9	(205.5, 15)	17	(204, 21.5)	6.5	(203, 26)
10	(100.5, 4.5)	19.5	(172, 25.25)	6	(127, 9.5)
11	(236.25, 43)	4	(234, 72.75)	4.5	(229, 51.5)
12	(98.5, −75.5)	10	(99, 30)	12	(96, 4)

Open in a new tab

Combination results and analysis

The decision of Participant p, marked as P, and the decision of Participant q, marked as Q, are used to obtain line segment PQ. The radii of confidence are used to calculate the two σ values to locate the coordinates of points M ₁, M ₂, and M ₃ along the extended P′Q′. To combine and compare the two visual decision systems of p and q, a common plane must be implemented to be evaluated by the different systems. The 127 intervals along the P′Q′ line serve as the common visual space to be scored.

When P′Q′ has been partitioned into the 127 intervals mapped according to M _i, the intervals are scored according to the normal distribution curves of P and Q using the standard deviation σ _P and σ _Q, respectively. Both systems assume the set of common interval midpoints d ₁, d ₂, d ₃,…,d ₁₂₇. Each scoring system, p and q, consists of a score function. We define score functions s _P(d _i) and s _Q(d _i) that map each interval, d _i, to a score in systems p and q, respectively. The rank function of each of the systems p and q maps each element d _i to a positive integer in N, where N = {x | 1 ≤ x ≤ 127}. We obtained the rank functions r _P(d _i) and r _Q(d _i) by sorting s _P(d _i) and s _Q(d _i) in descending order and assigning a rank value from 1 to 127 to each interval. C and D based on M _i, for i = 1, 2, and 3, are calculated, and the distances to target A are computed. The point with the shorter distance from the target is considered the point with the better performance.

Table 2 lists the performance of (P, Q), confidence radius of P, Q and performance of C and D based on M _i, i = 1, 2, and 3. Table 3 lists performance for M _i, i = 1, 2, and 3 in the twelve trials. Table 4 gives comparisons of the performance of C or D to that of P and Q, and to M _i. We note that Koriat’s criterion, taking the decision of the most confident system, gives a correct prediction of 7 out of the 12 trials (Trials 1, 2, 4, 6, 8, 9, and 11). The score combination C or rank combination D obtained by CFA improves P and Q in 8, 7, and 6 out of the 12 trials when the common visual space mean is M ₁, M ₂, and M ₃ respectively. It is interesting to note that C or D improves P and Q in more trials based on M ₁ than those based on M ₂ or M ₃ because M ₁ does not take into consideration the confidence radius as weighted means (Table 4(a)). The same reason can be given to Table 4(b) where C or D can improve M ₁ in more trials than M₂ or M₃. In addition, in the 4 trials (Trials 3, 5, 10, and 12) that Koriat’s criterion fails to apply, they can all be improved using the CFA framework.

Table 2.

Performance of combination: (a) Performance of P, Q, (b) Confidence radius of P, Q, (c) Performance of C and D based on M ₁, M ₂, and M ₃, respectively

Trial	(a) Per. (P,Q)	(b) Confidence Radius (σ _P, σ _Q)	(C)(1) Per. of C, D; based on M ₁	(C)(2) Per. of C, D; based on M ₂	(C)(3) Per. of C, D; based on M ₃
1	(20.41, 24.52)	(11.5, 16)	(20.24, 20.24)	(20.63, 20.07)	(20.14, 20.14)
2	(13.93, 99.3)	(7, 21.5)	(13.96, 13.96)	(13.91, 13.91)	(13.91, 13.91)
3	(67.25, 49.98)	(21, 22)	(66.71, 49.94)	(66.72, 67.13)	(66.70, 67.15)
4	(13.12, 41.74)	(14, 15.5)	(14.47, 13.23)	(14.40, 13.11)	(14.48, 13.19)
5	(34.56, 11.29)	(0.5, 3)	(34.38, 11.12)	(10.95, 10.95)	(34.51, 10.94)
6	(104.86, 38.26)	(21.5, 12)	(37.70, 37.70)	(37.63, 37.63)	(37.95, 37.95)
7	(8.53, 32.83)	(7, 6)	(32.68, 8.44)	(32.88, 32.44)	(32.68, 32.68)
8	(99.5, 59.98)	(12.5, 12)	(60.13, 60.13)	(59.79, 59.79)	(59.90, 59.90)
9	(11.28, 4.61)	(17, 6.5)	(4.86, 4.64)	(4.86, 4.65)	(4.95, 4.56)
10	(26.97, 47.68)	(19.5, 6)	(47.38, 26.68)	(47.73, 46.48)	(47.24, 47.24)
11	(11.17, 21.83)	(4, 4.5)	(11.08, 11.08)	(11.08, 11.08)	(11.22, 10.92)
12	(79.54, 26.17)	(10, 12)	(79.12, 25.76)	(79.80, 78.53)	(78.86, 78.86)

Open in a new tab

Bold numbers indicate C and/or D perform better than P and Q in (C)(1), (C)(2) and (C)(3). Bold numbers indicate better performance of the two systems in (a) and higher confidence in (b)

Table 3.

Performance of M ₁, M ₂, M ₃ in 12 trials

	Trial
	1	2	3	4	5	6	7	8	9	10	11	12
M ₁	4.37	51.52	52.86	27.35	11.91	33.52	14.90	79.45	7.95	10.70	8.84	26.89
M ₂	4.13	28.45	53.08	26.62	33.09	13.53	16.21	79.05	6.45	30.20	8.28	31.66
M ₃	6.28	17.26	53.32	25.90	34.51	5.41	17.53	78.64	5.46	41.25	7.78	36.36

Open in a new tab

Each bold number indicates the performance of M _i in the Trial is better than P and Q. M ₃ is best among M _i’s in Trials 2, 4, 6, 8, 9, and 11

Table 4.

Comparisons of performance of C or D to that (a) of P and Q, (b) of M _i, and (c) of P, Q, and M _i (set of 36 cases in Table 2)

	(a) C or D ≥ P and Q	(b) C or D ≥ M _i	(c) C or D ≥ P, Q,& M _i
M ₁	1, 3, 5, 6, 7, 10, 11, 12 (8/12)	2, 3, 4, 5, 7, 8, 9, 12 (8/12)	3, 5, 7, 12 (4/12)
M ₂	1, 2, 4, 5, 6, 8, 11 (7/12)	2, 4, 5, 8, 9 (5/12)	2, 4, 5, 8 (4/12)
M ₃	1, 2, 5, 6, 8, 9 (6/12)	2, 4, 5, 8, 9 (5/12)	2, 5 (2/12)
Total	21/36	18/36	10/36

Open in a new tab

Figures 6 and 7 illustrate the performances of P, C, D, M _i and Q for i = 1, 2, and 3 in Trials 2 and 7 respectively. In Trial 2, P performs quite good and has a higher confidence radius than Q. When given weighted means M ₂ and M ₃, combinatorial fusion C or D performs better than P and Q. However, in Trial 7, P performs better but has a lower confidence radius than Q. In this case, C or D does not improve P and Q based on M ₂ or M ₃ when more weight is given to Q. Therefore, we observe that giving more weight to the better performer with a higher confidence leads to a combination which improves P and Q. We call such a case a positive case. In the following Sect. 4.3, we investigate in general when combination (either rank or score combination) can improve P and Q.

Fig. 6 — Performance of P, C, D, and Q based on M ₁ (a), M ₂ (b), and M ₃ (c) respectively for Trial 2, a Performance of P, Q, C, and D based on M ₁ in Trial 2, b performance of P, Q, C, and D based on M ₂ in Trial 2, c performance of P, Q, C, and D based on M ₃ in Trial 2

Fig. 7 — Performance of P, C, D, and Q based on M ₁ (a), M ₂ (b), and M ₃ (c) respectively for Trial 7, a Performance of P, Q, C, and D based on M ₁ in Trial 7, b performance of P, Q, C, and D based on M ₂ in Trial 7, c performance of P, C, D, and Q based on M ₃ in Trial 7

Positive cases versus Negative cases

We plot the result of a score or rank combination of P and Q, distinguishing positive cases as “□” or “◊” and negative cases as “×” or “+” on the two-dimensional coordinate plane with the y-axis as the cognitive diversity d(P, Q) and the x-axis as the performance ratio P _l/P _h (lower performance over higher performance) for all the trials for each M _i, i = 1, 2, or 3. Each trial within each graph is noted as positive when rank or score combination performs better than both P and Q, and negative when it does not. The average for all positive cases and the average for all negative cases is also marked for each graph as “■” and “X” respectively.

Cognitive diversity between P and Q, d(P, Q), is the diversity between two RSC functions f _p and f _q, d(f _p, f _q), and is calculated using formula (9). Cognitive diversity values are normalized to (0, 1] in each case based on M _i, i = 1, 2, and 3 (see Table 5). Figure 8 depicts the positive versus negative cases based on each M_i, i = 1, 2, and 3 (Fig. 8a–c respectively) in terms of cognitive diversity (y-axis) and performance ratio (x-axis).

Table 5.

Cognitive diversity

Trial	d(p, q) in M ₁	d(p, q) in M ₂	d(p, q) in M ₃
1	0.338434959	0.194412684	0.291268428
2	0.785773308	0.596314746	0.758254198
3	0.056297571	0.003988946	0.059975847
4	0.081718215	0.007480963	0.106193355
5	0.546649181	0.257373029	0.394914152
6	0.474573259	0.315880355	0.443491266
7	0.053300385	0.017943343	0.016196129
8	0.040005652	0.002402607	0.039874004
9	0.516003209	0.36502678	0.779226911
10	1	0.774343099	1
11	0.024840875	1	0.02068517
12	0.068319595	0.060741956	0.093857643

Open in a new tab

Fig. 8 — Positive versus negative cases resulting from the 24 score and rank combinations in terms of cognitive diversity d(P, Q) (y-axis) and performance ratio P _l/P _h (x-axis) based on M ₁ (a), M ₂ (b), and M ₃ (c) respectively, a Positive versus negative cases based on M _1, b positive versus negative cases based on M _2, c positive versus negative cases based on M ₃

Summary and future work

In our previous work [27, 28], it has been demonstrated that combination of two visual cognition system using the CFA framework can improve each of the individual systems. In this paper, we analyze outcomes of these combinations according to positive cases or negative cases using the notions of cognitive diversity and performance ratio on the data set of an experiment with 12 trials [27]. It is demonstrated that in the majority of the 72 cases of rank combinations and score combinations (12 × 2 × 3 = 72) (see Fig. 8a–c), combination of two visual systems, based on weighted means M ₂ or M ₃, can outperform each of the individual systems only if they each perform relatively well (with higher performance ratio) and they are diverse (with high cognitive diversity).

In an earlier work by Hsu and Taksa [12], it was shown that under certain conditions, rank combination can be better than score combination. In the current study, each of the six trials (Trials 1, 2, 5, 6, 9, and 10) has higher diversity than the remaining six trials. Similar to the results in [12], the six trials do have better rank combination (D) than score combination (C). It is also interesting to note that improvement in the other six trials was carried out by rank combination only (Trial 3, 4, 7, 8, 11, and 12). In other cases, whenever score combination (C) improves P and Q, rank combination (D) can also improve. All these indicate that the CFA framework, which uses score and rank combination, is robust in analyzing combination and decision problems for visual cognition systems.

In the combination of decisions or visual cognition systems, as well as the integration of signals from different sensors, statistical means or weighted means such as M ₁, M ₂, or M ₃ are often used [1, 3, 4, 5, 8]. It has been observed in these previous studies that M ₃, using 1/σ ²_P (or 1/σ ²_Q) as the weight assigned to system P (or Q), provides better combination results. In our current study, when comparing M ₁, M ₂, and M ₃ in each of the 12 trials, it is shown that M ₃ is better than M ₁ and M ₂ in 6 of the 12 trials, while M ₁ and M ₂ are the best in 5 and 1 of the 12 trials respectively, independent of the performance of P and Q. So our current study supports that observation. However, when comparing improvements of M _i over P and Q, it was shown in our study that the statistical means M ₁, M ₂, and M ₃ can improve P and Q in 4, 3, and 3 trials, respectively (see Table 3). On the other hand, the CFA framework (C or D) based on M₁, M₂, or M₃ can improve P and Q in 8, 7, or 6 trials. All these indicate that the CFA framework is a viable analytic method in combining visual cognition systems and can be generalized to analyze data in bioinformatics and neuroscience.

In summary, our CFA framework provides two criteria: performance ratio and cognitive diversity to guide us to combine two visual cognition systems with confidence radii. In the case of unsupervised learning or when the performance cannot be evaluated (e.g., the location of A is not known), cognitive diversity itself can be used to direct us when to combine (when the cognitive diversity is big enough) or how to combine (use rank combination or score combination) (see [12, 14, 21, 22, and 23]).

Our future work includes the following:

Apply CFA framework to the combination of more than two visual systems;
Study the effect of the number of partition intervals in the common visual space defined by P′Q′;
Use other diversity measurements such as Pearson’s correlation (between two score functions s _A and s _B) and Kendall’s tau (see [29]) or Spearman’s rho (between two rank functions r _A and r _B); and
Apply CFA framework to combination of multiple sensing systems or combination of multi-modal physiological systems.

Acknowledgments

We thank two anonymous references for helpful comments and suggestions which led to improvement of the manuscript. DFH is partially supported by a travel fund provided by DIMACS and CCICADA at Rutgers University.

Biographies

Amy Batallones

received her Bachelor of Science in Computer Science at Fordham University in New York. Her current research interests center around the methods of combination of visual cognition systems using informatics and information fusion.

Kilby Sanchez

received a B.A. and an M.S., both in computer science, from Fordham University. His research interests include the application of combinatorial techniques in econometrics, economic forecasting, decision making, and financial modeling. He is also interested in the design of intelligent motion and signal detection systems using machine learning and data mining.

Brian Mott

received a B.A. from the University of Buffalo and an M.S. from Fordham University. He is currently continuing his research in combinatorial fusion for financial decision makers. He is also interested in predictive analytics in the financial field.

Cameron Coffran

received a B.S. and an M.S., both in computer science, from Fordham University. His current work involves clinical bioinformatics and software engineering for time series analysis. He is also interested in further work related to combinatorial fusion and the management of large clinical data.

D. Frank Hsu

Ph.D., is the Clavius Professor of Science and a Professor of Computer and Information Science at Fordham University in New York, NY. Dr. Hsu received an M.S. from the University of Texas at El Paso, Texas, and a Ph.D. from the University of Michigan. His research areas include combinatorics and algorithms; network interconnections and communications; and informatics and analytics. He has co-developed “combinatorial fusion algorithm” and showed its applications in a variety of domain areas including cognitive science, bioinformatics, virtual screening, target tracking, and multimodal feature fusion. He has served on several editorial boards including Brain Informatics, Health Information Science (book monograph series), IEEE Transactions on Computer, International Journal of Foundation of Computer Science, Journal of Advanced Mathematics and Applications, and Journal of Interconnection Networks.

Contributor Information

Amy Batallones, Email: abatallones@fordham.edu.

Kilby Sanchez, Email: kisanchez@fordham.edu.

Brian Mott, Email: bmott290@gmail.com.

Cameron Coffran, Email: cameron@rockefeller.edu.

D. Frank Hsu, Email: hsu@cis.fordham.edu

References

1.Hillis JM, Ernst MO, Banks MS, Landy MS. Combining sensory information: mandatory fusion within, but not between, senses. Science. 2002;298(5598):1627–1630. doi: 10.1126/science.1075396. [DOI] [PubMed] [Google Scholar]
2.Tong F, Meng M, Blake R. Neural basis of binocular rivalry. Trends Cogn Sci. 2006;10(11):502–511. doi: 10.1016/j.tics.2006.09.003. [DOI] [PubMed] [Google Scholar]
3.Ernst MO, Banks MS. Humans integrate visual and haptic information in a statistically optimal fashion. Nature. 2002;415:429–433. doi: 10.1038/415429a. [DOI] [PubMed] [Google Scholar]
4.Ernst MO. Learning to integrate arbitrary signals from vision and touch. J Vis. 2007;7(5):7. doi: 10.1167/7.5.7. [DOI] [PubMed] [Google Scholar]
5.Ernst MO. Decisions made better. Science. 2010;329(5995):1022–1023. doi: 10.1126/science.1194920. [DOI] [PubMed] [Google Scholar]
6.Gepshtein S, Burge J, Ernst O, Banks S. The combination of vision and touch depends on spatial proximity. J Vis. 2009;5(11):1013–1023. doi: 10.1167/5.11.7. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Lunghi C, Binda P, Morrone C. Touch disambiguates rivalrous perception at early stages of visual analysis. Curr Biol. 2010;20(4):R143–R144. doi: 10.1016/j.cub.2009.12.015. [DOI] [PubMed] [Google Scholar]
8.Bahrami B, Olsen K, Latham P, Roepstroff A, Rees G, Frith C. Optimally interacting minds. Science. 2010;329(5995):1081–1085. doi: 10.1126/science.1185718. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Kepecs A, Uchida N, Zariwala H, Mainen Z. Neural correlates, computation and behavioural impact of decision confidence. Nature. 2008;455:227–231. doi: 10.1038/nature07200. [DOI] [PubMed] [Google Scholar]
10.Gold JI, Shadlen N. The neural basis of decision making. Annu Rev Neurosci. 2007;30:535–574. doi: 10.1146/annurev.neuro.29.051605.113038. [DOI] [PubMed] [Google Scholar]
11.Koriat A. When are two heads better than one. Science. 2012;336:360–362. doi: 10.1126/science.1216549. [DOI] [PubMed] [Google Scholar]
12.Hsu DF, Taksa I. Comparing rank and score combination methods for data fusion in information retrieval. Inf Retrieval. 2005;8(3):449–480. doi: 10.1007/s10791-005-6994-4. [DOI] [Google Scholar]
13.Hsu DF, Chung YS, Kristal BS. Combinatorial fusion analysis: methods and practice of combining multiple scoring systems. In: Hsu HH, editor. Advanced data mining technologies in bioinformatics. Hershey: Idea Group Inc; 2006. pp. 1157–1181. [Google Scholar]
14.Hsu DF, Kristal BS, Schweikert C. Rank-score characteristics (RSC) function and cognitive diversity. Brain Inform. 2010;6334:42–54. doi: 10.1007/978-3-642-15314-3_5. [DOI] [Google Scholar]
15.Deng Y, Hsu DF, Wu Z, Chu CH (2012) Combining multiple sensor features for stress detection using combinatorial fusion. J Interconnect Netw 13(03n04)
16.Deng Y, Wu Z, Chu CH, Zhang Q, Hsu DF (2013) Sensor feature selection and combination for stress identification using combinatorial fusion. Int J Adv Rob Syst 10
17.Liu CY, Tang CY, Hsu DF (2013) Comparing system selection methods for the combinatorial fusion of multiple retrieval systems. J Interconnect Netw 14(01)
18.Li Y, Hsu DF, Chung SM (2013) combination of multiple feature selection methods for text categorization by using combinatorial fusion analysis and rank-score characteristic. Int J Artif Intell Tools 22(02)
19.Lin K-L, Lin C-Y, Huang C-D, Chang H-M, Yang C-Y, Lin C-T, Tang CY, Hsu DF. Feature selection and combination criteria for improving accuracy in protein structure prediction. IEEE Trans Nanobiosci. 2007;6(2):186–196. doi: 10.1109/TNB.2007.897482. [DOI] [PubMed] [Google Scholar]
20.Liu H, Wu ZH, Zhang X, Hsu DF. A skeleton pruning algorithm based on information fusion. Pattern Recogn Lett. 2013;34(10):1138–1145. doi: 10.1016/j.patrec.2013.03.013. [DOI] [Google Scholar]
21.Lyons DM, Hsu DF. Combining multiple scoring systems for target tracking using rank–score characteristics. Inf Fusion. 2009;10(2):124–136. doi: 10.1016/j.inffus.2008.08.009. [DOI] [Google Scholar]
22.Schweikert C, Brown S, Tang Z, Smith PR, Hsu DF. Combining multiple ChIP-seq peak detection systems using combinatorial fusion. BMC Genom. 2012;13(Suppl 8):S12. doi: 10.1186/1471-2164-13-S8-S12. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Yang JM, Chen YF, Shen TW, Kristal BS, Hsu DF. Consensus scoring criteria for improving enrichment in virtual screening. J Chem Inf Model. 2005;45:1134–1146. doi: 10.1021/ci050034w. [DOI] [PubMed] [Google Scholar]
24.McMunn-Coffran C, Paolercio E, Liu H, Tsai R, Hsu DF (2011) Joint decision making in visual cognition using Combinatorial Fusion Analysis. In: Proceedings of the 10th IEEE international conference on cognitive informatics and cognitive computing, 254–261
25.McMunn-Coffran C, Paolercio E, Fei Y, Hsu DF (2012) Combining visual cognition systems for joint decision making using combinatorial fusion. In: Proceedings of the 11th IEEE international conference on cognition informatics and cognition computing, pp 313–322
26.Paolercio E, McMunn-Coffran C, Mott B, Hsu DF, Schweikert C (2013) Fusion of two visual perception systems utilizing cognitive diversity. In: Proceedings of the 12th IEEE international conference on cognitive informatics and cognitive computing, pp 226–235
27.Batallones A, McMunn-Coffran C, Mott B, Sanchez K, Hsu DF (2012) Comparative study of joint decision-making on two visual cognition systems using combinatorial fusion. Active Media Technology. Lecture Notes in Computer Science, Series No. 7669, pp 215–225
28.Batallones A, McMunn-Coffran C, Mott B, Sanchez K, Hsu DF (2013) Combining two visual cognition systems using confidence radius and combinatorial fusion. Brain and Health Informatics. Lecture Notes in Computer Science, Series No. 8211, pp 72–81
29.Ng KB, Kantor PB. Predicting the effectiveness of naive data fusion on the basis of system characteristics. J Am Soc Inform Sci. 2000;51(12):1177–1189. doi: 10.1002/1097-4571(2000)9999:9999<::AID-ASI1030>3.0.CO;2-E. [DOI] [Google Scholar]

[CR1] 1.Hillis JM, Ernst MO, Banks MS, Landy MS. Combining sensory information: mandatory fusion within, but not between, senses. Science. 2002;298(5598):1627–1630. doi: 10.1126/science.1075396. [DOI] [PubMed] [Google Scholar]

[CR2] 2.Tong F, Meng M, Blake R. Neural basis of binocular rivalry. Trends Cogn Sci. 2006;10(11):502–511. doi: 10.1016/j.tics.2006.09.003. [DOI] [PubMed] [Google Scholar]

[CR3] 3.Ernst MO, Banks MS. Humans integrate visual and haptic information in a statistically optimal fashion. Nature. 2002;415:429–433. doi: 10.1038/415429a. [DOI] [PubMed] [Google Scholar]

[CR4] 4.Ernst MO. Learning to integrate arbitrary signals from vision and touch. J Vis. 2007;7(5):7. doi: 10.1167/7.5.7. [DOI] [PubMed] [Google Scholar]

[CR5] 5.Ernst MO. Decisions made better. Science. 2010;329(5995):1022–1023. doi: 10.1126/science.1194920. [DOI] [PubMed] [Google Scholar]

[CR6] 6.Gepshtein S, Burge J, Ernst O, Banks S. The combination of vision and touch depends on spatial proximity. J Vis. 2009;5(11):1013–1023. doi: 10.1167/5.11.7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Lunghi C, Binda P, Morrone C. Touch disambiguates rivalrous perception at early stages of visual analysis. Curr Biol. 2010;20(4):R143–R144. doi: 10.1016/j.cub.2009.12.015. [DOI] [PubMed] [Google Scholar]

[CR8] 8.Bahrami B, Olsen K, Latham P, Roepstroff A, Rees G, Frith C. Optimally interacting minds. Science. 2010;329(5995):1081–1085. doi: 10.1126/science.1185718. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Kepecs A, Uchida N, Zariwala H, Mainen Z. Neural correlates, computation and behavioural impact of decision confidence. Nature. 2008;455:227–231. doi: 10.1038/nature07200. [DOI] [PubMed] [Google Scholar]

[CR10] 10.Gold JI, Shadlen N. The neural basis of decision making. Annu Rev Neurosci. 2007;30:535–574. doi: 10.1146/annurev.neuro.29.051605.113038. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Koriat A. When are two heads better than one. Science. 2012;336:360–362. doi: 10.1126/science.1216549. [DOI] [PubMed] [Google Scholar]

[CR12] 12.Hsu DF, Taksa I. Comparing rank and score combination methods for data fusion in information retrieval. Inf Retrieval. 2005;8(3):449–480. doi: 10.1007/s10791-005-6994-4. [DOI] [Google Scholar]

[CR13] 13.Hsu DF, Chung YS, Kristal BS. Combinatorial fusion analysis: methods and practice of combining multiple scoring systems. In: Hsu HH, editor. Advanced data mining technologies in bioinformatics. Hershey: Idea Group Inc; 2006. pp. 1157–1181. [Google Scholar]

[CR14] 14.Hsu DF, Kristal BS, Schweikert C. Rank-score characteristics (RSC) function and cognitive diversity. Brain Inform. 2010;6334:42–54. doi: 10.1007/978-3-642-15314-3_5. [DOI] [Google Scholar]

[CR15] 15.Deng Y, Hsu DF, Wu Z, Chu CH (2012) Combining multiple sensor features for stress detection using combinatorial fusion. J Interconnect Netw 13(03n04)

[CR16] 16.Deng Y, Wu Z, Chu CH, Zhang Q, Hsu DF (2013) Sensor feature selection and combination for stress identification using combinatorial fusion. Int J Adv Rob Syst 10

[CR17] 17.Liu CY, Tang CY, Hsu DF (2013) Comparing system selection methods for the combinatorial fusion of multiple retrieval systems. J Interconnect Netw 14(01)

[CR18] 18.Li Y, Hsu DF, Chung SM (2013) combination of multiple feature selection methods for text categorization by using combinatorial fusion analysis and rank-score characteristic. Int J Artif Intell Tools 22(02)

[CR19] 19.Lin K-L, Lin C-Y, Huang C-D, Chang H-M, Yang C-Y, Lin C-T, Tang CY, Hsu DF. Feature selection and combination criteria for improving accuracy in protein structure prediction. IEEE Trans Nanobiosci. 2007;6(2):186–196. doi: 10.1109/TNB.2007.897482. [DOI] [PubMed] [Google Scholar]

[CR20] 20.Liu H, Wu ZH, Zhang X, Hsu DF. A skeleton pruning algorithm based on information fusion. Pattern Recogn Lett. 2013;34(10):1138–1145. doi: 10.1016/j.patrec.2013.03.013. [DOI] [Google Scholar]

[CR21] 21.Lyons DM, Hsu DF. Combining multiple scoring systems for target tracking using rank–score characteristics. Inf Fusion. 2009;10(2):124–136. doi: 10.1016/j.inffus.2008.08.009. [DOI] [Google Scholar]

[CR22] 22.Schweikert C, Brown S, Tang Z, Smith PR, Hsu DF. Combining multiple ChIP-seq peak detection systems using combinatorial fusion. BMC Genom. 2012;13(Suppl 8):S12. doi: 10.1186/1471-2164-13-S8-S12. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Yang JM, Chen YF, Shen TW, Kristal BS, Hsu DF. Consensus scoring criteria for improving enrichment in virtual screening. J Chem Inf Model. 2005;45:1134–1146. doi: 10.1021/ci050034w. [DOI] [PubMed] [Google Scholar]

[CR24] 24.McMunn-Coffran C, Paolercio E, Liu H, Tsai R, Hsu DF (2011) Joint decision making in visual cognition using Combinatorial Fusion Analysis. In: Proceedings of the 10th IEEE international conference on cognitive informatics and cognitive computing, 254–261

[CR25] 25.McMunn-Coffran C, Paolercio E, Fei Y, Hsu DF (2012) Combining visual cognition systems for joint decision making using combinatorial fusion. In: Proceedings of the 11th IEEE international conference on cognition informatics and cognition computing, pp 313–322

[CR26] 26.Paolercio E, McMunn-Coffran C, Mott B, Hsu DF, Schweikert C (2013) Fusion of two visual perception systems utilizing cognitive diversity. In: Proceedings of the 12th IEEE international conference on cognitive informatics and cognitive computing, pp 226–235

[CR27] 27.Batallones A, McMunn-Coffran C, Mott B, Sanchez K, Hsu DF (2012) Comparative study of joint decision-making on two visual cognition systems using combinatorial fusion. Active Media Technology. Lecture Notes in Computer Science, Series No. 7669, pp 215–225

[CR28] 28.Batallones A, McMunn-Coffran C, Mott B, Sanchez K, Hsu DF (2013) Combining two visual cognition systems using confidence radius and combinatorial fusion. Brain and Health Informatics. Lecture Notes in Computer Science, Series No. 8211, pp 72–81

[CR29] 29.Ng KB, Kantor PB. Predicting the effectiveness of naive data fusion on the basis of system characteristics. J Am Soc Inform Sci. 2000;51(12):1177–1189. doi: 10.1002/1097-4571(2000)9999:9999<::AID-ASI1030>3.0.CO;2-E. [DOI] [Google Scholar]

PERMALINK

On the combination of two visual cognition systems using combinatorial fusion

Amy Batallones

Kilby Sanchez

Brian Mott

Cameron Coffran

D Frank Hsu

Abstract

Introduction

The CFA framework for combining two visual cognition systems

Computing various statistical means

Converting each visual cognition system to a scoring system

Set common visual space

Fig. 1.

Treat P and Q as two scoring systems p and q

Fig. 2.

Fig. 3.

Combining scoring systems p and q using both score and rank combination

Fig. 4.

Cognitive diversity and performance ratio

Cognitive diversity

Fig. 5.

Performance ratio

Example

Data set

Table 1.

Combination results and analysis

Table 2.

Table 3.

Table 4.

Fig. 6.

Fig. 7.

Positive cases versus Negative cases

Table 5.

Fig. 8.

Summary and future work

Acknowledgments

Biographies

Amy Batallones

Kilby Sanchez

Brian Mott

Cameron Coffran

D. Frank Hsu

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases