Skip to main content
Proceedings. Mathematical, Physical, and Engineering Sciences logoLink to Proceedings. Mathematical, Physical, and Engineering Sciences
. 2019 Aug 21;475(2228):20180897. doi: 10.1098/rspa.2018.0897

Quantitative classification of vortical flows based on topological features using graph matching

Paul S Krueger 1,, Michael Hahsler 2, Eli V Olinick 2, Sheila H Williams 1, Mohammadreza Zharfa 1
PMCID: PMC6735483  PMID: 31534418

Abstract

Vortical flow patterns generated by swimming animals or flow separation (e.g. behind bluff objects such as cylinders) provide important insight to global flow behaviour such as fluid dynamic drag or propulsive performance. The present work introduces a new method for quantitatively comparing and classifying flow fields using a novel graph-theoretic concept, called a weighted Gabriel graph, that employs critical points of the velocity vector field, which identify key flow features such as vortices, as graph vertices. The edges (connections between vertices) and edge weights of the weighted Gabriel graph encode local geometric structure. The resulting graph exhibits robustness to minor changes in the flow fields. Dissimilarity between flow fields is quantified by finding the best match (minimum difference) in weights of matched graph edges under relevant constraints on the properties of the edge vertices, and flows are classified using hierarchical clustering based on computed dissimilarity. Application of this approach to a set of artificially generated, periodic vortical flows demonstrates high classification accuracy, even for large perturbations, and insensitivity to scale variations and number of periods in the periodic flow pattern. The generality of the approach allows for comparison of flows generated by very different means (e.g. different animal species).

Keywords: vortices, wakes, topology, graph matching, Gabriel graph

1. Introduction

A variety of flow configurations including flow around blocks and oscillating cylinders [14] and fluid motion resulting from animal locomotion [512] are well known to produce a range of distinct vortical flow patterns depending on the flow generating conditions. Categorization of flow fields according to general patterns observed has been helpful in describing the flow behaviour as the vortical arrangement is related to the vortex generation mechanisms and directly determines system dynamics such as drag (e.g. in flow around cylinders) or propulsive performance (e.g. in animal locomotion). Often flow field categorization is accomplished qualitatively based on flow field images, but this approach is subjective, difficult to apply as flow fields become more complex (e.g. due to a larger number of degrees of freedom in the flow generating system), and cannot quantitatively assess how similar flows are to one another in the absence of measurements spanning the full parameter space of the flow generating system. The latter is common in animal studies as animals typically perform only a subset of all possible motions in a natural swimming or flying environment.

Salient features of a flow field are indicated by critical points of the fluid velocity field, u. With the location and character of critical points known, they can be used to reconstruct the basic structure of the flow field [13] and characterize flow separation processes that generate vortices [14]. Critical points are locations where |u| = 0 and det(∇u) = 0, making them the only locations in the flow where streamlines can cross. Consequently, they can indicate such features as vortices and divisions between regions of the flow entrained by different vortices. The streamline patterns near the two types of critical points allowable in two-dimensional, incompressible fluid flows (centres and saddle points) are illustrated in figure 1a.

Figure 1.

Figure 1.

Critical points and flow patterns: (a) representative streamline patterns near two types of critical points: centres and saddle points. Centres can have a clockwise (CW) and counter-clockwise (CCW) sense, denoted by blue and red dots, respectively. (b) Flow fields (flow indicated by green streamlines) and critical point patterns for von Kármán and reverse von Kármán vortex streets (repeated vortex patterns) with saddle points denoted by asterisks (*) and a typical repeated unit of the flow pattern enclosed with a dashed box. Orange edge direction vectors, Qij, are illustrated for edges between points i and j in both flow fields. (Online version in colour.)

Using critical points allows the flow field characteristics to be reduced to a set of points and then a quantitative comparison between two flow fields A and B can be made using various point-set distance measures [15] that include distances determined by one-to-one, many-to-one and many-to-many mappings of points in field A to those in field B. Lavin et al. [16] were the first to introduce such a comparison of flow fields based on critical points. They computed the overall flow field similarity using the Earth Mover's Distance, which is a one-to-one comparison between critical points based on the mapping of critical points that minimizes the total computed distance between the paired critical points. In their implementation, Lavin et al. [16] used topological characteristics only (based on the eigenvalues of ∇u at the critical point locations) to determine the distance (dissimilarity) between paired points. They were able to characterize similarities and differences between several simple vector fields constructed to have different types of critical points. Later, Theisel & Weinkauf [17] refined the approach to comparing topological properties of critical points and Batra et al. [18] introduced consideration of streamline connections between critical points (separatrices) in a given flow field as an additional element to use in comparison of flow fields, with apparently improved ability to distinguish flow field differences. None of these approaches considered the geometrical location of the critical points in the analysis. Conversely, Depardon et al. [19] used the Euclidean distance between paired critical points in two fields to assess flow field similarity, which allowed characterization of the time evolution of flows downstream of the same physical system.

While successful in introducing the concept of quantitative flow field comparison, the approaches discussed above have several limitations. First, a comparison approach that considers only topological features (as in Lavin et al. [16]) can miss important flow characteristics. A simple example is von Kármán and reverse von Kármán vortex streets, illustrated in figure 1b. They are topologically equivalent (just repeated patterns of clockwise (CW) and counter-clockwise (CCW) vortices), and hence would not be distinguished if only topological properties of critical points are considered, but the von Kármán vortex street is characteristic of drag while a reverse von Kármán vortex street indicates thrust production. Second, considering critical point location alone (as in Depardon et al. [19]) does not account for simple transformations like rotation or scaling of the flow field so that flows which are topologically equivalent might appear different under this scheme. Finally, one-to-one mapping between critical points requires that flow fields with different numbers of critical points will appear different, even if both represent the same repeated flow pattern but one contains a larger portion of the pattern (more repeated units) than the other. Repeated patterns are common in flow fields associated with both drag and propulsion, and a typical repeated unit of the flow patterns shown in figure 1b is illustrated by a dashed box.

Given these shortcomings, it is worth considering general properties that would be desirable for a flow field comparison scheme. In addition to the basic mathematical properties desirable for a distance measure (i.e. positive definite, commutative and zero when computing the distance between a flow field and itself), the following properties are also desirable for comparing flow fields:

  • (i)

    Sensitivity to dominant topological structure. That is, the distance measure should contemplate differences in topological features (e.g. critical point type), but be insensitive to rotation, global scaling and (if possible) global stretching of the flow fields.

  • (ii)

    Sensitivity to structural geometric variations. Specifically, trading locations of critical points of different types (such as in transforming a von Kármán to a reverse von Kármán vortex street) should be recognized.

  • (iii)

    Robustness to minor variations in critical point location.

The objective of this paper is to introduce a flow field comparison approach using graph theory that incorporates the properties listed above and test it on representative 2D flow fields.

2. Graph theory for flow field pattern identification

A graph G is an ordered pair G = (V, E), where V is a set of points/vertices and E is a set of edges representing connections between two elements of V [20]. Some recent work has used graphs combined with tools from network theory to analyse complex fluid flows. Examples include point vortex interactions [21] and turbulent flows [2224]. In these approaches, the graph spanned large numbers of points/elements in the flow and was constructed to represent relevant physics such as interactions between vortices or correlations between fluid motion at different points in the flow. Application of network theory then helped to identify key flow features such as vortex clusters or coherent spatial patterns.

While valuable, network theory approaches require a large graph to provide meaningful results (due to their statistical nature) and focus on analysing the features of a given flow field. By contrast, the approach developed here emphasizes using relatively few, but dominant, flow features to compare different flow fields and assess their overall similarity. To achieve this, V is selected as the critical points identified in a flow field, as they are the chosen features of interest which encode the relevant flow physics (although features identified by the network theory approach could also be used with appropriate modifications to the algorithm described in the following).

To capture the structure and arrangement of the critical points (vertices), the graph edges should be constructed based on proximity of vertices and concentrate on local, rather than global, structure. A family of graph structures with this property that is widely used in computer vision and pattern classification is called the proximity graph family [25]. It includes relative neighbourhood graphs, β skeletons and Gabriel graphs, all of which are related to the well-known Delaunay triangulation and minimum spanning trees.

(a). Weighted gabriel graphs

Here, we focus on Gabriel graphs [26,27], for which (undirected) edges are constructed according to the following definition:

for i,jV,{i,j}Ed2(i,j)<d2(i,k)+d2(k,j)kV such that ki,j, 2.1

where d(i, j) is the Euclidean distance between i and j. When V represents a set of points in Euclidean space (e.g. critical points in a flow field), (2.1) states that i and j are adjacent (i.e. {i, j}∈E) if, and only if, the closed disc having the line segment between i and j as a diameter contains no vertices, as in the left-most example in figure 3.

Figure 3.

Figure 3.

Edge weights in a weighted Gabriel graph. (Online version in colour.)

The Gabriel graph gives a quick view of the overall structure of the vertices based on their proximity to one another (see figure 2 for an example Gabriel graph). Note that the graph only retains local structure, i.e. connects vertex pairs without vertices in between. A drawback of the construction of Gabriel graphs is that if the points are shifted slightly, as happens regularly in fluid flows, then the graph structure (i.e. the edges) can also change, sometimes significantly, even though the overall topology of the flow is essentially the same. Thus, comparisons of flow fields based purely on the Gabriel graph structure would not be robust. To address this shortcoming for the present application, we introduce the new concept of a weighted Gabriel graph defined as the set of edges E formed from all possible connections of elements of V (the complete graph), with elements of E having a ‘weight’ property Hij∈[0, 1] that measures the extent to which {i, j} violates (2.1). Edges in the (true/unweighted) Gabriel graph have weight zero while edges that maximally violate (2.1) have weight one. If there are vertices on the boundary of the disc defined by i and j, but no vertices inside the disc, then Hij = 0. If there are vertices inside the disc, then Hij is determined by the vertex that is closest to the line segment between i and j. As shown in figure 3, Hij = 1 if there is a vertex halfway between i and j, and 0 < Hij < 1 if the nearest vertex is elsewhere inside the disc. Formally, we define the violation of the Gabriel condition as

Hij2(1min(mink(d2(i,k)+d2(k,j)d2(i,j)),1)), 2.2

for vertex pair i, jV . The weights Hij are likely to exhibit only small variations with minor changes in the locations of elements of V and therefore, comparison of graphs based on Hij should be robust (relating to Property (iii) above). Moreover, the Hij are normalized by the local geometric scale and are therefore independent of uniform scaling (relating to Property (i) above). The proposed definition of a weighted Gabriel graph extends the concept of a Gabriel graph in a way conceptually similar to k-Gabriel graphs [28,29], but the present definition is more versatile in that each edge has a new property Hij∈[0, 1] associated with it, allowing quantitative comparison of edge relationships between graphs.

Figure 2.

Figure 2.

The Gabriel graph for the von Kármán vortex street shown in figure 1b. The flow streamlines are not shown to avoid clutter. The critical point types are represented using the same symbols as in figure 1. (Online version in colour.)

A weighted Gabriel graph does not explicitly consider streamline connections between saddle points and other critical points (separatrices), though such connections can be used to create an attributed, relational graph (ARG) as in Batra et al. [18]. While separatrices provide valuable information about the flow topology, a given separatrix is not guaranteed to connect with another critical point in the frame, so it is possible for a graph based on separatrices to be very sparse and hence, contain little information for comparison. Moreover, the computed location of separatrices is sensitive to noise and flow field distortions, so a comparison using separatrix connections to critical points is not likely to be robust.

(i). Edge direction

To characterize structural geometric variations such as reflections (Property (ii)), a direction vector is computed for each graph edge according to

Qij(SjSi)(xjxi), 2.3

where xi, xj are the position vectors of vertices i and j, and S = 1, 0, or −1 for CCW centres, saddle points, and CW centres, respectively. Figure 1b illustrates Qij for corresponding vertices in the von Kármán and reverse von Kármán flow fields, showing that Q changes direction for these edges, as desired, if the flow pattern is reflected about the free stream flow direction (horizontal for these flows). Note that edges connecting critical points of the same type have no direction as transposing the critical points would not change the topology. For purposes of comparing graphs, aligned edges (positive dot product of the respective edge vectors Q) are preferred.

3. Methods

(a). Flow field data

In order to accurately test classification of flow fields, it was important to have data for which the flow pattern was known (constructed). To this end, two-dimensional, incompressible flow fields were generated by placing potential vortices in a domain according to the patterns shown in figures 1b and 4. For each vortex arrangement, the associated velocity vector field was computed in the presence of a uniform upstream flow velocity selected to be somewhat smaller (25–50%) than the induced velocity between a representative pair of positive and negative vortices. The velocity field was computed on a uniform grid of resolution similar to that expected from experimental flow field measurements.

Figure 4.

Figure 4.

Flow patterns for squid fin swimming modes: (a) mode 1, (b) mode 2, (c) mode 3 and (d) mode 4. Two repeating units (M = 2) are shown in all cases. The second repeated unit is enclosed in a dashed box. The symbols follow those in figure 1. Also, for scale comparison purposes, D is the same as defined in figure 1b. (Online version in colour.)

While flow patterns in figure 1b represent simple flow patterns associated with objects producing drag or thrust, the patterns of vortices in figure 4 were constructed to be qualitatively similar to the wake signatures associated with the four swimming modes identified in squid fin propulsion by Stewart et al. [10]. The nominal magnitude of circulation of vortices was the same in all cases except for the vortices in the bottom of the flow pattern for squid fin mode 3, which were larger than the rest by a factor of 2. Two versions of the patterns in figure 1b were created, with one being a uniformly larger-scaled version of the other. In total, six different flow patterns were generated, but two of the patterns had two configurations of different scale.

To allow for testing classification of flow fields in the presence of disturbances, ten sets of a random number of instances (between 3 and 10 with a median value of 8) of each flow pattern shown in figure 1b (including both scaled versions) and figure 4 were generated with random shifts in the origin, random variations in the position (in xx and yy, separately) and/or strength of the vortices of magnitude up to a specified perturbation amplitude, and a random number (M, between 2 and 4) of repeated units for each instance. As the flow patterns consisted of groupings of positive and negative vortices, the perturbation amplitude for vortex position was specified as a fraction of the separation between representative pairs of vortices (e.g. R in figure 1b).

The critical points were determined from the computed velocity vector fields following the method of Depardon et al. [4]. This approach works equally well for viscous flows and experimental data, even though only inviscid potential flows are investigated here. For experimental data, pre-processing using an appropriate filtering approach may be required to remove noise and to eliminate spurious critical points that do not represent the dominant flow features (Depardon et al. [4]).

The first step of the critical point detection algorithm creates a histogram of the vector angle in a 5 × 5 grid point search window surrounding each point in the flow. Points where 75% or more of the histogram bins were non-empty were classified as possible critical points (histograms of eight bins over the range 0 − 2π were used), producing groups of points (neighbourhoods) in which critical point(s) existed. The neighbourhoods were further divided into subdomains containing only one critical point by repeated calculation of the Poincaré–Bendixson index over subdivisions of each neighbourhood. Subgrid location of the critical points in the identified subdomains was determined using area integrals specific to the critical point properties (see Depardon et al. [4]). Properties of the critical points were determined by computing the eigenvalues of ∇u at the critical point locations, but the only properties used in this investigation were the classification (saddle point or centre) and the sense of rotation (for centres).

(b). Graph matching solution

Each instance of a flow pattern described above is a unique flow field that can be compared with others in order to classify all flow fields in the dataset into groups containing similar flow patterns. The required flow field data for the graph matching algorithm described in this subsection is the locations and types of the critical points. The algorithm was applied to all pairs of flow fields in a given dataset in order to populate a distance matrix for input to the clustering algorithm described in the next subsection.

For purposes of the graph matching algorithm, a matching is defined as a one-to-one pairing of a subset of the critical points in flow field A with a subset of the critical points in flow field B according to type (saddle, CW centre, CCW centre), in accordance with Property (i). The size of a matching is the number of points matched in each flow field, n=min(SDA,SDB)+min(CWA,CWB)+min(CCWA,CCWB), where SDf, CWf and CCWf, are the number of saddle, CW and CCW centre points in frame f, respectively. Pairing (matching) critical points implies a matching of edges in the weighted Gabriel graphs of the two flow fields. Specifically, if critical points i and j in A are matched to critical points k and ℓ in B, respectively, then edges {i, j} and {k, ℓ} are matched from the respective weighted Gabriel graphs of A and B. For example, figure 5 illustrates a matching of size 8 and shows four of the 28 pairs of matched edges, 〈1A, 1B〉, 〈2A, 2B〉, 〈3A, 3B〉 and 〈4A, 4B〉. For a pair of matched edges {i, j} and {k, ℓ}, a direction penalty is defined as zero if Qij · Qk≥0 and one otherwise, and a Gabriel penalty is defined as (Hij − Hk)2. The procedure to compute the distance between flow fields A and B has four main steps:

  • (1)

    Calculate n. Let m = (n)(n − 1)/2 denote the number edges matched from each weighted Gabriel graph.

  • (2)

    Find a matching with size n that minimizes the total direction penalty for the matched edges, pd.

  • (3)

    Find a matching with size n and direction penalty pd that minimizes the total Gabriel penalty for the matched edges, pG.

  • (4)

    Compute the distance between the flow fields as the average Gabriel penalty, DAB = pG/m∈[0, 1].

Figure 5.

Figure 5.

Example matching with fields of two and three repeated units. Both frame A (M = 2) and frame B (M = 3) are reverse von Kármán vortex streets with 15% vortex position perturbation. Solid lines indicate matched critical points and dashed lines are graph edges. Only some of the graph edges are shown for clarity. (Online version in colour.)

The matching problems described above were formulated as binary integer programs (BIPs). For the BIPs, xij is defined as a binary variable indicating whether or not critical point i in flow field A is matched with critical point j in flow field B, and yijk is a binary variable indicating whether or not edge {i, j} in the weighted Gabriel graph for flow field A is matched with edge {k, ℓ} in the weighted Gabriel graph for flow field B. Let P denote the set of pairs of critical points {iA, jB} such that i and j are the same type (saddle, CW centre, CCW centre), and let M=(iA,jA,kB,B:{i,k}Pand{j,}P) be the set of four-tuples corresponding to matchable edges, and let M0M and M1M be the sets of matchable edge pairs with direction penalties of zero and one, respectively. The BIP for Step 2 of the matching algorithm is

minpd=(i,j,k,)M1yijk, 3.1
suchthatjB:{i,j}Pxij1iA, 3.2
iA:{i,j}Pxij1jB, 3.3
{i,j}Pxij=n 3.4
andxik+xj1+yijk(i,j,k,)M 3.5

The objective function (3.1) minimizes the direction penalty, and constraint sets (3.2) and (3.3) ensure that each critical point in A is matched with at most one critical point in B, and vice versa. Constraint (3.4) fixes the size of the matching at the value of n found in Step 1. Constraint set (3.5) forces the indicator variable yijk = 1 if critical points i and j in field A are matched to critical points k and ℓ in field B.

Step 3 of the algorithm minimizes the objective function

pG=(i,j,k,)M(HijHk)2yijk, 3.6

subject to (3.2)—(3.5) and an additional constraint to fix the direction penalty to the value found in Step 2.

An ideal matching would simultaneously minimize pd and pG, however any matching algorithm must make a trade-off between the two objectives. Step 2 of the matching algorithm gives priority to pd so that DAB is determined by restricting the solution space to matchings with the minimum possible direction penalty. Preliminary experiments indicated that this approach not only yields more accurate clusters but also reduces the time to compute DAB.

(c). Clustering

Distance information can be used to classify objects when class labels for a reasonably large training set are available (supervised learning) or cluster objects when the desired groups are not a priori known, but learned from the data (unsupervised learning). Here, we focus on clustering which is often used in exploratory data analysis with the goal to group objects based on available data, such that objects in the same cluster (group) are similar to each other, while objects from different groups are dissimilar. We apply this approach to group flow fields using a hierarchical clustering algorithm as follows.

From the pairwise distances between a set of flow fields obtained via graph matching, a distance matrix, DM = [DAB], is assembled. The clustering algorithm starts with each flow field forming its own group and then recursively joins the most similar (least distance) groups of flow fields till only a single group containing all flow fields is left. The distance between two groups of flow fields is measured using complete-linkage, i.e. the largest distance between a flow field from each group. This approach usually leads to stable, balanced clusters. The result of the hierarchical clustering process is a dendrogram, a tree representation of the clustering hierarchy. The tree structure can be inspected to evaluate the similarity between groups of flow fields. Groupings can also be extracted by cutting the tree structure at an appropriate height and looking at the connected subtrees below the cut.

For the experiments conducted in this paper, the number of distinct flow patterns and the original grouping of the flow fields (based on the generating pattern) is known. We, therefore, can evaluate the performance of our approach by comparing the known original grouping of the flow fields with the computed clustering using the known number of clusters (i.e. six, the number of distinct flow patterns). Each grouping represents a partition of the set of flow fields. The agreement of two partitions can be measured using the Rand index [30] which is the fraction of pairs of flow fields for which the two partitions agree (i.e. a pair is either in the same group or different groups in both partitions). Since two partitions can agree by chance, correspondence is typically measured using the adjusted Rand index [30,31], obtained from the Rand index by subtracting the expected agreement of random partitions.

4. Results

To test the implementation of weighted Gabriel graphs with edge direction vectors for comparing and distinguishing flow fields, 10 sets of multiple realizations of the flow fields described in the Methods section were created. The perturbation amplitude was specified as the maximum allowable amplitude in terms of a percent of the nominal circulation or nominal separation between vortices, as appropriate. Only results with two or greater repeated units in the flow patterns are considered here because a flow pattern with only one repeated unit was too generic (typically only four critical points) to form a pattern distinctive enough to be easily distinguished from other cases. The graph matching and clustering algorithms were applied to each of the 10 datasets to provide enough results to evaluate the overall accuracy and reliability of the approach.

(a). Graph matching algorithm

Examples of graph matching are illustrated in figures 6 and 7. The coloured lines indicate critical points that are matched in the graph matching process. The graph edges are not shown because the weighted Gabriel graph includes all possible connections between vertices, which is difficult to interpret visually. As noted in the Methods section, matching critical points implies a matching of edges in the weighted Gabriel graphs of the two flow fields.

Figure 6.

Figure 6.

Graph matching for two flow fields of the same type (reverse von Kármán vortex streets): (a) 0% perturbation amplitude with M = 2 in frame A and M = 3 in frame B and (b) 15% vortex position perturbation amplitude with M = 2 in frame A and M = 3 in frame B. (Online version in colour.)

Figure 7.

Figure 7.

Graph matching for two flow fields of different type with 15% vortex position perturbation and M = 2: (a) reverse von Kármán (frame A) and Squid Fin 2 (frame B) and (b) reverse von Kármán (frame A) and Squid Fin 4 (frame B). (Online version in colour.)

Matching for flow fields of the same type is illustrated in figure 6 for 0% perturbation (figure 6a) and 15% vortex position perturbation (figure 6b). The results of the matching follow the expected trend that similar features are matched together in the order they appear. This is true even if different numbers of repeating units (M) in the periodic pattern are present. Note in particular that both matches shown in figure 6 show the matched elements in the flow field with the greater number of repeated units (frame B) grouped together (i.e. matched features are not separated on opposite ends of the flow field). This result is a natural outcome of finding the best match of the weighted Gabriel graphs and is not an independent constraint imposed by the matching solution. It also appears to be a robust result as figure 6b includes a 15% amplitude perturbation on vortex position. A solution keeping matched features adjacent in frame B that are also adjacent in frame A is desirable since physically one would expect matched features to have a similar relationship to one another if the two flow fields are similar. As the matching algorithm identifies similar features for the flow fields in these figures, the computed dissimilarity is small, giving DAB = 0.0145 and 0.0108 for the matchings shown in figure 6a and b, respectively.

That the matched pattern in frame B is on the left in figure 6a and on the right in figure 6b is not a concern. The algorithm matches to a repeated pattern in a subset of frame B since it contains more repeated units. The ends of the pattern tend to match best because of end effects in the patterns generated, but which end is matched is random for patterns with no perturbation. As the patterns in figure 6b are perturbed, the arrangement on the right in frame B turns out to be the best match based on the specific arrangements for the cases considered.

Matching for flow fields of different type is illustrated in figure 7. For simplicity, all flow fields in figure 7 have the same number of repeated units (M = 2) and a vortex position perturbation amplitude of 15%, but results for other cases are similar. Figure 7 qualitatively illustrates that flow fields of different type do not match well, as is apparent from the crossing of lines for matched critical points in frames A and B (e.g. the red and blue lines in figure 7a). Naturally, the computed dissimilarity for these cases is much higher as well, giving DAB = 0.168 and 0.104 for the matching shown in figure 7a,b, respectively. Some of the dissimilarity (in terms of the graph edge weights Hij) might be reduced by relaxing the constraint that only critical points of the same type can be matched (which would uncross the red and blue lines in figure 7a), but this would be inconsistent with the flow topologies.

(b). Clustering algorithm

Application of the graph matching algorithm to each of the 10 sets of flow fields results in a matrix of pairwise distances, DM = [DAB], for all pairs of frames in the set. Using this information, clustering should be able to group flow fields from the same flow pattern together and produce six groups corresponding to the six distinct repeating flow patterns used to generate the data. Clustering was performed using R [32, v.3.2.3]. The function hclust in the package stats with complete-linkage was used.

Results of the hierarchical clustering are illustrated graphically using dendrograms, which indicate the breakdown (hierarchy) of relationships between groupings of the items. Dendrograms for low position perturbation magnitude (resulting in a clustering with an adjusted Rand index of 1) and high position perturbation magnitude (producing a reduced adjusted Rand index) are shown in figure 8a,b, respectively. Since complete-linkage is used, the height of the branches measure the maximum dissimilarity (in terms of DAB) between all flow fields represented by the branch. Red boxes are drawn around the branches of the dendrograms that identify groups or clusters of similar flow fields. The groups are identified by cutting the dendrogram tree at an appropriate dissimilarity level to produce six groups, equal to the known number of flow patterns considered.

Figure 8.

Figure 8.

Dendrograms for hierarchical clustering of flow fields using the distances produced by the matching algorithm. Red boxes indicate the grouping found by the clustering. Results are for position perturbations: (a) 5% perturbation amplitude (adjusted Rand index = 1) and (b) 20% perturbation amplitude (adjusted rand index = 0.84). (Online version in colour.)

The flow field designations found at the bottom of each tree branch indicate the pattern type (Squid Fin 1, von Kármán 2, etc.) and the number of repeated units contained in the pattern (indicated by the number following M). The last number in the flow field designation is a unique index for identifying different cases of the same type in a set. For clarity, the pattern type for each group is indicated in larger font below each box. Note that both sets of von Kármán and reverse von Kármán vortex streets (denoted ‘1’ and ‘2’) are listed together below the boxes since they have the same pattern, but different scale, so they should be grouped together for correct results.

For small perturbation amplitude (figure 8a), all patterns of a given type are grouped together, resulting in an adjusted Rand index of 1. This includes grouping both sets of von Kármán and reverse von Kármán vortex streets together, indicating scale independence of the results, and grouping together all values of M for a given flow pattern, indicating independence of the results to number of periods of the flow pattern.

For large perturbation amplitude (figure 8b), the range of DAB has increased, as expected given the greater degree of variability in the flow field patterns. Due to the greater variability, not all of the flow fields appear on the proper branches, resulting in reduced accuracy of the classification (adjusted Rand index =0.84). In particular, some of the von Kármán cases are grouped with the Squid Fin 1 cases. As the vortex patterns for these cases are similar, this miss-classification under distortion of the vortex positions is not surprising. The similarity of the von Kármán and Squid Fin 1 flow fields is illustrated qualitatively figures 1 and 5, and also in the dendrogram in figure 8a by the fact that they are joined into a single branch at a relatively low dissimilarity level DAB. This illustrates an additional capability of the presented quantitative classification approach, namely, the ability to not only classify flow fields, but also to determine relative relationships/similarity between different flow field classes.

To evaluate if our approach recovers the six repeating flow patterns, the agreement of the clustering and the known correct grouping was measured using the adjusted Rand index. A value close to one indicates good agreement and a value equal to one indicates perfect agreement. Figure 9 shows the resulting adjusted Rand index for three experiments involving perturbation to the vortex circulations, positions or both, arranged according to perturbation amplitude. The bars in figure 9 indicate the median for the 10 datasets used at each perturbation amplitude, and the error bars demarcate the upper and lower quartiles.

Figure 9.

Figure 9.

Resulting adjusted Rand index for three experiments involving perturbation (%) in (a) vortex position, (b) vortex circulation and (c) both circulation and position. (Online version in colour.)

5. Discussion

The adjusted Rand index in figure 9 illustrates that the matching algorithm is able to identify relationships between the flow patterns in the flow fields. The agreement between the algorithm-identified groupings of flow fields and the known relationships is perfect for small perturbations, and still quite high (greater than 0.75) for all perturbation levels considered. Even at zero perturbation amplitude, a Rand index of one is significant in that it indicates the algorithm is insensitive to uniform scale changes (both scales of the von Kármán and reverse von Kármán cases are properly classified) and to the number of repeated units in a repeating pattern (as illustrated in figures 5 and 6). The decreasing adjusted Rand index for increased perturbation amplitude is expected because as the perturbation magnitude increases, the flow patterns become distorted, allowing for random similarities between different patterns to appear and for differences between similar patterns to increase. At high enough perturbation level, some flow fields in the same nominal group become distorted enough that they no longer look the same and should not be grouped together.

Perturbation in position only (figure 9a) has a relatively weak impact on the accuracy of the matching algorithm. Conversely, perturbation to the vortex circulation has a stronger impact on accuracy (figure 9b). Because perturbations to circulation only impact the locations of the saddle points, this suggests that distortions to only one aspect of the flow significantly influences the matching algorithm. Somewhat counter-intuitively, the results for flow patterns with perturbations to both vortex position and circulation have similar accuracy (when accounting for the error bars) to those with perturbation in position only and in circulation only. This is likely because random perturbations to position and circulation can tend to cancel each other, so that their mutual effect is not as strong as may be expected.

Additional results (not shown) indicate that the matching algorithm is relatively insensitive to shifting the phase of the repeated pattern in one of the two compared frames. For example, removing the two left-most critical points (half of a repeated unit for most cases) from frame A in the data with 15% vortex position perturbation gives a median adjusted Rand index of 0.91 with values of 0.98 and 0.83 at the 75th and 25th percentiles, respectively. These results are only slightly degraded from the original data (figure 9a).

An important consideration is that both saddle points and centres are used in the flow field matching process. The positions of centres are relatively robust, but saddle points are strongly influenced by the frame of reference considered, which may influence the weights Hij or even the number of critical points in the vector field. For purposes of comparison, this sensitivity can be remedied by always comparing cases in the same reference frame (e.g. the reference frame moving with the animal for comparing flow fields generated by swimming animals). Alternatively, the analysis can be restricted to centres. This tends to reduce the accuracy of the method because it reduces the amount of information available for comparison in each flow field. For larger flow fields or flow fields with many vortices, it is expected that the impact of restricting to centres only would be smaller.

Over 480 000 frame distance calculations (DAB) were required to obtain the results summarized in figure 9. The graph matching BIPs were written in AMPL [33] 10.0 and solved with CPLEX [34] 12.6.0.0 on Dell R720 computers with Dual Intel Xeon processors with up to 12 cores running at up to 3.5 GHz. Steps 2 and 3 of the matching algorithm are generalizations of the graph isomorphism problem, the computational complexity of which is an open question [35]. The CPU times for finding the matching with directional penalty (pd in step 2) ranged from 0.001 s to 2.598 min. Finding optimal matchings with the Gabriel penalty (pG in step 3) took the bulk of the total time to calculate the frame distances; individual calculations ranged from 0.001 s to 9.537 h with median and mean times of 1.68 and 6.025 s, respectively. The focus of the present work is on developing a robust method. Future work will consider improving computational efficiency.

An alternative method for pairing of critical points in two different flow velocity fields is called feature flow fields (FFF), and was originally introduced by Theisel & Seidel [36] for tracking critical points in time evolving fluid flows. The method involves constructing an interpolation vector field (the FFF) whose streamlines track the expected motion of the critical points in the two flow fields to be compared, allowing corresponding critical points to be identified along a common streamline in the FFF. The method tends to be more computationally efficient than a correspondence analysis, but it is susceptible to (correctable) integration errors [37] and it may completely fail if the flow fields are too far offset from one another. The latter consideration makes FFFs problematic for general comparison of flow fields generated by different systems or that are distorted by flow variations. For this reason, the matching algorithm presented here involves a direct search for the best correspondence between flow features.

Because the matching algorithm can consider any possible correspondence between edges in the weighted Gabriel graphs describing two flow patterns, it is guaranteed to find the correct distance measure DAB, regardless of the relative similarity between the flow patterns. This allows the method to be applied for comparing flows that are generated by very different means, such as by two different species of swimming animals or by a mechanical and corresponding biological system. In principle, it would also allow patterns in fluid flows to be compared with other patterns found in nature or in man-made systems, making it a powerful tool for classifying patterns and revealing similarities across systems. Moreover, the method is readily generalizable to three-dimensional flows, with the difference that CW and CCW distinctions are less relevant in three dimensions because they depend on the perspective of the observer. As currently implemented, the matching algorithm is insensitive to uniform scaling differences, but may be sensitive to non-uniform scaling (stretching), if it is severe enough. More general pattern classification may require this capability. Future work will investigate utilization of additional geometric information available in the critical point pattern to account for local stretching of the pattern.

6. Conclusion

A robust, graph-based approach for encoding patterns in vortical fluid flows has been developed by introducing a novel generalization of the Gabriel graph that assigns weights Hij∈[0, 1] to each edge in the complete graph, allowing minor variations in the graph structure to appear as variations in the weights. To emphasize essential features of the flow, graphs were constructed using critical points of the fluid velocity field as graph vertices. Comparisons between flow patterns were made by determining the dissimilarity in the associated graphs (computed by minimizing differences in the matched edge weights) with constraints on matching edge vertices that ensured topological similarity and minimized undesirable geometric variations (e.g. reflections). Clustering of several periodic flow fields based on the dissimilarity computed from the graph matching algorithm showed highly accurate classification of flows, even with large perturbations to the flow structure, and insensitivity of the method to inconsequential structural changes such as uniform change in physical scale or number of periods present in the periodic pattern. The generality of the graph matching method allows it to be applied to flows from different sources without an obvious correspondence between features such as the wakes of different species of swimming animals.

Data accessibility

The raw data (velocity vector fields and critical point data) used in this article are publicly available on SMU's digital repository (scholar.smu.edu/engineering_mechanical_research/5/).

Authors' contributions

P.S.K., M.H. and E.V.O. developed the algorithm, and contributed to writing the paper and analysing the data. S.H.W. assisted with developing code to analyse the data and with writing the paper. M.Z. assisted with analysing the results and writing the paper.

Competing interests

We declare we have no competing interests.

Funding

This material is based upon work supported by the National Science Foundation under grant nos 1115139 and 1557698. Support of the Lyle School of Engineering is also gratefully acknowledged.

References

  • 1.Williamson CHK, Roshko A. 1988. Vortex formation in the wake of an oscillating cylinder. J. Fluids Struct. 2, 355–381. ( 10.1016/S0889-9746(88)90058-8) [DOI] [Google Scholar]
  • 2.Becker S, Lienhart H, Durst F. 2002. Flow around three-dimensional obstacles in boundary layers. J. Wind Eng. Ind. Aerodyn. 90, 265–279. ( 10.1016/S0167-6105(01)00209-4) [DOI] [Google Scholar]
  • 3.Gostelow JP, Platzer MF, Carscallen WE. 2005. On vortex formation in the wake flows of transonic turbine blades and oscillating airfoils. J. Turbomach. 128, 528–535. ( 10.1115/1.2184354) [DOI] [Google Scholar]
  • 4.Depardon S, Lasserre JJ, Brizzi LE, Borée J. 2006. Instantaneous skin-friction pattern analysis using automated critical point detection on near-wall PIV data. Meas. Sci. Technol. 17, 1659–1669. ( 10.1088/0957-0233/17/7/004) [DOI] [Google Scholar]
  • 5.Ellington CP, van den Berg C, Willmott AP, Thomas ALR. 1996. Leading-edge vortices in insect flight. Nature 384, 626–630. ( 10.1038/384626a0) [DOI] [Google Scholar]
  • 6.Lauder GV, Drucker EG. 2002. Forces, fishes, and fluids: hydrodynamic mechanisms of aquatic locomotion. News Physiol. Sci. 17, 235–240. ( 10.1152/nips.01398.2002) [DOI] [PubMed] [Google Scholar]
  • 7.Spedding GR, Rosén M, Hedenström A. 2003. A family of vortex wakes generated by a thrush nightingale in free flight in a wind tunnel over its entire natural range of flight speeds. J. Exp. Biol. 206, 2313–2344. ( 10.1242/jeb.00423) [DOI] [PubMed] [Google Scholar]
  • 8.Tytell ED, Lauder GV. 2004. The hydrodynamics of eel swimming, I. Wake structure. J. Exp. Biol. 207, 1825–1841. ( 10.1242/jeb.00968) [DOI] [PubMed] [Google Scholar]
  • 9.Bartol IK, Krueger PS, Stewart WJ, Thompson JT. 2009. Hydrodynamics of pulsed jetting in juvenile and adult brief squid Lolliguncula brevis: evidence of multiple jet ‘modes’ and their implications for propulsive efficiency. J. Exp. Biol. 212, 1889–1903. ( 10.1242/jeb.027771) [DOI] [PubMed] [Google Scholar]
  • 10.Stewart WJ, Bartol IK, Krueger PS. 2010. Hydrodynamic fin function of brief squid, Lolliguncula brevis. J. Exp. Biol. 213, 2009–2024. ( 10.1242/jeb.039057) [DOI] [PubMed] [Google Scholar]
  • 11.Flammang BE, Lauder GV, Troolin DR, Tyson S. 2011. Volumetric imaging of shark tail hydrodynamics reveals a three-dimensional dual-ring vortex wake structure. Proc. R. Soc. B 278, 3670–3678. ( 10.1098/rspb.2011.0489) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Bartol IK, Krueger PS, Jastrebsky RA, Thompson JT. 2016. Volumetric flow imaging reveals the importance of vortex ring formation in squid swimming tail-first and arms-first. J. Exp. Biol. 219, 392–403. ( 10.1242/jeb.129254) [DOI] [PubMed] [Google Scholar]
  • 13.Perry AE, Chong MS. 1994. Topology of flow patterns in vortex motions and turbulence. Appl. Sci. Res. 53, 357–374. ( 10.1007/BF00849110) [DOI] [Google Scholar]
  • 14.Tobak M, Peak DJ. 1982. Topology of three-dimensional separated flows. Annu. Rev. Fluid Mech. 14, 61–85. ( 10.1146/annurev.fl.14.010182.000425) [DOI] [Google Scholar]
  • 15.Eiter T, Mannila H. 1997. Distance measures for point sets and their computation. Acta Informatica 34, 109–133. ( 10.1007/s002360050075) [DOI] [Google Scholar]
  • 16.Lavin Y, Batra R, Hesselink L. 1998. Feature comparisons of vector fields using Earth mover's distance. In Proc. of IEEE Visualizations '98, Research Triangle Park, NC, 18–23 October, pp. 103–109. IEEE.
  • 17.Theisel H, Weinkauf T. 2002. Vector field metrics based on distance measures of first order critical points. J. WSCG 10, 121–128. [Google Scholar]
  • 18.Batra R, Kling K, Hesselink L. 2000. Topology based methods for quantitative comparisons of vector fields. Proc. SPIE, Vis. Data Explor. Anal. VII 3960, 268–279. ( 10.1117/12.378904) [DOI] [Google Scholar]
  • 19.Depardon S, Lasserre JJ, Brizzi LE, Borée J. 2007. Automated topology classification method for instantaneous velocity fields. Exp. Fluids 42, 697–710. ( 10.1007/s00348-007-0277-3) [DOI] [Google Scholar]
  • 20.Ahuja RK, Magnanti TL, Orlin JB. 1993. Network flows: theory, algorithms, and applications. Upper Saddle River, NJ: Prentice-Hall, Inc. [Google Scholar]
  • 21.Nair AG, Taira K. 2015. Network-theoretic approach to sparsified discrete vortex dynamics. J. Fluid Mech. 768, 549–571. ( 10.1017/jfm.2015.97) [DOI] [Google Scholar]
  • 22.Taira K, Nair AG, Brunton SL. 2016. Network structure of two-dimensional decaying isotropic turbulence. J. Fluid Mech. 795, R2 ( 10.1017/jfm.2016.235) [DOI] [Google Scholar]
  • 23.Scarsoglio S, Iacobello G, Ridolfi L. 2016. Complex networks unveiling spatial patterns in turbulence. Int. J. Bifurcation Chaos 26, 1650223 ( 10.1142/S0218127416502230) [DOI] [Google Scholar]
  • 24.Iacobello G, Scarsoglio S, Kuerten JGM, Ridolfi L. 2018. Spatial characterization of turbulent channel flow via complex networks. Phys. Rev. E 98, 013107 ( 10.1103/PhysRevE.98.013107) [DOI] [PubMed] [Google Scholar]
  • 25.Jaromczyk JW, Toussaint GT. 1992. Relative neighborhood graphs and their relatives. Proc. IEEE 80, 1502–1517. ( 10.1109/5.163414) [DOI] [Google Scholar]
  • 26.Gabriel KR, Sokal RR. 1969. A new statistical approach to geographic variation analysis. Syst. Zool. 18, 259–278. ( 10.2307/2412323) [DOI] [Google Scholar]
  • 27.Matula DW, Sokal RR. 1980. Properties of Gabriel graphs relevant to geographic variation research and the clustering of points in the plane. Geograph. Anal. 12, 205–222. ( 10.1111/gean.1980.12.issue-3) [DOI] [Google Scholar]
  • 28.Su TH, Chang RC. 1990. The k-Gabriel graphs and their applications. In Algorithms (eds T Asano, T Ibaraki, H Imai, T Nishizeki). Lecture Notes in Computer Science, vol. 450. Berlin, Germany: Springer.
  • 29.Bose P, Collette S, Hurtado F, Korman M, Langerman S, Sacristán V, Saumell M. 2013. Some properties of k-Delaunay and k-Gabriel graphs. Comput. Geom. Theory Appl. 46, 131–139. ( 10.1016/j.comgeo.2012.04.006) [DOI] [Google Scholar]
  • 30.Rand WM. 1971. Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66, 846–850. ( 10.1080/01621459.1971.10482356) [DOI] [Google Scholar]
  • 31.Hubert L, Arabie P. 1985. Comparing partitions. J. Classif. 2, 193–218. ( 10.1007/BF01908075) [DOI] [Google Scholar]
  • 32.R Core Team. 2018. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. [Google Scholar]
  • 33.Fourer R, Gay DM, Kernighan B. 2002. AMPL: A mathematical programming language, 2 edn Cenage Learning; Independence, KY, USA. [Google Scholar]
  • 34.IBM ILOG. 2017. CPLEX. See https://www.ibm.com/analytics/cplex-optimizer.
  • 35.Garey MR, Johnson DS. 1979. Computers and intractability: a guide to the theory of NP-completeness. New York, NY: W. H. Freeman & Co. [Google Scholar]
  • 36.Theisel H, Seidel HP. 2003. Feature flow fields. In Proc. Symp. on Data Visualization 2003 (VisSym '03), Grenoble, France, 26–28 May, pp. 141–148. Aire-la-Ville, Switzerland: Eurographics Association.
  • 37.Weinkauf T, Theisel H, Van Gelder A, Pang A. 2011. Stable feature flow fields. IEEE Trans. Vis. Comput. Graph 17, 770–780. ( 10.1109/TVCG.2010.93) [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The raw data (velocity vector fields and critical point data) used in this article are publicly available on SMU's digital repository (scholar.smu.edu/engineering_mechanical_research/5/).


Articles from Proceedings. Mathematical, Physical, and Engineering Sciences are provided here courtesy of The Royal Society

RESOURCES