StripePy: fast and robust characterization of architectural stripes

Andrea Raffo; Roberto Rossini; Jonas Paulsen

doi:10.1093/bioinformatics/btaf351

. 2025 Jun 13;41(6):btaf351. doi: 10.1093/bioinformatics/btaf351

StripePy: fast and robust characterization of architectural stripes

Andrea Raffo ^1,^✉, Roberto Rossini ², Jonas Paulsen ³

Editor: Peter Robinson

PMCID: PMC12215313 PMID: 40511982

Abstract

Motivation

Architectural stripes in Hi-C and related data are crucial for gene regulation, development, and DNA repair. Despite their importance, few tools exist for automatic stripe detection.

Results

We introduce StripePy, which leverages computational geometry methods to identify and analyze architectural stripes in contact maps from Chromosome Conformation Capture experiments like Hi-C and Micro-C. StripePy outperforms existing tools, as shown through tests on various datasets and a newly developed simulated benchmark, StripeBench, providing a valuable resource for the community.

Availability and implementation

StripePy is released to the public as an open-source, MIT-licensed Python application. StripePy source code is hosted on GitHub at https://github.com/paulsengroup/StripePy and is archived on Zenodo. StripePy can be easily installed from source or PyPI using pip and from Bioconda using conda. Containerized versions of StripePy are regularly published on DockerHub.

1 Introduction

Eukaryotic genomes are hierarchically folded inside the nucleus into a 3D conformation crucial for gene regulation, cell division, and DNA repair (Van Bortle and Corces 2012, Dekker and Misteli 2015, Bonev and Cavalli 2016, Zheng and Xie 2019, Bouwman et al. 2022). At the scale of the nucleus, individual chromosomes form distinct territories (Rabl 1885, Cremer and Cremer 2010), further organized into A (euchromatin) and B (heterochromatin) compartments (Lieberman-Aiden et al. 2009). At the lower levels of the hierarchy, topologically associated domains (TADs) arise from enriched spatial contacts within domains, mediated by CTCF binding at their boundaries (Dixon et al. 2012, Nora et al. 2012).

Chromosome conformation capture sequencing methods like Hi-C and Micro-C have been instrumental in revealing these structural hierarchies across the genome. However, deciphering the resulting patterns would be impossible without computational tools designed to detect and analyze them. The significance of these computational methods is underscored by the numerous tools and techniques developed and made available to the research community for analyzing the genome structure at all levels of the genome hierarchy (Raffo and Paulsen, 2023) and for generating 3D reconstructions (see, e.g. Cifuentes et al. 2024, Chen et al. 2025).

In contrast to other genome architectural features, few algorithms are dedicated to the automatic recognition of “stripes,” which in Hi-C matrices visually consist of vertical (or horizontal) narrow rectangles anchored at the main matrix diagonal. From a biological standpoint, stripes are thought to arise from asymmetric cohesin-mediated loop extrusion (Fudenberg et al. 2016, Chang et al. 2020). This occurs when loop-extruding cohesin is unidirectionally blocked by a CTCF protein bound to its binding site with the N-terminal end pointing toward it, while loop extrusion can proceed unobstructed on the other side, resulting in a “stripe” of enriched Hi-C contacts emanating from CTCF-bound sites in the genome. Because CTCF binding is enriched at TAD borders, stripes are often found at their edges (Chang et al. 2020); this is not always the case, as it was recently noted that stripes can also appear without a TAD being clearly observed (Gupta et al. 2022).

Stripes are increasingly recognized as important features for epigenetic regulation of transcription and enhancer activity (Kim and Shendure 2019, Kraft et al. 2019), for development (Kraft et al. 2019), and for DNA repair (Arnould et al. 2021), highlighting our need to better detect and analyze them. Further, numerous aspects behind loop extrusion are still far from being fully understood, including factors governing the loading of cohesin onto chromatin and the existence of specific targeted loading sites in eukaryotes (Drayton and Hansen 2022).

Existing stripe detection tools are all rooted in the fields of image processing and analysis. The first method, Zebra (Vian et al. 2018), identifies pixel tracks of higher interaction frequency at genomic domain boundaries, requiring manual processing to confirm stripe candidates. Zebra’s original implementation is not publicly available, but a re-implementation was made available by an independent group on GitHub under the name of StripeCaller (https://github.com/XiaoTaoWang/StripeCaller); StripeCaller constrains the stripe width to a single bin and does not assume that stripes are anchored to the main diagonal of the matrix. Another tool published in late 2018 is domainClassifyR (https://github.com/ChristopherBarrington/domainClassifyR). The tool detects architectural stripes based on TAD boundaries by defining a stripe score that is based on the Z-statistic; however, intra-TAD segments remain undetected. Chromosight (Matthey-Doret et al. 2020) exploits template-based pattern recognition by convolving a set of templates over the contact map. A set of criteria is then applied to filter candidate stripes to deal with, e.g. candidate stripes overlapping with too many empty pixels or that are too close to another detected pattern. Chromosight identifies stripes as a single point, without estimating stripe width and height. Stripenn (Sora Yoon et al. 2022) preprocesses the input Hi-C matrix by performing contrast adjustment followed by noise reduction via the Canny edge detection algorithm. A set of custom criteria is introduced to detect and possibly merge vertical lines. Finally, the tool computes two coefficients—median P-value and stripiness—to quantitatively evaluate architectural stripes. Of the four methods available at present, Stripenn is the only one capable of estimating both the width and the height of a stripe, effectively transitioning stripes from a 0- or 1D pattern to a fully realized 2D representation.

We present StripePy, a stripe recognition method based on concepts rooted in geometric pattern recognition, topological data analysis (Carlsson 2009), and simple geometric reasoning. Implemented in Python, StripePy is uniquely capable of reading interactions in both Cooler (.cool and .mcool) and Juicer (.hic) data formats. Beyond the detection of stripes, including their height and width estimates, StripePy provides a collection of descriptors that can be used for subsequent postprocessing and in-depth analyses. One of the computed descriptors is the relative change, which is used to discern candidate stripes that exhibit minimal variation when comparing the average signal within stripes with the average signal in their neighborhood. StripePy thus offers an efficient and user-friendly method to detect architectural stripes, while at the same time computing several descriptors that can be used to inform further downstream analyses.

To assess StripePy’s performance and compare it with existing tools, we developed StripeBench, a novel benchmark consisting of 64 Hi-C contact maps simulated with the computational tool MoDLE (Rossini et al. 2022) at different resolutions, contact densities, and noise levels. These maps come with ground truth annotations. Furthermore, StripeBench defines a set of measures to quantify the performance and compare computational tools in the classification of genomic bins and recognition of stripes. StripeBench is used to compare StripePy against StripeCaller, Chromosight, and Stripenn and assess how these methods behave at increasing levels of resolution, contact density, and noise level.

Finally, the analysis of real contact maps from two cell lines is presented, and the predicted stripes are compared against CTCF chromatin immunoprecipitation sequencing (ChIP-Seq) peaks. These analyses demonstrate that StripePy can increase the number of correctly predicted stripes and detected anchor sites while maintaining high overall evaluation scores, making StripePy a valuable contribution to the community.

2 Materials and methods

2.1 Overview of StripePy

StripePy combines geometric pattern recognition, topological persistence, and simple geometric reasoning to detect stripes in contact maps. StripePy’s CLI consists of three subcommands: call, view, and plot. stripepy call is responsible for running the stripe detection algorithm on a contact matrix in Cooler (.cool or .mcool) or Juicer (.hic) format (Durand et al. 2016, Abdennur and Mirny, 2020) at a given resolution. While StripePy can process both balanced and unbalanced contact maps, tests detailed in the Supplementary Information, available as supplementary data at Bioinformatics online, reveal that our algorithm performs optimally when no prior balancing is applied. Our method depends on a set of parameters, with default values specified both in the remainder of this section and in the help message displayed when executing stripepy call—help. The command stripepy call produces a file in Hierarchical Data Format (.hdf5) (Folk et al. 2011) containing a list of rectangular regions corresponding to the candidate stripes, together with a set of descriptors and complementary information. The genomic coordinates of the candidate stripes identified by stripepy call can be extracted from the .hdf5 file using stripepy view, which outputs the stripes’ coordinates in BEDPE format directly to standard output. Finally, a stripepy plot can be used to visualize architectural stripes overlaid on top of the Hi-C matrix. stripepy plot can also generate several graphs showing the general properties of the called stripes. The remainder of this section summarizes the main concepts behind the StripePy algorithm, which are displayed in the form of the graphically simplified pipeline in Fig. 1; the reader is referred to the Supplementary Information, available as supplementary data at Bioinformatics online for the theoretical and technical details, and to Fig. 7, available as supplementary data at Bioinformatics online for an extended overview of StripePy.

Figure 1. — Short overview of StripePy. In StripePy call, the input contact map is first pre-processed (Step 1), then studied to detect linear patterns (Step 2) which are subsequently enriched by determining the width and height of each stripe through geometric reasoning, as well as peaks (represented by blue dots) in the signal inside of the stripes (Step 3); the candidates stripes are then post-processed to remove weak occurrences (Step 4). Stripepy view extracts the genomic coordinates of the candidate stripes and outputs them into a.bedpe file. A StripePy plot can be used to produce various plots of the steps adopted by the StripePy call.

StripePy’s algorithm boils down to four consecutive steps:

In Step 1, StripePy starts by applying a pre-processing step: it consists of a log-transformation, followed by the rescaling of the matrix entries such that they are in [0, 1], and then extracts a band around the main diagonal (here set to the default value of 5 Mbp, which was selected to upper bound TAD sizes in mammals Hansen et al. 2018). Then, StripePy searches for vertical stripes in the lower- and upper-triangular parts of the matrix separately by applying the remaining steps independently to the corresponding sub-matrices.
In Step 2, each of the two triangular matrices is marginalized by summing over the rows to obtain a scalar function of the columns. To mitigate uninformative fluctuations due to columns/rows with a low number of interactions, a constraint using the maximum between the scalar function and its weighted moving average is applied (Raffo and Biasotti 2020, 2021a,b). The obtained profile is then scaled so that it has values in [0, 1]. All local maxima are filtered with topological persistence, a technique in topological data analysis requiring a local maximum to stand significantly higher than its surroundings to be considered “persistent”: said otherwise, only local maxima that are markedly higher than nearby values are retained, while those corresponding to minor fluctuations are disregarded. This allows to keep loci (here called seed sites) that exhibit a more marked linear pattern, in a manner similar to the pattern recognition method known as the Hough transform (Hough 1962, Mukhopadhyay and Chaudhuri 2015). The default threshold for topological persistence is set to 0.04, i.e. 4%. This value was selected empirically after testing with contact maps generated by MoDLE (Rossini et al. 2022). The reader is referred to the Supplementary Information, available as supplementary data at Bioinformatics online for an extended description of how topological persistence works.
In Step 3, each seed site is analyzed independently to estimate its horizontal and vertical domains, which consist of the genomic intervals defining the stripe horizontally and vertically, respectively. These can then be used, e.g. to compute the stripe width and height and to extract the corresponding regions within the contact map. Given a seed site, the horizontal domain is defined by looking in the neighborhood of the local maximum point where the scalar function from Step 2 is monotonically increasing (resp. decreasing) to the left (resp. to the right) of the maximum point, and then by finding the left (resp. right) bin where the increase (resp. decrease) of the scalar function is the steepest; to prevent excessive stretching of the horizontal domain at low resolutions, a maximum stripe width is introduced (default value: 100 kb). The vertical domain of a stripe is obtained by extracting the columns corresponding to the horizontal domain, marginalizing in a similar manner as in Step 2, and then studying this scalar function. Two criteria are provided: (i) applying topological persistence on the rescaled profile to identify persistent peaks in the signal and use the location of the furthest peak as a boundary; the remaining peaks, if any, point at regions of the stripe where the signal sharply increases; (ii) thresholding to a minimum value.
To enrich what is now just a purely geometric structure (seed site + horizontal domain + vertical domain), Step 4 computes a number of complementary descriptors from the contact map, considering the matrix entries either inside the stripe or outside the stripe (in what we call a k-neighbor): minimum, quartiles, maximum, arithmetic mean and standard deviation. We then combine the inner and outer means into the relative change parameter, which quantifies to what extent the signal differs between the inside and the outside of the stripe: it is defined as the difference between mean inner and mean outer signals, divided by the mean outer signal; the mean outer signal is computed in the k-neighbor.

The .hdf5 file generated from these four steps can be inspected and post-processed using the stripepy view command. Thresholding the relative change equals to ruling out weaker stripes. After conducting tests on datasets from the same cell lines analyzed in the article, we have chosen to adopt a threshold of 4% for synthetic data and 3% for real contact maps, which resulted in high single-value scores. Changes in protocols, restriction enzymes, and cell lines may necessitate adjustments of this parameter: lowering the threshold in relative change will retain a greater number of stripes, whereas higher thresholds will lead to more restrictive filtration.

2.2 StripeBench: unified benchmarking of stripe identification algorithms for Hi-C data

A critical challenge in evaluating the performance of any pattern recognition method for Hi-C is the absence of a robust benchmarking system. Despite the wide and ready availability of contact maps under different conditions such as sequencing depth, resolution, and noise level, the absence of controlled experiments, scalability, and standardization hinders the quantitative analysis of—as well as the fair comparison between—existing tools. To overcome these limitations, we introduce StripeBench, a benchmark that includes: a set of 64 simulated contact maps containing realistic TAD and stripe patterns, a standardized “ground truth” baseline for testing, and a collection of metrics for evaluation and comparison. The remainder of this section presents key points and notations related to the benchmark, with a more detailed explanation provided in the Supplementary Information, available as supplementary data at Bioinformatics online.

The synthetic contact maps are obtained through a genome-wide run of MoDLE (Rossini et al. 2022), a computational tool that uses fast stochastic simulation to sample DNA-DNA contacts generated by loop extrusion. Here, loop extrusion is modeled using extrusion barriers whose occupancy is based on RAD21 ChIP-Seq data from the H1-hESC cell line. Simulations are run using default settings except for the following three factors: target contact density (i.e. the sequencing depth), scale parameter (which controls the noise), and resolution. The effect of changing these parameters on the contact maps is illustrated in Fig. 2A. The ground truth consists of the extrusion barriers (here, CTCF binding sites) inputted to MoDLE. Each barrier includes a position, a blocking direction, and an occupancy value. These three parameters concur to determine the location, direction, and strength of the linear pattern observed in the resulting contact maps. The ground truth contains 31 423 barriers at 5 kb, with nearly an equal distribution between lower and upper-triangular occurrences. This number decreases to 30 581, 28 631, and 26 191 at 10, 25, and 50 kb, respectively: this reduction occurs because we discard duplicated barriers that overlap with the same genomic bin.

Figure 2. — StripeBench. (A) The dataset is composed of 64 contact maps, which are generated by varying three factors: contact density $δ$ , noise level $σ$ , resolution $ρ$ ; the three columns show the effect of increasing each of these three factors singularly. (B) StripeBench’s recognition vs. classification measures. Circled numbers 1–4 point at concepts relevant in recognition: (1–2) show a predicted stripe (semitransparent rectangle) that can be deemed “good”, as it contains a ground truth anchor point (here represented by a dashed blue vertical line); (3) point at a predicted stripe that is not “meaningful” as it does not contain any ground truth anchor point; (4) shows a ground truth anchor point that is not found, as it is not contained in any predicted stripe. Circled numbers 5–8 refer, respectively, to an example of true positive, false positive, true negative, and false negative in (bin) classification, where we have represented the lower-triangular ground truth classification vector $l_{low}^{*}$ and the predicted classification vectors ${\hat{l}}_{low}$ with two rectangles where white means 0 (no anchor point) and vertical colored segments denote 1 (anchor point); blue (resp. orange) refers to anchor points with average extrusion barrier occupancy greater than or equal (resp. lower) to 0.70. The panel also lists classification and recognition measures.

To evaluate and compare stripe callers from multiple directions, we introduced two families of measures: classification measures target the identification of genomic bins hosting extrusion barriers; recognition measures find out whether predicted stripes include bins hosting extrusion barriers, i.e. if it is traversed by it. Figure 2B illustrates these concepts, see also (Kuhn and Johnson 2018).

We adopt the following “base measures” for the classification problem: sensitivity True Positive Rate (TPR), specificity True Negative Rate (TNR), and Positive Predictive Value (PPV). Similarly, we introduce the following base recognition measures: Anchor Hit Rate (AHR) and Fraction of Good Candidates (FGC), which correspond to TPR and PPV in the context of recognition. Despite their popularity and ease of interpretability, these metrics run the risk of being severely biased toward the majority class when applied to highly imbalanced datasets. As a result, they can be misleading if used in isolation (Gu et al. 2009, Luque et al. 2019, Aguilar-Ruiz and Michalak 2024). To address these concerns and capture a more balanced evaluation, the following single-value metrics are adopted: balanced accuracy (bACC) and Geometric Mean (GM), which combine sensitivity and specificity, respectively; F1-score (F1c) and Fowlkes–Mallows index (FMc), which summarize precision and recall; Jaccard Index (JI).

3. Results

3.1 StripePy: improved stripe classification and recognition

We applied StripePy, Chromosight, StripeCaller, and Stripenn—the currently available stripe callers—to the 64 contact maps in StripeBench, obtaining a number of predictions that varies with resolution, contact density, and noise level. Specifically, we obtained: from 7705 to 35 068 stripes with StripePy, from 3500 to 34 912 with Chromosight, from 0 to 33 002 with StripeCaller, and from 0 to 3092 with Stripenn. The number of predicted stripes for each contact map is provided in Table 1, available as supplementary data at Bioinformatics online.

Some representative examples of predicted stripes are provided in Fig. 3. As mentioned before, Chromosight does not provide estimates for the length or width of a stripe; instead, it returns two genomic pairs, each of which is positioned at a distance determined by the chosen resolution. StripeCaller provides an estimate for the length of a stripe, but, similarly to Chromosight, it sets the width to equal the matrix resolution.

Figure 3. — Benchmarking pt. 1. (A) Three regions extracted from three MoDLE-simulated contact maps. The upper and lower bands show the anchor points from the ground truth as solid vertical blue lines, and they refer to the upper- and the lower-triangular parts, respectively. The images of the contact maps have superimposed the predictions produced by the tools under study. (B–M) Boxplots of the classification (TPR, TNR, PPV, bACC, GM, F1c, FMc, and JI) and recognition (AHR, FGC, F1r, and FMr) measures over the StripeBench contact maps. (N–Q) Heatmaps summarizing how StripePy performs compared to Chromosight and StripeCaller in terms of percentages of ground truth anchor points: (classification) correctly labeled as such; (recognition) found inside a candidate stripe.

Due to the low predictive capability of Stripenn on this dataset, we decided to exclude this tool from the comparative analysis. Nevertheless, its classification and recognition scores on each contact map are included, together with the scores of the other stripe callers, in Tables 2–5, available as supplementary data at Bioinformatics online.

3.1.1 StripePy improves overall bin classification

First, we conducted an overall analysis by evaluating each tool’s classification performance on anchor and non-anchor sites using base measures, considering the entire dataset. Results indicate that StripePy markedly outperforms the other tools in the TPR with a median TPR of 0.2794, compared to Chromosight (0.1957) and StripeCaller (0.0753), see Fig. 3B. Most tools show high TNR due to the overwhelming number of non-anchor bins (median TNR values for StripePy, Chromosight, StripeCaller are 0.9756, 0.9833, and 0.9949, respectively, as shown in Fig. 3C). The strong skewness in label distribution has also been shown to affect negatively PPV, as pointed out in (Aguilar-Ruiz and Michalak 2024): the median values for StripePy, Chromosight, and StripeCaller are 0.5341, 0.5395, and 0.5314, respectively (see Fig. 3D).

To address the imbalance in positive and negative bins, we selected a weighted accuracy measure (bACC), which balances sensitivity and specificity through an arithmetic mean. StripePy again outperforms the other tools with a median bACC of 0.6272, compared to Chromosight (0.5886) and StripeCaller (0.5349), see Fig. 3E. Applying the geometric mean, which combines TPR and TNR in an alternative fashion, also confirms this trend: median values for StripePy, Chromosight, and StripeCaller are 0.5208, 0.4389, and 0.2736, respectively (Fig. 3F). Examining the harmonic mean of precision and recall (F1c) reinforces StripePy’s improved predictive performance with a median F1c of 0.3554, compared to Chromosight (0.2767) and StripeCaller (0.1291), see Fig. 3G. The alternative combination of precision and recall via the FMc also shows StripePy’s superiority with a median FMc of 0.3765, versus Chromosight (0.3181) and StripeCaller (0.1966), see Fig. 3H.

Using the JI, a summary of classifier performance that considers true positives, false positives, and false negatives, shows that StripePy outperforms the other tools with a median JI of 0.2161, compared to 0.1605 for Chromosight and 0.0690 for StripeCaller, see Fig. 3I.

Regarding interquartile ranges, StripePy provides the lowest values for all classification measures except for TNR (StripePy: 0.0148, Chromosight: 0.0093, StripeCaller: 0.0090, see Fig. 3C) and PPV (StripePy: 0.1307, Chromosight: 0.1252, StripeCaller: 0.1315, see Fig. 3D). The presence of outliers points to possible limit cases where performance drops significantly (see, e.g. the case of TPR in Fig. 3B).

To conclude, StripePy consistently outperforms Chromosight and StripeCaller across various classification metrics, including the base measure TPR and all single-value measures: bACC, GM, F1c, FMc, and Jaccard Index (JI) Coefficient. Chromosight and StripeCaller call fewer stripes with a slightly higher PPV. In terms of stability, interquartile ranges indicate a generally more stable performance for StripePy. However, unhandled edge cases may still occur, as shown by the presence of outliers.

3.1.2 Inclusion of stripe width estimation further increases StripePy’s performance

To understand the impact of width estimation, we conducted a similar analysis focused on the recognition problem. The base measure, AHR, which corresponds to TPR in this task, confirms that StripePy significantly increases the fraction of recognized anchor sites when estimating widths. The median AHR reaches 0.6312, compared to the median classification measure of 0.2794 (Fig. 3J). Analogously, width estimation also enhances the fraction of positives that are true positives: while StripePy’s median PPV was 0.5341, its counterpart in the recognition task (FGC) soars to 0.9767 (Fig. 3K). As a result, the single-value measures see a substantial rise: the harmonic mean of AHR and FGC (F1r) attains a median value of 0.7699 for StripePy, compared to a median F1c of 0.3554 (Fig. 3L); the median FMc is 0.7881, whereas its corresponding value was 0.3765 in the classification (Fig. 3M). While StripePy drastically improves its performance, Chromosight and StripeCaller merely carry over their TPR, PPV, F1c, and FMc values to AHR, FGC, F1r, and FMr. This is because these tools do not have any notion of widths, and thus their metrics are not impacted by interpreting the problem as recognition rather than classification (Fig. 3J–M).

To summarize, estimating the width of a stripe not only provides a more realistic description of the stripe properties but also significantly improves the performance of a stripe caller, reinforcing the point that stripes should not be thought of as simple 1D segments, but rather as narrow rectangles.

3.1.3 StripePy discovers previously detected and novel stripes

We then investigated the degree of overlap between the predictions obtained by StripePy and those produced by Chromosight and StripeCaller.

We compared the anchor sites detected and missed by StripePy with those of Chromosight (Fig. 3N) and StripeCaller (Fig. 3O). The study reveals that, on average, 68.0% (Fig. 3N) to 70.2% (Fig. 3O) of the ground truth anchor points remain undetected by all tools. The vast majority of anchor points correctly identified by Chromosight and StripeCaller are also found by StripePy: only 4.4% (Fig. 3N) and 2.3% (Fig. 3O) of anchor sites are unique to Chromosight and StripeCaller, respectively. Finally, StripePy recognizes a higher percentage of anchor points, achieving an 18.2% detection rate when StripeCaller is considered (Fig. 3O).

When leveraging stripe widths, StripePy’s advantage increases significantly. The percentage of anchors undetected by StripePy and Chromosight falls to 33.9% (Fig. 3P), and for StripePy and StripeCaller drops to 35.5% (Fig. 3Q). Only 2.2% of the anchor points are now unique to Chromosight (Fig. 3P), and just 0.7% are unique to StripeCaller (Fig. 3Q). Meanwhile, a substantial proportion of anchor sites are only found by StripePy: 46.0% compared to Chromosight (Fig. 3P) and 52.9% compared to StripeCaller (Fig. 3Q).

3.1.4 StripePy outperforms existing tools across a diverse set of conditions

Hi-C data are generated with varying noise levels and sequencing depths, which affect the resolution at which the data can be meaningfully studied and significantly impact data analysis, such as the identification of architectural stripes (Lajoie et al. 2015). Unlike other functional genomics assays, Hi-C data requires analysis at a user-determined effective resolution, often involving the consideration of multiple bin sizes on the basis of the pattern under study. However, higher resolution increases technical noise and contributes to data sparseness. These factors are influenced, among other things, by the technology used, such as the digestion strategy, e.g. sequence-specific digestion by one or more restriction enzymes, or non-sequence-specific digestion by MNase (Micro-C) or DNAse (DNAse Hi-C) (Ma et al. 2015, Ramani et al. 2016, Yardımcı et al. 2019, Krietenstein et al. 2020). To study the effects of contact density, noise, and resolution on the callers, we stratified the dataset by these factors and evaluated the dependency of classification and recognition metrics.

To this end, we grouped contact maps by specifying one factor at a time. For the sake of clarity, here we limit the analysis to the medians of the classification and recognition scores; boxplots of the scores can be found in Figs 2, 3, and 6, available as supplementary data at Bioinformatics online.

Upon first investigation, we observe that medians yield rankings in line with the overall analysis: StripePy consistently outperforms other tools across all contact densities, noise levels, and resolutions for the majority of measures. Focusing on base measures, StripePy provides the highest TPR scores (Fig. 1A1, B1, and C1, available as supplementary data at Bioinformatics online), AHR scores (Fig. 1A9, B9, and C9, available as supplementary data at Bioinformatics online), and FGC scores (Fig. 1A10, B10, and C10, available as supplementary data at Bioinformatics online); however, StripePy exhibits a generally lower median performance in correctly labeling negative bins, as displayed by TNR (Fig. 1A2, B2, and C2, available as supplementary data at Bioinformatics online). When it comes to PPV, StripePy achieves median values similar to those of Chromosight and StripeCaller at higher contact density and noise level (Fig. 1A3, B3, available as supplementary data at Bioinformatics online). StripePy’s leads in classification and recognition is confirmed when adopting single-value classification (bACC, GM, F1c, FMc, and JI, see Fig. 1A4–A8, B4–B8, and C4–C8, available as supplementary data at Bioinformatics online) and recognition measures (F1r, and FMr, see Fig. 1A11–A12, B11 and B12, and C11 and C12, available as supplementary data at Bioinformatics online).

We observe that increasing contact frequency does not always result in higher scores. For example, the medians of TPR (Fig. 1A1, available as supplementary data at Bioinformatics online) and AHR (Fig. 1A9, available as supplementary data at Bioinformatics online) show that increasing the contact density results in a slight decrease in StripePy performance. Conversely, Chromosight experiences a minor increase in its performance, while it is only StripeCaller that progressively increases its medians, although it appears to asymptotically approach a value well below StripePy and Chromosight.

Regarding changes in noise level, we see that the median of TPR (Fig. 1B1, available as supplementary data at Bioinformatics online), AHR (Fig. 1B9, available as supplementary data at Bioinformatics online), and of all single-value metrics (Fig. 1B4–B8 and B11–B12, available as supplementary data at Bioinformatics online) show drops of different magnitudes as noise increases. This trend is accompanied by a rise in the median TNR scores (Fig. 1B2, available as supplementary data at Bioinformatics online). The simultaneous rise in median TNR scores and decrease in TPR scores suggests that all callers tend to reduce the number of detected stripes as noise increases, as confirmed by the data in Table 1, available as supplementary data at Bioinformatics online.

With the sole exception of TNR (Fig. 1C2, available as supplementary data at Bioinformatics online) and AHR (Fig. 1C9, available as supplementary data at Bioinformatics online), higher resolutions lead to a drop in median scores for StripePy and Chromosight; the same applies to StripeCaller on a smaller number of metrics. For example, all callers show decreases in median FMc (Fig. 1C7, available as supplementary data at Bioinformatics online) and FMr (Fig. 1C12, available as supplementary data at Bioinformatics online) when transitioning from 10 to 5 kb. This might be due, among other reasons, to the increase in sparsity that accompanies the increase in resolution.

3.1.5 Analyzing statistical significance

We conducted statistical significance tests to assess whether the variations in classification and recognition performance between pairs of tools are statistically substantial. Table 6, available as supplementary data at Bioinformatics online, presents the P-values for the Anderson–Darling test (Anderson and Darling 1954), utilizing the implementation provided by SciPy. The tests were conducted on subsets of the StripeBench benchmark, which were obtained by grouping contact maps by specifying one factor at a time, as well as on the entire benchmark. The P-values are clamped to a minimum value of $10^{- 4}$ . This analysis reveals that, with the sole exception of PPV, the differences between tools are generally statistically significant: in fact, the overwhelming majority of P-values fall below the 0.05 threshold, allowing us to reject the null hypothesis that pairs of tools behave according to the same probability distribution. Notable exceptions occur in the classification measures for StripePy and Chromosight, e.g. for low noise level.

3.2 Analyzing human Hi-C matrices with StripePy reveals fast and robust detection of stripes

To investigate how StripePy behaves on real data, we gathered files from the 4DNucleome and ENCODE, and compared the stripes predicted by various available callers at a resolution of 10 kb with the CTCF peaks from the corresponding cell lines.

We selected four contact maps: an in situ Hi-C and a Micro-C map from the H1 human embryonic stem cell line (H1-hESC), and an in situ Hi-C and an intact Hi-C map from the human lymphoblastoid GM12878 cell line. This choice allows us to compare the tools across different cell lines with distinct contact patterns (e.g. sparsity and number of interactions) and technologies, thus providing a comprehensive evaluation of performance under varied conditions. Each interaction map was paired with a CTCF ChIP-Seq dataset. These datasets were used to generate a list of stripe anchor points by taking the midpoint between the start and end coordinates of a CTCF peak. At a resolution of 10 kb, the Hi-C and Micro-C maps for H1-hESC consist of 215,229,365 and 276,917,669 non-zero entries, which sum up to a total of 2,785,926,565 and 1,186,984,847 interactions, respectively (numbers refer to the cis portion of the contact map). This highlights that although the Hi-C map contains more than twice as many interactions when compared to the Micro-C map, it is sparser. As per the intact Hi-C and in situ Hi-C maps for GM12878, the first has 278,180,961 non-zero entries and a total of 1,415,005,237 interactions, while the second has 91,009,553 non-zero entries and 338,111,285 interactions (numbers refer to the cis portion of the contact map at 10 kb). This indicates that the first map is less sparse and has a higher number of interactions than the second one.

3.2.1 Running StripePy on real data confirms its performance in stripe recognition

Since these datasets are also manageable by Stripenn, we have included this tool in the comparative analysis.

Snapshots of neighborhoods from different chromosomes are provided in Fig. 5, available as supplementary data at Bioinformatics online. Table 7, available as supplementary data at Bioinformatics online, reports the classification and recognition measures for the three maps, complete of the number of anchors found (nAF) and the number of stripes predicted (nSP). To ease the comparison, classification and recognition scores are reported as percentages. From plots and statistics, we reach a series of conclusions.

First, Chromosight, StripeCaller, and Stripenn focus on visually prominent segments, while StripePy can also locate milder or hidden patterns associated with ChIP-Seq peaks. An example of this is indicated by the black arrows in Fig. 5A1–A3, available as supplementary data at Bioinformatics online, which shows interactions and stripes for the H1-hESC Micro-C dataset. This point is further highlighted by the number of anchors found (nAF), which is significantly higher for StripePy: for the same contact map, stripes predicted by StripePy contain a total of 32,644 anchor sites, compared to 6257, 5294, and 9263 anchors identified by Chromosight, StripeCaller, and Stripenn, respectively (Table 7, available as supplementary data at Bioinformatics online). The higher number of anchors found is accompanied by a significantly higher number of predictions (nSP): for this contact map, StripePy 26,210 candidates, compared to 11,765, 12,018, and 3,864 predictions from Chromosight, StripeCaller, and Stripenn, respectively (Table 7, available as supplementary data at Bioinformatics online).

Moving to the analysis of the classification and recognition measures, we observe that most of the general findings uncovered with the synthetic data from StripeBench are still valid on these maps (Table 7, available as supplementary data at Bioinformatics online). For classification measures, we indeed observe that StripePy is stronger in correctly predicting positives rather than negatives: TPR values are higher in StripePy, while the TNR is generally lower. As for StripeBench, the overwhelming number of non-anchor bins results in an imbalanced classification problem, causing the other methods to focus on the most populous class (i.e. the negatives) at the expense of the anchor sites (i.e. the positive class). When focusing on single-value metrics, StripePy generally achieves the highest scores, making it the best overall classifier. The sole exceptions involve the H1-hESC Micro-C (for bACC and FMc) and GM12878 DpnII in situ Hi-C datasets (for the sole bACC). A possible explanation for this underperformance is the sparsity of these two maps, which can affect the pseudo-distributions by generating spurious local maxima with little, if any, biological significance. In the most severe case, namely the GM12878 in situ Hi-C dataset, the sparsity is compounded by a low number of interactions, leading to a significant increase in the number of candidate stripes (see Fig. 5D1–D3, available as supplementary data at Bioinformatics online). For base recognition measures, StripePy excels in AHR but lags behind Stripenn with respect to FGC. The latter can identify, however, only between 9.10% (in GM12878 in situ Hi-C) and 17.00% (in H1-hESC Micro-C) of the anchor sites, which contributes to its relatively low overall performance in recognition measures like F1r and FMr. In other words, Stripenn excels in finding a very limited number of stripes which, in turn, are highly likely to contain anchor sites (see Fig. 5A4, B4, C4, and D4, available as supplementary data at Bioinformatics online). Conversely, StripePy has a lower percentage of stripes that contain anchor sites but provides a much more balanced performance in terms of stripe recognition (see Fig. 5A1, B1, C1, and D1, available as supplementary data at Bioinformatics online).

3.3 StripePy exhibits excellent computational performance

While designing and implementing StripePy, special attention was given to computational performance and efficient usage of computational resources. As a result, StripePy is significantly faster than existing tools, being twice as fast as Chromosight and over 66 times faster than Stripenn. Peak memory usage is also much lower than other tools. Overall, processing the .mcool matrix for ENCFF993FGR– a Hi-C dataset with close to two billion interactions at 10 kb resolution—using 8 CPU cores takes <35 s and requires below 650 MB of memory. These results were obtained thanks to deliberate decisions while designing the StripePy algorithm, as well as careful implementation taking advantage of various optimization techniques, including parallel programming, asynchronous programming, and exploiting shared memory as much as possible. Furthermore, special care was taken when selecting third-party dependencies. For example, by using hictkpy instead of other Python libraries to read interactions to fetch interactions from .hic and .mcool files, we were able to take advantage of the library capabilities to efficiently fetch interactions surrounding the matrix diagonal, greatly reducing peak memory usage while avoiding needlessly fetching interactions not required by StripePy (Rossini and Paulsen 2024). On the algorithmic side, unlike other stripe recognition tools, StripePy does not rely on steps rooted in image processing and analysis. While the use of the global pseudo-distribution and the subsequent identification of vertical linear patterns as maximum points resemble the Hough transform (Hough 1962, Mukhopadhyay and Chaudhuri 2015)—a popular pattern recognition technique for curve and surface recognition—avoiding the use of binarization and edge detection allows to speed up the computation and eliminates the need for additional parameters.

4 Discussion

Hi-C and related techniques have greatly advanced our understanding of 3D genome organization and its role in gene regulation, DNA replication, and repair (Gupta et al. 2022, Sora Yoon et al. 2022). While computational methods have been crucial to these insights (Di Stefano et al. 2021), tools for detecting architectural stripes remain scarce. Here, we developed StripePy, a new CLI application, which combines concepts from geometric pattern recognition, algebraic topology, and basic geometry to detect architectural stripes from Hi-C data. StripePy can process contact maps in various formats (.cool, .mcool, and .hic) and outputs its findings in both Hierarchical Data Format (.hdf5) and BEDPE format. Unlike most existing solutions, StripePy does not reduce stripe detection to a mere bin classification problem. Instead, it also computes shape descriptors such as the width and height of stripes, and generates statistics that can be used for postprocessing purposes, such as ranking and classification of candidate stripes, e.g. to filter out unrealistic stripes.

A major challenge in the field is the lack of standardized definitions for architectural elements in Hi-C data, such as TADs, stripes, and individual contacts (Carty et al. 2017, de Wit 2020, Liu et al. 2021, Raffo and Paulsen 2023). To address this, we developed StripeBench, a dataset of 64 simulated contact maps with ground truth annotations and a comprehensive list of metrics, facilitating exhaustive quantitative comparisons between callers. We anticipate that this benchmark will aid in the future development of algorithms tailored for Hi-C and related data.

An additional obstacle with the analysis of Hi-C data lies in the variation in resolutions, noise levels, and sequencing depth that characterize this kind of data, all of which hinder standardized detection of chromatin features. Our benchmark tackles this issue by incorporating datasets with diverse conditions, enabling a more comprehensive evaluation of stripe detection methods.

Based on the benchmarks presented in the previous sections, it is clear from the quantitative analysis that all tools struggle, to different degrees, in accurately identifying true positives in the classification task. This difficulty is mainly caused by the skewed distribution of the ground truth labels, but it also stems from the lack of a precise and unambiguous definition of what a stripe is. While incorporating geometric descriptors proves beneficial—as noticeable from the increased F1c and Fowlkes—further work is needed to better define architectural stripes and their functional implications.

Supplementary Material

btaf351_Supplementary_Data

btaf351_supplementary_data.pdf^{(15.2MB, pdf)}

Acknowledgements

We thank the 4DNucleome Network and the lab of Job Dekker for contributing several of the Hi-C and Micro-C datasets used as part of this study. We thank the ENCODE Consortium and the labs of Erez Aiden and Michael Snyder for contributing ENCODE data used as part of this study. Furthermore, we would like to acknowledge Mr. Bendik Berg for his contributions in packaging StripePy and configuring the framework for unit testing.

Contributor Information

Andrea Raffo, Department of Biosciences, University of Oslo, Oslo 0316, Norway.

Roberto Rossini, Department of Biosciences, University of Oslo, Oslo 0316, Norway.

Jonas Paulsen, Department of Biosciences, University of Oslo, Oslo 0316, Norway.

Author contributions

Andrea Raffo (Conceptualization [lead], Formal analysis [lead], Investigation [lead], Methodology [lead], Software [equal], Writing—original draft [lead], Writing—review & editing [lead]), Roberto Rossini (Conceptualization [supporting], Software [equal], Writing—original draft [equal], Writing—review & editing [equal]), and Jonas Paulsen (Conceptualization [supporting], Funding acquisition [lead], Supervision [lead], Writing—original draft [equal], Writing—review & editing [equal])

Supplementary data

Supplementary data are available at Bioinformatics online.

Conflict of interest: None declared.

Funding

This work was supported by the Norwegian Research Council [Projects 324137 and 343102].

Code availability

StripePy source code is hosted on GitHub at https://github.com/paulsengroup/StripePy and is archived on Zenodo at DOI: 10.5281/zenodo.15310827 (Raffo and Rossini 2025b). StripePy can be easily installed from source or PyPI using pip https://pypi.org/project/stripepy-hic. Furthermore, StripePy is available on Bioconda at https://anaconda.org/bioconda/stripepy-hic and can be installed using conda. Containerized versions of StripePy are regularly published on DockerHub at https://hub.docker.com/r/paulsengroup/stripepy.

The code used for the benchmarks and data analyses presented in this article is hosted on a separate GitHub repository at https://github.com/paulsengroup/2024-stripepy-paper. A copy of the code in this repository is archived on Zenodo at DOI: 10.5281/zenodo.15310693 Raffo and Rossini (2025a).

Data availability

This study reanalyzed a number of publicly available datasets released by the 4DNucleome (https://data.4dnucleome.org/) (Dekker et al. 2017, Reiff et al. 2022) and ENCODE (https://www.encodeproject.org/) (Dunham et al. 2012, Luo et al. 2020, Hitz et al. 2023, Kagda et al. 2023):

Hi-C and Micro-C datasets: 4DNFI6HDY7WZ (Krietenstein et al. 2020)—H1-hESC (in situ Hi-C; DpnII); 4DNFI9GMP2J8 (Krietenstein et al. 2020)—H1-hESC (Micro-C); ENCFF216QQM—GM12878 (in situ Hi-C; DpnII); ENCFF993FGR—GM12878 (intact MNase Hi-C).
CTCF ChIP-Seq datasets: ENCFF692RPA—H1-hESC; ENCFF796WRU—GM12878.

The results presented throughout the manuscript have been deposited on Zenodo at DOI: 10.5281/zenodo.15308825 (Raffo 2025).

The benchmark developed as part of this publication and used to contrast the stripe callers considered in this manuscript, StripeBench, has also been deposited on Zenodo at DOI: 10.5281/zenodo.14448329 (Raffo 2024).

References

Abdennur N, Mirny LA. Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics 2020;36:311–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Aguilar-Ruiz JS, Michalak M. Classification performance assessment for imbalanced multiclass data. Sci Rep 2024;14:10759. 10.1038/s41598-024-61365-z [DOI] [PMC free article] [PubMed] [Google Scholar]
Anderson TW, Darling DA. A test of goodness of fit. J Am Stat Assoc 1954;49:765–9. 10.1080/01621459.1954.10501232 [DOI] [Google Scholar]
Arnould C, Rocher V, Finoux A-L et al. Loop extrusion as a mechanism for formation of DNA damage repair foci. Nature 2021;590:660–5. 10.1038/s41586-021-03193-z [DOI] [PMC free article] [PubMed] [Google Scholar]
Bonev B, Cavalli G. Organization and function of the 3D genome. Nat Rev Genet 2016;17:661–78. 10.1038/nrg.2016.112 [DOI] [PubMed] [Google Scholar]
Bouwman BAM, Crosetto N, Bienko M. The era of 3D and spatial genomics. Trends Genet 2022;38:1062–75. 10.1016/j.tig.2022.05.010 [DOI] [PubMed] [Google Scholar]
Carlsson G. Topology and data. Bull Amer Math Soc 2009;46:255–308. 10.1090/S0273-0979-09-01249-X [DOI] [Google Scholar]
Carty M, Zamparo L, Sahin M et al. An integrated model for detecting significant chromatin interactions from high-resolution Hi-C data. Nat Commun 2017;8:15454. 10.1038/ncomms15454 [DOI] [PMC free article] [PubMed] [Google Scholar]
Chang L-H, Ghosh S, Noordermeer D. TADs and their borders: free movement or building a wall? J Mol Biol 2020;432:643–52. 10.1016/j.jmb.2019.11.025. [DOI] [PubMed] [Google Scholar]
Chen Y, Lin Z-B, Wang S-K et al. Reconstruction of diploid higher-order human 3D genome interactions from noisy Pore-C data using Dip3D. Nat Struct Mol Biol 2025:1–26. 10.1038/s41594-025-01512-w[PMC] [DOI] [PubMed] [Google Scholar]
Cifuentes D, Draisma J, Henriksson O et al. 3D genome reconstruction from partially phased Hi-C data. Bull Math Biol 2024;86:33. 10.1007/s11538-024-01263-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
Cremer T, Cremer M. Chromosome territories. Cold Spring Harb Perspect Biol 2010;2:a003889. 10.1101/cshperspect.a003889 [DOI] [PMC free article] [PubMed] [Google Scholar]
de Wit E. TADs as the caller calls them. J Mol Biol 2020;432:638–42. 10.1016/j.jmb.2019.09.026 [DOI] [PubMed] [Google Scholar]
Dekker J, Belmont AS, Guttman M et al. ; 4D Nucleome Network The 4D nucleome project. Nature 2017;549:219–26. 10.1038/nature23884 [DOI] [PMC free article] [PubMed] [Google Scholar]
Dekker J, Misteli T. Long-range chromatin interactions. Cold Spring Harb Perspect Biol 2015;7:a019356. 10.1101/cshperspect.a019356 [DOI] [PMC free article] [PubMed] [Google Scholar]
Di Stefano M, Paulsen J, Jost D et al. 4D nucleome modeling. Curr Opin Genet Dev 2021;67:25–32. 10.1016/j.gde.2020.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
Dixon JR, Selvaraj S, Yue F et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 2012;485:376–80. 10.1038/nature11082 [DOI] [PMC free article] [PubMed] [Google Scholar]
Drayton JA, Hansen AS. Right on target: chromatin jets arise from targeted cohesin loading in wild-type cells. Mol Cell 2022;82:3755–7. 10.1016/j.molcel.2022.09.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
Dunham I, Kundaje A, Aldred SF et al. An integrated encyclopedia of DNA elements in the human genome. Nature 2012;489:57–74. 10.1038/nature11247 [DOI] [PMC free article] [PubMed] [Google Scholar]
Durand NC, Robinson JT, Shamim MS et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst 2016;3:99–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
Folk M, Heber G, Koziol Q et al. An overview of the HDF5 technology suite and its applications. In: Proceedings of the EDBT/ICDT 2011 Workshop on Array Databases. New York, NY: Association for Computing Machinery, 2011, 36–47. 10.1145/1966895.1966900 [DOI] [Google Scholar]
Fudenberg G, Imakaev M, Lu C et al. Formation of chromosomal domains by loop extrusion. Cell Rep 2016;15:2038–49. 10.1016/j.celrep.2016.04.085 [DOI] [PMC free article] [PubMed] [Google Scholar]
Gu Q, Zhu L, Cai Z. Evaluation measures of the classification performance of imbalanced data sets. In: Zhihua C, Zhenhua L, Zhuo K, Yong L (eds), Computational Intelligence and Intelligent Systems. Berlin, Heidelberg: Springer, 2009, 461–71. 10.1007/978-3-642-04962-0_53 [DOI] [Google Scholar]
Gupta K, Wang G, Zhang S et al. StripeDiff: model-based algorithm for differential analysis of chromatin stripe. Sci Adv 2022;8:eabk2246. 10.1126/sciadv.abk2246 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hansen AS, Cattoglio C, Darzacq X et al. Recent evidence that TADs and chromatin loops are dynamic structures. Nucleus 2018;9:20–32. 10.1080/19491034.2017.1389365 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hitz BC, Lee J-W, Jolanki O et al. ; The ENCODE Uniform Analysis Pipelines. 2023. 10.1101/2023.04.04.535623 [DOI]
Hough PVC. Method and means for recognizing complex patterns. US Patent 3069,654, 12 1962.
Kagda MS, Lam B, Litton C et al. Data Navigation on the ENCODE Portal. 2023. https://arxiv.org/abs/2305.00006
Kim S, Shendure J. Mechanisms of interplay between transcription factors and the 3D genome. Mol Cell 2019;76:306–19. 10.1016/j.molcel.2019.08.010 [DOI] [PubMed] [Google Scholar]
Kraft K, Magg A, Heinrich V et al. Serial genomic inversions induce tissue-specific architectural stripes, gene misexpression and congenital malformations. Nat Cell Biol 2019;21:305–10. 10.1038/s41556-019-0273-x[PMC] [DOI] [PubMed] [Google Scholar]
Krietenstein N, Abraham S, Venev SV et al. Ultrastructural details of mammalian chromosome architecture. Mol Cell 2020;78:554–65.e7. 10.1016/j.molcel.2020.03.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
Kuhn M, Johnson K. Applied Predictive Modeling. New York: Springer, 2018. [Google Scholar]
Lajoie BR, Dekker J, Kaplan N. The Hitchhiker’s guide to Hi-C analysis: practical guidelines. Methods 2015;72:65–75. 10.1016/j.ymeth.2014.10.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
Lieberman-Aiden E, van Berkum NL, Williams L et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 2009;326:289–93. 10.1126/science.1181369 [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu N, Low WY, Alinejad-Rokny H et al. Seeing the Forest through the trees: prioritising potentially functional interactions from Hi-C. Epigenetics Chromatin 2021;14:41. 10.1186/s13072-021-00417-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
Luo Y, Hitz BC, Gabdank I et al. New developments on the encyclopedia of DNA elements (ENCODE) data portal. Nucleic Acids Res 2020;48:D882–9. 10.1093/nar/gkz1062[PMC][10.1093/nar/gkz1062] [31713622] [DOI] [PMC free article] [PubMed] [Google Scholar]
Luque A, Carrasco A, Martín A et al. The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognit 2019;91:216–31. 10.1016/j.patcog.2019.02.023 [DOI] [Google Scholar]
Ma W, Ay F, Lee C et al. Fine-scale chromatin interaction maps reveal the cis-regulatory landscape of human lincRNA genes. Nat Methods 2015;12:71–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Matthey-Doret C, Baudry L, Breuer A et al. Computer vision for pattern detection in chromosome contact maps. Nat Commun 2020;11:5795. 10.1038/s41467-020-19562-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
Mukhopadhyay P, Chaudhuri BB. A survey of Hough transform. Pattern Recognit 2015;48:993–1010. 10.1016/j.patcog.2014.08.027 [DOI] [Google Scholar]
Nora EP, Lajoie BR, Schulz EG et al. Spatial partitioning of the regulatory landscape of the X-inactivation Centre. Nature 2012;485:381–5. 10.1038/nature11049 [DOI] [PMC free article] [PubMed] [Google Scholar]
Rabl C. Uber zellthilung. Morphol Jahrb 1885;10:214–330. [Google Scholar]
Raffo A, Biasotti S. Data-driven quasi-interpolant spline surfaces for point cloud approximation. Comput Graph 2020;89:144–55. 10.1016/j.cag.2020.05.004 [DOI] [Google Scholar]
Raffo A, Biasotti S. Weighted Quasi-Interpolant spline approximations of planar curvilinear profiles in digital images. Mathematics 2021a;9:3084. 10.3390/math9233084 [DOI] [Google Scholar]
Raffo A, Biasotti S. Weighted quasi-interpolant spline approximations: properties and applications. Numer Algor 2021b;87:819–47. 10.1007/s11075-020-00989-4 [DOI] [Google Scholar]
Raffo A, Paulsen J. The shape of chromatin: insights from computational recognition of geometric patterns in Hi-C data. Brief Bioinform 2023;24:bbad302. 10.1093/bib/bbad302 [DOI] [PMC free article] [PubMed] [Google Scholar]
Raffo A. Data analysis results for StripePy manuscript, April 2025. 10.5281/zenodo.15308825.
Raffo A. StripeBench: a benchmark to compare architectural stripe callers, December 2024. 10.5281/zenodo.14535653
Raffo A, Rossini R. paulsengroup/2024-stripepy-paper: v1.1.0, April 2025a. 10.5281/zenodo.15310693
Ramani V, Cusanovich DA, Hause RJ et al. Mapping 3D genome architecture through in situ DNase Hi-C. Nat Protoc 2016;11:2104–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
Reiff SB, Schroeder AJ, Kırlı K et al. The 4d Nucleome data portal as a resource for searching and visualizing curated nucleomics data. Nat Commun 2022;13:6561. 10.1038/s41467-022-29697-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
Rossini R, Kumar V, Mathelier A et al. MoDLE: high-performance stochastic modeling of DNA loop extrusion interactions. Genome Biol 2022;23:247. 10.1186/s13059-022-02815-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
Rossini R, Paulsen J. Hictk: blazing fast toolkit to work with .hic and. cool files. Bioinformatics 2024;40:btae408. 10.1093/bioinformatics/btae408 [DOI] [PMC free article] [PubMed] [Google Scholar]
Raffo A, Rossini R. paulsengroup/stripepy: v1.1.0, April 2025b. 10.5281/zenodo.15310827
Van Bortle K, Corces VG. Nuclear organization and genome function. Annu Rev Cell Dev Biol 2012;28:163–87. 10.1146/annurev-cellbio-101011-155824 [DOI] [PMC free article] [PubMed] [Google Scholar]
Vian L, Pękowska A, Rao SSP et al. The energetics and physiological impact of cohesin extrusion. Cell 2018;173:1165–78.e20. 10.1016/j.cell.2018.03.072 [DOI] [PMC free article] [PubMed] [Google Scholar]
Yardımcı GG, Ozadam H, Sauria MEG et al. Measuring the reproducibility and quality of Hi-C data. Genome Biol 2019;20:57. 10.1186/s13059-019-1658-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
Yoon S, Chandra A, Vahedi G. Stripenn detects architectural stripes from chromatin conformation data using computer vision. Nat Commun 2022;13:1602. 10.1038/s41467-022-29258-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
Zheng H, Xie W. The role of 3D genome organization in development and cell differentiation. Nat Rev Mol Cell Biol 2019;20:535–50. 10.1038/s41580-019-0132-4 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

btaf351_Supplementary_Data

btaf351_supplementary_data.pdf^{(15.2MB, pdf)}

Data Availability Statement

Hi-C and Micro-C datasets: 4DNFI6HDY7WZ (Krietenstein et al. 2020)—H1-hESC (in situ Hi-C; DpnII); 4DNFI9GMP2J8 (Krietenstein et al. 2020)—H1-hESC (Micro-C); ENCFF216QQM—GM12878 (in situ Hi-C; DpnII); ENCFF993FGR—GM12878 (intact MNase Hi-C).
CTCF ChIP-Seq datasets: ENCFF692RPA—H1-hESC; ENCFF796WRU—GM12878.

The results presented throughout the manuscript have been deposited on Zenodo at DOI: 10.5281/zenodo.15308825 (Raffo 2025).

[btaf351-B1] Abdennur N, Mirny LA. Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics 2020;36:311–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B2] Aguilar-Ruiz JS, Michalak M. Classification performance assessment for imbalanced multiclass data. Sci Rep 2024;14:10759. 10.1038/s41598-024-61365-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B3] Anderson TW, Darling DA. A test of goodness of fit. J Am Stat Assoc 1954;49:765–9. 10.1080/01621459.1954.10501232 [DOI] [Google Scholar]

[btaf351-B4] Arnould C, Rocher V, Finoux A-L et al. Loop extrusion as a mechanism for formation of DNA damage repair foci. Nature 2021;590:660–5. 10.1038/s41586-021-03193-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B5] Bonev B, Cavalli G. Organization and function of the 3D genome. Nat Rev Genet 2016;17:661–78. 10.1038/nrg.2016.112 [DOI] [PubMed] [Google Scholar]

[btaf351-B6] Bouwman BAM, Crosetto N, Bienko M. The era of 3D and spatial genomics. Trends Genet 2022;38:1062–75. 10.1016/j.tig.2022.05.010 [DOI] [PubMed] [Google Scholar]

[btaf351-B7] Carlsson G. Topology and data. Bull Amer Math Soc 2009;46:255–308. 10.1090/S0273-0979-09-01249-X [DOI] [Google Scholar]

[btaf351-B8] Carty M, Zamparo L, Sahin M et al. An integrated model for detecting significant chromatin interactions from high-resolution Hi-C data. Nat Commun 2017;8:15454. 10.1038/ncomms15454 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B9] Chang L-H, Ghosh S, Noordermeer D. TADs and their borders: free movement or building a wall? J Mol Biol 2020;432:643–52. 10.1016/j.jmb.2019.11.025. [DOI] [PubMed] [Google Scholar]

[btaf351-B10] Chen Y, Lin Z-B, Wang S-K et al. Reconstruction of diploid higher-order human 3D genome interactions from noisy Pore-C data using Dip3D. Nat Struct Mol Biol 2025:1–26. 10.1038/s41594-025-01512-w[PMC] [DOI] [PubMed] [Google Scholar]

[btaf351-B11] Cifuentes D, Draisma J, Henriksson O et al. 3D genome reconstruction from partially phased Hi-C data. Bull Math Biol 2024;86:33. 10.1007/s11538-024-01263-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B12] Cremer T, Cremer M. Chromosome territories. Cold Spring Harb Perspect Biol 2010;2:a003889. 10.1101/cshperspect.a003889 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B13] de Wit E. TADs as the caller calls them. J Mol Biol 2020;432:638–42. 10.1016/j.jmb.2019.09.026 [DOI] [PubMed] [Google Scholar]

[btaf351-B14] Dekker J, Belmont AS, Guttman M et al. ; 4D Nucleome Network The 4D nucleome project. Nature 2017;549:219–26. 10.1038/nature23884 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B15] Dekker J, Misteli T. Long-range chromatin interactions. Cold Spring Harb Perspect Biol 2015;7:a019356. 10.1101/cshperspect.a019356 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B16] Di Stefano M, Paulsen J, Jost D et al. 4D nucleome modeling. Curr Opin Genet Dev 2021;67:25–32. 10.1016/j.gde.2020.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B17] Dixon JR, Selvaraj S, Yue F et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 2012;485:376–80. 10.1038/nature11082 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B18] Drayton JA, Hansen AS. Right on target: chromatin jets arise from targeted cohesin loading in wild-type cells. Mol Cell 2022;82:3755–7. 10.1016/j.molcel.2022.09.027 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B19] Dunham I, Kundaje A, Aldred SF et al. An integrated encyclopedia of DNA elements in the human genome. Nature 2012;489:57–74. 10.1038/nature11247 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B20] Durand NC, Robinson JT, Shamim MS et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst 2016;3:99–101. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B21] Folk M, Heber G, Koziol Q et al. An overview of the HDF5 technology suite and its applications. In: Proceedings of the EDBT/ICDT 2011 Workshop on Array Databases. New York, NY: Association for Computing Machinery, 2011, 36–47. 10.1145/1966895.1966900 [DOI] [Google Scholar]

[btaf351-B22] Fudenberg G, Imakaev M, Lu C et al. Formation of chromosomal domains by loop extrusion. Cell Rep 2016;15:2038–49. 10.1016/j.celrep.2016.04.085 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B23] Gu Q, Zhu L, Cai Z. Evaluation measures of the classification performance of imbalanced data sets. In: Zhihua C, Zhenhua L, Zhuo K, Yong L (eds), Computational Intelligence and Intelligent Systems. Berlin, Heidelberg: Springer, 2009, 461–71. 10.1007/978-3-642-04962-0_53 [DOI] [Google Scholar]

[btaf351-B24] Gupta K, Wang G, Zhang S et al. StripeDiff: model-based algorithm for differential analysis of chromatin stripe. Sci Adv 2022;8:eabk2246. 10.1126/sciadv.abk2246 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B25] Hansen AS, Cattoglio C, Darzacq X et al. Recent evidence that TADs and chromatin loops are dynamic structures. Nucleus 2018;9:20–32. 10.1080/19491034.2017.1389365 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B26] Hitz BC, Lee J-W, Jolanki O et al. ; The ENCODE Uniform Analysis Pipelines. 2023. 10.1101/2023.04.04.535623 [DOI]

[btaf351-B27] Hough PVC. Method and means for recognizing complex patterns. US Patent 3069,654, 12 1962.

[btaf351-B28] Kagda MS, Lam B, Litton C et al. Data Navigation on the ENCODE Portal. 2023. https://arxiv.org/abs/2305.00006

[btaf351-B29] Kim S, Shendure J. Mechanisms of interplay between transcription factors and the 3D genome. Mol Cell 2019;76:306–19. 10.1016/j.molcel.2019.08.010 [DOI] [PubMed] [Google Scholar]

[btaf351-B30] Kraft K, Magg A, Heinrich V et al. Serial genomic inversions induce tissue-specific architectural stripes, gene misexpression and congenital malformations. Nat Cell Biol 2019;21:305–10. 10.1038/s41556-019-0273-x[PMC] [DOI] [PubMed] [Google Scholar]

[btaf351-B31] Krietenstein N, Abraham S, Venev SV et al. Ultrastructural details of mammalian chromosome architecture. Mol Cell 2020;78:554–65.e7. 10.1016/j.molcel.2020.03.003 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B32] Kuhn M, Johnson K. Applied Predictive Modeling. New York: Springer, 2018. [Google Scholar]

[btaf351-B33] Lajoie BR, Dekker J, Kaplan N. The Hitchhiker’s guide to Hi-C analysis: practical guidelines. Methods 2015;72:65–75. 10.1016/j.ymeth.2014.10.031 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B34] Lieberman-Aiden E, van Berkum NL, Williams L et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 2009;326:289–93. 10.1126/science.1181369 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B35] Liu N, Low WY, Alinejad-Rokny H et al. Seeing the Forest through the trees: prioritising potentially functional interactions from Hi-C. Epigenetics Chromatin 2021;14:41. 10.1186/s13072-021-00417-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B36] Luo Y, Hitz BC, Gabdank I et al. New developments on the encyclopedia of DNA elements (ENCODE) data portal. Nucleic Acids Res 2020;48:D882–9. 10.1093/nar/gkz1062[PMC][10.1093/nar/gkz1062] [31713622] [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B37] Luque A, Carrasco A, Martín A et al. The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognit 2019;91:216–31. 10.1016/j.patcog.2019.02.023 [DOI] [Google Scholar]

[btaf351-B38] Ma W, Ay F, Lee C et al. Fine-scale chromatin interaction maps reveal the cis-regulatory landscape of human lincRNA genes. Nat Methods 2015;12:71–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B39] Matthey-Doret C, Baudry L, Breuer A et al. Computer vision for pattern detection in chromosome contact maps. Nat Commun 2020;11:5795. 10.1038/s41467-020-19562-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B40] Mukhopadhyay P, Chaudhuri BB. A survey of Hough transform. Pattern Recognit 2015;48:993–1010. 10.1016/j.patcog.2014.08.027 [DOI] [Google Scholar]

[btaf351-B41] Nora EP, Lajoie BR, Schulz EG et al. Spatial partitioning of the regulatory landscape of the X-inactivation Centre. Nature 2012;485:381–5. 10.1038/nature11049 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B42] Rabl C. Uber zellthilung. Morphol Jahrb 1885;10:214–330. [Google Scholar]

[btaf351-B43] Raffo A, Biasotti S. Data-driven quasi-interpolant spline surfaces for point cloud approximation. Comput Graph 2020;89:144–55. 10.1016/j.cag.2020.05.004 [DOI] [Google Scholar]

[btaf351-B44] Raffo A, Biasotti S. Weighted Quasi-Interpolant spline approximations of planar curvilinear profiles in digital images. Mathematics 2021a;9:3084. 10.3390/math9233084 [DOI] [Google Scholar]

[btaf351-B45] Raffo A, Biasotti S. Weighted quasi-interpolant spline approximations: properties and applications. Numer Algor 2021b;87:819–47. 10.1007/s11075-020-00989-4 [DOI] [Google Scholar]

[btaf351-B46] Raffo A, Paulsen J. The shape of chromatin: insights from computational recognition of geometric patterns in Hi-C data. Brief Bioinform 2023;24:bbad302. 10.1093/bib/bbad302 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B47] Raffo A. Data analysis results for StripePy manuscript, April 2025. 10.5281/zenodo.15308825.

[btaf351-B48] Raffo A. StripeBench: a benchmark to compare architectural stripe callers, December 2024. 10.5281/zenodo.14535653

[btaf351-B49] Raffo A, Rossini R. paulsengroup/2024-stripepy-paper: v1.1.0, April 2025a. 10.5281/zenodo.15310693

[btaf351-B50] Ramani V, Cusanovich DA, Hause RJ et al. Mapping 3D genome architecture through in situ DNase Hi-C. Nat Protoc 2016;11:2104–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B51] Reiff SB, Schroeder AJ, Kırlı K et al. The 4d Nucleome data portal as a resource for searching and visualizing curated nucleomics data. Nat Commun 2022;13:6561. 10.1038/s41467-022-29697-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B52] Rossini R, Kumar V, Mathelier A et al. MoDLE: high-performance stochastic modeling of DNA loop extrusion interactions. Genome Biol 2022;23:247. 10.1186/s13059-022-02815-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B53] Rossini R, Paulsen J. Hictk: blazing fast toolkit to work with .hic and. cool files. Bioinformatics 2024;40:btae408. 10.1093/bioinformatics/btae408 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B54] Raffo A, Rossini R. paulsengroup/stripepy: v1.1.0, April 2025b. 10.5281/zenodo.15310827

[btaf351-B55] Van Bortle K, Corces VG. Nuclear organization and genome function. Annu Rev Cell Dev Biol 2012;28:163–87. 10.1146/annurev-cellbio-101011-155824 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B56] Vian L, Pękowska A, Rao SSP et al. The energetics and physiological impact of cohesin extrusion. Cell 2018;173:1165–78.e20. 10.1016/j.cell.2018.03.072 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B57] Yardımcı GG, Ozadam H, Sauria MEG et al. Measuring the reproducibility and quality of Hi-C data. Genome Biol 2019;20:57. 10.1186/s13059-019-1658-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B58] Yoon S, Chandra A, Vahedi G. Stripenn detects architectural stripes from chromatin conformation data using computer vision. Nat Commun 2022;13:1602. 10.1038/s41467-022-29258-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[btaf351-B59] Zheng H, Xie W. The role of 3D genome organization in development and cell differentiation. Nat Rev Mol Cell Biol 2019;20:535–50. 10.1038/s41580-019-0132-4 [DOI] [PubMed] [Google Scholar]

PERMALINK

StripePy: fast and robust characterization of architectural stripes

Andrea Raffo

Roberto Rossini

Jonas Paulsen

Roles

Abstract

Motivation

Results

Availability and implementation

1 Introduction

2 Materials and methods

2.1 Overview of StripePy

Figure 1.

2.2 StripeBench: unified benchmarking of stripe identification algorithms for Hi-C data

Figure 2.

3. Results

3.1 StripePy: improved stripe classification and recognition

Figure 3.

3.1.1 StripePy improves overall bin classification

3.1.2 Inclusion of stripe width estimation further increases StripePy’s performance

3.1.3 StripePy discovers previously detected and novel stripes

3.1.4 StripePy outperforms existing tools across a diverse set of conditions

3.1.5 Analyzing statistical significance

3.2 Analyzing human Hi-C matrices with StripePy reveals fast and robust detection of stripes

3.2.1 Running StripePy on real data confirms its performance in stripe recognition

3.3 StripePy exhibits excellent computational performance

4 Discussion

Supplementary Material

Acknowledgements

Contributor Information

Author contributions

Supplementary data

Funding

Code availability

Data availability

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases