Skip to main content
Analytical Science Advances logoLink to Analytical Science Advances
. 2025 Feb 25;6(1):e202400013. doi: 10.1002/ansa.202400013

A new methodology for sub‐femtomolar detection of organic molecules through the combination of surface‐enhanced Raman spectroscopy and a superhydrophobic fluidic concentrator

Victor Fabre 1,2,, Franck Carcenac 1, Adrian Laborde 1, Jean‐Baptiste Doucet 1, Christophe Vieu 1, Philippe Louarn 2, Emmanuelle Trevisiol 3,1
PMCID: PMC11878439  PMID: 40041746

Abstract

A specific device that combines (1) surface‐enhanced Raman spectroscopy (SERS) and (2) superhydrophobic surfaces is developed to detect traces of analytes diluted at sub‐femtomolar concentration in water solutions. The first step of the analysis consists in the evaporation of a drop of the solution on the device, designed to concentrate all the analytes on a central functionalized small area (80 µm diameter). This analytical zone is covered with Ag nanoparticles dedicated to enhance Raman signals. In a second step, this zone is scanned pixel by pixel to accumulate around 2200 Raman spectra. The third step is an algorithmic analysis of the pile of spectra to identify Raman peaks that are specific to the targeted molecules. We detail an original analysis method that allows (1) to select spectra that are significantly different from those obtained when a pure solvent is evaporated (control experiment), (2) to classify the spectra by a criterion of similarity and, finally, (3) to select the SERS spectra of the analytes. This method uses hierarchical correlation clustering techniques, the originality being to classify the different spectra on the basis of their peak positions, with all peaks being normalized at the same intensity and bandwidth. The method leads to a convincing identification of spectra of the targeted molecules (i.e. rhodamine B), down to atto‐molar concentrations.


Abbreviations

CCD

charge‐coupled device

DMSO

dimethyl sulphoxide

FDTS

1H,1H,2H,2H‐perfluorodecyltrichlorosilane

HCA

hierarchical clustering algorithm

HMDS

hexamethyldisilazane

PCA

principal components analysis

Rh‐B

rhodamine B

RIE

reactive ion etching

SEM

scanning electron microscope

SERS

surface‐enhanced Raman spectroscopy

SH

superhydrophobic

1. INTRODUCTION

Since the emergence of surface‐enhanced Raman spectroscopy (SERS) in the 1980s, progresses have been made in surface engineering to improve detection thresholds and, thus, to detect molecules at concentrations well below the femtomolar range. One of the methods to reach such a detection level is to couple an SERS surface to a superhydrophobic (SH) one. 1 , 2 , 3 , 4 By evaporating a drop containing an analyte on this type of surface, it turned out to be possible to pre‐concentrate the analytes and thus improve the limit of detection.

However, SH/SERS combination shows some limitations that need to be mitigated to exploit their full potential. One is linked to the characteristics of SERS spectra and, more precisely, to their fluctuations. 5 Due to the complexity of the interactions of molecules with a SERS surface, the measured Raman spectra may indeed present slight changes in peak's position, shape and intensity. Even more challenging for confident detection of molecules, some peaks may disappear from the spectra, which is often referred to as Raman blinking in the literature. Another limiting point that is scarcely discussed in SERS literature is the role of the solvent used to generate dilutions of a molecular compound at very low concentrations. Indeed, even ultrapure water coming from pharmaceutical and semiconductor clean room facilities 6 , 7 , 8 , 9 could contain minerals and organic matter susceptible to generate some peaks at some frequencies of interest for Raman identification. Hypothetical solution at zero concentration thus generates Raman signals pointing to the molecular vibrations generated by the elements constituting the ‘history’ of the dilution solvent. These two main issues of fluctuations and purity render SERS detection a complex analytical technique for real applications requiring the detection of sub‐femtomolar traces. We propose a method to overcome these difficulties. It is based on an algorithmic analysis of the spectra on SH/SERS surfaces using a hierarchical correlation clustering method.

With the improvement of computer science in both hardware (charge‐coupled device [CCD] sensors, computing power and memory) and software, the ability to record and process data – especially Raman spectra – has increased considerably. SERS, and more generally, Raman spectroscopy, has not escaped the trend of using machine learning algorithms for specific applications. Most of these algorithms can be divided into supervised and unsupervised categories. Supervised algorithms are ‘trained’ on data for which an upstream human analysis has performed a classification. Once the algorithm is trained, an unknown dataset is then analysed by the algorithm and classified based on what it has learned. In the unsupervised category, the algorithms search for structures that allow the data to be gathered into families with the same characteristics. In the case of SERS spectra analysis, the first papers using classification algorithms date back to 2010. This type of algorithm allowed the use of SERS not as a characterization technique but rather as a classification method for specific purposes such as food analysis, forensics, analysis of bacteria, 10 , 11 viruses 12 or medical diagnosis. 13 , 14 , 15

The approach we have selected in our work is to discard supervised algorithms as they do not consider the true complexity of the SERS spectra coming from an unknown sample. Indeed, this approach targets very specific detection applications, such as the discrimination between, for example, healthy and cancerous cells. On the contrary, what is sought in our work is the highlighting of information that will then be assessed relevant by the experienced user for, ideally, identifying some specific molecules contained at sub‐femtomolar concentration in the solution. Instead of using a conventional method based on principal components analysis, the idea of this paper is to introduce in the case of SERS spectra analysis, a hierarchical correlation classification method, that is classification based on the similarity of the spectra between themselves. Beyond the specificity of our application, our choice of not using artificial intelligence algorithms is motivated by a reduction of resources (all spectra will be processed with a standard laptop) and the desire to have an explicit categorization method thoroughly. From this point of view, the ‘expert’ system with decisions based on the explicit definition of criteria and thresholds is preferable and what we propose in the following paper can be called an explicit expert system.

In this paper, we present the SH/SERS surface used for the experiments. Then, an innovative method based on hierarchical correlation clustering for processing SERS information is described and discussed in light of some experiments targeting the identification of sub‐femtomolar traces of organic molecular compounds.

2. MICROFABRICATED DEVICE FOR COUPLING A SUPERHYDROPHOBIC FLUIDIC CONCENTRATOR AND AN SERS SURFACE

We have first developed a specific device for recording SERS spectra of a targeted molecule at very low concentrations. It combines a SH fluidic concentrator and a SERS surface based on Ag nanoparticles. The SH concentrator has been designed to drive the analytes during the evaporation of a micro‐volume of water‐based solution on the top of an analytical pedestal. Its top surface is covered by Ag nanoparticles that form nano‐antennas able to enhance Raman signals by plasmonic effects. After the evaporation of the sample and the deposit of the analytes, the focused laser beam of the Raman system scans, pixel by pixel the entire surface of the pedestal. Given the diameters of the beam of the laser (2 µm) and the one of the pedestal (80 µm), a typical experiment (evaporation + Raman imaging of the pedestal) generates about 2200 spectra. These spectra need to be treated algorithmically to extract the relevant information they contain. The description of the algorithmic method and the discussion of some applications will be the central focus of this contribution.

Figure 1 shows a global view of the device where the microstructures of the fluidic concentrator can be seen; meanwhile, in parts (A)–(E), the device is represented at different steps of the fabrication process. This gives insight into details on the plasmonic nanostructures responsible for the surface enhancement of Raman signals. In the following, we will refer to ‘SH/SERS’ for denoting the Raman spectra acquired using the device and the methodology of spectroscopic characterization.

FIGURE 1.

FIGURE 1

Scanning electron microscopy image of the superhydrophobic/surface‐enhanced Raman spectroscopy (SH/SERS) device in wide field: (A) focus on the ‘pedestal’ area after the first etching step, the same area is represented in (B) after the ICP‐reactive ion etching (RIE) etching creating Si nano‐pillars; in (C) focus of the nano‐pillars found in the central pedestal; (D) view of the nano‐pillars over the central pedestal at the end of the process after the thermal evaporation of 100 nm of silver on 5 nm of titanium and the lift‐off of the 1H,1H,2H,2H‐perfluorodecyltrichlorosilane (FDTS) hydrophobic layer; (E) zoom on some of the nano‐pillars of image (D) covered with silver nanoparticle, note the bending of the nano‐pillars resulting in a clustering of adjacent nano‐pillars, which couple the Ag nanoparticles leading to possible intense hot spots in SERS.

The SH aspect of the device and its use for concentrating scarce dilute molecules onto the analytical central pedestal have been detailed in a previous work. 16 Here, we briefly describe the fabrication of the full device, including the fluidic concentrator and the SERS surface. The detailed fabrication process can be found in Section 9. Each silicon device occupies a disk of 3 mm in diameter (Figure 1) distributed over a 4″ silicon wafer. Conventional UV photolithography followed by reactive ion etching (RIE) of a silicon wafer allows us to define the first level of the structure. Figure 1A shows the central part of the device where a circular pedestal of 80 µm diameter is present. This pedestal is the hydrophilic central zone of the device on which all the analytes will be concentrated. It is surrounded by five radial guiding lines, which form an angle of 72° among them. All over the surface, silicon micro‐pillars of 3 µm diameter and 6 µm high are arranged in a periodic hexagonal lattice (15 µm pitch) (Figure 1A). These micro‐pillars define the first level of topography conferring to the surface a SH character after the hydrophobic coating of the silicon wafer.

On top of the micro‐pillars and the pedestal, a second step of RIE is performed to create standing silicon nano‐pillars (Figure 1B,C) of 150 nm average diameter and 1.1 µm high. These nano‐pillars contribute to enhancing the super‐hydrophobicity of the surface by their roughening effect, and, more importantly, they will support metallic Ag nanoparticles, which will serve as plasmonic resonators. Indeed, to create the SERS substrate, a bilayer of Ti (5 nm) and Ag (100 nm) is then evaporated under vacuum on top of the layer of nano‐pillars. Finally, to improve the SH property of the surface, a self‐assembled molecular monolayer of 1H,1H,2H,2H‐perfluorodecyltrichlorosilane (FDTS) is deposited in vapour phase and structured by a lift‐off process after a second UV‐photolithography exposure step. This hydrophobic coating is deposited everywhere except on the central pedestal, which is left hydrophilic. This contrast of hydrophobicity induces a change in the evaporation mode as soon as the edges of the sessile droplet encounter the limit of the pedestal (see Ref. [16]). The result is a more homogenous 2D assembling of the analytes over the analytical zone. The surface of the Ag nanoparticles is left pristine (without FDTS) to avoid any unwanted molecules of FDTS in the proximity of the plasmonic antenna, which would have generated parasitic Raman peaks in the spectra.

Parts (D) and (E) of Figure 1 show the configuration of the nano‐antennas obtained at the end of the process. They are constituted of silicon nano‐pillars covered with Ag nanoparticles that tend to bend by deformation of the pillars. The resulting non‐homogeneity, with ‘aggregates’ of Ag nanoparticles separated by more or less narrow gaps, is known to favour electromagnetic enhancement. Depending on the relative positioning and conformation of the aggregates, the plasmonic effect can be locally enhanced; this defines so‐called hot spots that are randomly dispatched on the pedestal surface. Their density contributes to the overall SERS efficiency of the device. Each time some molecules of interest are left in the vicinity of a hot spot, an enhanced Raman signal is produced containing the Raman signature of the molecules to detect in the corresponding Raman spectrum.

3. HIERARCHICAL CORRELATION CLUSTERING

As previously explained, about 2200 spectra from the Raman scan are acquired on the pedestal. At very low concentrations, only a small fraction of these spectra contains relevant information for analysing the traces of the molecular compounds diluted in the water sample that has been evaporated. How can they be selected and how can they be distinguished from spectra of contaminants?

We have developed a dedicated algorithm that helps to cluster the spectra of interest and to differentiate them from spectra obtained by control experiments performed with only pure water (the diluent). It is based on the hierarchical correlation clustering of the spectra. Following the clustering method, 17 the principle is to divide the data into groups of similar objects (or ‘clusters’). In our case, the data are the spectra obtained from the Raman ‘imaging’ (or scan) of the pedestal. Each spectrum is the suite of counts (noted xi , 1 ≤ i ≤ n) detected on each of the n channels of acquisition of the Raman spectrometer, at a given pixel of the pedestal scan. The degree of similarity between two spectra (X and Y) is quantified using the Pearson correlation (ρ xy ). It is calculated for all pairs of spectra, which defines the correlation matrix:

ρxy=covX,YσXσYwith1ρxy1
x¯=i=1nxi,y¯=i=1nyi,
σx2=1ni=1nxix¯2,σy2=1ni=1nyiy¯2,
COVX,Y=1ni=1nxix¯yiy¯

where |𝜌𝑥𝑦| may vary from 0 to 1 (in the case of perfect correlation). This metric can be used to indicate that various spectra may originate from the vibrations of the same molecular substance.

Using Python programming and the pandas function library, all these Raman spectra can be arranged in the form of 2D arrays (DataFrames) with rows indexed according to the Raman wavenumbers and each column corresponding to one of the spectra of the mapping. The name of the column thus refers to a particular experiment and to an address that represents the position of the probed pixel of the surface. The DataFrames are used to compute all the Pearson correlation coefficients among spectra. This generates an N × N matrix (‘correlation matrix’) with N the number of columns of the DataFrames. It is a diagonal and symmetrical matrix. Several experiments can be treated together with N that can be larger than 10,000.

Figure 2A, as an example, represents 20 random spectra from the DataFrames constructed after evaporation of a 6 µL droplet of 100 nM rhodamine B (Rh‐B) solubilized in ultrapure water; the corresponding spectra are plotted in the usual manner (intensity vs. spectral shift). The associated correlation matrix is shown in 2b. In this particular case, all the spectra look quite similar and are characteristic of the fluorescence band of Rh‐B, which translates into high correlation coefficients, between 0.55 and 1 across the matrix.

FIGURE 2.

FIGURE 2

(A) Twenty randomly chosen surface‐enhanced Raman spectroscopy (SERS) spectra from the DataFrames constructed after evaporation of a 6 µL droplet of 100 nM of rhodamine B in ultrapure water; (B) the Pearson correlation coefficient matrix computed using the pandas library (Python).

The correlation matrix is the entry for the hierarchical data classification. We use a classical agglomeration method that consists of a series of data partitions and the formation of groups of Raman spectra that are gathered based on their mutual correlation. These groups will be called ‘clusters’. At each step of the method, the algorithm calculates the distance between the elements according to a specified metric and gathers the elements closest to each other. The procedure is then represented in the form of dendrograms, which is a branched structure similar to evolutionary trees. In this study, we use a metric based on dissimilarity. The dissimilarity matrix between the spectra can be calculated as the difference between the identity matrix and the absolute value of the correlation matrix. For two spectra X and Y, the distance can thus be calculated as follows:

dX,Y=1ρX,Y

The algorithm works as follows. In the beginning, each spectrum constitutes a single cluster. At first, the dissimilarity matrix (N × N) is used to form the first branches of the dendrogram connecting the two most similar elements; these branches form a cluster. Once this first cluster is built, a new distance matrix (N −1 × N −1) is recomputed. In this new matrix, only the distances between the elements and the new cluster are calculated. There are several methods to calculate the distance of the elements to the cluster. In this work, we use the ‘maximum distance’. If we note X 1, X 2, …, Xn , the elements of the cluster, noted C and Y, a distant element out of this cluster, then the maximum distance d(Y, C) can be calculated as

dY,C=maxidY,Xi

We chose this method of distance calculation because it ensures that an element will not be added to a cluster just because it has a significant degree of correlation with only one element in the cluster. This procedure is then repeated until all the spectra are assembled.

An example of a dendrogram is given in Figure 3. A threshold of dissimilarity allows cutting the dendrogram to form the clusters according to their mutual similarities. In the example below, the similarity threshold (1 – dissimilarity) has been set to 0.9. This threshold sections the dendrogram into three points and thus defines three clusters (see Figure 4). Cluster number 3 has only one element. Indeed, if we stick to the definition of the selected distance, then this one is distant by 0.45 (ρ = 0.55) with Cluster 1 and 0.14 (ρ = 0.86) with Cluster 2.

FIGURE 3.

FIGURE 3

Example of a dendrogram built using the Pearson correlation coefficient matrix from Figure 2B, the dotted line is a threshold set to 0.1 that cuts the dendrogram three times, thus defining three clusters.

FIGURE 4.

FIGURE 4

Representation of the three clusters constructed using the dendrogram of Figure 3 and a threshold set to 0.1; the first cluster contains 6 spectra, the second 13 spectra, meanwhile the last only 1.

4. ADAPTATION OF THE METHOD TO THE SPECIFICITIES OF SH/SERS SPECTRA

We now discuss the procedure that we have established to retrieve relevant spectra from experiments.

The first step is a pre‐selection consisting of rejecting spectra that present an anomaly or are, a priori, not informative regarding the type of molecules we are looking for. This concern, in particular, the spectra with a single extremely intense peak that has been contaminated by cosmic rays impacting the CCD cameras during the acquisition. They indeed interfere with the whole analysis and can be considered a form of noise. To do so, we use a Python adaptation by Coca‐Lopez 18 of the R algorithm presented by Whitaker et al. 19 We also reject those that do not exhibit any peaks in the domain 1000–1800 cm−1, in which organic molecules exhibit many peaks. In practice, we retain for analysis the spectra presenting at least three peaks in the region 1000–1800 cm−1. To identify the peaks, we use a Python code described in Multiscale peak detection in wavelet space by Zhang et al. 20

The second step intends to answer the crucial question of the purity of the solvent and the presence of contaminants. The objective is to select spectra that are representative of the molecules under analysis (e.g. Rh‐B) and significantly distinct from those generated by the analysis of a droplet of ultrapure water. To that end, we build a stack that gathers all the spectra from one or several experiments at non‐zero concentrations of Rh‐B and those obtained from a control experiment, issued from the evaporation of a droplet of solvent only (ultrapure DI water). We then perform the hierarchical clustering and retain as relevant information only the clusters that do not contain any spectrum from the control experiment. This highly selective method allows us to significantly increase our confidence in selecting only relevant spectra that are not suspected of coming from unavoidable contaminants.

This method is illustrated in Figure 5. We seek to detect Rh‐B molecule diluted at different concentrations, ranging from 10−18 to 10−7 M, in ultrapure water. All the spectra from the different experiments with solutions at different concentrations are stacked with the 2200 spectra obtained with ultrapure water. The initial DataFrames has about 29,000 entries. After the pre‐selection, this is reduced to 7665 spectra. They then feed the hierarchical classification algorithm.

FIGURE 5.

FIGURE 5

Workflow of surface‐enhanced Raman spectroscopy (SERS) spectra analysis. Left side: from experiments to cluster selection based on hierarchical correlation clustering. Right side: diagram explaining the process of elimination of contaminant spectra in the data frame.

Figure 6 shows the number of clusters free of spectra from the control experiment, in which there is more than one spectrum (we eliminate singletons) as a function of the similarity threshold. As expected, the number of generated clusters increases with the correlation threshold, which implies that the number of spectra in these clusters globally decreases. This number passes through a maximum for a correlation rate of 0.94 and returns to 0 when the correlation rate is 1 (all clusters being singletons). A threshold value of 0.9 appears to be a good compromise since it leads to clusters that are sufficiently rich in number of spectra without relaxing too much their similarities.

FIGURE 6.

FIGURE 6

(A) Number of clusters free of pure water spectra and containing more than one element as a function of the similarity threshold.

Figure 7 represents a typical cluster composed of seven spectra from 1 pM, 1 fM, 100 aM and 1 aM experiments.

FIGURE 7.

FIGURE 7

(A) Representation of the spectra of a generated cluster with a correlation threshold of 0.9. The red lines point the position of the algorithmically detected peaks. (B) List of the peaks detected in the spectra of the cluster shown in Figure 7. Each peak is denoted by its frequency shift expressed in cm−1.

This method thus actually assembles the spectra from different experiments into clusters, the Raman signatures of which are different from those generated by the control experiment. This was our purpose; however, when inspecting the ‘physical coherence’ of the resulting clusters, that is screening their common vibrational signatures, we discover that the method also assembles apparent inconsistent clusters, formed of spectra with uncorrelated peaks. An example is given in Figure 7 and shows a cluster that gathers spectra with peaks that never coincide and, thus, most likely correspond to different molecules. Further analysis has shown that the problem comes from the influence of the baseline on the correlation factor, which becomes predominant in comparison to the influence of the peaks themselves. In other words, the cluster gathers similar spectra; however, their similarity is not due to the peaks themselves but to their overall shape or envelope.

To improve the method, we thus remove this envelope before detecting the peaks of the spectra and classifying the selected spectra. This baseline removal step can therefore be considered a signal pre‐processing 21 step before analysis. In Section S1 of this paper, we detail this baseline issue.

Similarly, we also realize that the correlation can be largely affected by the intensity of the peaks, although this is not necessarily pertinent information due to the large fluctuations of the enhancement factor from one hot spot to another. Indeed, in SH/SERS, the intensity of the peaks is related to the exaltation factor around the probed hot spots, which depends intimately on the nanometre scale configuration of the Ag nanoparticles. There is a large part of the variability in these configurations and, then, a lot of fluctuations in the local exaltation factors. Moreover, SH/SERS detection at very low concentrations approaches conditions where less than a molecular monolayer is deposited on the analytical pedestal. This leads to situations where a hot spot single‐molecule regime is attained. In such a situation, the peak intensity at a given pixel position does not reflect the local concentration of molecules but rather characterizes the local plasmonic coupling between Ag nanoparticles, that is the substrate itself and not the analyte. Due to the high (and uncontrolled) variability of the peak intensities, it is thus possible that two spectra present a low correlation, although they contain peaks at close positions. We thus decide to perform the correlation on simplified spectra that only contain the information on the peak frequency positions (see Section S2 for details). By setting all peak intensities to 1 in the simplified spectrum and by attributing the same width to all the peaks (10 cm−1), we thus eliminate information potentially interesting for one trying to characterize the plasmonic effects of the substrate but not useful for our specific analytical purposes.

In other words, in the methodology adopted, we need a high‐surface density of hot spots to generate measurable vibrational peaks at very low molecular concentrations that can be discriminated from the baseline and the noise, but we do not exploit any information related to the intensity of the detected peaks as is usually done in analytical Raman techniques. Our hypothesis is therefore that the hierarchical correlation clustering on simplified spectra will classify spectra in clusters exhibiting some similarity in the chemical environment, which has been probed by SH/SERS. This method will serve as a preliminary selection of clusters that gather relevant spectra. After classification, it is necessary to display the original spectra to look at the physical coherence of the generated clusters. The simplified spectra serve as intermediary objects for feeding the classification algorithm with the relevant information.

5. RE‐ADJUSTMENT OF THE CORRELATION THRESHOLD USED IN HIERARCHICAL CLUSTERING ALGORITHM (HCA) WITH SIMPLIFIED SPECTRA

If it is obvious that ρ = 0 marks an absence of correlation, there is no numerical criterion that simply sets what is a ‘good’ correlation between two variables. There is an element of subjectivity in the human assessment of similarity, and |ρ| > 0.7 is classically considered a mark of ‘good’ correlation. The interest of the systemic calculation is to dispose of this ambiguity by categorizing the variables according to their cross‐correlation and, in a set of complex variables, to classify the spectra by cluster of highest correlation. We can then analyse the relevance of these classifications to see, for example, if similar clusters are reproduced in successive experiments.

When using simplified spectra, simply made of several gate functions at each peak position, the Pearson correlation factor gives information on the correspondence of the position of the peaks among two spectra. As seen previously, the choice of the Pearson correlation threshold used in the hierarchical clustering algorithm (HCA) is important in selecting the number of clusters greater than one element. To elaborate on the ‘right’ correlation threshold required for getting an efficient classification, we made a simulation. 10,000 spectra with 3 peaks, 10,000 spectra with 4 peaks and, finally, 10,000 spectra with 5 peaks (all with random peaks) have been created from scratch on the interval [1000–1800] cm−1. The Pearson correlation matrix (30,000 × 30,000) is then calculated. Hereafter, an element of the matrix (|ρXiYj|) will be called an observation. The distribution of the absolute value of the Pearson correlation coefficient across the generated matrix, that is the distribution of all the observations (shown in Figure S3‐1) is maximal at a correlation factor of 0.08 and then rapidly decreases. Another way to observe the decrease in the number of spectra exhibiting a correlation factor greater than a given value consists in plotting the cumulative number of observations depending on the Pearson correlation coefficient threshold (Figure S3‐2). In this graph, we show for several absolute values of the Pearson correlation coefficient the evolution of the cumulated number of observations that are superior to the absolute value of the Pearson correlation coefficient. The plot indicates, for example, that if a correlation threshold of 0.7 is used, only 0.0045% of all the pairs of spectra will be correlated with a larger factor. This percentage goes to 0.03% for a correlation threshold of 0.6. Together, these results show that the typical correlation factor among randomly generated simplified spectra very scarcely exceeds 0.6. It is also interesting to look at pairs of spectra exhibiting a correlation factor greater than 0.6. This is the purpose of Figure 8 where two pairs of spectra of respective correlation factors of 0.64 and 0.85 are shown. In the first case (Figure 8A), the spectrum labelled ‘427’ exhibits three gates (three peaks); meanwhile, the spectrum labelled ‘1789’ has four gates. Two of the three gates of spectrum ‘427’ are partially aligned with two of the four gates of spectrum ‘1789’; meanwhile, one gate of spectrum ‘427’ is perfectly aligned with one gate of spectrum ‘1789’. In the second case (Figure 8B) of higher correlation (0.85), the two spectra have three gates, among which one is perfectly aligned, whereas the two others are slightly misaligned. By generalization that we have checked among a large number of examples, we can conclude that if the classification algorithm of simplified spectra is parametrized with a correlation threshold of 0.6, then simplified spectra having at least three peaks in partial or perfect overlapping will be grouped in the same cluster.

FIGURE 8.

FIGURE 8

Example of two pairs (A and B) of simplified spectra exhibiting a Pearson correlation coefficient greater than 0.6.

The correlation threshold for clustering from the simplified spectra was then set to this value for this work. This value was obtained empirically, but its interpretation can be explained.

For a lower threshold value, we will group spectra having less than 3 peaks in common, whereas if we increase the threshold, two spectra will be declared similar (in the same cluster) if four peaks or more are in common. The adjustment of the threshold around 0.6 can thus be tuned by the user as a function of the criterion of similarity he wants to employ for the specific purpose of the experiment, either being more demanding or more relaxing. In our mind, the choice of 0.6 is a good starting compromise between a too‐strong correlative requirement (which tends to depopulate the clusters and increase their numbers) and an insurance that the spectra gathered in the same cluster have at least three vibrational peaks in correspondence. Overall, the method that we have developed and that we use for the rest of this paper is depicted in Figure 9.

FIGURE 9.

FIGURE 9

General procedure for the retained algorithmic selection of superhydrophobic/surface‐enhanced Raman spectroscopy (SH/SERS) spectra based on hierarchical clustering algorithm (HCA).

6. APPLICATION OF OPTIMIZED HIERARCHICAL CORRELATION CLUSTERING TO RHODAMINE B SH/SERS DETECTION

Figure 10 shows some selected spectra (one for each concentration [100 nM to 1 aM]) issued from clusters retained after hierarchical correlation clustering and selection. Associated Raman peaks are given in Figure 10b. The concentration of Rh‐B corresponding to a displayed spectrum is given in the spectrum title.

FIGURE 10.

FIGURE 10

(A) Example of superhydrophobic/surface‐enhanced Raman spectroscopy (SH/SERS) spectra of rhodamine B among the selected clusters by the algorithm for each concentration investigated (10−6 to 10−18 M/L), detected peaks are shown by red dots. (B) List of peaks detected and possible assignments using Refs. [19, 20]. (C) Example of a selected cluster, Cluster 182 has 10 spectra with different nominal concentrations (100 nM, 10 nM and 1 aM).

All clusters used to create Figure 10 can be found in Section S4. In Figure 10C, Cluster 182 is displayed as an example. As expected, we observe that in this cluster, the spectra have several peaks in common in the frequency window of interest (1000–1800 cm−1), despite very strong variations of the baseline which sometimes completely mask the Raman peaks. Rh‐B, due to its aromatic rings, presents a large number of Raman peaks in the region 1000–1800 cm−1. With the selection of 0.6 as a correlation threshold, it is possible to check that each spectrum of Figure 10A has at least three peaks that can be assigned to vibrational excitations of Rh‐B reported in the literature. For instance, chemical bonds of xanthene C–C elongation could be found at 1189, 1358 and 1646 cm−1. Cdiethylamine–H deformation peak is present at 1465 cm−1. The most remarkable spectrum of Figure 10A is the spectrum recorded during an experiment achieved at 10−18 M (1 aM), which has four peaks of Rh‐B: 1280, 1364 (C–C elongation of xanthene), 1465 (Cdiethylamine–H deformation) and 1646 (C–C elongation of xanthene) cm−1.

This example of Rh‐B detection using an SH/SERS device and an optimized hierarchical classification algorithm, accounting for the specificities of SERS spectra, illustrates the potentiality of this methodology for combining a very low limit of detection, discrimination from contamination and confidence in the identification of a targeted molecule.

7. DISCUSSION: TOWARDS AN SH/SERS QUANTITATIVE ANALYTICAL METHOD?

Usually, to achieve a quantitative study, it is necessary to perform a calibration curve that allows one to relate the concentration to the variations of a metric that can be read out from the recorded spectra. In the case of ‘conventional’ Raman, the metric is the intensity of a particular vibrational peak of the targeted molecule of interest. The baseline extraction of the signal is therefore of primary importance. In our case, we envision a new method of quantitative analysis in SH/SERS, not relying on peak intensity, which at very low concentration requires generating an average spectrum over a large area of the substrate. The new paradigm we have introduced is based on individual, pixel by pixel, Raman spectrum recording and classification of the mass of the spectra to select algorithmically the most informative ones. As explained before, during this treatment, intensity information is lost. The metrics that could be used for calibration purposes are therefore not intensity‐based but could be simply the total number of spectra contained in the final selected clusters by the algorithm and post‐validated by the user.

We try to illustrate this analysis in Figure 11, which displays the evolution of the total number of representative Rh‐B spectra (at least three peaks of the molecule) included in the selected clusters as a function of the Rh‐B concentration in ultrapure water. As expected, we see that the number of certified Rh‐B spectra decreases when Rh‐B concentration decreases. However, the evolution with concentration is not monotonous. For concentrations ranging from 100 to 10 nM, we note that there is no very significant evolution of the number of certified spectra. Between 10 nM and 100 pM, a rapid decrease in the number of spectra in the clusters is visible. Finally, from 100 pM and below, we observe a plateau in the number of representative spectra with a slight increase at 100 aM.

FIGURE 11.

FIGURE 11

Left side: experimental evolution of the total number of certified rhodamine B spectra in selected clusters as a function of rhodamine B concentration in ultrapure water. Right side: speculative evolution of the total number of certified rhodamine B spectra in selected clusters according to the hypothesis of a single‐molecule regime (see text for explanation).

In Figure 11, on the right side, we speculate about what could be an SH/SERS quantification curve based on the analysis of the number of certified spectra in the algorithmically selected clusters. Three regimes can be distinguished: the saturation regime at high concentrations, the quantification regime (1 µM to 1 pM) and, finally, the detection regime, when single‐molecule configurations dominate.

We hypothesize that this observation traces a change of regime in the way molecules are concentrated and adsorbed at the surface of the analytical pedestal of our SH/SERS devices. At very low concentrations, here in the case of Rh‐B below 100 pM, a so‐called single‐molecule regime is reached, meaning that on average less than a monolayer of targeted molecules is deposited at the SERS surface. It is quite intuitive to imagine that in this regime, the number of ‘certified’ spectra (spectra that are selected by the algorithm and validated by the direct inspection) is mainly dominated by the surface density of hot spots or more precisely by the probability that a single molecule lies in the proximity of a hot spot. We hypothesize that the number of these random coincidence events does not reflect linearly the initial concentration of the sample, and this is the reason why a plateau in the number of certified spectra is reached. In other words, in this regime, the dispersion of the local electromagnetic enhancement factors, the morphological heterogeneity of the hot spots and the diversity of the molecular conformations resulting from the adsorption are finally as important as the concentration of the solution. Note also that in these figures, we have reported the nominal concentration resulting from the protocol of successive dilutions, which is affected by uncertainties on the real concentrations of the drops evaporated on the devices.

8. CONCLUSION

In this paper, we have developed an algorithmic methodology for treating SERS spectra obtained by scanning, pixel by pixel, the analytical surface of a fluidic SH concentrator device. The analytical zone is covered by integrated Ag silver nano‐antennas obtained by thin Ag film deposition over etched silicon nano‐pillars. The evaporation of a sessile droplet of solution over this device conducts all the molecules contained in the droplet to an 80 µm diameter circular analytical zone. Then, by probing the entire analytical area, all the hot spots of the surface where the molecules of interest are randomly distributed are recorded. The obtained spectra are then algorithmically selected to reveal the presence of the targeted molecules down to sub‐femtomolar concentrations. The classification algorithm of these so‐called SH/SERS spectra is based on hierarchical clustering. The proposed algorithm enables the classification of the spectra based on the similarity of their vibrational Raman peaks while being transparent to the fluctuations of the baseline of the spectra. The methods enable us to be confident with the identification of the targeted molecules down to the atto‐molar range because we only retained spectra that are not classified in a cluster containing any spectrum of a control experiment performed with ultrapure DI water, that is with a nominal null concentration of targeted molecules.

Altogether, the results presented in this paper show that the method enables the detection of Rh‐B diluted at 10–18 M, with an identification of the molecule based on at least three vibrational peaks and not only one as in previously published works. 22 In summary, we propose a method for titrating a solution at a very low concentration that is not based on peak intensity but on the number of SERS spectra selected by the algorithm and certified by their physical relevance.

9. EXPERIMENTAL SECTION

9.1. Fabrication of a hierarchical superhydrophobic surface‐enhanced Raman scattering device

The device is fabricated on a 4‐in. single‐crystal silicon substrate (001). Figure 12 represents the different fabrication steps. The first step consists of photolithography, which aims at transferring the patterns of the mask in a photosensitive resin (Figure 12A). To do this, the wafer is first subjected to an O2 plasma (5 min 800 W) and an induction of hexamethyldisilazane, which is an adhesion promoter. A positive tone resist (AZ ECI 300) is then spin‐coated on the substrate. The thickness of this resist film is 1.1 µm. This resist is exposed to UV illumination through the mask MA6 gen 4 optical aligner (dose 150 mJ/cm2), then annealed at 90°C for 60 s. Figure 12b shows the structure after the development of the resin (25 s in MFCD‐26). Figure 12c corresponds to the etching and removal of the resist layer. It allows transferring the patterns in the silicon. Using a Bosch etching process (Alcatel AMS4200), it is possible to obtain an etching depth of 6 µm in 100 s. This process combines 2.5 s cycles of SF6 (250 sccm) and 1.5 s cycles of C4F8 (250 sccm) and O2 (38 sccm). The RF power is 2500 W, whereas the pulsed RF power is 70 W for 20 ms. The reactor temperature is set to 10°C and the pressure is set to 5.10−2 mBar. The resin is then removed by soaking the wafer in acetone for 2 min and then rinsing it with deionized water. An O2 plasma (5 min, 800 W) is then applied to remove all traces of resist. Figure 12I is a scanning electron microscope (SEM) image of the centre of the surface after the first two steps. We can observe the 80 µm diameter central pedestal, the 3 µm micro‐pillars as well as the guiding lines.

FIGURE 12.

FIGURE 12

Schematic diagram of the clean room process used to produce a device combining surface‐enhanced Raman spectroscopy (SERS) nano‐antennas and a superhydrophobic surface.

The Alcatel AMS4200 is also used to create the silicon nano‐pillars through an etching process called ‘black silicon’. The plasma is performed at 20°C, the gas flow rates are set at 38 sccm for SF6 and 45 sccm for O2, and it lasts 5 min with 600 W RF power and 30 W RF power. The pressure is set to 10−2 mBar. Figure 12II shows us the edge of the pedestal after the maskless etching. We can observe nano‐pillars present on the entire surface. The surface is then subjected to an Argon plasma (100 W, 60 s) followed by an O2 plasma (800 W, 5 min) to remove any etching contamination. A 100 nm layer of silver is then deposited on a 5 nm Ti bonding layer using the EVA 600 vacuum thermal evaporation machine (Figure 12f). The resulting structure is not SH at this time because the surface of the slightly oxidized silicon substrate is hydrophilic. To convert the entire hydrophobic surface except for the central pedestal, an FDTS lift‐off is performed. The idea is to structure a resist pad on the pedestal, perform the FDTS deposition to make all the remaining surface SH and then remove the resin from the pedestal. To do this, a layer of MicroChemicals AZ499 resist is spray‐coated (SUSS MicroTec automatic spray‐coating system) and baked at 90°C for 2 min. The MA6 gen 4 is used to expose the resist with a dose of 600 mJ/cm2 through a new mask aligned with the previous patterns, allowing the structure of the central pedestal. The resistor is then developed with EVG 120. The state of the surface is schematized in Figure 12h, and the orange layer represents the resin covering the central pedestal. FDTS is vapour deposited using the ORBIS 1000 platform (40°C with a chamber pressure of 40 Torr; deposition time is 5 min). The flow rate of water and FDTS is set at 40 sccm. The wafer is then immersed in dimethyl sulphoxide at a high temperature (80°C) for 4 h to remove the resin. At the end of the process (Figure 12i), FDTS is present on the entire surface (in red on the image) except on the pedestal. By observing the structure with an SEM (Figure 12III), we can observe silver‐coated silicon nano‐pillars. The nano‐pillars form bundles: They are bent when the resist is removed.

9.2. Sample preparation

Solutions of Rh‐B (purchased from Sigma‐Aldrich) have been prepared by cascade dilution (dilution factor 10, using ultrapure water for the clean room facilities) from a mother solution at 1 mM. Solutions from 1 µM to 1 aM have been prepared and used for experiments using sterile Eppendorf and pipette tips.

9.3. Measurement

The Raman spectrometer we used is a HORIBA Jobin Yvon from the Xplora range; it is equipped with an Olympus BXF optical microscope with three objectives (×100 NA 0.9, ×50 NA 0.5 and ×10 NA 0.2). The acquisition is done in backscatter mode, that is the objective is used both as a focuser and as a collector. The Nd:YAG 532 nm laser with an output power of 10 mW is focused, in a spot of 2 µm diameter, on the surface through the objective ×50 (numerical aperture NA = 0.5). Neutral filters are used to reduce the intensity to 0.1 mW on the sample. The light scattered by the sample is sent, after dispersion, on an array of 600 slits per mm on a CCD camera 1024 × 256 pixels and cooled to −60°C by a Peltier module. The use of this laser and this grating, in particular, allows one acquisition to obtain a spectrum on a spectral window extending from 100 to 5000 cm−1. A single acquisition is made per spectrum at an integration time of 250 ms. The pixel size of the mapping is 1.5 × 1.5 µm2. It is thus possible to map the pedestal by acquiring 2200 spectra in approximately 20 min. The pedestal of each device was then mapped using the above conditions, except the 100 and 10 nM concentrations, for which the parameters were as follows: respectively, 250 ms of acquisition and 0.01 mW of power under the sample; 100 ms of acquisition and 0.1 mW of power under the sample. Indeed, for concentrations from 100 to 1 nM, the signal emitted by the sample caused the detector to saturate; it was, therefore, necessary to reduce either the exposure time or the laser power. As we acquire 2200 spectra per experiment, we have to use an algorithmic method to retrieve the pertinent information.

CONFLICT OF INTEREST STATEMENT

The authors declare no conflicts of interest.

Supporting information

Supporting Information

Figure S1/Figure S2/Figure S3/Figure S4 – Baseline issue/Baseline removal method/Pearson correlation factors of simplified spectra/Clusters used to create Figure 10 (file type.docx)

ACKNOWLEDGEMENTS

This work was supported by LAAS‐CNRS micro and nanotechnologies platform, a member of the Renatech French National network. The CNES (Centre National d'Etudes Spatiales) is gratefully acknowledged for financial support.

Fabre V, Carcenac F, Laborde A, et al. A new methodology for sub‐femtomolar detection of organic molecules through the combination of surface‐enhanced Raman spectroscopy and a superhydrophobic fluidic concentrator. Anal Sci Adv. 2025;6:e202400013. 10.1002/ansa.202400013

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are available from the corresponding author upon reasonable request.

REFERENCES

  • 1. Gentile F, Coluccio ML, Coppedè N, et al. Superhydrophobic surfaces as smart platforms for the analysis of diluted biological solutions. ACS Appl Mater Interfaces. 2012;4:3213‐3224. [DOI] [PubMed] [Google Scholar]
  • 2. Luo X, Pan R, Cai M, et al. Atto‐molar Raman detection on patterned superhydrophilic‐superhydrophobic platform via localizable evaporation enrichment. Sens Actuators B Chem. 2021;326:128826. [Google Scholar]
  • 3. Yu J, Wu J, Yang H, et al. Extremely sensitive SERS sensors based on a femtosecond laser‐fabricated superhydrophobic/‐philic microporous platform. ACS Appl Mater Interfaces. 2022;14:43877‐43885. [DOI] [PubMed] [Google Scholar]
  • 4. De Angelis F, Gentile F, Mecarini F, et al. Breaking the diffusion limit with super‐hydrophobic delivery of molecules to plasmonic nanofocusing SERS structures. Nature Photon. 2011;5:682‐687. [Google Scholar]
  • 5. Merlen A, Pardanaud C, Gratzer K, et al. Spectral fluctuation in SERS spectra of benzodiazepin molecules: the case of oxazepam. J Raman Spectrosc. 2020;51:2192‐2198. [Google Scholar]
  • 6. Balazs MK. Measuring and identifying particles in ultrapure water. In: Mittal KL, ed. Particles in Gases and Liquids 1: Detection, Characterization, and Control. Springer; 1989:35‐50. doi: 10.1007/978-1-4613-0793-8_2 [DOI] [Google Scholar]
  • 7. Kulakov LA, Mcalister MB, Ogden KL, Larkin MJ, O'hanlon JF. Analysis of bacteria contaminating ultrapure water in industrial systems. Appl Environ Microbiol. 2002;68:1548‐1555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Zhang X, Yang Y, Ngo HH, et al. A critical review on challenges and trend of ultrapure water production process. Sci Total Environ. 2021;785:147254. [DOI] [PubMed] [Google Scholar]
  • 9. Zhao P, Bai Y, Liu B, Chang H, Cao Y, Fang J. Process optimization for producing ultrapure water with high resistivity and low total organic carbon. Process Saf Environ. Prot. 2019;126:232‐241. [Google Scholar]
  • 10. Jan T‐K, Lin H‐T, Chen H‐P, et al. Cost‐sensitive classification on pathogen species of bacterial meningitis by surface enhanced Raman scattering. In: 2011 IEEE International Conference on Bioinformatics and Biomedicine . IEEE; 2011:390‐393. doi: 10.1109/BIBM.2011.133 [DOI] [Google Scholar]
  • 11. Huang C‐Y, Tsai T‐H, Wen B‐C, et al. Hybrid SVM/CART classification of pathogenic species of bacterial meningitis with surface‐enhanced Raman scattering. In: 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) . IEEE; 2010:406‐409. doi: 10.1109/BIBM.2010.5706600 [DOI] [Google Scholar]
  • 12. Lim J‐Y, Nam J‐S, Shin H, et al. Identification of newly emerging influenza viruses by detecting the virally infected cells based on surface enhanced Raman spectroscopy and principal component analysis. Anal Chem. 2019;91:5677‐5684. [DOI] [PubMed] [Google Scholar]
  • 13. Botta R, Chindaudom P, Eiamchai P, et al. Tuberculosis determination using SERS and chemometric methods. Tuberculosis. 2018;108:195‐200. [DOI] [PubMed] [Google Scholar]
  • 14. Kim W, Lee SH, Kim JH, et al. Paper‐based surface‐enhanced Raman spectroscopy for diagnosing prenatal diseases in women. ACS Nano. 2018;12:7100‐7108. [DOI] [PubMed] [Google Scholar]
  • 15. Banaei N, Moshfegh J, Mohseni‐Kabir A, Houghton JM, Sun Y, Kim B. Machine learning algorithms enhance the specificity of cancer biomarker detection using SERS‐based immunoassays in microfluidic chips. RSC Adv. 2019;9:1859‐1868. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Fabre V, Carcenac F, Laborde A, et al. Hierarchical superhydrophobic device to concentrate and precisely localize water‐soluble analytes: a route to environmental analysis. Langmuir. 2022;38:14249‐14260. [DOI] [PubMed] [Google Scholar]
  • 17. Berkhin P. A survey of clustering data mining techniques. In: Kogan J, Nicholas C, Teboulle M, eds. Grouping Multidimensional Data: Recent Advances in Clustering. Springer; 2006:25‐71. doi: 10.1007/3-540-28349-8_2 [DOI] [Google Scholar]
  • 18. Cocal‐Lopez N. GitHub – Nicocopez/Outlier_detection_for_Spikes_Removal_from_Raman_Spectra: Whitaker and Hayes' Approach to Remove Spikes from Raman Spectra Uses an Outlier Detection Technique. https://github.com/nicocopez/Outlier_detection_for_Spikes_Removal_from_Raman_Spectra/tree/master
  • 19. Whitaker DA, Hayes K. A simple algorithm for despiking Raman spectra. Chemometrics and Intelligent Laboratory Systems; 2018:82‐84. [Google Scholar]
  • 20. Zhang Z‐M, Tong X, Peng Y, et al. Multiscale peak detection in wavelet space. Analyst. 2015;140:7955‐7964. [DOI] [PubMed] [Google Scholar]
  • 21. Giguere S, Carey C, Boucher T, Mahadevan S, Dyar MD, An optimization perspective on baseline removal for spectroscopy. In: Proceedings of the 5th IJCAI Workshop on Artificial Intelligence in Space. 2015. [Google Scholar]
  • 22. Lee HK, Lee YH, Zhang Qi, et al. Superhydrophobic surface‐enhanced Raman scattering platform fabricated by assembly of Ag nanocubes for trace molecular sensing. ACS Appl Mater Interfaces. 2013;5:11409‐11418. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Figure S1/Figure S2/Figure S3/Figure S4 – Baseline issue/Baseline removal method/Pearson correlation factors of simplified spectra/Clusters used to create Figure 10 (file type.docx)

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.


Articles from Analytical Science Advances are provided here courtesy of Wiley

RESOURCES