Skip to main content
Nature Communications logoLink to Nature Communications
. 2025 Dec 4;16:10886. doi: 10.1038/s41467-025-65872-z

Bio-inspired cross-modal super-additive plasticity for seamless visual processing-in-sensory and -in-memory

Xiong Xiong 1,, Tianyue Fu 1, Chengru Gu 2, Qijun Li 2, Honggang Liu 2, Xin Wang 1, Jiyang Kang 2, Shiyuan Liu 1, Yufan Wang 3, Dong Li 3, Xiao Wang 3, Anlian Pan 3, Yanqing Wu 1,2,
PMCID: PMC12678436  PMID: 41345098

Abstract

Bio-inspired cross-modal visual perception hardware offers potential for edge intelligence. However, physical implementation of such hardware by conventional optoelectronics typically results in linear function combinations, lacking super-additive integration. Here, inspired by the primary cortex of the biological brain, we design a hardware platform based on molybdenum disulfide channel for processing-in-sensory and -in-memory. Cross-modal correlation photoelectric signals processing is demonstrated by utilizing electric field-assisted photogenerated carrier tunneling based on a floating gate photoelectric device array. The devices exhibit high synergistic paradigm super-additive behavior up to 103 times and significant time-dependent plasticity for visual encoding and perception enhancement. After sensory preprocessing, patterns are accurately routed and recognized by a non-volatile four-transistor ternary content-addressable memory circuit array. The cell maintains a large resistance ratio of 105 and high lookup durability of 1012. The hardware platform of cross-modal visual perception empowers seamless visual process-in-sensory and -in-memory, providing potential for ubiquitous visual edge intelligent systems.

Subject terms: Electrical and electronic engineering, Sensors


Xiong et al. report a floating gate photoelectric array based on MoS2 channel layer. By regulating the tunnelling efficiency of electric field-assisted photogenerated carrier, it simulates the cross-modal correlation plasticity observed in the primary cortex of brain, enabling visual perception hardware for secure image coding.

Introduction

Recent studies have revealed that the primary sensory cortex can respond to stimuli outside its main sensory modality, challenging the longstanding notion that it exclusively processes one sensory mode1,2. Cross-modal multisensory information provides richer, more accurate responses, enabling organisms to achieve more efficient perception in shorter time frames and at lower detection thresholds3. For instance, animals utilize visual and auditory correlation perception for more accurate environmental perception, motion navigation, and predator detection behavior46. Simultaneously, cross-modal correlation responses facilitate information re-encoding and integration, as confirmed by classic effects like the McGurk effect7, sound-induced flash illusion8, and rubber hand illusion9. Although neuroscience has extensively studied cross-modal associated perception, there are still challenges in mechanism development and physical implementation of related bio-inspired solid-state electronic hardware, with the super-additive nature of cross-modal perception for high-efficiency artificial intelligence.

Visual information occupies a significant proportion of Internet of Things applications. Traditional information processing techniques face challenges of data chain redundancy, high energy consumption, and low efficiency10,11. Significant efforts have been invested in simulating bionic visual morphological hardware, such as biomimetic visual information preprocessing techniques for reducing redundant data and improving the recognition efficiency of subsequent processing1214, high-speed in-situ photoelectric neural network training and inference15, visual motion object detection16,17, and various electronic or optical synapses1820. However, most of these studies focus on single-channel responses and the perception of light or electricity, lacking effective processing of cross-modal information. Some research has demonstrated hardware that can be stimulated by multiple modalities of information2123, such as visual (optical signals) and tactile, auditory, etc. (usually coupled as electrical signals). For example, nanowire transistors that respond to light and stretching, triboelectric sensors that respond to light and tactile stimuli24, and perovskite artificial synapses that respond to light and electricity25. However, these studies involve only linear combinations of stimuli and do not demonstrate the natural super-additive advantage of bionic cross-channel plasticity. Recent research simulated multisensory integration of optical and tactile perception using digital spiking events from triboelectric sensors and phototransistors26, but information encoding and functional reconstruction still need further development to inspire efficient machine vision encoding and seamlessly integrated sensing-computation platforms.

In this work, we fabricate a floating-gate photoelectric device array based on a chemical vapor deposition (CVD) monolayer of molybdenum disulfide (MoS2) and gold as the channel and floating-gate components. By regulating the tunneling efficiency of electric-field-assisted photogenerated carriers, we simulate the cross-modal correlation plasticity observed in the primary cortex of the biological brain. The photoelectric sensor devices showcased over 1000 times synergistic paradigm super-additive behavior and significant time-dependent plasticity, which is then used for energy-efficient and secure image coding. Preprocessed visual images can be further routed and searched using the MoS2 floating-gate-transistors (FGTs) based non-volatile four-transistor ternary content-addressable memory (4T-TCAM) cells array, which exhibits a resistance ratio of up to 105 between matching and mismatching states and a lookup durability of up to 1012. The hardware platform of cross-modal visual perception empowers seamless visual process-in-sensory and -in-memory, inspiring the development of low-power edge computing platforms for artificial intelligence.

Results and discussion

Cross-modal correlation perception and design

Various sensory stimuli provide distinct dimensions of information27. When faced with different sensory elements, the brain’s perception cortex not only simply adds them together but also carries out cross-sensory correlation and integration to form a unified experience27,28. The well-known McGurk effect experiment exemplifies this phenomenon (Fig. 1a). When the visual presentation of the Ba mouth shape coincides with the auditory transmission of the Ga sound, the subjects will think they hear a third sound, Da, due to the correlation of visual and auditory information7. Similarly, maintaining the auditory Ba sound while displaying different mouth shape visual information leads subjects to perceive different sounds. For instance, if subjects see the Ba mouth shape while hearing the sound, they think they hear Ba, and if they see the Fa mouth shape, they think they hear Fa. This accomplishes the visual signal encoding of the perceived auditory signal to some extent29.

Fig. 1. Cross-modal correlation perception and design.

Fig. 1

a McGurk effect schematic. The auditory perception is influenced by the visual association. The McGurk effect affects auditory perception through vision, producing an effect similar to decoding information. b Cross-modal associative perception of concepts and features: Multi-channel information is coordinated and coupled in the primary cortex, integrating into new perceptions that include the super-additive effect and time dependence of information coordination. c The use of solid-state electronic device arrays to simulate cross-modal correlation response features to achieve efficient information integration and coding. d Simulating the cross-modal correlation response behavior through the electric-field-assisted photogenerated carrier tunneling process, where the use of single-layer molybdenum sulfide helps to generate efficient photogenerated carriers.

With the development of biological understanding, these cross-sensory couplings can occur in the primary sensory cortex without the involvement of higher brain regions30,31. Research indicates that corresponding cross-modal perception exhibits evident spatiotemporal coupling characteristics, where two signals can produce responses that exceed the sum of their individual responses, achieving super-additive new features beyond threshold26,32. This behavior also demonstrates a time synchronization effect33, where multiple stimulus information is difficult to show associated responses when it exceeds a certain time range (typically tens to hundreds of milliseconds8,34), as depicted in Fig. 1b. The use of solid-state electronic device arrays to simulate cross-modal correlation response features such as the McGurk effect to achieve efficient information integration and coding (as shown in Fig. 1c) is expected to reduce the overall power consumption overhead. The mechanism of electric-field-assisted photogenerated carrier tunneling through the barrier can emulate this bionic behavior at the mechanism level of solid-state electronic devices, as illustrated in Fig. 1d. In the barrier tunneling model, photo-induced non-equilibrium carriers in semiconductors can only undergo inefficient tunneling in the absence of an applied electric field due to the large and wide energy barrier. The carrier energy can be estimated from the difference between the photon energy (hv) and the excitation barrier (barrier 1 in Fig. 1d) in the semiconductor bandgap when there is also a large barrier (barrier 2 in Fig. 1d) to block carrier tunneling. In this model, energy barrier 1 and barrier 2 together determine the tunneling possibility. Correspondingly, in the absence of light but with an applied electric field, the Fermi level is too low to undergo high-probability tunneling. However, when light and an electric field occur simultaneously, photo-induced carriers with enough energy can tunnel through the electric-field-induced triangular barrier by Fowler-Nordheim (FN) tunneling, significantly increasing the tunneling efficiency. The number of tunneling carriers also positively correlates with the time overlap between the electric field and illumination in this process, demonstrating strong coupling characteristics of time correlation.

To achieve an efficient photogenerated carrier system, this work utilizes CVD-grown monolayer transition metal dichalcogenides, molybdenum disulfide, as the device channel to construct a floating-gate transistor. Monolayer molybdenum disulfide exhibits robust light-matter interaction and a direct bandgap energy band structure, showcasing exceptional performance in optoelectronic devices35,36. Additionally, its ultra-thin body structure, low-temperature fabrication processing characteristics, and manufacturing scalability confer unique advantages in semiconductor devices and are widely studied in logic, memory, and sensing applications3743.

Device structure and basic characteristics

We approach the seamless sensing-memory-processing scheme by exploiting monolayer MoS2 floating-gate transistors as the core components. Figure 2a shows the device’s layout design. To achieve monolithic integration for processing-in-sensor and computing-in-memory basic components, local gate electrodes are predefined on the substrate of a silicon wafer covered by an insulating oxide layer, and then 25 nm hafnium oxide is deposited on the gate electrodes as a blocking oxide layer. The 2 nm patterned gold is then deposited on the gate region to form the floating gate. 9 nm hafnium oxide is then deposited on the wafer to serve as the tunnel oxide. The CVD-grown monolayer MoS2 single crystals with suitable density are used and transferred onto the substrate as the active channel layer (Supplementary Fig. 1 for details of the material characterization). The channel region is defined by etching. The channel under the floating gate is used to implement memory devices, while the channel located beneath the gate electrode without a floating gate is used to implement transistors. Finally, source/drain electrodes and interconnections are patterned. The schematic of the device fabrication process flow is shown in Supplementary Fig. 2. Figure 2b shows an enlarged MoS2 FGT units array picture for processing-in-sensor, Fig. 2c shows a 4T-TCAM unit consisting of two MoS2 transistors and two MoS2 FGTs for computing-in-memory, and Fig. 2d, e show enlarged microscope images of the channel region of the MoS2 FGT and MoS2 transistors, respectively. The electrical properties of the single basic components of MoS2 FGTs (the schematic diagram is shown in Fig. 2f) are investigated first. The MoS2 FGTs show transfer characteristic curves with hysteresis and non-volatile storage characteristics that can be programmed by Vg pulses due to charge stored in the floating-gate layer, which results are presented in Supplementary Fig. 3 and Supplementary Fig. 4. The memory behavior of MoS2 FGTs is further demonstrated as shown in Supplementary Fig. 5. The basic principle behind the memory behavior of MoS2 FGTs is described in the Supplementary Note 2. The statistical electrical variability of separate MoS2 FGTs with a median program/erase current ratio of about 106 provides a solid foundation for subsequent memory applications as shown in Fig. 2g. As a memory-based neural component, the continuous potentiation and depression of conductance can also be simulated through continuous voltage stimulation, and the relevant results are shown in the Supplementary Fig. 6. Moreover, a demonstration of the interconnection of the 4 × 4 cells is also presented in the Supplementary Fig. 7.

Fig. 2. Device structure and basic characteristics.

Fig. 2

a Microscope image of the optical sensor array and the 4T-TCAM circuits. Scale bar: 200 μm. The enlarged image of (b) MoS2 floating-gate transistor cells and (c) 4T-TCAM cell in (a). Scale bar: 50 μm. The channels region magnification of the (d) MoS2 floating-gate transistor and (e) MoS2 transistor. Scale bar: 20 μm. f Schematic of the basal component MoS2 floating-gate transistor. g The cumulative probability distribution of high resistance states (HRS, blue symbol) and low resistance states (LRS, red symbol) for multiple MoS2 FGTs devices, with current extracted at Vg = 0 V and Vd = 1 V. μ, mean value. σ, standard deviation. h The optical response characteristics under different LED light sources with wavelengths of 470 nm (blue line), 530 nm (cyan line), and 625 nm (red line). The incident light power intensity is 9.5 mW cm2. i Schematic of a single TCAM circuit cell based on MoS2 floating-gate transistors. j Repeated reading of the output current (Vbias = 1 V) of the TCAM cell between searching bit 1 and 0 (the period is 2 s) with storage bit 0 (black line) and 1 (red line). k The read endurance of the 4T-TCAM with storage bit 0 at Vbias = 2 V. Using a 100 ns read pulse width of Vg, applied repeatedly to the MoS2 transistor. The red symbol represents a matching output, and the black symbol represents a mismatching output. After the endurance test, the resistance ratio can still reach 105 after rewriting storage bit data into the MoS2 FGTs (blue symbol represents matching output, olive-green symbol represents mismatching output).

Moreover, as a processing-in-sensor unit, MoS2 FGTs also show the optical response characteristics under different light-emitting diode (LED) light sources with wavelengths of 470 nm, 530 nm, and 625 nm, demonstrating the potential of broadband detection. The photo-response characteristics of the device are shown in Fig. 2h, and more results of photodetectors are shown in the Supplementary Fig. 8. The photodetector maximum responsivity and detectivity of a 470-nm-wavelength LED are 1.7 × 104 A W1 and 1.36 × 1011 Jones, respectively. The leakage current can be neglected during device operation due to a thick blocking oxide layer of the MoS2 FGT device, as shown in Supplementary Fig. 9. Based on MoS2 FGTs, another basic circuit unit, the ternary content-addressable memory (the schematic diagram is shown in Fig. 2i), is also studied. As the special hardware performs in-memory searching and matching, the definition of storage states and operations rules for pattern matching, as well as the non-volatility of the 4T-TCAM cell, is described in Supplementary Fig. 10 and Supplementary Note 1. The sequence of search signals is shown in Fig. 2j with a period of 2 s, and the bias voltage (Vbias) is always 1 V. It can be seen that the matched output current is about 1010 amperes and the mismatched output is about 105 amperes when the stored state is bit 1 or 0. The high resistance ratio (Rratio = Rmatch/Rmismatch) of up to 105 between match or mismatch improves the sensing margin in a data-intensive application, which can be compared to the best results based on NV-TCAM4446. As a read-dominant device, the read endurance of the 4T-TCAM is also investigated and presented in Fig. 2k with stable reading cycles up to 1012 (Supplementary Fig. 11 for more details). Moreover, the programming and erasing endurance of the stored bits in the TCAM is also performed and presented in Supplementary Fig. 12.

Super-additive electro-optic devices

The monolayer MoS2 FGTs exhibited excellent photo-response under illumination by a blue light-emitting diode (470 nm) light source, thanks to the direct bandgap of monolayer MoS2 (Supplementary Fig. 13). Different from the conventional image sensors unit whose response is not dependent on irradiation time12, the MoS2 FGTs image sensors manifest a light-dosage-dependent output with illumination and exhibit synaptic plasticity features as shown in Supplementary Figs. 13 and 14. Interestingly, the synchronous gate voltage pulse with the light stimulations applied to the devices enhances the photo-response, realizing the in-situ image preprocessing without additional hardware cost. Figure 3a shows the real-time monitored output current comparison by different signal stimulation of 30 times with a period of 1 s and a duration of 10 milliseconds. The timing diagram of LED light pulse with a power intensity of 16.7 mW cm2 and the gate voltage pulse with an amplitude of −2 V are shown in Fig. 3b. The output photocurrent by the synergistic stimulus with light and voltage is much larger than the output current of the optical signal or electric signal applying on the device separately, indicating that the super-additive effect exists in MoS2 FGTs due to the process of charge injection into floating gate was enhanced by the electric field. The super-additive current ratio (CR)= Ielectro-optic/(Ielectric+ Ioptic) is extracted over 100 in Fig. 3a, which enables efficient visual perception. Visualized graphics sensing results of the letter P with a power intensity of 16.7 mW cm2 based on the 4 × 4 MoS2 FGTs array are carried out to demonstrate the advantages of super-additive optoelectronic sensing. The bright and dark pixels can be gradually distinguished as the number of light pulses increases. When the letter P is stimulated with a simultaneous gate voltage pulse at all pixels, significantly enhanced current in bright pixels and slightly enhanced current in dark pixels result in the signal-to-noise margin ratio improvement of the visual information compared with only light stimulation, even better than the effect of light stimulation of 150 times as illustrated in Fig. 3a. As a result, the realization of similar quality image sensing can greatly reduce the delay and energy consumption. Figure 3c shows the comparison of illumination/dark current, energy consumption per pixel, and the latency of the devices for 150 optical pulses and one optical-electrical cooperative pulse. The statistical illumination-current (Iph) and dark current (Idark) of sensor pixels extracted from 20 MoS2 FGTs suggest that the signal-to-noise margin ratio improves by over 6 times by the optical-electrical cooperative stimulation. The energy consumption extracted from 20 MoS2 FGTs devices by E = Ipeak × Vbias × T, where Ipeak is the transient peak output current for each pulse, Vbias is 50 mV, and T is the duration time of stimulation (10 ms). The low energy consumption of the optical-electrical cooperative stimulation is about 0.23 nJ, which is about one-tenth of that of the optical stimulation. Moreover, the latency can be dramatically reduced 150 times for the optical-electrical co-stimulated device. The super-additive electro-optic response of MoS2 FGTs is also realized under incident LEDs of different wavelengths, as shown in Supplementary Fig. 15. We further extract the super-additive CR with respect to the power intensity, pulse width, amplitude of Vg pulse, and base voltage of Vg for stimulation 1, 15, and 30 times, as shown in Supplementary Fig. 16. The continuous tunable CR of up to >1000 of the MoS2 FGTs enables parallel complex sensing and preprocessing by coupled optical signals and electrical signals.

Fig. 3. Photoelectric co-modulated responses of MoS2 FGT sensors.

Fig. 3

a Real-time monitored output current by Vg pulse (red line, Vg pulse = −2 V, Vg base = 0 V), LED light (blue line, power intensity is 16.7 mW cm2 with λ = 470 nm), and synchronous voltage-LED (orange line) stimulation of 30 times. The inset grayscale images present the output current stimulated by the visual image letter P. b Timing diagrams of applied optical and electrical pulses with the same period and duration. The period is 1 s, and the duration is 10 ms for all stimulations. c Performance comparison of signal-to-noise margin ratio, energy consumption per pixel (data from 20 devices, the error bar represents the standard deviation), and latency for optical-electrical stimulations (1 time) and only optical stimulations (150 times).

Cross-modal plasticity and mechanisms

To explore the spatiotemporal dynamics of the MoS2 FTGs device, optical-electrical signals with a relative timing ∆t were implemented, and the transient response is presented in Fig. 4a. The optical pulse signals with a power intensity of 16.7 mW cm2 and electrical pulse signals (Vg_pulse = −2 V, Vg_base = 0 V) have the same period of 1 s and duration of 10 ms. ∆t > 0 means that the optical pulse arrives before the electrical pulse. The optical-electrical signal stimulation with a relative arriving timing ∆t decreased the total time of electric-field-assisted charge injection and thus led to an attenuated effect for photocurrent enhancement. The energy band diagram is plotted in Fig. 4b–d to get a deep insight into the dynamics capture of photogenerated carriers. Figure 4b shows the energy band diagram under LED illumination only. At zero bias thermal equilibrium, the Fermi levels EF of the system are consistent. Under illumination with a photon energy of 2.6 eV, photogenerated carriers can be excited to the conduction band (electrons) and valence band (holes). Temporarily augmented carrier density is expressed by quasi-Fermi levels EFn and EFp for electrons and holes, respectively. The EFp for minority carriers in the n-type semiconductor of MoS2 will deviate more from the initial Fermi level47. As a result, some photogenerated holes tend to be injected into the floating gate, although the tunneling efficiency is low at this condition due to the sufficient barrier height, as we have discussed in Fig. 1d, while some are trapped in the interface states48, resulting in the shifted negative of the channel threshold voltage. Meanwhile, the photogenerated electrons excited into the conduction band increase the conductance of the channel. Taken together, the device exhibits enhanced current under illumination and the residual non-volatility from the carriers injected into the floating gate. Under continuous light exposure or repeated light pulses, photogenerated carriers are continuously excited within the bandgap. This process can persist until the charge in the floating gate reaches saturation. Figure 4c shows the energy band diagram under negative Vg bias only. Although the barrier of the tunneling oxide decreases under the bias voltage at this time, it still shows a very low tunneling efficiency due to the relatively high barrier, due to the lack of excited high-energy photogenerated carriers in the channel. But when light and electric field occur simultaneously, the light-excited high-energy carriers can tunnel through the electric-field-induced triangular barrier by FN tunneling, greatly increasing the tunneling efficiency as shown in Fig. 4d. Since the duration of the electric-field-assisted photogenerated carrier tunneling process determines the amount of charge injected into the floating gate, and the channel current depends on channel threshold voltage shift, the photoelectric coordination time shows a strong correlation with the final output current. To fully elucidate the relative timing dependence, the excitation current with various ∆t for different stimulation times is extracted to demonstrate the spatiotemporal evolutionary trend. The current changes by nearly two orders of magnitude within the |∆t| of 10 ms, and barely changes beyond 10 ms as shown in Fig. 4e. This trend does not change with the number of stimuli from 1 to 30, only increases the overall current level. This behavior also simulates the time dependence of biological cross-modal correlation response on a similar timescale. Furthermore, similar to the time-dependent plasticity of super-additive electro-optic response for negative voltage, the synergistic depression effect of the positive voltage also exhibits a similar time dependence, resulting in an overall depression trend presented in Supplementary Fig. 17. This process can also be explained using the electric-field-assisted photogenerated carrier tunneling model. Since the opposite electric field assists electrons in tunneling into the floating gate, an overall inhibitory trend is formed. By using cross-modal time-dependence plasticity and light intensity tunability as we have discussed from the experimental results of individual devices, a bionic pattern decoding application for the obfuscation of visual information can be achieved in the integrated sensor array (Fig. 4f). In order to improve the transmission efficiency and security of visual information, the obfuscation of visual information can be used for transmission and then decoding in the sensor array terminal with a corresponding key. The decoding key is known a priori from the message sender, and it is composed of Vg pulses with different ∆t for each pixel. The decoded visual information can then be obtained from pre-blurred information and the corresponding key by the MoS2 FGTs sensor array. The numerical simulations for the proposed pattern decoding process prove the consistency between the decoded information and the original information. This approach also allows for the retrieval of different visual information using various keys from the same input pattern, thereby enhancing the efficiency of graphics transmission to different terminal devices.

Fig. 4. Cross-modal correlation plasticity of MoS2 FGTs for pattern decoding.

Fig. 4

a Real-time monitored output current by synchronous optical-electrical stimulation of 30 times with different ∆t from −10 ms to 10 ms. The period is 1 s and the duration is 10 ms for all optical (power intensity is 16.7 mW cm2) and electrical (Vg pulse = −2 V, Vg base = 0 V) pulses. bd The energy band diagram presents the tunnel and capture dynamics of photogenerated carriers under different modal stimulation. EF, Fermi level. EFn, electron quasi-Fermi level. EFp, hole quasi-Fermi level. e The extracted excitation current with various ∆t for different stimulation times from (a). f Schematic and numerical simulations of an image decoding application based on the integrated sensor array.

One-shot visual signal processing

In traditional vision processing architectures, cameras capture visual scenes frame by frame, converting optical signals into analog electrical signals in real-time, which are then converted to digital signals and stored and processed in the information chain10 (Fig. 5a). Due to the different semiconductor manufacturing processes of the individual hardware, the hardware of the sensor, memory, and processing units is physically separated, which leads to drawbacks in terms of speed and energy consumption11. The superfluous data in the sensor add an exponential burden to the subsequent information processing15, and the memory wall between the memory and processing units11 leads to high energy consumption and high latency in computation. To overcome the drawbacks of the traditional visual processing scheme mentioned above, a processing-in-sensor array was used to reduce redundant data and perform vision information decoding by bio-inspired cross-modal correlation, while processing-in-memory units were used for subsequent one-shot processing to break the memory wall between conventional memory and processing units. Figure 5b shows the operational concept of the seamless sensing-storage-processing scheme in this work. Input visual signals stimulate the sensor pixels with a gate voltage pulse simultaneously. After preprocessing by the MoS2 FGTs array, the visual information is converted into non-volatile electrical signals. Each pixel of information is then used as input to each TCAM cell, and the entire TCAM cell outputs whether it matches its pre-stored information. The NV-TCAM array shows notably improved advantages in energy saving and speed for the exact match lookup task. We then performed an image recognition simulation with circuit-level analysis using the experimental data. More simulation details and results are presented in the Supplementary Fig. 18, Supplementary Table 1 and Supplementary Note 3. By using the statistics of the resistance distribution of MoS2 transistors and MoS2 FGTs, a 16 × 4 TCAM array circuit with 16 inputs and 4 outputs was constructed by SPICE as illustrated in Fig. 5c. Each TCAM cell pre-stores digital bit information at two MoS2 FGTs, and two gates of transistors connect to input signal lines (search lines). Each output line connects 16 TCAM cells with pre-stored letters P, K, U, or N marks with color blocks. Once the visual information is sensed and preprocessed by the sensor array, the preprocessed images are transformed into input signals for TCAM cells by the sense amplifiers. Then the TCAM circuits return the matching lookup results as shown in Fig. 5d. When the preprocessed letter images are input to the TCAM circuits, the line output that exactly matches the stored letter is high resistance, otherwise low resistance. The simulation result shows a high resistance ratio of over 104 for 16-pixel images between all-match and 1-bit-mismatch states, and remains of 103 for 64-pixel images. TCAM cell-based image recognition circuits also allow for imprecise image matching by storing bit X to highlight critical information, as shown in Supplementary Fig. 19. Furthermore, efficient information transmission and a multi-terminal secure publishing process are also demonstrated by taking advantage of the cross-optical-electrical correlation response of the MoS2 FGTs sensor and the TCAM circuits, as presented in Fig. 5e. The input visual pattern is obfuscated and then received by various terminals with different keys. The terminals without the right key return fuzzy results, while the terminal receiver with special right keys (key 1 and key 2 in Fig. 5e) gets special results.

Fig. 5. TCAM array simulations for visual security.

Fig. 5

a Conventional vision processing architecture based on silicon technologies. ADC, analog to digital converter. CPU, central processing unit. b One-shot processing proposed in this work that monolithically integrates a processing-in-sensor (PIS) scheme combined with a computing-in-memory (CIM) architecture for vision security. c Circuit diagram of the simulated 16 × 4 TCAM array. Each 16 TCAM cell in the 16 × 4 array stores the letters P (R1), K (R2), U (R3), or N (R4), as marked with color blocks. d Returned output resistance from the TCAM array once the digital visual images P, K, U, and N. e Schematic, and simulation results with a blurred image and different keys.

In conclusion, we demonstrate bio-inspired cross-modal hardware with super-additive plasticity for a seamless sensing-memory-processing scheme. By regulating the tunneling efficiency of electric-field-assisted photogenerated carriers in the MoS2 FGTs, we simulated the cross-modal correlation plasticity observed in the primary cortex of the biological brain. The bionic-inspired MoS2 FGTs show more than 1000 times synergistic paradigm superposition behavior and significant time-dependent plasticity, and can be further used for energy-saving and efficient image perception and image security coding. The preprocessed visual images can be routed and searched using a non-volatile 4T-TCAM unit array composed of MoS2 FGTs, with a resistance ratio of up to 105 between the matching and mismatching states of a single cell and a lookup durability of up to 1012, providing rich sense margin and reliability for high-resolution perception mode one-shot processing. The hardware platform of cross-modal visual perception empowers seamless visual perception and a processing-in-memory platform, providing tremendous potential for ubiquitous visual low-power edge intelligent hardware.

Methods

Synthesis of molybdenum disulfide

The monolayer MoS2 film is synthesized on molten soda-lime-silica glass substrates by a two-zone CVD system. The powdered sulfur precursor is located in Zone I at 230 °C with a constant argon flow. The MoO3 precursor is evaporated at 800 °C in Zone II, where the molten glass substrate is placed downstream. The growing duration is kept at 10 min and then cooled down naturally. Monolayer MoS2 with an appropriate nucleation density can be obtained.

Device fabrication

The schematic of the device fabrication process flow is shown in Supplementary Fig. 2. During the device fabrication process, all lithography is performed by electron-beam lithography, and metal deposition is performed by an electron-beam evaporation system. The local gate electrodes (20/40 nm Ni/Au) are predefined on the substrate, and then 25 nm hafnium oxide is deposited by ALD as a blocking oxide layer at 200 °C. A 2-nm thick gold layer is deposited on the gate region to form the floating gate. Another 9 nm hafnium oxide is then deposited as tunnel oxide. The CVD-growth monolayer MoS2 film is transferred onto the substrate as the active channel layer. After the MoS2 transfer is completed, photo-resist is used as a mask, and the etching region is defined via electron-beam exposure. The exposed resist is removed after development, while the resist in the channel region is retained. The channel region is defined by Ar/O2 plasma etching. Finally, source/drain electrodes and interconnections are patterned using 20/60 nm Ni/Au. The temperature of the whole process does not exceed 200 °C. Supplementary Fig. 20 shows the layout images of the early batch of devices, which included some redundant pad designs.

Electrical measurements

The electrical measurements are carried out in a Lakeshore CPX-VF probe station under 105 Torr at room temperature. An Agilent semiconductor parameter analyzer B1500A is used to perform measurements. The device is irradiated through a fiber-coupled LED light source with models of Thorlabs M470F3 (λ = 470 nm), M530F2 (λ = 530 nm), and M625F2 (λ = 625 nm) with a T-Cube series LED driver (LEDD1B). The light pulse can be modulated by an external voltage pulse with a delay of <100 μs. The timing of synchronous electrical-optical pulses applied to devices can be obtained by the two channels of the Agilent B1525A SPGU module. Millisecond-scale optical/electrical pulses are used to better simulate the timescale of biomimetic behaviors, while also being limited by the LED equipment speed.

Simulation

The blurred picture decoding results are simulated based on the experimental results of the single device. The cross-modal time-dependent plasticity and light intensity tunability used in the simulation are extracted from the experimental results of individual devices, rather than a fully driven interconnected array. The resistance ratio of the TCAM array is simulated by SPICE, including the resistance variation of MoS2 FGTs measured from the experimental data. See Supplementary Note 3 for more simulation details.

Supplementary information

Source data

Source data (2.5MB, xlsx)

Acknowledgements

This work was supported by the Natural Science Foundation of China (grant nos. 92364203 (Y.W.), 92464303 (Y.W.), 62090034 (Y.W.), 62425402 (Y.W.), 62574006 (X.X.), and 62104012 (X.X.)), the National Key Research and Development Program of China (grant nos. 2022YFB4400100 (X.X.), 2021YFA1202903 (Y.W.)), and Beijing Natural Science Foundation (grant no. 4242057 (X.X.)), and Technology Innovation Program of Hunan Province (grant no. 2021RC5008 (Y.W.)). X.X. acknowledges Kun Xiong for discussions and simulation support.

Author contributions

Y.W. and X.X. conceived the project and designed the experiments. X.X. fabricated the devices and performed electrical measurements. T.F. assisted in the device fabrication. C.G., H.L., and J.K. contributed to the MoS2 material synthesis. X.X. and Q.L. performed the simulations. X.W. and S.L. assisted with the setup of the test system. Y.W., D.L., X.W., and A.P. assisted in the partial characterization of the materials. X.X. and Y.W. analyzed the data and wrote the manuscript. All authors contributed to discussions and commented on the manuscript.

Peer review

Peer review information

Nature Communications thanks Seongin Hong and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Data availability

All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Information. The data that support the plots within this paper and other findings of this study are available from the corresponding author upon request. The data generated in this paper are provided in the Source Data file. Source data are provided with this paper.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Xiong Xiong, Email: xiongxiong@pku.edu.cn.

Yanqing Wu, Email: yqwu@pku.edu.cn.

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-025-65872-z.

References

  • 1.Lemus, L., Hernández, A., Luna, R., Zainos, A. & Romo, R. Do sensory cortices process more than one sensory modality during perceptual judgments? Neuron67, 335–348 (2010). [DOI] [PubMed] [Google Scholar]
  • 2.Bizley, J. K., Nodal, F. R., Bajo, V. M., Nelken, I. & King, A. J. Physiological and anatomical evidence for multisensory interactions in auditory cortex. Cereb. Cortex17, 2172–2189 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Bolognini, N. & Maravita, A. Proprioceptive alignment of visual and somatosensory maps in the posterior parietal cortex. Curr. Biol.17, 1890–1895 (2007). [DOI] [PubMed] [Google Scholar]
  • 4.Munoz, N. E. & Blumstein, D. T. Multisensory perception in uncertain environments. Behav. Ecol.23, 457–462 (2012). [Google Scholar]
  • 5.Currier, T. A. & Nagel, K. I. Multisensory control of navigation in the fruit fly. Curr. Opin. Neurobiol.64, 10–16 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Comer, C. & Baba, Y. Active touch in orthopteroid insects: behaviours, multisensory substrates and evolution. Philos. Trans. R. Soc. B Biol. Sci.366, 3006–3015 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.McGurk, H. & Macdonald, J. Hearing lips and seeing voices. Nature264, 746–748 (1976). [DOI] [PubMed] [Google Scholar]
  • 8.Shams, L., Kamitani, Y. & Shimojo, S. What you see is what you hear. Nature408, 788–788 (2000). [DOI] [PubMed] [Google Scholar]
  • 9.Botvinick, M. & Cohen, J. Rubber hands ‘feel’ touch that eyes see. Nature391, 756–756 (1998). [DOI] [PubMed] [Google Scholar]
  • 10.Zhou, F. & Chai, Y. Near-sensor and in-sensor computing. Nat. Electron.3, 664–671 (2020). [Google Scholar]
  • 11.Wong, H. S. P. & Salahuddin, S. Memory leads the way to better computing. Nat. Nanotechnol.10, 191–194 (2015). [DOI] [PubMed] [Google Scholar]
  • 12.Zhou, F. et al. Optoelectronic resistive random access memory for neuromorphic vision sensors. Nat. Nanotechnol.14, 776–782 (2019). [DOI] [PubMed] [Google Scholar]
  • 13.Yang, X. et al. A self-powered artificial retina perception system for image preprocessing based on photovoltaic devices and memristive arrays. Nano Energy78, 105246 (2020). [Google Scholar]
  • 14.Wang, C.-Y. et al. Gate-tunable van der Waals heterostructure for reconfigurable neural network vision sensor. Sci. Adv.6, eaba6173 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mennel, L. et al. Ultrafast machine vision with 2D material neural network image sensors. Nature579, 62–66 (2020). [DOI] [PubMed] [Google Scholar]
  • 16.Zhang, Z. et al. All-in-one two-dimensional retinomorphic hardware device for motion detection and recognition. Nat. Nanotechnol.17, 27–32 (2022). [DOI] [PubMed] [Google Scholar]
  • 17.Zhou, Y. et al. Computational event-driven vision sensors for in-sensor spiking neural networks. Nat. Electron.6, 870–878 (2023). [Google Scholar]
  • 18.Pi, L. et al. Broadband convolutional processing using band-alignment-tunable heterostructures. Nat. Electron.5, 248–254 (2022). [Google Scholar]
  • 19.Dodda, A. et al. Active pixel sensor matrix based on monolayer MoS2 phototransistor array. Nat. Mater.21, 1379–1387 (2022). [DOI] [PubMed] [Google Scholar]
  • 20.Lee, S., Peng, R., Wu, C. & Li, M. Programmable black phosphorus image sensor for broadband optoelectronic edge computing. Nat. Commun.13, 1485 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.You, J. et al. Simulating tactile and visual multisensory behaviour in humans based on an MoS2 field effect transistor. Nano Res.16, 7405–7412 (2023). [Google Scholar]
  • 22.Jiang, C. et al. Mammalian-brain-inspired neuromorphic motion-cognition nerve achieves cross-modal perceptual enhancement. Nat. Commun.14, 1344 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Yu, J. et al. Bioinspired mechano-photonic artificial synapse based on graphene/MoS2 heterostructure. Sci. Adv.7, eabd9117 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Liu, L. et al. Stretchable neuromorphic transistor that combines multisensing and information processing for epidermal gesture recognition. ACS Nano16, 2282–2291 (2022). [DOI] [PubMed] [Google Scholar]
  • 25.Chen, G. et al. Temperature-controlled multisensory neuromorphic devices for artificial visual dynamic capture enhancement. Nano Res.16, 7661–7670 (2023). [Google Scholar]
  • 26.Sadaf, M. U. K., Sakib, N. U., Pannone, A., Ravichandran, H. & Das, S. A bio-inspired visuotactile neuron for multisensory integration. Nat. Commun.14, 5729 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Brunel, L., Carvalho, P. F. & Goldstone, R. L. It does belong together: cross-modal correspondences influence cross-modal integration during perceptual learning. Front. Psychol. 6, 358 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Faivre, N., Mudrik, L., Schwartz, N. & Koch, C. Multisensory integration in complete unawareness: evidence from audiovisual congruency priming. Psychol. Sci.25, 2006–2016 (2014). [DOI] [PubMed] [Google Scholar]
  • 29.Mitchel, A. D., Christiansen, M. H. & Weiss, D. J. Multimodal integration in statistical learning: evidence from the McGurk illusion. Front. Psychol. 5, 407 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ghazanfar, A. A. & Schroeder, C. E. Is neocortex essentially multisensory? Trends Cogn. Sci.10, 278–285 (2006). [DOI] [PubMed] [Google Scholar]
  • 31.Kayser, C. The multisensory nature of unisensory cortices: a puzzle continued. Neuron67, 178–180 (2010). [DOI] [PubMed] [Google Scholar]
  • 32.Meredith, M. A. & Stein, B. E. Spatial determinants of multisensory integration in cat superior colliculus neurons. J. Neurophysiol.75, 1843–1857 (1996). [DOI] [PubMed] [Google Scholar]
  • 33.Laing, M., Rees, A. & Vuong, Q. C. Amplitude-modulated stimuli reveal auditory-visual interactions in brain activity and brain connectivity. Front. Psychol.6, 1440 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Nozaradan, S., Peretz, I. & Mouraux, A. Steady-state evoked potentials as an index of multisensory temporal binding. Neuroimage60, 21–28 (2012). [DOI] [PubMed] [Google Scholar]
  • 35.Lembke, D., Bertolazzi, S. & Kis, A. Single-layer MoS2 electronics. Acc. Chem. Res.48, 100–110 (2015). [DOI] [PubMed] [Google Scholar]
  • 36.Lopez-Sanchez, O., Lembke, D., Kayci, M., Radenovic, A. & Kis, A. Ultrasensitive photodetectors based on monolayer MoS2. Nat. Nanotechnol.8, 497–501 (2013). [DOI] [PubMed] [Google Scholar]
  • 37.Migliato Marega, G. et al. Logic-in-memory based on an atomically thin semiconductor. Nature587, 72–77 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Li, W. et al. Approaching the quantum limit in two-dimensional semiconductor contacts. Nature613, 274–279 (2023). [DOI] [PubMed] [Google Scholar]
  • 39.Xiong, X. et al. Demonstration of vertically-stacked CVD monolayer channels: MoS2 nanosheets GAA-FET with Ion>700 µA/µm and MoS2/WSe2 CFET. In 2021 IEEE International Electron Devices Meeting (IEDM) 7.5.1–7.5.4 10.1109/IEDM19574.2021.9720533 (IEEE, 2021).
  • 40.Xiong, X. et al. Top-gate CVD WSe2 pFETs with record-high Id~594 μA/μm, Gm~244 μS/μm and WSe2/MoS2 CFET based half-adder circuit using monolithic 3D Integration. In 2022 International Electron Devices Meeting (IEDM) 20.26.1–20.26.4 10.1109/IEDM45625.2022.10019476 (IEEE, 2022).
  • 41.Yu, J. et al. Simultaneously ultrafast and robust two-dimensional flash memory devices based on phase-engineered edge contacts. Nat. Commun.14, 5662 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Liao, F. et al. Bioinspired in-sensor visual adaptation for accurate perception. Nat. Electron.5, 84–91 (2022). [Google Scholar]
  • 43.Pradhan, B. et al. Ultrasensitive and ultrathin phototransistors and photonic synapses using perovskite quantum dots grown from graphene lattice. Sci. Adv.6, eaay5225 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Xiong, X. et al. Nonvolatile logic and ternary content-addressable memory based on complementary black phosphorus and rhenium disulfide transistors. Adv. Mater.34, 2106321 (2022). [DOI] [PubMed] [Google Scholar]
  • 45.Yang, R. et al. Ternary content-addressable memory with MoS2 transistors for massively parallel data search. Nat. Electron.2, 108–114 (2019). [Google Scholar]
  • 46.Ni, K. et al. Ferroelectric ternary content-addressable memory for one-shot learning. Nat. Electron.2, 521–529 (2019). [Google Scholar]
  • 47.Kufer, D. & Konstantatos, G. Highly sensitive, encapsulated MoS2 photodetector with gate controllable gain and speed. Nano Lett.15, 7307–7313 (2015). [DOI] [PubMed] [Google Scholar]
  • 48.John, R. A. et al. Optogenetics inspired transition metal dichalcogenide neuristors for in-memory deep recurrent neural networks. Nat. Commun.11, 3211 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Source data (2.5MB, xlsx)

Data Availability Statement

All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Information. The data that support the plots within this paper and other findings of this study are available from the corresponding author upon request. The data generated in this paper are provided in the Source Data file. Source data are provided with this paper.


Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES