Key technology of brain-computer interaction based on speech imagery

Yanpeng LIU; Anmin GONG; Peng DING; Lei ZHAO; Qian QIAN; Jianhua ZHOU; Lei SU; Yunfa FU

doi:10.7507/1001-5515.202107018

. 2022 Jun 25;39(3):596–611. [Article in Chinese] doi: 10.7507/1001-5515.202107018

Show available content in

Key technology of brain-computer interaction based on speech imagery

Yanpeng LIU ^1,², Anmin GONG ³, Peng DING ^1,², Lei ZHAO ⁴, Qian QIAN ^1,², Jianhua ZHOU ^1,², Lei SU ^1,², Yunfa FU ^1,^2,^5,⁶

PMCID: PMC10950764 PMID: 35788530

Abstract

Speech expression is an important high-level cognitive behavior of human beings. The realization of this behavior is closely related to human brain activity. Both true speech expression and speech imagination can activate part of the same brain area. Therefore, speech imagery becomes a new paradigm of brain-computer interaction. Brain-computer interface (BCI) based on speech imagery has the advantages of spontaneous generation, no training, and friendliness to subjects, so it has attracted the attention of many scholars. However, this interactive technology is not mature in the design of experimental paradigms and the choice of imagination materials, and there are many issues that need to be discussed urgently. Therefore, in response to these problems, this article first expounds the neural mechanism of speech imagery. Then, by reviewing the previous BCI research of speech imagery, the mainstream methods and core technologies of experimental paradigm, imagination materials, data processing and so on are systematically analyzed. Finally, the key problems and main challenges that restrict the development of this type of BCI are discussed. And the future development and application perspective of the speech imaginary BCI system are prospected.

Keywords: Brain-computer interaction, Speech imagery, Experimental paradigm, Classification, Decoding

引言

脑机接口（brain-computer interface，BCI）是一种通信或控制系统，用户发送到外部世界的消息或命令不通过大脑外围神经和肌肉的正常输出路径，而是利用计算机等外部电子设备，实现大脑与外界的交流和控制^[1-2]。BCI系统可分为自发与诱发两类，前者是基于某种特定心理任务产生的神经活动，而后者是在外部刺激下所诱发的，其中自发BCI系统常见的心理范式（任务）是运动想象，它要求被试在心理上模拟身体某个部位运动（比如手或脚）。基于运动想象的脑机交互机制已被广泛研究，然而运动想象这一心理任务却对部分被试不太友好，大约20%的人不能产生有效的控制，即“BCI盲”现象。因此，研究者提出了其他一系列的心理范式（任务），如言语想象、视觉想象^[3]、心算^[4]等，利用这些任务同样可以进行BCI系统的研究与开发。其中基于言语想象BCI系统拥有较多的优点，如自发产生且无需刺激、无需训练且对被试友好，可直接表达真实意图，能够提供一种自然的交流方式等。

早期对言语产生时脑神经信号的研究可追溯至1967年，Schafer^[5]发现在阅读不同字母前525 ms期间，同一大脑皮层区域有着不同的皮层电位；Hiraiwa等^[6]利用阅读五个日语元音时的准备电位进行分类；Suppes等^[7]发现在不同单词听觉刺激及想象过程中，可以利用脑神经信号对单词进行分类。在随后的研究中，越来越多的学者关注在言语想象过程中对脑神经信号的分析，并逐渐将其发展为BCI系统中一个重要的范式。对言语想象的研究由最初发现阅读不同字母时脑神经信号的不同变化，到利用想象元音进行分类，再到利用阅读连续句子进行解码，有着一个缓慢的发展历程，其中对言语想象多分类及处理句子时的解码还处于发展初期，未来拥有着广阔的发展前景。基于言语想象的BCI系统能够实现一种较为自然的交流方式，这一系统对言语障碍、肌萎缩侧索硬化症、闭锁综合征等疾病患者具有积极作用^[8-9]。

言语想象的综述前人也有论述，如陈霏等^[10]讨论了言语想象脑机交互技术的信号采集和信号处理技术，但并未对实验范式及想象材料等存在的问题进行归纳总结，同样对于句子解码任务介绍较少；Schultz等^[11]主要介绍了言语产生时多种生物信号及其记录形式，Cooney等^[12]主要关注言语的生理学知识及其产生，但两者均未涉及到实验范式和信号处理；Martin等^[13-14]在其论述中只是描述了利用皮层脑电（electrocorticography，ECoG）技术对言语想象时的脑神经信号进行解码，Panachakel等^[15]只是描述了脑电（electroencephalography，EEG）技术，两者都没有综合考量多种脑神经信号在言语想象领域的应用。与此同时，虽然言语想象任务执行起来相对容易，但在特征提取、分类、人机交互等方面存在一定的难度，实验范式的设计和想象材料的选择也没有统一的标准。

因此，针对上述存在的问题，本文对言语想象的实验范式及想象材料进行系统的归纳总结，讨论处理言语想象数据的算法，归纳在线系统、实验范式、言语想象数据和解码句子这几方面存在的具体问题，并展望了言语想象BCI系统在未来的发展方向及应用前景。

1. 言语想象的神经机制

1.1. 言语想象的基础生理过程

言语想象，指的是人们在心里发音，而不发出实际声音，也没有面部动作，这一现象的产生涉及到人的认知、记忆、学习、思考等方面的大脑神经机制。Oppenheim等^[16]在其研究中指出言语想象是真实发音的删减版，其能够激活发音特征，只是没有产生可以听到的声音。Palmer等^[17]通过功能性磁共振成像（functional magnetic resonance imaging，fMRI）发现在真实发音期间所激活的脑区与在言语想象期间相似。Huang等^[18]通过fMRI发现真实发音与言语想象都会激活布洛卡区及其他部分脑区，但在真实发音状态下布洛卡区的激活程度更大，并且在言语想象状态下，左半脑区激活尤为显著。这些研究表明，两种行为在脑区激活上存在部分重叠，这也为以后实现分类和解码言语想象时的脑神经信号提供了科学理论依据。

Basho等^[19]通过fMRI发现言语想象能够更明显地激活左侧颞中回和额上回等脑区；而Shuster等^[20]通过fMRI测量血氧浓度依赖性（blood oxygen level-dependent，BOLD）发现，左侧中央前回和中央后回等脑区在真实发音任务中的BOLD响应明显大于言语想象。可以发现，两种行为在脑区激活上虽然存在重叠，但也各有侧重，因此不能将两种行为产生的脑神经机制等同看待。

在言语想象任务中，Goto等^[21]利用脑磁图（magnetoencephalography，MEG）在左下额中回和左前颞叶皮层等脑区观察到事件相关去同步（event-related desynchronization，ERD）现象，并且具有不同的时空特征。Shergill等^[22]利用fMRI来检测脑活动与言语想象生成速率的关系，发现言语想象速率的增加与脑区激活相关。研究发现，自发的言语想象与任务诱发的言语想象存在差异，任务诱发的言语想象与左下额叶区域的激活增加有关，而在自发的言语想象时这一区域的激活却不明显^[23]。通过对真实发音与言语想象两种行为在神经层面的研究，可以更好地将言语想象范式应用于BCI系统；而通过对言语想象神经影像学的不断探索，研究者可以利用其中的结论更好地设计出言语想象BCI系统的实验范式，选择合适的特征提取算法。

在言语想象和真实发音过程中大脑的神经活动有重叠，但也存在部分差异，因此不能将真实发音实验中得到的结论应用于言语想象范式，而应将言语想象与BCI系统结合起来对这一脑活动机制进行研究。大脑在处理不同含义词语时有着不同的激活方式，利用言语想象时大脑的神经特征，可以为BCI系统的分类、解码提供依据，言语想象范式能够得以发展，就是依托了对这些生理机制的研究。但是，在真实发音过程中存在嘴唇、舌头等发音器官的运动，那么利用言语想象时的脑神经信号进行分类、解码，究竟是利用大脑处理言语信息的信号还是发音器官运动想象的信号还需进一步探究。

1.2. 脑神经信号采集

BCI系统根据信号采集方式的不同可分为侵入式和非侵入式两类。侵入式BCI系统需要通过外科手术实现，将电极植入大脑内部，这使得由运动和其他非神经伪迹产生的影响大大减少。在侵入式言语想象BCI系统中，由于ECoG信号信噪比高、具有较高的时间及空间分辨率且侵入性相对较低，所以研究大多是采集ECoG信号。但基于ECoG采集的言语想象BCI系统的被试多为癫痫患者，他们植入电极的最初目的是治疗癫痫并非脑机交互，所以这种方式仅适用于部分特定人群^[24-25]。非侵入式BCI系统是一种将信号采集电极放置在头皮表面获取脑神经信号的无创途径，常见的非侵入式技术有EEG、功能性近红外光谱（functional near-infrared spectroscopy，fNIRS）、MEG等，这也是目前采集脑神经信号应用最广的方式。

目前来看，在言语想象BCI系统中，采集ECoG和EEG信号的研究均得到了广泛开展。ECoG信号凭借其更精准和更快速的控制被应用于鼠标的一维控制^[25-26]，以及句子解码任务^[27-28]。EEG信号的采集凭借其价格较低、便携易用等优点被广泛深入研究，因此在非侵入式言语想象BCI系统中大多数是采集EEG信号。

此外，Kaongoen等^[29]在言语想象任务中，不仅采集了头皮EEG，还采集了耳部EEG，发现两种不同形式的EEG在分类任务上并没有显著差异。因此，在后续言语想象BCI系统应用当中应结合被试的情况选择合适的信号采集方式。另一方面，侵入式采集方式需要通过外科手术植入电极，非侵入式采集方式有的存在一个佩戴过程（如EEG）而有的记录设备比较庞大（如MEG），所以未来开发便携的采集系统也是研究的一个方向。

1.3. 言语想象BCI系统的脑区选择

在基于言语想象BCI系统中，对于侵入式采集方式其电极都是预先设计好的，一经植入就不会再随意移动，而对于非侵入式采集方式而言，可以通过不同的电极排布，筛选出对分类、解码起重要作用的脑区。表1^[8,30-36]展示了言语想象BCI系统的脑区选择，其中侵入式采集方式其电极都是植入固定的脑区，而非侵入式采集方式选择的都是执行分类任务准确率最高的脑区。

表 1. Brain region selection of the BCI system of speech imagery.

言语想象BCI系统的脑区选择

文献	采集方式	脑区选择
Guenther等^[30]，2009	神经营养电极（侵入式）	左侧中央前回
Brumberg等^[8]，2011	微电极（侵入式）	左侧中央前回
Pei等^[31]，2011	ECoG（侵入式）	额叶、顶叶和颞叶区域
Herff等^[32]，2015	ECoG（侵入式）	左侧额叶和颞叶
Sereshkeh等^[33]，2018	fNIRS（非侵入式）	左侧颞叶和左颞顶区
Lee等^[34]，2019	EEG（非侵入式）	左侧布洛卡区和韦尼克区
Riaz等^[35]，2015	EEG（非侵入式）	左侧运动皮层、布洛卡区和韦尼克区
Koizumi等^[36]，2018	EEG（非侵入式）	前额叶皮层

Open in a new tab

从表1中可以发现，基于言语想象BCI系统的脑区选择大多集中在左半脑。此外，Wang等^[37]还设计了采集脑神经信号两种不同电极排布的BCI系统，一种是采集全脑信息，另一种是仅采集左半脑信息，结果表明，仅利用左半脑信息也可以提取出言语想象的脑神经信号特征。通过对脑区更精准的定位，可以优化电极排布，使得基于言语想象BCI系统的应用更加简单轻便。

1.4. 言语想象BCI系统在不同波段下的表现

成年人EEG中主要包括θ波（4 ~ 7 Hz）、α波（8 ~ 13 Hz）、β波（14 ~ 30 Hz）和γ波（> 30 Hz），并且每种频率的EEG节律都与大脑特定的生理现象密切相关。

在言语想象任务过程中，Jahangiri等^[38]通过音节想象分类任务发现α波具有最高的分类性能，其次是β波；同样D’Zmura等^[39]发现，β波（13 ~ 18 Hz）中包含丰富的分类特征。郭苗苗等^[40]通过对言语想象EEG数据进行时频分析发现，被试默读汉字时所引起的EEG信号能量变化差异主要体现在α波和β波。Sereshkeh等^[41]通过对单词想象的EEG信号分析观察到，在布洛卡区和额叶皮层存在β波的激活。Koizumi等^[36]通过言语想象分类任务发现，γ波高频段（60 ~ 120 Hz）取得了比其他波段都高的分类精度，并且在（0 ~ 60 Hz）范围内，γ波低频段（30 ~ 40 Hz）的分类精度存在一个峰值。由于记录EEG与ECoG有着相似的生理学基础，所以在利用ECoG进行言语想象研究中也有类似的结论。Ikeda等^[42]利用ECoG信号进行元音想象研究，发现布洛卡区的β波能产生较高的分类精度。Crone等^[43]在其研究中指出ECoG中的γ波（80 ~ 100 Hz）可用于研究人类语言的神经解剖学和加工动力学，并且在利用ECoG信号进行言语解码研究中大多是利用高γ波信号^[32,44]。

因此，在后续言语想象BCI系统研究中，研究者可将EEG和ECoG的波段关注在α波、β波及高γ波。在言语想象任务过程中，对波段信息的研究有助于选择合适的信号采集方式及频域分析算法。

1.5. 言语想象任务中的EEG动态特征

事件相关电位（event related potential，ERP）是被刺激诱发的电位（相对于自发的EEG节律而言），或者当某种心理因素出现时，在脑区所产生的电位变化。在非侵入式采集的EEG信号中，利用EEG中的ERP开发的BCI经典范式有N170（面孔识别）和P300（打字）。

在言语想象任务中，同样也存在ERP现象，DaSalla等^[45]研究指出，在执行元音想象起始时，C3、CZ和C4电极（国际10-20系统）位置出现负波趋势，在300 ms左右出现正波，并且这些波形与真实语音产生时的ERP非常类似。杨晓芳等^[46]发现在执行音位想象任务时的ERP波形与真实发音器官运动引起的颅内及头皮电位时间进程相似。Kim等^[47]提出将ERP与言语想象任务相结合的范式，利用ERP电位峰值这一特征实现智能家居控制。

虽然ERP已广泛应用于脑功能及神经科学研究，但在言语想象BCI系统中，与ERP相关的研究相对较少，且没有利用ERP进行分类识别的言语想象BCI系统。虽然在言语想象任务中存在ERP现象，但在执行不同言语想象任务时ERP仅具有较小差异，这就使得分类任务变得较为困难，因此从细微的区别中区分不同的心理状态需要利用新的特征及算法进行分析。

2. 言语想象BCI系统的实验范式及想象材料

言语想象BCI系统的实验范式可分为两类，一类是分类任务，另一类是解码任务，如图1所示为言语想象BCI系统实验范式。分类任务的对象是两个或多个单词、音节、音位等，其目标是将脑神经信号分类为有限类别数当中的一类；而解码任务的对象大多是句子，其目标是利用脑神经信号重建连续的语句特征。

2.1. 分类任务实验范式

基于言语想象BCI系统分类任务的实验范式与基于运动想象BCI系统的实验范式较为相似，分类任务实验范式的单个实验试次通常包括准备期、刺激/提示期、想象期和休息期。图1a所示为分类任务实验范式的时序图，以想象“是”为例。

在准备期，被试被要求注视屏幕，一般情况下屏幕会显示“十”字符号，目的是让被试避免头部运动使脑活动保持基线水平，也便于实现后续异步系统想象状态与空闲状态的区分^[48]。

刺激/提示是给出被试在想象期所执行的想象任务，较为常见的刺激/提示材料为单个汉字、单词、音节、音位等。根据刺激/提示材料呈现方式的不同可分为听觉提示^[34,49]、视觉提示^[36,38]和视听结合^[50]三种形式。在刺激/提示期，若为听觉提示，系统会通过扬声器播放被试需要想象的材料；若为视觉提示，系统会通过屏幕显示被试需要想象的材料。其中较为常见的是视觉提示，当在线人机交互系统中使用视觉提示，被试可以在多个提示材料中自主选择要执行的想象任务。若使用听觉提示，被试只能根据听觉提示进行人机交互，将不能自主选择想象材料。与此同时，使用听觉刺激会激活与言语想象相关的脑区，而使用视觉提示就能够避免这一问题^[51]。Sereshkeh等^[52]通过选择一些答案为“是”或“不是”的问题进行视觉提示。若为视听结合，则是将上述两种提示进行结合，Zhang等^[50]在关于汉语音节四种音调的研究中指出，采用视听结合的方式分类准确率要高于仅使用视觉提示。

在想象期，被试被要求想象在刺激/提示期所呈现的提示材料，但是关于言语想象任务如何执行，各个文献都有不同的表述。例如，在心里读某个字不能移动嘴唇且不能发音^[37]；想象在心里默默地说出某个字^[53]；想象以第一人称的方式说话，说话者感觉自己在没有任何发音动作的情况下说话^[34]。通过广泛查阅文献将言语想象任务指导语总结为：被试应以第一人称的角度进行想象，在想象过程中心里默读所提示的材料，但是不能发出声音，同时应保持发音器官与面部器官不能运动^{[11,34,37,40,53]}。

被试进行言语想象的方式一般有两种，一种是在想象期不断重复想象提示材料^{[37, 40]}，另一种是在想象之前或想象过程中会听到较短的周期声音，声音一般为嘟嘟声或滴答声，这样有助于形成节奏，让被试更好地根据节奏进行想象^[54-55]。D’Zmura等^[39]通过不同节奏的言语想象，发现这一方式在提高分类自由度的同时也会产生较为丰富的分类特征。在部分实验范式中，想象期和刺激/提示期是重合的，即在想象过程中视觉刺激/提示持续存在^[45,56]。通过图表的形式可以清楚地表述各个实验范式的设计方法，如表2^[39,54-56]所示，展示了包含节奏提示的想象期时序图。

表 2. Timing diagram of imagination period with rhythm reminder.

包含节奏提示的想象期时序图

文献	范式描述	想象期时序图
Qureshi等^[54]，2018； Mohanchandra等^[55]，2016	在想象之前会听到较短的周期声音，有助于形成节奏，让被试更好地进行想象
D’Zmura等^[39]，2009	在想象同一材料时，通过改变周期T的大小，在提高分类自由度的同时也会产生较为丰富的分类特征
Nguyen等^[56]，2018	指示被试在每次嘟嘟声时进行言语想象，并以相同的节奏继续

Open in a new tab

在休息期，被试不需要进行任何心理想象任务，此时屏幕一般为黑屏。AlSaleh等^[57]在其研究中指出，准备期和言语想象期的二分类准确率要优于休息期和言语想象期的二分类准确率，这种现象是由于准备期的视觉注意会激活大脑对视觉信息的处理。设置休息期是让被试进行休息，避免连续的心理任务产生疲劳，而设置准备期是为了提高想象期与空闲状态的可分性，进一步促进异步系统的发展。

通过对言语想象分类任务实验范式的归纳总结，不难发现各个文献所设计的实验范式并不一致，范式的确定不能单一地从准确率这一个方面进行考虑，而是要结合后续在线系统和用户的使用感受进行综合考虑设计，例如通过设置问卷、控制变量法和多种评价指标寻找合适的实验范式。此外，虽然使用周期性的嘟嘟声可以提高分类性能，但是持续的听觉刺激会对被试造成听觉疲劳，同样听觉刺激对脑神经信号也会产生影响，因此言语想象实验范式的设计仍有改进空间。

2.2. 解码任务实验范式

解码真实言语产生时的脑神经信号是利用言语想象进行人机交互的必要一步。Herff等^[32]通过采集被试阅读句子时的ECoG信号可以实现对音素、单词的解码；而Anumanchipalli等^[27]不仅能够解码阅读句子时的脑神经信号，而且当被试默念句子时（即做出必要的口型，不发出声音），也可以利用所设计的解码器合成语音。直接对想象句子时脑神经信号进行解码存在一定的难度，因此需要将朗读句子时的脑神经信号与句子信息进行标定并训练，以实现对脑神经信号的解码。

解码处理句子时脑神经信号的实验范式是将被试朗读实验材料时的脑神经信号与朗读的内容进行标定，然后利用标定的信息与脑神经信号训练解码器，最后在阅读或想象句子时对脑神经信号进行解码。图1b为解码任务实验范式的流程图，被试阅读屏幕上显示的连续句子，并利用采集到的音频信息做为标签与脑神经信号一起记录。实验材料在屏幕上呈现的形式可分为两类，一类是文本以恒定的速度从右至左在屏幕滚动显示，另一类则是在屏幕一次显示一个句子。为保证记录过程的连贯，被试在记录之前会熟悉所执行的任务。如果被试本身是言语障碍患者，那就需要用到迁移学习，利用健康被试训练解码器进行解码。

2.3. 想象音位/音节

语言的学习都是从音到字，再从词到句，循序渐进学习掌握，同样言语想象材料的选择也符合这一规律。其中音位是语言中具有区别意义作用最基本的语音单位，而音节则是由不同音位组合起来的语音单位。早期对大脑处理言语的研究都始于音位，所以在言语想象BCI系统发展中音位/音节也是首先考虑到的实验材料。如表3^{[38-39,45,50,58]}所示，展示了具有代表性的音位/音节/声调想象材料，并总结了选择这些材料的原因。

表 3. Representative phonemes/syllables/tone imagination materials.

具有代表性的音位/音节/声调想象材料

材料类型	想象的材料	选择的原因	文献
音位	元音/a/和/u/	在真实发音过程中有着不同的嘴部肌肉活动	DaSalla等^[45]，2009
音位	元音/a/、/e/、/i/、/o/和/u/	发音平稳、简单，本身没有特定的意义	Coretto等^[58]，2017
音节	无语义的/ba/和/ku/	能够在分类性能差异上避免语义对言语想象产生影响	D’Zmura等^[39]，2009
音节	有语义的“ba”“fo”“le”“ry”表示 “back”“forward”“left”“right”四个方向	在认知上是合适的，且彼此不同，适合用于控制鼠标、轮椅等外部设备	Jahangiri等^[38]，2017
声调	汉语普通话“ba”的四个声调	声调是语言音调的变化，它在语言感知和语义理解中起着重要作用	Zhang等^[50]，2020

Open in a new tab

以上材料的选择都是基于作者探究性的目的，无论所选择的材料是否有意义其结果都是可分的，因此在后续研究中可以将选择的材料赋予特定的意义，这样便于产生控制输出。除了表中列举的想象材料，有研究逐渐将想象材料拓展到辅音，如杨晓芳等^[46]选择的想象材料为四个元音音位/a/、/i/、/u/、/y/以及四个辅音音位/m/、/n/、/ŋ/、/f/；而Brumberg等^[8]对一名瘫痪患者进行研究，要求其想象38个美式英语音位。随着研究的不断深入，能够发现更多具有可分性的想象材料，这对提升言语想象BCI控制自由度具有重要的应用价值。

2.4. 想象汉字/单词

汉字/单词材料的选择既有基于特定含义的又有基于特定结构的，如表4^{[31,37,52-54,59-62]}所示，展示了具有代表性的汉字/单词想象材料，并总结了选择这些材料的原因。

表 4. Representative Chinese characters/word imagination materials.

具有代表性的汉字/单词想象材料

材料类型	想象的材料	选择的原因	文献
汉字	“左”和“壹”	这两个汉字有着不同的发音、字形和意义，并且在日常生活中常用	Wang等^[37]，2013
单词	“yes”和“no”	回答词，用于控制开关，回答是/否问题	Sereshkeh等^[52]，2019
	“go”、“back”、“left”、“right”和“stop”	方向词，用于控制鼠标、轮椅等外部设备	Qureshi等^[54]，2018
	“ambulance”“help me”“water”等瘫痪/失语症患者常用的高频单词	为瘫痪/失语症患者提供基本的交流	Lee等^[59]，2020
	高音调的“um（嗯）”与警笛声“wee-woo（呜呜）”	用于在线异步系统控制与非控制状态的区分	Song等^[60-61]，2017，2020
	三类不同的词语，分别为字母、数字及生活中常见的一些物品	用于区分具有不同含义言语想象的脑神经信号	Kumar等^[62]，2018
	特定结构的单词，辅音-元音-辅音（consonant-vowel-consonant，CVC），像“bet”“can”“coon”等	用于研究从脑神经信号中解码出元音或辅音成分	Pei等^[31]，2011；Chengaiyan等^[53]，2019

Open in a new tab

从表4中可以发现，目前基于英语言语想象的研究较多，而汉字想象的研究相对较少。除了表中所罗列的，郭苗苗等^[40]还选择了“喝”、“右”、“吃”和“冷”四个汉字作为想象材料。汉语作为世界上使用人口数最多的语言，基于汉语的言语想象BCI系统拥有很大的需求，因此对其研究具有深远的意义。

除了选择某种单一类型的想象材料，也有研究选择多种类型的想象材料进行对比研究。AlSaleh等^[57]根据语义上的变化选择了十一个材料，包括有无语义的音节/ba/和/ku/，方向词“left”、“right”、“up”和“down”，回答词“yes”和“no”，以及情绪词“happy”、“sad”和“help”，研究结果表明想象不同类型的单词与空闲状态的二分类准确率并无差异。Nguyen等^[56]选择了短单词“in”、“out”和“up”，长单词“cooperate”和“independent”，以及音位/a/、/i/和/u/，选择不同类型的想象材料是为了探究影响言语想象分类效果的因素，如复杂程度、意思和发音。实验结果指出，短单词之间和音位之间的分类性能相似，这表明影响言语想象分类效果的是发音而非意思；长单词相比于短单词能提供更高的Kappa系数，平均分别为0.32和0.25，这表明复杂程度越高的单词越容易利用脑神经信号进行区分；一个短单词和一个长单词之间也产生了很高的分类性能，最高能达到二分类96.90%的准确率，这表明不同复杂程度单词之间能提高分类效果。通过对多种类型想象材料进行对比研究，可以为后续言语想象研究在选择想象材料时提供参考。

2.5. 想象句子

将字、词按照一定的逻辑进行组合就构成特定含义的句子，如果用分类任务的思路重建连续的语句将不能表达句子连贯的意思，这时就需要综合考虑词语前后的逻辑进行解码。

Dash等^[63-64]选择了五个常用的短语“Do you understand me？”、“That’s perfect.”、“How are you?”、“I need help.”和“Good-bye.”用作想象材料，虽然选择的是短语但是其本质还是进行分类研究，是将脑神经信号分类为有限数量当中的一类。在解码任务中选择的文本材料有童话故事^[27]、演讲^[32]、MOCHA-TIMIT语料数据库^[28,65]等，由于MOCHA-TIMIT语料数据库中的句子基本上涵盖了英语中出现的所有发音形式，所以使用得较多。由于对解码句子的研究较少，所以选择的材料也有一定的局限性，不过在后续的研究中可以选择生活中常用的一些句子，以帮助言语障碍患者实现简单的沟通交流。同样也可以选择包含生活中常用汉字的文章，像中小学语文教材的课文，并为常见的字、词建立数学模型。

3. 数据处理的关键技术

3.1. 特征提取

特征提取部分是言语想象BCI技术的核心，该过程的实质是从采集的脑神经信号中提取部分有用的信息，并利用这些信息进行不同脑状态的区分。特征提取算法大概可以分为三类：时域法、频域法和空域法。

时域法一般选取各通道信号的均值、方差、峰度等作为特征，常用这种算法的脑神经信号采集方式是EEG^[66-67]和fNIRS^[52,68]。Iqbal等^[69]发现在元音想象的EEG信号中，时域特征取得了比空域更好的分类精度。在fNIRS信号中，Hwang等^[68]在单词想象二分类任务中发现所有时域特征类型中峰度特征的平均分类精度最高，而Sereshkeh等^[33]选择均值作为特征对言语想象任务进行在线分类。

常用的频域法有功率谱密度（power spectral density，PSD）^[31,70]、离散小波变换（discrete wavelet transform，DWT）^[64,71]、梅尔频率倒谱系数（Mel frequency cepstrum coefficient，MFCC）^[72]等。其中MFCC是基于人耳听觉特征建立的，已广泛应用于语音识别领域，也有研究者发现这一算法在言语想象BCI系统中同样适用^[73]。Riaz等^[35]与Cooney等^[74]在基于言语想象BCI系统中，对比分析了不同的特征提取算法，发现利用MFCC进行特征提取在其数据中都取得了最好的分类效果。

空域法中常用到的是共空间模式（common spatial patterns，CSP），这种算法最初是应用在二分类BCI系统中，它是将两种不同类型的信号联合对角化提取相应的特征^{[40,50,59,75-76]}。

时域、频域考虑了单个通道的特征，而空域综合考虑了多个通道的特征，不同类型的特征存在互补关系，并且在言语想象BCI系统中应用最多的特征提取算法为CSP及各种频域算法。Garcia-Salinas等^[77]利用张量分解将时域、频域和空域的信息结合起来提取特征，虽然可以提高分类精度，但是需要较多的计算成本，因此在后续特征提取过程中可以利用特征选择和融合算法筛选出最具辨识度的特征。除了上述常见的特征提取算法，黎曼几何^[78]、脑连接特征^[53]和EEG皮层电流^[79]也被应用于言语想象BCI系统。

3.2. 分类与解码

分类与解码就是进行不同脑状态的区分，确定所提取的特征与脑状态的对应关系。当前，基于言语想象BCI系统的分类与解码算法主要有经典的机器学习和更加前沿的深度学习两类。

其中常用的机器学习分类算法有线性判别分析（linear discriminant analysis，LDA）^[80-81]、极限学习机（extreme learning machine，ELM）^[29,82]、支持向量机（support vector machine，SVM）^[75,83]、随机森林（random forest，RF）^[84-85]等。Min等^[67]在其研究中，对言语想象EEG数据进行分类，其结果表明ELM及其改进算法的性能优于LDA和使用径向基核函数的SVM。Matsumoto等^[86]在其研究中同时使用了高斯核函数的相关向量机（relevance vector machines，RVM）和SVM，当训练数据较少时，使用高斯核函数的SVM分类效果较好，因此该算法适合应用于在线系统。值得注意的是，Sereshkeh等在两个研究中所采集的脑神经信号不同，一个为表征神经元放电的EEG^[71]，另一个为表征脑组织血氧代谢活动的fNIRS^[33]，进而选择了不同的算法。对于EEG信号，Sereshkeh等通过比较正则化的LDA、SVM、朴素贝叶斯（naive Bayes，NB）、K近邻算法（k-nearest neighbor，KNN）和人工神经网络（artificial neural network，ANN）（多层感知器）的分类准确率，发现ANN（多层感知器）有最高的分类准确率。而在关于fNIRS的研究中，Sereshkeh等则指出，与SVM（分别使用线性核函数、多项式核函数、径向基核函数和Sigmoid核函数）、ANN（具有一个隐藏层的多层感知器）和NB相比，正则化的LDA具有最高的分类准确率。同样一些新颖的分类算法也被应用于言语想象BCI系统，像迁移学习^[87]和自适应分类器^[88]。

经典的机器学习算法特征提取与分类是分开进行的，两个处理步骤选择的算法不一定能够达到最佳的效果，并且比较依赖于研究者的经验，而深度学习算法就避免了这一问题，这种算法在部分情况下不需要进行特征提取，而是将特征提取和分类直接在数据中联合学习。深度学习作为一种特殊的机器学习算法，已应用在言语想象BCI系统中，其不仅可以应用于分类任务^[89]，还可应用于解码任务^[27]。从脑神经信号中解码连续的句子作为一种非线性变换存在一定的难度，而深度学习可以从复杂的序列当中直接提取有价值的信息，并且其作为一种端到端的方式能够弥补缺少先验知识（如哪些通道在解码中起决定性作用）造成的问题，而且深度学习在提高分类、解码精度方面更具有潜力。在基于言语想象BCI系统中，常用到的深度学习算法有卷积神经网络（convolutional neural networks，CNN）^[90-91]、循环神经网络（recurrent neural network，RNN）^[65,92]、深度神经网络（deep neural networks，DNN）^[93-94]、长短记忆网络（long short term memory，LSTM）^[27,89]等。除了深度学习可以应用于解码任务，广泛应用于自然语言处理领域的维特比算法^[95]也被应用于解码任务^[32]。

经典的机器学习算法经过长时间的发展已经逐渐成熟，但这种算法以应用于分类任务为主，需要与特征提取算法配合应用，存在着一定的局限性。同样深度学习算法有诸多优点，但是其在BCI系统应用中也存在问题，比如：基于不同数据要设计出不同的深度学习网络结构；通常需要大量的数据集进行训练来调整参数；在线BCI系统的建立存在一定的难度。

3.3. 典型算法比较

由于不同文献间数据采集协议不同（包括被试、实验范式和想象材料等），为比较各算法在言语想象数据中的性能，选择使用相同数据集的文献进行比较。如表5^{[35,45,58,66,69,74,77-78,87,89,96-98]}所示，比较了言语想象BCI系统的特征提取及分类算法，并且所有文献都是对言语想象期间的EEG信号进行分类研究。

表 5. Feature extraction and classification algorithm comparison of speech imagery BCI system.

言语想象BCI系统的特征提取及分类算法比较

文献	材料	特征提取及分类	准确率
注：表中第二列若没有引用参考文献，代表此篇论文使用的数据为作者所采集；若有文献引用，则此篇论文使用的数据为所引用的文献数据
DaSalla等^[45]，2009	元音：/a/和/u/	CSP； SVM	配对二分类（平均）： /a/与空闲状态72.33%， /u/与空闲状态78.00%， /a/与/u/62.67%
Riaz等^[35]，2015	元音：/a/和/u/^[45]	MFCC； KNN	配对二分类（平均）： /a/与空闲状态75.00%， /u/与空闲状态93.83%， /a/与/u/91.83%
Iqbal等^[69]，2016	元音：/a/和/u/^[45]	均值、标准差；线性分类器	配对二分类（平均）： /a/与空闲状态94.17%， /u/与空闲状态100.00%， /a/与/u/95.00%
Zhao等^[66]，2015	音位或音节：/iy/、/uw/、/piy/、/tiy/、/diy/、/m/和/n/；单词：“pat”、“pot”、“knew”和“gnaw”	均值、中值、标准差等时域统计特征； SVM	语音类别二分类：平均55.40%，最高79.16%
Sun等^[96]，2016	音位或音节：/iy/、/uw/、/piy/、/tiy/、/diy/、/m/和/n/；单词：“pat”、“pot”、“knew”和“gnaw”^[66]	神经网络（neural networks，NN）	语音类别二分类：平均69.80%，最高87.00%
Saha等^[89]，2019	音位或音节：/iy/、/uw/、/piy/、/tiy/、/diy/、/m/和/n/；单词：“pat”、“pot”、“knew”和“gnaw”^[66]	基于通道的协方差矩阵； CNN、LSTM、深度自编码器（deep auto encoder，DAE）、XGBoost算法（extreme gradient boost）	语音类别二分类：平均77.90%，最高85.23%
Bakhshali等^[78]，2020	音位或音节：/iy/、/uw/、/piy/、/tiy/、/diy/、/m/和/n/；单词：“pat”、“pot”、“knew”和“gnaw”^[66]	基于相关熵谱密度的黎曼距离； KNN	语音类别二分类：平均77.39%，最高86.52%
Cooney等^[74]，2018	音位或音节：/iy/、/uw/、/piy/、/tiy/、/diy/、/m/和/n/；单词：“pat”、“pot”、“knew”和“gnaw”^[66]	MFCC； SVM	十一分类：平均20.80%，最高33.33%
Panachakel等^[97]，2019	音节：/iy/、/uw/、/piy/、/tiy/、/diy/、/m/和/n/；单词：“pat”、“pot”、“knew”和“gnaw”^[66]	DWT； DNN	十一分类：平均57.15%，最高84.23%
Coretto等^[58]，2017	五个元音：/a/、/e/、/i/、/o/和/u/；六个西班牙单词：“arriba”、“abajo”、“izquierda”、“derecho”、“adelante”和“atras”对应汉语为上、下、左、右、前、后	DWT； RF	元音五分类：平均22.32%；单词六分类：平均18.58%
Garcia-Salinas等^[77]，2018	六个西班牙单词：“arriba”、“abajo”、“izquierda”、“derecho”、“adelante”和“atras”^[58]	张量分解； SVM	单词六分类：平均59.70%
Cooney等^[87]，2019	五个元音：/a/、/e/、/i/、/o/和/u/^[58]	CNN，迁移学习	元音五分类：平均35.68%
Cooney等^[98]，2020	五个元音：/a/、/e/、/i/、/o/和/u/；六个西班牙单词：“arriba”、“abajo”、“izquierda”、“derecho”、“adelante”和“atras”^[58]	浅层CNN，深层CNN，EEGNet	元音五分类：平均30.00%；单词六分类：平均24.97%

Open in a new tab

从表中可以发现不同数据集有着其适合的算法，并没有某一种算法能在所有数据集中表现出很好的结果，选择合适的特征提取算法也能取得和深度学习相媲美的分类精度。因此要结合数据的特点选择合适的算法，当数据较多时可以考虑深度学习算法，较少时考虑迁移学习算法，在线系统则考虑SVM，还可以通过改进已有的算法，使其能够处理小样本、含噪声、非平稳数据。另一方面，可以选择语音、语言领域适用于言语想象BCI系统的数据处理算法，如MFCC（语音识别领域）、维特比算法（自然语言处理领域）。

同样，也没有特定的特征提取及分类算法组合在任意言语想象数据中都取得很好的结果，虽然经典的算法组合为CSP与SVM^[37,45,99]，但随着算法的不断发展，自适应、黎曼几何、深度学习等算法也都得到了广泛应用。

绝大部分言语想象BCI系统研究中仅通过准确率对算法的性能进行评价，仅有个别研究选择了额外的评价指标，如Kappa系数^[56]、灵敏度及特异度^[100]。因此，在后续言语想象研究中，应综合考虑其他评价指标（如信息传输速率、失效率等^[101]）对算法性能进行综合性评价。

4. 存在的问题及对未来的展望

4.1. 在线系统

从离线分析到实时在线是BCI系统的发展规律，早期离线分析是为了探究言语想象范式的可行性，寻找合适的实验范式及想象材料，在其不断发展过程中应选择合适的数据处理算法并逐渐将其应用于在线系统。实时在线输出/控制是衡量BCI系统的金标准，开发在线系统更具有实际应用价值，但目前基于言语想象BCI系统大多采用离线分析方法，实现实时在线的BCI系统较少。

在利用言语想象任务进行在线分类系统中，大多集中于二分类或者三分类研究^[9,33,102]，其中Sereshkeh等^[41]设计的实时在线言语想象BCI系统想象“no”与休息状态间的平均准确率达到75.90%，想象“no”与“yes”间的平均准确率达到69.30%，其后续设计的在线三分类（想象“no”与“yes”及休息状态）的平均准确率达到64.10%^[33]；Chaudhary等^[9]设计的实时在线言语想象BCI系统想象“no”与“yes”间的准确率超过了70.00%；Wang等^[102]设计了言语想象与运动想象结合的在线BCI训练系统，在线分类准确率均超过了80.00%。

在线系统中，实时采集到的数据需要及时处理，由于脑神经信号具有个体差异和非平稳性，这就使得在线系统的发展存在一定的难度，而自适应算法^[103]和迁移学习^[104]的提出能够在一定程度上解决这一问题。另一方面，基于言语想象在线BCI系统大多集中于二分类或者三分类，因此后续开发在线及实时控制的多分类BCI系统更有意义。

4.2. 实验范式

基于言语想象BCI系统没有固定的实验范式，部分研究还是探索性地设计实验范式。因此，可以通过对神经机制与以往范式的研究，设计出标准规范的实验范式，进而推进言语想象BCI系统的发展。

言语想象的材料，在之前的研究中既有无语义的材料，也有有语义的材料，无语义的音位或音节在发音过程中其发音部位、嘴部活动不同，利用这些材料可以实现脑活动状态的区分。由于部分语言的单词或音节是由五个元音中的一个和辅音构成（如英语、日语等），所以选择使用最多的无语义材料是五个元音。而有语义的材料在实现可分性的同时，将有机会在现实生活中得以应用，像回答词、方向词及瘫痪/失语症患者常用的高频单词。所以在后续研究中，应尽量使用有指代意义的音节或生活中常用的词语或句子，这样可以让闭锁综合征及言语障碍患者通过言语想象对设备进行控制，在生活中实现简单的活动和较为流畅的交流。

4.3. 言语想象数据

在言语想象研究中，采集的数据大多是正常人，而言语想象范式其目的是为了提高言语障碍患者的沟通能力，因此在以后的研究中应尽可能选择一部分言语障碍被试。另一方面，大多数研究都是利用言语想象数据进行分类任务，仅有个别研究将言语想象应用于控制鼠标及智能家居。因此，在后续研究中应将言语想象范式与实际控制相结合，在拥有实际应用的同时，还可以提高被试参加实验的积极性与成就感。

基于言语想象的数据库比较少，已有的数据库有西班牙语^[58]和英语^[66]，虽然汉语是世界上使用人口数最多的语言，却没有基于汉字想象的数据库。因此，对言语想象BCI系统感兴趣且有条件的学者可以将采集到的汉字想象数据予以公布，以促使在汉字想象方面的分类解码算法快速发展。

4.4. 解码句子

人们在日常生活中都是使用句子进行交流，并非孤立的字、词，因此研究解码句子具有更深远的意义和应用价值。对解码句子的研究能够更加全面地了解大脑关于语言的加工处理过程，同时对基于分类任务言语想象BCI系统也能起到促进作用。基于句子想象的BCI系统解码研究并未广泛开展，目前的研究都需要采集被试朗读句子时的脑神经信号进行训练并解码，随着研究的不断深入，可不必阅读句子而仅通过想象就足以实现训练与解码。

Makin等^[65]将阅读句子时的ECoG信号到文本的转换类比为机器翻译的过程，解码被试脑神经信号的平均单词错误率约为3%。Sun等^[28]利用所设计的深度学习网络结构对阅读及默念句子时的ECoG信号进行训练解码，其中效果最好的单词错误率为7%。虽然在有限句子数据集中进行解码有着较好的准确性，但如果将该技术应用于自然交流中，则需要不断探索，比如多少数据集才能够满足日常沟通交流，以及如何获得足够多的训练数据集。

从以上研究可以发现，解码处理句子时的脑神经信号一般用的是ECoG信号，虽然这种信号的信噪比较高，但是需要通过外科手术植入电极，使得这一研究只能在特定人群中开展。基于此，在后续研究中，利用非侵入方式获取脑神经信号，通过优化改进数据处理算法以实现对句子的解码将是未来的发展方向。

4.5. 多种状态下脑神经信号分析

在采集脑神经信号进行分类任务过程中，部分研究并不仅仅局限于言语想象过程中的脑神经信号，在真实发音和听觉刺激/提示过程中的脑神经信号同样可以进行分类，利用这两种状态下的脑神经信号产生的分类准确率也明显高于想象状态^[49,64]。也有研究将休息期、刺激/提示期、想象期和真实发音期不同心理状态间的脑神经信号进行分类，不同状态间的分类可以监测被试的脑活动状态，从而实现在线异步系统控制与非控制状态的区分^[66,105]。通过对各个状态内不同言语任务的分类以及不同状态间的分类，可以更好地促进言语想象BCI系统的发展。同样，Wang等^[106]提出将言语想象与运动想象相结合的心理想象范式，这一范式在不增加执行任务心理负担的同时可以提高分类精度。

4.6. 多模态信号对言语信息的解码

在言语想象BCI系统中，不仅可以采集一种形式的脑神经信号对言语信息进行分类、解码，还可以采集两种形式的脑神经信号，如EEG和fNIRS的混合信号^[52]。不同形式间的脑神经信号可以起到互补作用，从而提高BCI系统的性能。

在言语产生时，有大脑的神经活动、舌头等发音器官的运动，这些生物信号都提供了关于言语的信息，因此不仅可以通过采集脑神经信号进行言语信息的解码研究，同样还可以利用发音器官的运动和肌电信号对言语信息进行解码^[107]，如Zhao等^[66]就利用了多种模态信息以实现对音位、音节及单词的分类。未来在言语想象BCI系统开发中，可以将脑神经信号与发音器官的运动、肌电信号、面部特征（舌头、喉咙和嘴唇）等生理信号进行结合，进而开发出自由度更高、效率更快的BCI系统，这对一些存在发音障碍但是发音器官可以运动的患者来说将更加适用。虽然融合多种模态的生理信号蕴含丰富的信息，但是在采集数据时系统也会变得更加复杂，因此开发轻便的多模态信号采集设备也是未来发展需要考虑的一个问题。

4.7. 言语想象BCI系统在未来的发展及应用

基于言语想象BCI系统在多个领域将有着广泛的应用，如交流功能恢复、军事、教育、娱乐等，并且有着很大的研究价值和发展潜力，图2所示为基于言语想象BCI系统的应用。

其中言语想象BCI系统未来最主要的应用在于交流功能恢复及军事领域。应用于交流功能恢复领域BCI系统的经典范式有稳态视觉诱发电位（steady-state visual evoked potential，SSVEP）和P300，通过这两种范式都可以实现打字系统，以帮助言语障碍患者获得与外界交流的能力。但是这两种范式都需要刺激诱发，刺激会让被试产生疲劳，而言语想象范式就避免了这一弊端，能够直接表达真实的内容。随着技术的发展成熟，可以将言语想象应用于军事当中，通过脑神经信号采集、分析和解码，无需使用语音即可进行人与人之间的交流，从而实现无声加密通信；还可以利用言语想象开发多人协调决策融合系统，利用群体的智慧提高决策的准确性。

言语想象范式不仅可以实现交流通信，由于言语想象范式相比运动想象范式具有足够多的指令，同样还可以实现对设备及环境的控制。除了常规的鼠标和轮椅控制，还可以将言语想象BCI系统与物联网技术结合，以实现对智能家居的控制；与智能驾驶技术结合，以实现智能辅助驾驶等。在未来言语想象BCI系统发展过程中可以考虑引入神经反馈技术，通过可视化脑区激活等神经特征，监测和改善被试的言语想象能力^[108]。

在教育领域，通过言语想象BCI系统对检测到的脑神经信号进行解码，然后将解码得到的信息与当前的学习任务进行对比，进而实现对学习状态评估和专注度量化。但这一领域的应用不仅存在技术问题，还存在一系列的伦理问题，比如使用者的个人隐私、网络安全等。

在安全领域，可以将言语想象应用于脑纹识别，所谓脑纹识别指的是利用脑神经信号进行身份识别和验证。大多数进行脑纹识别的研究是基于静息态、运动想象、时间相关和视觉诱发^[109]，利用言语想象进行脑纹识别的研究较少，因此这一技术有着广阔的发展前景^[110]。

在娱乐方面，言语想象BCI系统也有很好的应用，可以利用言语想象开发打字游戏，在提供娱乐功能的同时，可以使言语障碍患者快速掌握基于言语想象的BCI打字系统。还可将其与虚拟现实技术结合，无需额外的外部控制设备，直接通过言语想象控制游戏中的角色，以获得沉浸式体验^[111-112]。

5. 结束语

目前，基于言语想象BCI技术已步入快速发展阶段，多元化的研究让该技术日趋复杂多样，且远未形成统一标准。本文聚焦实验范式和数据处理两大核心问题，系统分析了这两方面的内容，并归纳了在线系统、实验范式、言语想象数据和解码句子这几方面存在的具体问题。这些研究可以帮助相关学者梳理思路，为进一步发展基于言语想象BCI技术提供一些有益借鉴。未来，将言语想象范式发展到能够自然地交互，还需要将其与心理学、神经科学、计算机科学等相关学科结合起来，采取跨学科的方法推进这一研究的发展，并使这一范式产生较好的产业转化^[113]。

重要声明

利益冲突声明：本文全体作者均声明不存在利益冲突。

作者贡献声明：刘艳鹏负责文献查询、归纳总结和论文撰写工作；龚安民负责论文的修改和补充完善；丁鹏、赵磊、钱谦、周建华和苏磊负责修改意见的处理及完善；伏云发负责论文的审校。

Funding Statement

国家自然科学基金（81771926，61763022，82172058，62006246）

The National Natural Science Foundation of China

References

1.Wolpaw J R, Birbaumer N, Mcfarland D J, et al Brain–computer interfaces for communication and control. Clin Neurophysiol. 2002;113(6):767–791. doi: 10.1016/S1388-2457(02)00057-3. [DOI] [PubMed] [Google Scholar]
2.伏云发, 郭衍龙, 张夏冰, 等. 脑机接口: 变革性的人机交互. 北京: 国防工业出版社, 2020.
3.李昭阳, 龚安民, 伏云发基于EEG脑网络下肢动作视觉想象识别研究. 南京大学学报(自然科学) 2020;56(4):570–580. [Google Scholar]
4.Yousefi R, Sereshkeh A R, Chau T Development of a robust asynchronous brain-switch using ErrP-based error correction. J Neural Eng. 2019;16(6):066042. doi: 10.1088/1741-2552/ab4943. [DOI] [PubMed] [Google Scholar]
5.Schafer E W P Cortical activity preceding speech: Semantic specificity. Nature. 1967;216(5122):1338–1339. doi: 10.1038/2161338a0. [DOI] [PubMed] [Google Scholar]
6.Hiraiwa A, Shimohara K, Tokunaga Y EEG topography recognition by neural networks. IEEE Eng Med Biol. 1990;9(3):39–42. doi: 10.1109/51.59211. [DOI] [PubMed] [Google Scholar]
7.Suppes P, Lu Z L, Han B Brain wave recognition of words. P Natl Acad Sci USA. 1997;94(26):14965–14969. doi: 10.1073/pnas.94.26.14965. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Brumberg J S, Wright E J, Andreasen D S, et al Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech-motor cortex. Front Neurosci-Switz. 2011;5:00065. doi: 10.3389/fnins.2011.00065. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Chaudhary U, Xia B, Silvoni S, et al Brain–computer interface–based communication in the completely locked-in state. PLoS Biol. 2017;15(1):e1002593. doi: 10.1371/journal.pbio.1002593. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
10.陈霏, 潘昌杰基于发音想象的脑机接口的研究综述. 信号处理. 2020;36(6):816–830. [Google Scholar]
11.Schultz T, Wand M, Hueber T, et al Biosignal-based spoken communication: A survey. IEEE-ACM T Audio Spe. 2017;25(12):2257–2271. [Google Scholar]
12.Cooney C, Folli R, Coyle D Neurolinguistics research advancing development of a direct-speech brain-computer interface. iScience. 2018;8:103–125. doi: 10.1016/j.isci.2018.09.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Martin S, Millan J D R, Knight R T, et al The use of intracranial recordings to decode human language: Challenges and opportunities. Brain Lang. 2016;193(2019):73–83. doi: 10.1016/j.bandl.2016.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Martin S, Iturrate I, Millan J D R, et al Decoding inner speech using electrocorticography: Progress and challenges toward a speech prosthesis. Front Neurosci-Switz. 2018;12:00422. doi: 10.3389/fnins.2018.00422. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Panachakel J T, Ramakrishnan A G Decoding covert speech from EEG-A comprehensive review. Front Neurosci-Switz. 2021;15:642251. doi: 10.3389/fnins.2021.642251. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Oppenheim G M, Dell G S Motor movement matters: The flexible abstractness of inner speech. Mem Cognition. 2010;38(8):1147–1160. doi: 10.3758/MC.38.8.1147. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Palmer E D, Rosen H J, Ojemann J G, et al An event-related fMRI study of overt and covert word stem completion. Neuroimage. 2001;14(1):182–193. doi: 10.1006/nimg.2001.0779. [DOI] [PubMed] [Google Scholar]
18.Huang J, Carr T H, Cao Y Comparing cortical activations for silent and overt speech using event-related fMRI. Hum Brain Mapp. 2002;15(1):39–53. doi: 10.1002/hbm.1060. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Basho S, Palmer E D, Rubio M A, et al Effects of generation mode in fMRI adaptations of semantic fluency: Paced production and overt speech. Neuropsychologia. 2007;45(8):1697–1706. doi: 10.1016/j.neuropsychologia.2007.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Shuster L I, Lemieux S K An fMRI investigation of covertly and overtly produced mono- and multisyllabic words. Brain Lang. 2005;93(1):20–31. doi: 10.1016/j.bandl.2004.07.007. [DOI] [PubMed] [Google Scholar]
21.Goto T, Hirata M, Umekawa Y, et al Frequency-dependent spatiotemporal distribution of cerebral oscillatory changes during silent reading: A magnetoencephalograhic group analysis. Neuroimage. 2011;54(1):560–567. doi: 10.1016/j.neuroimage.2010.08.023. [DOI] [PubMed] [Google Scholar]
22.Shergill S, Brammer M, Fukuda R, et al Modulation of activity in temporal cortex during generation of inner speech. Hum Brain Mapp. 2002;16(4):219–227. doi: 10.1002/hbm.10046. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Hurlburt R T, Alderson-day B, Kuhn S, et al Exploring the ecological validity of thinking on demand: Neural correlates of elicited vs. spontaneously occurring inner speech. PloS One. 2016;11(2):e0147932. doi: 10.1371/journal.pone.0147932. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Kellis S, Miller K, Thomson K, et al Decoding spoken words using local field potentials recorded from the cortical surface. J Neural Eng. 2010;7(5):056007. doi: 10.1088/1741-2560/7/5/056007. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Leuthardt E C, Gaona C, Sharma M, et al Using the electrocorticographic speech network to control a brain-computer interface in humans. J Neural Eng. 2011;8(3):036004. doi: 10.1088/1741-2560/8/3/036004. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Leuthardt E C, Schalk G, Wolpaw J R, et al A brain-computer interface using electrocorticographic signals in humans. J Neural Eng. 2004;1(2):63–71. doi: 10.1088/1741-2560/1/2/001. [DOI] [PubMed] [Google Scholar]
27.Anumanchipalli G K, Chartier J, Chang E F Speech synthesis from neural decoding of spoken sentences. Nature. 2019;568(7753):493–498. doi: 10.1038/s41586-019-1119-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Sun P, Anumanchipalli G K, Chang E F Brain2Char: a deep architecture for decoding text from brain recordings. J Neural Eng. 2020;17(6):066015. doi: 10.1088/1741-2552/abc742. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Kaongoen N, Choi J, Jo S Speech-imagery-based brain-computer interface system using ear-EEG. J Neural Eng. 2021;18(1):016023. doi: 10.1088/1741-2552/abd10e. [DOI] [PubMed] [Google Scholar]
30.Guenther F H, Brumberg J S, Wright E J, et al A wireless brain-machine interface for real-time speech synthesis. PloS One. 2009;4(12):e8218. doi: 10.1371/journal.pone.0008218. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Pei X, Barbour D, Leuthardt E C, et al Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans. J Neural Eng. 2011;8(4):046028. doi: 10.1088/1741-2560/8/4/046028. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Herff C, Heger D, Pesters A D, et al Brain-to-text: decoding spoken phrases from phone representations in the brain. Front Neurosci-Switz. 2015;9:00217. doi: 10.3389/fnins.2015.00217. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Sereshkeh A R, Yousefi R, Wong A T, et al Online classification of imagined speech using functional near-infrared spectroscopy signals. J Neural Eng. 2018;16(1):016005. doi: 10.1088/1741-2552/aae4b9. [DOI] [PubMed] [Google Scholar]
34.Lee S H, Lee M, Jeong J H, et al Towards an EEG-based intuitive BCI communication system using imagined speech and visual imagery// 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC) Bari: IEEE. 2019:4409–4414. [Google Scholar]
35.Riaz A, Akhtar S, Iftikhar S, et al Inter comparison of classification techniques for vowel speech imagery using EEG sensors// The 2014 2nd International Conference on Systems and Informatics (ICSAI 2014) Shanghai: IEEE. 2015:712–717. [Google Scholar]
36.Koizumi, K, Ueda K, Nakao M Development of a cognitive brain-machine interface based on a visual imagery method// 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Honolulu: IEEE. 2018:1062–1065. doi: 10.1109/EMBC.2018.8512520. [DOI] [PubMed] [Google Scholar]
37.Wang L, Zhang X, Zhong X, et al Analysis and classification of speech imagery EEG for BCI. Biomed Signal Proces. 2013;8(6):901–908. doi: 10.1016/j.bspc.2013.07.011. [DOI] [Google Scholar]
38.Jahangiri A, Sepulveda F The contribution of different frequency bands in class separability of covert speech tasks for BCIs// 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Jeju: IEEE. 2017:2093–2096. doi: 10.1109/EMBC.2017.8037266. [DOI] [PubMed] [Google Scholar]
39.D’Zmura M, Deng S, Lappas T, et al Toward EEG sensing of imagined speech// Jacko J A. Human-computer interaction. New trends. HCI 2009. Lecture notes in computer science. Berlin, Heidelberg: Springer. 2009:40–48. [Google Scholar]
40.郭苗苗, 齐志光, 王磊, 等语言脑机接口康复系统中的参数优化研究. 信号处理. 2018;34(8):974–983. [Google Scholar]
41.Sereshkeh A R, Trott R, Bricout A, et al Online EEG classification of covert speech for brain–computer interfacing. Int J Neural Syst. 2017;27(8):1750033. doi: 10.1142/S0129065717500332. [DOI] [PubMed] [Google Scholar]
42.Ikeda S, Shibata T, Nakano N, et al Neural decoding of single vowels during covert articulation using electrocorticography. Front Hum Neurosci. 2014;8:00125. doi: 10.3389/fnhum.2014.00125. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Crone N E, Hao L, Hart J, et al Electrocorticographic gamma activity during word production in spoken and sign language. Neurology. 2001;57(11):2045–2053. doi: 10.1212/WNL.57.11.2045. [DOI] [PubMed] [Google Scholar]
44.Lotte F, Brumberg J S, Brunner P, et al Electrocorticographic representations of segmental features in continuous speech. Front Hum Neurosci. 2015;9:00097. doi: 10.3389/fnhum.2015.00097. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.DaSalla C S, Kambara H, Sato M, et al Single-trial classification of vowel speech imagery using common spatial patterns. Neural Netw. 2009;22(9):1334–1339. doi: 10.1016/j.neunet.2009.05.008. [DOI] [PubMed] [Google Scholar]
46.杨晓芳, 江铭虎基于汉语音位发音想象的脑机接口研究. 中文信息学报. 2014;28(5):13–23. doi: 10.3969/j.issn.1003-0077.2014.05.002. [DOI] [Google Scholar]
47.Kim H J, Lee M H, Lee M A BCI based smart home system combined with event-related potentials and speech imagery task// 2020 8th International Winter Conference on Brain-Computer Interface (BCI) Gangwon: IEEE. 2020:1–6. [Google Scholar]
48.Wang L, Liu X, Liang Z, et al Analysis and classification of hybrid BCI based on motor imagery and speech imagery. Measurement. 2019;147:106842. doi: 10.1016/j.measurement.2019.07.070. [DOI] [Google Scholar]
49.Martin S, Brunner P, Iturrate I, et al Word pair classification during imagined speech using direct brain recordings. Sci Rep-UK. 2016;6:25803. doi: 10.1038/srep25803. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Zhang X, Li H, Chen F EEG-based classification of imaginary Mandarin tones// 2020 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Montreal: IEEE. 2020:3889–3892. doi: 10.1109/EMBC44109.2020.9176608. [DOI] [PubMed] [Google Scholar]
51.Akbari H, Khalighinejad B, Herrero J L, et al Towards reconstructing intelligible speech from the human auditory cortex. Sci Rep-UK. 2019;9(1):874. doi: 10.1038/s41598-018-37359-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Sereshkeh A R, Yousefi R, Wong A T, et al Development of a ternary hybrid fNIRS-EEG brain–computer interface based on imagined speech. Brain-Computer Interfaces. 2019;6(2):1–13. [Google Scholar]
53.Chengaiyan S, Retnapandian A S, Anandan K Identification of vowels in consonant-vowel-consonant words from speech imagery based EEG signals. Cogn Neurodynamics. 2019;14(1):1–19. doi: 10.1007/s11571-019-09558-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Qureshi M N I, Min B, Park H J, et al Multiclass classification of word imagination speech with hybrid connectivity features. IEEE T Bio-Med Eng. 2018;65(10):2168–2177. doi: 10.1109/TBME.2017.2786251. [DOI] [PubMed] [Google Scholar]
55.Mohanchandra K, Saha S A communication paradigm using subvocalized speech: Translating brain signals into speech. Augment Hum Res. 2016;1:3. doi: 10.1007/s41133-016-0001-z. [DOI] [Google Scholar]
56.Nguyen C H, Karavas G K, Artemiadis P Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features. J Neural Eng. 2018;15(1):016002. doi: 10.1088/1741-2552/aa8235. [DOI] [PubMed] [Google Scholar]
57.AlSaleh M, Moore R, Christensen H, et al Discriminating between imagined speech and non-speech tasks using EEG// 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Honolulu: IEEE. 2018:1952–1955. doi: 10.1109/EMBC.2018.8512681. [DOI] [PubMed] [Google Scholar]
58.Coretto G A P, Gareis I E, Rufiner H L Open access database of EEG signals recorded during imagined speech// 12th International Symposium on Medical Information Processing and Analysis(SIPAIM) Tandil: International Society for Optics and Photonics. 2017:1016002. [Google Scholar]
59.Lee S H, Lee M, Lee S W EEG representations of spatial and temporal features in imagined speech and overt speech// Palaiahnakote S, Baja G S D, Wang L, et al. Pattern recognition. Aucklannd: Springer. 2020:387–400. [Google Scholar]
60.Song Y, Sepulveda F A novel onset detection technique for brain–computer interfaces using sound-production related cognitive tasks in simulated-online system. J Neural Eng. 2017;14(1):016019. doi: 10.1088/1741-2552/14/1/016019. [DOI] [PubMed] [Google Scholar]
61.Song Y, Sepulveda F Comparison between covert sound-production task (sound-imagery) vs. motor-imagery for onset detection in real-life online self-paced BCIs. J Neuroeng Rehabil. 2020;17(1):1–11. doi: 10.1186/s12984-020-0651-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
62.Kumar P, Saini R, Roy P P, et al Envisioned speech recognition using EEG sensors. Pers Ubiquit Comput. 2018;22(1):185–199. doi: 10.1007/s00779-017-1083-4. [DOI] [Google Scholar]
63.Dash D, Ferrari P, Heitzman D, et al Decoding speech from single trial MEG signals using convolutional neural networks and transfer learning// 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Berlin: IEEE. 2019:5531–5535. doi: 10.1109/EMBC.2019.8857874. [DOI] [PubMed] [Google Scholar]
64.Dash D, Ferrari P, Wang J Decoding imagined and spoken phrases from non-invasive neural (MEG) signals. Front Neurosci-Switz. 2020;14:00290. doi: 10.3389/fnins.2020.00290. [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Makin J G, Moses D A, Chang E F Machine translation of cortical activity to text with an encoder-decoder framework. Nat Neurosci. 2020;23(4):575–582. doi: 10.1038/s41593-020-0608-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
66.Zhao S, Rudzicz F Classifying phonological categories in imagined and articulated speech// 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) South Brisbane: IEEE. 2015:992–996. [Google Scholar]
67.Min B, Kim J, Park H J, et al Vowel imagery decoding toward silent speech BCI using extreme learning machine with electroencephalogram. Biomed Res Int. 2016;2016:2618265. doi: 10.1155/2016/2618265. [DOI] [PMC free article] [PubMed] [Google Scholar]
68.Hwang H J, Choi H, Kim J Y, et al Toward more intuitive brain–computer interfacing: classification of binary covert intentions using functional near-infrared spectroscopy. J Biomed Opt. 2016;21(9):091303. doi: 10.1117/1.JBO.21.9.091303. [DOI] [PubMed] [Google Scholar]
69.Iqbal S, Shanir P P M, Khan Y U, et al Time domain analysis of EEG to classify imagined speech// Satapathy S, Raju K, Mandal J, et al. Proceedings of the Second International Conference on Computer and Communication Technologies. Advances in intelligent systems and computing. New Delhi: Springer. 2016:793–800. [Google Scholar]
70.Tottrup L, Leerskov K, Hadsund J T, et al Decoding covert speech for intuitive control of brain-computer interfaces based on single-trial EEG: a feasibility study// 2019 IEEE 16th International Conference on Rehabilitation Robotics (ICORR) Toronto: IEEE. 2019:689–693. doi: 10.1109/ICORR.2019.8779499. [DOI] [PubMed] [Google Scholar]
71.Sereshkeh A R, Trott R, Bricout A, et al EEG classification of covert speech using regularized neural networks. IEEE-ACM T Audio Spe. 2017;25(12):2292–2300. [Google Scholar]
72.Hashim N, Ali A, Mohd-Isa W N Word-based classification of imagined speech using EEG// Alfred R, Iida H, Ag I A, et al. Computational science and technology. Singapore: Springer. 2018:195–204. [Google Scholar]
73.Muda L, Begam M, Elamvazuthi I Voice recognition algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. J Comput. 2010;2(3):138–143. [Google Scholar]
74.Cooney C, Folli R, Coyle D Mel Frequency Cepstral Coefficients enhance imagined speech decoding from EEG// 2018 29th Irish Signals and Systems Conference (ISSC) Belfast: IEEE. 2018:1–7. [Google Scholar]
75.Lee S H, Lee M, Lee S W Neural decoding of imagined speech and visual imagery as intuitive paradigms for BCI communication. IEEE T Neur Sys Reh. 2020;28(12):2647–2659. doi: 10.1109/TNSRE.2020.3040289. [DOI] [PubMed] [Google Scholar]
76.Blankertz B, Tomioka R, Lemm S, et al Optimizing spatial filters for robust EEG single-trial analysis. IEEE Signal Proc Mag. 2008;25(1):41–56. doi: 10.1109/MSP.2008.4408441. [DOI] [Google Scholar]
77.Garcia-Salinas J S, Villasenor-Pineda L, Reyes-Garcia C A, et al Tensor decomposition for imagined speech discrimination in EEG// Batyrshin I, Martinez-Villasenor M, Ponce E H. Advances in computational intelligence. Guadalajara: Springer. 2018:239–249. [Google Scholar]
78.Bakhshali M A, Khademi M, Ebrahimi-Moghadam A, et al EEG signal classification of imagined speech based on Riemannian distance of correntropy spectral density. Biomed Signal Proces and Control. 2020;59:101899. doi: 10.1016/j.bspc.2020.101899. [DOI] [Google Scholar]
79.Yoshimura N, Nishimoto A, Belkacem A N, et al Decoding of covert vowel articulation using electroencephalography cortical currents. Front Neurosci-Switz. 2016;10:00175. doi: 10.3389/fnins.2016.00175. [DOI] [PMC free article] [PubMed] [Google Scholar]
80.Jahangiri A, Achanccaray D, Sepulveda F A novel EEG-based four-class linguistic BCI// 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Berlin: IEEE. 2019:3050–3053. doi: 10.1109/EMBC.2019.8856644. [DOI] [PubMed] [Google Scholar]
81.Deng S, Srinivasan R, Lappas T, et al EEG classification of imagined syllable rhythm using Hilbert spectrum methods. J Neural Eng. 2010;7(4):046006. doi: 10.1088/1741-2560/7/4/046006. [DOI] [PubMed] [Google Scholar]
82.Pawar D, Dhage S Multiclass covert speech classification using extreme learning machine. Biomed Eng Lett. 2020;10(2):217–226. doi: 10.1007/s13534-020-00152-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
83.Sarmiento L C, Cortes C J, Bacca J A, et al Brain computer interface (BCI) with EEG signals for automatic vowel recognition based on articulation mode// 5th ISSNIP-IEEE Biosignals and Biorobotics Conference (2014): Biosignals and Robotics for Better and Safer Living (BRC) Salvador: IEEE. 2014:1–4. [Google Scholar]
84.Torres-Garcia A A, Reyes-Garcia C A, Villasenor-Pineda L Toward a silent speech interface based on unspoken speech//Proceedings of Biosignals 2012 (BIOSTEC) Algarve: SciTePress. 2012:370–373. [Google Scholar]
85.Torres-Garcia A A, Reyes-Garcia C A, Villasenor-Pineda L, et al Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification. Expert Syst Appl. 2016;59:1–12. doi: 10.1016/j.eswa.2016.04.011. [DOI] [Google Scholar]
86.Matsumoto M, Hori J Classification of silent speech using support vector machine and relevance vector machine. Appl Soft Comput. 2014;20:95–102. doi: 10.1016/j.asoc.2013.10.023. [DOI] [Google Scholar]
87.Cooney C, Folli R, Coyle D Optimizing input layers improves CNN generalization and transfer learning for imagined speech decoding from EEG// 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC) Bari: IEEE. 2019:1311–1316. [Google Scholar]
88.Jimenez-Guarneros M, Gomez-Gil P Standardization-refinement domain adaptation method for cross-subject EEG-based classification in imagined speech recognition. Pattern Recogn Lett. 2021;141:54–60. doi: 10.1016/j.patrec.2020.11.013. [DOI] [Google Scholar]
89.Saha P, Fels S, Abdul-Mageed M Deep learning the EEG manifold for phonological categorization from active thoughts// 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Brighton: IEEE. 2019:2762–2766. [Google Scholar]
90.Cooney C, Korik A, Folli R, et al Classification of imagined spoken word-pairs using convolutional neural networks// Gernot R M P, Jonas C D, Selina C W. Proceedings of the 8th Graz Brain Computer Interface Conference 2019. Graz: Verlag der Technischen Universitat Graz. 2019:338–343. [Google Scholar]
91.Parhi M, Tewfik A H Classifying imaginary vowels from frontal lobe EEG via deep learning// 2020 28th European Signal Processing Conference (EUSIPCO) Amsterdam: IEEE. 2021:1195–1199. [Google Scholar]
92.Saha P, Fels S Hierarchical deep feature learning for decoding imagined speech from EEG// Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu: AAAI. 2019:10019–10020. [Google Scholar]
93.Panachakel J T, Ramakrishnan A G, Ananthapadmanabha T V A novel deep learning architecture for decoding imagined speech from EEG. arXiv preprint arXiv. 2020:2003.09374. [Google Scholar]
94.Torres J M M, Stepanov E A, Riccardi G EEG semantic decoding using deep neural networks// Concepts, Actions and Objects Workshop CAOs 2016. Rovereto: Personal Healthcare Agents. 2016:1–2. [Google Scholar]
95.Okhovvat M, Sharifi M, Bidgoli B M An accurate Persian part-of-speech tagger. Comput Syst Sci Eng. 2020;35(6):423–430. doi: 10.32604/csse.2020.35.423. [DOI] [Google Scholar]
96.Sun P, Qin J Neural networks based EEG-speech models. arXiv preprint arXiv. 2016:1612.05369. [Google Scholar]
97.Panachakel J T, Ramakrishnan A G, Ananthapadmanabha T V Decoding imagined speech using wavelet features and deep neural networks// 2019 IEEE 16th India Council International Conference (INDICON) Rajkot: IEEE. 2019:1–4. [Google Scholar]
98.Cooney C, Korik A, Folli R, et al Evaluation of hyperparameter optimization in machine and deep learning methods for decoding imagined speech EEG. Sensors. 2020;20(16):4629. doi: 10.3390/s20164629. [DOI] [PMC free article] [PubMed] [Google Scholar]
99.Matsumoto M Silent speech decoder using adaptive collection// Kuflik T, Stock O. Proceedings of the Companion Publication of the 19th International Conference on Intelligent User Interfaces. New York: Association for Computing Machinery. 2014:73–76. [Google Scholar]
100.Iqbal S, Khan Y U, Farooq O EEG based classification of imagined vowel sounds// 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom) New Delhi: IEEE. 2015:1591–1594. [Google Scholar]
101.吕晓彤, 丁鹏, 李思语, 等脑机接口人因工程及应用:以人为中心的脑机接口设计和评价方法. 生物医学工程学杂志. 2021;38(2):210–223. doi: 10.7507/1001-5515.202101093. [DOI] [PMC free article] [PubMed] [Google Scholar]
102.Wang L, Huang W, Yang Z, et al A method from offline analysis to online training for the brain-computer interface based on motor imagery and speech imagery. Biomed Signal Proces. 2020;62:102100. doi: 10.1016/j.bspc.2020.102100. [DOI] [Google Scholar]
103.熊馨, 杨秋红, 周建华, 等脑机融合控制中脑电伪迹处理方法. 昆明理工大学学报(自然科学版) 2021;46(3):56–70. [Google Scholar]
104.Wan Z, Yang R, Huang M, et al A review on transfer learning in EEG signal analysis. Neurocomputing. 2021;421:1–14. doi: 10.1016/j.neucom.2020.09.017. [DOI] [Google Scholar]
105.Herff C, Heger D, Putze F, et al Cross-subject classification of speaking modes using fnirs// International Conference on Neural Information Processing. Berlin, Heidelberg: Springer. 2012:417–424. [Google Scholar]
106.Wang L, Zhang X, Zhong X, et al Improvement of mental tasks with relevant speech imagery for brain-computer interfaces. Measurement. 2016;91:201–209. doi: 10.1016/j.measurement.2016.05.054. [DOI] [Google Scholar]
107.Denby B, Schultz T, Honda K, et al Silent speech interfaces. Speech Commun. 2010;52(4):270–287. doi: 10.1016/j.specom.2009.08.002. [DOI] [Google Scholar]
108.伏云发, 龚安民, 南文雅. 神经反馈原理与实践. 北京: 电子工业出版社, 2021.
109.汪露雲, 孔万增, 张昕昱, 等脑纹识别研究综述. 中国生物医学工程学报. 2017;36(5):602–607. doi: 10.3969/j.issn.0258-8021.2017.05.013. [DOI] [Google Scholar]
110.Brigham K, Kumar B V K V Subject identification from electroencephalogram (EEG) signals during imagined speech// 2010 Fourth IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS) Washington: IEEE. 2010:1–8. [Google Scholar]
111.伏云发, 龚安民, 陈超, 等. 面向实用的脑-机接口: 缩小研究与实际应用之间的差距. 北京: 科学出版社, 2022.
112.伏云发, 杨秋红, 徐宝磊, 等. 脑机接口原理与实践. 北京: 国防工业出版社, 2017.
113.罗建功, 丁鹏, 龚安民, 等脑机接口技术的应用、产业转化和商业价值. 生物医学工程学杂志. 2022;39(2):405–415. doi: 10.7507/1001-5515.202108068. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b1] 1.Wolpaw J R, Birbaumer N, Mcfarland D J, et al Brain–computer interfaces for communication and control. Clin Neurophysiol. 2002;113(6):767–791. doi: 10.1016/S1388-2457(02)00057-3. [DOI] [PubMed] [Google Scholar]

[b2] 2.伏云发, 郭衍龙, 张夏冰, 等. 脑机接口: 变革性的人机交互. 北京: 国防工业出版社, 2020.

[b3] 3.李昭阳, 龚安民, 伏云发基于EEG脑网络下肢动作视觉想象识别研究. 南京大学学报(自然科学) 2020;56(4):570–580. [Google Scholar]

[b4] 4.Yousefi R, Sereshkeh A R, Chau T Development of a robust asynchronous brain-switch using ErrP-based error correction. J Neural Eng. 2019;16(6):066042. doi: 10.1088/1741-2552/ab4943. [DOI] [PubMed] [Google Scholar]

[b5] 5.Schafer E W P Cortical activity preceding speech: Semantic specificity. Nature. 1967;216(5122):1338–1339. doi: 10.1038/2161338a0. [DOI] [PubMed] [Google Scholar]

[b6] 6.Hiraiwa A, Shimohara K, Tokunaga Y EEG topography recognition by neural networks. IEEE Eng Med Biol. 1990;9(3):39–42. doi: 10.1109/51.59211. [DOI] [PubMed] [Google Scholar]

[b7] 7.Suppes P, Lu Z L, Han B Brain wave recognition of words. P Natl Acad Sci USA. 1997;94(26):14965–14969. doi: 10.1073/pnas.94.26.14965. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b8] 8.Brumberg J S, Wright E J, Andreasen D S, et al Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech-motor cortex. Front Neurosci-Switz. 2011;5:00065. doi: 10.3389/fnins.2011.00065. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b9] 9.Chaudhary U, Xia B, Silvoni S, et al Brain–computer interface–based communication in the completely locked-in state. PLoS Biol. 2017;15(1):e1002593. doi: 10.1371/journal.pbio.1002593. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]

[b10] 10.陈霏, 潘昌杰基于发音想象的脑机接口的研究综述. 信号处理. 2020;36(6):816–830. [Google Scholar]

[b11] 11.Schultz T, Wand M, Hueber T, et al Biosignal-based spoken communication: A survey. IEEE-ACM T Audio Spe. 2017;25(12):2257–2271. [Google Scholar]

[b12] 12.Cooney C, Folli R, Coyle D Neurolinguistics research advancing development of a direct-speech brain-computer interface. iScience. 2018;8:103–125. doi: 10.1016/j.isci.2018.09.016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b13] 13.Martin S, Millan J D R, Knight R T, et al The use of intracranial recordings to decode human language: Challenges and opportunities. Brain Lang. 2016;193(2019):73–83. doi: 10.1016/j.bandl.2016.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b14] 14.Martin S, Iturrate I, Millan J D R, et al Decoding inner speech using electrocorticography: Progress and challenges toward a speech prosthesis. Front Neurosci-Switz. 2018;12:00422. doi: 10.3389/fnins.2018.00422. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b15] 15.Panachakel J T, Ramakrishnan A G Decoding covert speech from EEG-A comprehensive review. Front Neurosci-Switz. 2021;15:642251. doi: 10.3389/fnins.2021.642251. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b16] 16.Oppenheim G M, Dell G S Motor movement matters: The flexible abstractness of inner speech. Mem Cognition. 2010;38(8):1147–1160. doi: 10.3758/MC.38.8.1147. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b17] 17.Palmer E D, Rosen H J, Ojemann J G, et al An event-related fMRI study of overt and covert word stem completion. Neuroimage. 2001;14(1):182–193. doi: 10.1006/nimg.2001.0779. [DOI] [PubMed] [Google Scholar]

[b18] 18.Huang J, Carr T H, Cao Y Comparing cortical activations for silent and overt speech using event-related fMRI. Hum Brain Mapp. 2002;15(1):39–53. doi: 10.1002/hbm.1060. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b19] 19.Basho S, Palmer E D, Rubio M A, et al Effects of generation mode in fMRI adaptations of semantic fluency: Paced production and overt speech. Neuropsychologia. 2007;45(8):1697–1706. doi: 10.1016/j.neuropsychologia.2007.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b20] 20.Shuster L I, Lemieux S K An fMRI investigation of covertly and overtly produced mono- and multisyllabic words. Brain Lang. 2005;93(1):20–31. doi: 10.1016/j.bandl.2004.07.007. [DOI] [PubMed] [Google Scholar]

[b21] 21.Goto T, Hirata M, Umekawa Y, et al Frequency-dependent spatiotemporal distribution of cerebral oscillatory changes during silent reading: A magnetoencephalograhic group analysis. Neuroimage. 2011;54(1):560–567. doi: 10.1016/j.neuroimage.2010.08.023. [DOI] [PubMed] [Google Scholar]

[b22] 22.Shergill S, Brammer M, Fukuda R, et al Modulation of activity in temporal cortex during generation of inner speech. Hum Brain Mapp. 2002;16(4):219–227. doi: 10.1002/hbm.10046. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b23] 23.Hurlburt R T, Alderson-day B, Kuhn S, et al Exploring the ecological validity of thinking on demand: Neural correlates of elicited vs. spontaneously occurring inner speech. PloS One. 2016;11(2):e0147932. doi: 10.1371/journal.pone.0147932. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b24] 24.Kellis S, Miller K, Thomson K, et al Decoding spoken words using local field potentials recorded from the cortical surface. J Neural Eng. 2010;7(5):056007. doi: 10.1088/1741-2560/7/5/056007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b25] 25.Leuthardt E C, Gaona C, Sharma M, et al Using the electrocorticographic speech network to control a brain-computer interface in humans. J Neural Eng. 2011;8(3):036004. doi: 10.1088/1741-2560/8/3/036004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b26] 26.Leuthardt E C, Schalk G, Wolpaw J R, et al A brain-computer interface using electrocorticographic signals in humans. J Neural Eng. 2004;1(2):63–71. doi: 10.1088/1741-2560/1/2/001. [DOI] [PubMed] [Google Scholar]

[b27] 27.Anumanchipalli G K, Chartier J, Chang E F Speech synthesis from neural decoding of spoken sentences. Nature. 2019;568(7753):493–498. doi: 10.1038/s41586-019-1119-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b28] 28.Sun P, Anumanchipalli G K, Chang E F Brain2Char: a deep architecture for decoding text from brain recordings. J Neural Eng. 2020;17(6):066015. doi: 10.1088/1741-2552/abc742. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b29] 29.Kaongoen N, Choi J, Jo S Speech-imagery-based brain-computer interface system using ear-EEG. J Neural Eng. 2021;18(1):016023. doi: 10.1088/1741-2552/abd10e. [DOI] [PubMed] [Google Scholar]

[b30] 30.Guenther F H, Brumberg J S, Wright E J, et al A wireless brain-machine interface for real-time speech synthesis. PloS One. 2009;4(12):e8218. doi: 10.1371/journal.pone.0008218. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b31] 31.Pei X, Barbour D, Leuthardt E C, et al Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans. J Neural Eng. 2011;8(4):046028. doi: 10.1088/1741-2560/8/4/046028. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b32] 32.Herff C, Heger D, Pesters A D, et al Brain-to-text: decoding spoken phrases from phone representations in the brain. Front Neurosci-Switz. 2015;9:00217. doi: 10.3389/fnins.2015.00217. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b33] 33.Sereshkeh A R, Yousefi R, Wong A T, et al Online classification of imagined speech using functional near-infrared spectroscopy signals. J Neural Eng. 2018;16(1):016005. doi: 10.1088/1741-2552/aae4b9. [DOI] [PubMed] [Google Scholar]

[b34] 34.Lee S H, Lee M, Jeong J H, et al Towards an EEG-based intuitive BCI communication system using imagined speech and visual imagery// 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC) Bari: IEEE. 2019:4409–4414. [Google Scholar]

[b35] 35.Riaz A, Akhtar S, Iftikhar S, et al Inter comparison of classification techniques for vowel speech imagery using EEG sensors// The 2014 2nd International Conference on Systems and Informatics (ICSAI 2014) Shanghai: IEEE. 2015:712–717. [Google Scholar]

[b36] 36.Koizumi, K, Ueda K, Nakao M Development of a cognitive brain-machine interface based on a visual imagery method// 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Honolulu: IEEE. 2018:1062–1065. doi: 10.1109/EMBC.2018.8512520. [DOI] [PubMed] [Google Scholar]

[b37] 37.Wang L, Zhang X, Zhong X, et al Analysis and classification of speech imagery EEG for BCI. Biomed Signal Proces. 2013;8(6):901–908. doi: 10.1016/j.bspc.2013.07.011. [DOI] [Google Scholar]

[b38] 38.Jahangiri A, Sepulveda F The contribution of different frequency bands in class separability of covert speech tasks for BCIs// 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Jeju: IEEE. 2017:2093–2096. doi: 10.1109/EMBC.2017.8037266. [DOI] [PubMed] [Google Scholar]

[b39] 39.D’Zmura M, Deng S, Lappas T, et al Toward EEG sensing of imagined speech// Jacko J A. Human-computer interaction. New trends. HCI 2009. Lecture notes in computer science. Berlin, Heidelberg: Springer. 2009:40–48. [Google Scholar]

[b40] 40.郭苗苗, 齐志光, 王磊, 等语言脑机接口康复系统中的参数优化研究. 信号处理. 2018;34(8):974–983. [Google Scholar]

[b41] 41.Sereshkeh A R, Trott R, Bricout A, et al Online EEG classification of covert speech for brain–computer interfacing. Int J Neural Syst. 2017;27(8):1750033. doi: 10.1142/S0129065717500332. [DOI] [PubMed] [Google Scholar]

[b42] 42.Ikeda S, Shibata T, Nakano N, et al Neural decoding of single vowels during covert articulation using electrocorticography. Front Hum Neurosci. 2014;8:00125. doi: 10.3389/fnhum.2014.00125. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b43] 43.Crone N E, Hao L, Hart J, et al Electrocorticographic gamma activity during word production in spoken and sign language. Neurology. 2001;57(11):2045–2053. doi: 10.1212/WNL.57.11.2045. [DOI] [PubMed] [Google Scholar]

[b44] 44.Lotte F, Brumberg J S, Brunner P, et al Electrocorticographic representations of segmental features in continuous speech. Front Hum Neurosci. 2015;9:00097. doi: 10.3389/fnhum.2015.00097. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b45] 45.DaSalla C S, Kambara H, Sato M, et al Single-trial classification of vowel speech imagery using common spatial patterns. Neural Netw. 2009;22(9):1334–1339. doi: 10.1016/j.neunet.2009.05.008. [DOI] [PubMed] [Google Scholar]

[b46] 46.杨晓芳, 江铭虎基于汉语音位发音想象的脑机接口研究. 中文信息学报. 2014;28(5):13–23. doi: 10.3969/j.issn.1003-0077.2014.05.002. [DOI] [Google Scholar]

[b47] 47.Kim H J, Lee M H, Lee M A BCI based smart home system combined with event-related potentials and speech imagery task// 2020 8th International Winter Conference on Brain-Computer Interface (BCI) Gangwon: IEEE. 2020:1–6. [Google Scholar]

[b48] 48.Wang L, Liu X, Liang Z, et al Analysis and classification of hybrid BCI based on motor imagery and speech imagery. Measurement. 2019;147:106842. doi: 10.1016/j.measurement.2019.07.070. [DOI] [Google Scholar]

[b49] 49.Martin S, Brunner P, Iturrate I, et al Word pair classification during imagined speech using direct brain recordings. Sci Rep-UK. 2016;6:25803. doi: 10.1038/srep25803. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b50] 50.Zhang X, Li H, Chen F EEG-based classification of imaginary Mandarin tones// 2020 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Montreal: IEEE. 2020:3889–3892. doi: 10.1109/EMBC44109.2020.9176608. [DOI] [PubMed] [Google Scholar]

[b51] 51.Akbari H, Khalighinejad B, Herrero J L, et al Towards reconstructing intelligible speech from the human auditory cortex. Sci Rep-UK. 2019;9(1):874. doi: 10.1038/s41598-018-37359-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b52] 52.Sereshkeh A R, Yousefi R, Wong A T, et al Development of a ternary hybrid fNIRS-EEG brain–computer interface based on imagined speech. Brain-Computer Interfaces. 2019;6(2):1–13. [Google Scholar]

[b53] 53.Chengaiyan S, Retnapandian A S, Anandan K Identification of vowels in consonant-vowel-consonant words from speech imagery based EEG signals. Cogn Neurodynamics. 2019;14(1):1–19. doi: 10.1007/s11571-019-09558-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b54] 54.Qureshi M N I, Min B, Park H J, et al Multiclass classification of word imagination speech with hybrid connectivity features. IEEE T Bio-Med Eng. 2018;65(10):2168–2177. doi: 10.1109/TBME.2017.2786251. [DOI] [PubMed] [Google Scholar]

[b55] 55.Mohanchandra K, Saha S A communication paradigm using subvocalized speech: Translating brain signals into speech. Augment Hum Res. 2016;1:3. doi: 10.1007/s41133-016-0001-z. [DOI] [Google Scholar]

[b56] 56.Nguyen C H, Karavas G K, Artemiadis P Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features. J Neural Eng. 2018;15(1):016002. doi: 10.1088/1741-2552/aa8235. [DOI] [PubMed] [Google Scholar]

[b57] 57.AlSaleh M, Moore R, Christensen H, et al Discriminating between imagined speech and non-speech tasks using EEG// 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Honolulu: IEEE. 2018:1952–1955. doi: 10.1109/EMBC.2018.8512681. [DOI] [PubMed] [Google Scholar]

[b58] 58.Coretto G A P, Gareis I E, Rufiner H L Open access database of EEG signals recorded during imagined speech// 12th International Symposium on Medical Information Processing and Analysis(SIPAIM) Tandil: International Society for Optics and Photonics. 2017:1016002. [Google Scholar]

[b59] 59.Lee S H, Lee M, Lee S W EEG representations of spatial and temporal features in imagined speech and overt speech// Palaiahnakote S, Baja G S D, Wang L, et al. Pattern recognition. Aucklannd: Springer. 2020:387–400. [Google Scholar]

[b60] 60.Song Y, Sepulveda F A novel onset detection technique for brain–computer interfaces using sound-production related cognitive tasks in simulated-online system. J Neural Eng. 2017;14(1):016019. doi: 10.1088/1741-2552/14/1/016019. [DOI] [PubMed] [Google Scholar]

[b61] 61.Song Y, Sepulveda F Comparison between covert sound-production task (sound-imagery) vs. motor-imagery for onset detection in real-life online self-paced BCIs. J Neuroeng Rehabil. 2020;17(1):1–11. doi: 10.1186/s12984-020-0651-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b62] 62.Kumar P, Saini R, Roy P P, et al Envisioned speech recognition using EEG sensors. Pers Ubiquit Comput. 2018;22(1):185–199. doi: 10.1007/s00779-017-1083-4. [DOI] [Google Scholar]

[b63] 63.Dash D, Ferrari P, Heitzman D, et al Decoding speech from single trial MEG signals using convolutional neural networks and transfer learning// 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Berlin: IEEE. 2019:5531–5535. doi: 10.1109/EMBC.2019.8857874. [DOI] [PubMed] [Google Scholar]

[b64] 64.Dash D, Ferrari P, Wang J Decoding imagined and spoken phrases from non-invasive neural (MEG) signals. Front Neurosci-Switz. 2020;14:00290. doi: 10.3389/fnins.2020.00290. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b65] 65.Makin J G, Moses D A, Chang E F Machine translation of cortical activity to text with an encoder-decoder framework. Nat Neurosci. 2020;23(4):575–582. doi: 10.1038/s41593-020-0608-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b66] 66.Zhao S, Rudzicz F Classifying phonological categories in imagined and articulated speech// 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) South Brisbane: IEEE. 2015:992–996. [Google Scholar]

[b67] 67.Min B, Kim J, Park H J, et al Vowel imagery decoding toward silent speech BCI using extreme learning machine with electroencephalogram. Biomed Res Int. 2016;2016:2618265. doi: 10.1155/2016/2618265. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b68] 68.Hwang H J, Choi H, Kim J Y, et al Toward more intuitive brain–computer interfacing: classification of binary covert intentions using functional near-infrared spectroscopy. J Biomed Opt. 2016;21(9):091303. doi: 10.1117/1.JBO.21.9.091303. [DOI] [PubMed] [Google Scholar]

[b69] 69.Iqbal S, Shanir P P M, Khan Y U, et al Time domain analysis of EEG to classify imagined speech// Satapathy S, Raju K, Mandal J, et al. Proceedings of the Second International Conference on Computer and Communication Technologies. Advances in intelligent systems and computing. New Delhi: Springer. 2016:793–800. [Google Scholar]

[b70] 70.Tottrup L, Leerskov K, Hadsund J T, et al Decoding covert speech for intuitive control of brain-computer interfaces based on single-trial EEG: a feasibility study// 2019 IEEE 16th International Conference on Rehabilitation Robotics (ICORR) Toronto: IEEE. 2019:689–693. doi: 10.1109/ICORR.2019.8779499. [DOI] [PubMed] [Google Scholar]

[b71] 71.Sereshkeh A R, Trott R, Bricout A, et al EEG classification of covert speech using regularized neural networks. IEEE-ACM T Audio Spe. 2017;25(12):2292–2300. [Google Scholar]

[b72] 72.Hashim N, Ali A, Mohd-Isa W N Word-based classification of imagined speech using EEG// Alfred R, Iida H, Ag I A, et al. Computational science and technology. Singapore: Springer. 2018:195–204. [Google Scholar]

[b73] 73.Muda L, Begam M, Elamvazuthi I Voice recognition algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. J Comput. 2010;2(3):138–143. [Google Scholar]

[b74] 74.Cooney C, Folli R, Coyle D Mel Frequency Cepstral Coefficients enhance imagined speech decoding from EEG// 2018 29th Irish Signals and Systems Conference (ISSC) Belfast: IEEE. 2018:1–7. [Google Scholar]

[b75] 75.Lee S H, Lee M, Lee S W Neural decoding of imagined speech and visual imagery as intuitive paradigms for BCI communication. IEEE T Neur Sys Reh. 2020;28(12):2647–2659. doi: 10.1109/TNSRE.2020.3040289. [DOI] [PubMed] [Google Scholar]

[b76] 76.Blankertz B, Tomioka R, Lemm S, et al Optimizing spatial filters for robust EEG single-trial analysis. IEEE Signal Proc Mag. 2008;25(1):41–56. doi: 10.1109/MSP.2008.4408441. [DOI] [Google Scholar]

[b77] 77.Garcia-Salinas J S, Villasenor-Pineda L, Reyes-Garcia C A, et al Tensor decomposition for imagined speech discrimination in EEG// Batyrshin I, Martinez-Villasenor M, Ponce E H. Advances in computational intelligence. Guadalajara: Springer. 2018:239–249. [Google Scholar]

[b78] 78.Bakhshali M A, Khademi M, Ebrahimi-Moghadam A, et al EEG signal classification of imagined speech based on Riemannian distance of correntropy spectral density. Biomed Signal Proces and Control. 2020;59:101899. doi: 10.1016/j.bspc.2020.101899. [DOI] [Google Scholar]

[b79] 79.Yoshimura N, Nishimoto A, Belkacem A N, et al Decoding of covert vowel articulation using electroencephalography cortical currents. Front Neurosci-Switz. 2016;10:00175. doi: 10.3389/fnins.2016.00175. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b80] 80.Jahangiri A, Achanccaray D, Sepulveda F A novel EEG-based four-class linguistic BCI// 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) Berlin: IEEE. 2019:3050–3053. doi: 10.1109/EMBC.2019.8856644. [DOI] [PubMed] [Google Scholar]

[b81] 81.Deng S, Srinivasan R, Lappas T, et al EEG classification of imagined syllable rhythm using Hilbert spectrum methods. J Neural Eng. 2010;7(4):046006. doi: 10.1088/1741-2560/7/4/046006. [DOI] [PubMed] [Google Scholar]

[b82] 82.Pawar D, Dhage S Multiclass covert speech classification using extreme learning machine. Biomed Eng Lett. 2020;10(2):217–226. doi: 10.1007/s13534-020-00152-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b83] 83.Sarmiento L C, Cortes C J, Bacca J A, et al Brain computer interface (BCI) with EEG signals for automatic vowel recognition based on articulation mode// 5th ISSNIP-IEEE Biosignals and Biorobotics Conference (2014): Biosignals and Robotics for Better and Safer Living (BRC) Salvador: IEEE. 2014:1–4. [Google Scholar]

[b84] 84.Torres-Garcia A A, Reyes-Garcia C A, Villasenor-Pineda L Toward a silent speech interface based on unspoken speech//Proceedings of Biosignals 2012 (BIOSTEC) Algarve: SciTePress. 2012:370–373. [Google Scholar]

[b85] 85.Torres-Garcia A A, Reyes-Garcia C A, Villasenor-Pineda L, et al Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification. Expert Syst Appl. 2016;59:1–12. doi: 10.1016/j.eswa.2016.04.011. [DOI] [Google Scholar]

[b86] 86.Matsumoto M, Hori J Classification of silent speech using support vector machine and relevance vector machine. Appl Soft Comput. 2014;20:95–102. doi: 10.1016/j.asoc.2013.10.023. [DOI] [Google Scholar]

[b87] 87.Cooney C, Folli R, Coyle D Optimizing input layers improves CNN generalization and transfer learning for imagined speech decoding from EEG// 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC) Bari: IEEE. 2019:1311–1316. [Google Scholar]

[b88] 88.Jimenez-Guarneros M, Gomez-Gil P Standardization-refinement domain adaptation method for cross-subject EEG-based classification in imagined speech recognition. Pattern Recogn Lett. 2021;141:54–60. doi: 10.1016/j.patrec.2020.11.013. [DOI] [Google Scholar]

[b89] 89.Saha P, Fels S, Abdul-Mageed M Deep learning the EEG manifold for phonological categorization from active thoughts// 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Brighton: IEEE. 2019:2762–2766. [Google Scholar]

[b90] 90.Cooney C, Korik A, Folli R, et al Classification of imagined spoken word-pairs using convolutional neural networks// Gernot R M P, Jonas C D, Selina C W. Proceedings of the 8th Graz Brain Computer Interface Conference 2019. Graz: Verlag der Technischen Universitat Graz. 2019:338–343. [Google Scholar]

[b91] 91.Parhi M, Tewfik A H Classifying imaginary vowels from frontal lobe EEG via deep learning// 2020 28th European Signal Processing Conference (EUSIPCO) Amsterdam: IEEE. 2021:1195–1199. [Google Scholar]

[b92] 92.Saha P, Fels S Hierarchical deep feature learning for decoding imagined speech from EEG// Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu: AAAI. 2019:10019–10020. [Google Scholar]

[b93] 93.Panachakel J T, Ramakrishnan A G, Ananthapadmanabha T V A novel deep learning architecture for decoding imagined speech from EEG. arXiv preprint arXiv. 2020:2003.09374. [Google Scholar]

[b94] 94.Torres J M M, Stepanov E A, Riccardi G EEG semantic decoding using deep neural networks// Concepts, Actions and Objects Workshop CAOs 2016. Rovereto: Personal Healthcare Agents. 2016:1–2. [Google Scholar]

[b95] 95.Okhovvat M, Sharifi M, Bidgoli B M An accurate Persian part-of-speech tagger. Comput Syst Sci Eng. 2020;35(6):423–430. doi: 10.32604/csse.2020.35.423. [DOI] [Google Scholar]

[b96] 96.Sun P, Qin J Neural networks based EEG-speech models. arXiv preprint arXiv. 2016:1612.05369. [Google Scholar]

[b97] 97.Panachakel J T, Ramakrishnan A G, Ananthapadmanabha T V Decoding imagined speech using wavelet features and deep neural networks// 2019 IEEE 16th India Council International Conference (INDICON) Rajkot: IEEE. 2019:1–4. [Google Scholar]

[b98] 98.Cooney C, Korik A, Folli R, et al Evaluation of hyperparameter optimization in machine and deep learning methods for decoding imagined speech EEG. Sensors. 2020;20(16):4629. doi: 10.3390/s20164629. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b99] 99.Matsumoto M Silent speech decoder using adaptive collection// Kuflik T, Stock O. Proceedings of the Companion Publication of the 19th International Conference on Intelligent User Interfaces. New York: Association for Computing Machinery. 2014:73–76. [Google Scholar]

[b100] 100.Iqbal S, Khan Y U, Farooq O EEG based classification of imagined vowel sounds// 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom) New Delhi: IEEE. 2015:1591–1594. [Google Scholar]

[b101] 101.吕晓彤, 丁鹏, 李思语, 等脑机接口人因工程及应用:以人为中心的脑机接口设计和评价方法. 生物医学工程学杂志. 2021;38(2):210–223. doi: 10.7507/1001-5515.202101093. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b102] 102.Wang L, Huang W, Yang Z, et al A method from offline analysis to online training for the brain-computer interface based on motor imagery and speech imagery. Biomed Signal Proces. 2020;62:102100. doi: 10.1016/j.bspc.2020.102100. [DOI] [Google Scholar]

[b103] 103.熊馨, 杨秋红, 周建华, 等脑机融合控制中脑电伪迹处理方法. 昆明理工大学学报(自然科学版) 2021;46(3):56–70. [Google Scholar]

[b104] 104.Wan Z, Yang R, Huang M, et al A review on transfer learning in EEG signal analysis. Neurocomputing. 2021;421:1–14. doi: 10.1016/j.neucom.2020.09.017. [DOI] [Google Scholar]

[b105] 105.Herff C, Heger D, Putze F, et al Cross-subject classification of speaking modes using fnirs// International Conference on Neural Information Processing. Berlin, Heidelberg: Springer. 2012:417–424. [Google Scholar]

[b106] 106.Wang L, Zhang X, Zhong X, et al Improvement of mental tasks with relevant speech imagery for brain-computer interfaces. Measurement. 2016;91:201–209. doi: 10.1016/j.measurement.2016.05.054. [DOI] [Google Scholar]

[b107] 107.Denby B, Schultz T, Honda K, et al Silent speech interfaces. Speech Commun. 2010;52(4):270–287. doi: 10.1016/j.specom.2009.08.002. [DOI] [Google Scholar]

[b108] 108.伏云发, 龚安民, 南文雅. 神经反馈原理与实践. 北京: 电子工业出版社, 2021.

[b109] 109.汪露雲, 孔万增, 张昕昱, 等脑纹识别研究综述. 中国生物医学工程学报. 2017;36(5):602–607. doi: 10.3969/j.issn.0258-8021.2017.05.013. [DOI] [Google Scholar]

[b110] 110.Brigham K, Kumar B V K V Subject identification from electroencephalogram (EEG) signals during imagined speech// 2010 Fourth IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS) Washington: IEEE. 2010:1–8. [Google Scholar]

[b111] 111.伏云发, 龚安民, 陈超, 等. 面向实用的脑-机接口: 缩小研究与实际应用之间的差距. 北京: 科学出版社, 2022.

[b112] 112.伏云发, 杨秋红, 徐宝磊, 等. 脑机接口原理与实践. 北京: 国防工业出版社, 2017.

[b113] 113.罗建功, 丁鹏, 龚安民, 等脑机接口技术的应用、产业转化和商业价值. 生物医学工程学杂志. 2022;39(2):405–415. doi: 10.7507/1001-5515.202108068. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

基于言语想象的脑机交互关键技术

Key technology of brain-computer interaction based on speech imagery

Yanpeng LIU

Anmin GONG

Peng DING

Lei ZHAO

Qian QIAN

Jianhua ZHOU

Lei SU

Yunfa FU

Abstract

Abstract

引言

1. 言语想象的神经机制

1.1. 言语想象的基础生理过程

1.2. 脑神经信号采集

1.3. 言语想象BCI系统的脑区选择

表 1. Brain region selection of the BCI system of speech imagery.

1.4. 言语想象BCI系统在不同波段下的表现

1.5. 言语想象任务中的EEG动态特征

2. 言语想象BCI系统的实验范式及想象材料

图 1.

2.1. 分类任务实验范式

表 2. Timing diagram of imagination period with rhythm reminder.

2.2. 解码任务实验范式

2.3. 想象音位/音节

表 3. Representative phonemes/syllables/tone imagination materials.

2.4. 想象汉字/单词

表 4. Representative Chinese characters/word imagination materials.

2.5. 想象句子

3. 数据处理的关键技术

3.1. 特征提取

3.2. 分类与解码

3.3. 典型算法比较

表 5. Feature extraction and classification algorithm comparison of speech imagery BCI system.

4. 存在的问题及对未来的展望

4.1. 在线系统

4.2. 实验范式

4.3. 言语想象数据

4.4. 解码句子

4.5. 多种状态下脑神经信号分析

4.6. 多模态信号对言语信息的解码

4.7. 言语想象BCI系统在未来的发展及应用

图 2.

5. 结束语

Funding Statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases