Shmiel et al. 10.1073/pnas.0509346102.

Supporting Information

Files in this Data Supplement:

Supporting Figure 5
Supporting Figure 6
Supporting Figure 7
Supporting Figure 8
Supporting Table 1
Supporting Text




Supporting Figure 5

Fig. 5. Examples of three drawing components found by data-mining techniques. Each row represents several occurrences of the same component, which are marked by thicker lines. The drawing components were based on direction changes and on curvature changes of the drawing.





Supporting Figure 6

Fig. 6. An example for counting neural components around drawing components. Bottom trace times show the occurrences of a drawing component A. Vertical linelets show the times at which neurons 1.2 and 8.0 generated spikes. For each occurrence of A, we look at a window that started 400 ms before it and lasted for 300 ms (marked in red), and count the occurrences of each possible neural component (that consists of these two neurons) whose first spike is inside this window. Thicker blue and green linelets show the occurrences of a particular neural component in which neuron 8.0 emitted a spike and then, within 38–39 ms, neuron 1.2 emitted a spike. In this illustration, the illustrated neural component repeats four times around A.





Supporting Figure 7

Fig. 7. Dot display showing occurrences of a frequent ISI around occurrences of the drawing component that was shown in Fig. 1a. (Upper) Firing times of unit 8.0. (Lower) Firing times of unit 1.2. Each linelet represents a single spike. Each of the 63 lines in both panels shows spikes occurred around the appropriate 63 occurrences of the drawing component, in which it was preceded at least once by the interval 38–39 ms in the window of 0.4 to 0.1 s before the start of the drawing component. The gray line in each panel represents the average firing rate using bins of 9 ms. (Scale bars: 25 spikes per s.) For all other details about the dot display, see the legend of Fig. 2.





Supporting Figure 8

Fig. 8. Distribution of ISIs in each set of measurements. Shown are the distributions of ISIs in the first and the second sets of measurements. The abscissa shows the bins, and the ordinate shows the total amount of ISIs that fell into these bins in the appropriate 3 significant days of each set.





Table 1. Minimal widths of teetering windows

First set of measurements

Second set of measurements

Day

Minimal width of teetering window, ms

Day

Minimal width of teetering window, ms

16 Jul 00

3

26 Aug 99

0.5

28 Jun 00

6

14 Sep 99

3

27 Jun 00

12

22 Aug 99

4





Supporting Text

Our main idea was to look for unknown relations between hand motion and patterns of spikes. In our analysis, we focused on finding relations that involve exact time delays between spikes. The relations were found using various statistical methods including data-mining techniques, and their probability of occurring by chance was evaluated by using 5,000 surrogate spike trains having the same firing-rate undulations as the actual data.

The Experiment

Macaca fascicularis

monkeys were trained to hold a two-joint manipulandum and scribble in the horizontal plane. One of 19 hexagonal patches that tiled the working space was randomly selected as a target. When the monkey hit the target, a short beep was sounded, the monkey obtained a juice reward, and the target jumped at random to another location. Each piece of drawing until a reward was considered a single trial. In this way the monkey was encouraged to produce continuous movements. After training, a recording chamber and a head holder were attached to the skull under full surgical anesthesia in aseptic conditions. The chamber position was verified by MRI. Proximal arm regions were identified by cortical microstimulation and the border between M1 and Pre-M by histology. After recovery, daily recordings of cortical activity with eight individually movable microelectrodes commenced. After recording for several weeks, the monkey was trained to alternate between trials of scribbling and trials of a standard center-out motion. Data were initially stored in the computer memory and after 10 rewards, or 1 min with no reward, were transferred into disk files. Each file only contained trials with similar behavior. Single-unit activity was extracted on-line by MSD (Alpha Omega, Nazareth). Only shapes that were regarded as well isolated or as a mixture of two spike shapes at most were used. Two sets of experimental days were analyzed. In the first set, spike times were recorded at a time-resolution of 1 ms, whereas in the second set the resolution was 0.1 ms. Simultaneously, the manipulandum position was sampled 100 times per s. Monkeys’ handling and treatment conformed the law, were approved by the institutional ethics committee, and were supervised by a veterinarian.

Here, only data obtained for blocks of scribbling motion were analyzed. The statistics would benefit from having long stretches of stationary recording with multiple neurons. Therefore, we looked for recording days with penetrations in which many cells showed arm-related activity based on intracortical microstimulations (train duration 100–200 ms, 300 pulses per s, each lasting 0.2 ms, amplitude ≤ 50 mA) and/or passive manipulations of arm joints. Only days with at least 400 correct trials were selected. All isolated units were tested for stability of total spike count in a fixed window ranging from 1,000 ms before the reward event until the reward. For each unit, we used only a continuous range of files in which its activity was judged to be stable. The results in this article are based on 6 recording days that best fit the above criteria. These days contained 8–21 isolated units recorded through four to eight electrodes that showed stationary activity in a range of 10–162 files.

Drawing Components and Neural Components

The first goal was finding repeating patterns for each kind of data independently. These patterns are termed drawing components and neural components, respectively.

Drawing Events and Drawing Components.

To find repeated patterns of drawing, we used data-mining techniques (1, 2). For this purpose the continuous drawing must be converted into a sequence of events (drawing events). In one experimental day there are hundreds of such events. Searching for repeating sequences of events is greatly facilitated by algorithms of data-mining. A drawing event was marked as occurring at the time at which a certain drawing property changed from one range of values to another. The property itself may be arbitrarily chosen. For example, it could be defined as a change in the drawing direction from a range of 0–30° to a range of 30–60°. Other definitions can be based on changes in the curvature or in the velocity of the drawing. Once a set of criteria for identifying drawing events is defined, the drawing data are translated into a sequence of these events along the time axis. Then, data-mining algorithms are activated to detect repeating subsequences in the translated data. The repeating subsequences are called drawing components. Naturally, different definitions of the set of criteria lead to different drawing components. Fig. 5 illustrates three drawing components found in one recording day, where two types of drawing events were used: changes in the direction and changes in the curvature.

In this article, only drawing events which were defined by using direction changes and velocity changes were used. All possible drawing directions were quantized into 12 sections of 30° each. Transitions from one section to another were considered as drawing events. In the first set of measurements we used also velocity changes as events by quantizing the drawing velocity to five sections of 10 cm/s each. Sequences of such repeating events that last for at most 0.5 s were considered as drawing components. In the first set of measurements, drawing components were defined as sequences that repeated at least 750 times, whereas in the second set the drawing components were defined as the 15 most repeating sequences. There were 12–25 different drawing components for a given day.

Neural Components.

The basic entity of the neural data are a single spike generated at a specific time by a specific neuron. A neural component was defined as a triple (n1, n2, delta) where n1 and n2 are two neurons and delta is a time interval (between spikes generated by these neurons). In this way, each pair of spikes in the neural data could be interpreted as an occurrence of some neural component. For the results given in this article, the total number of time intervals between spikes was limited to 50. In the first set of measurements, time intervals were quantized to 2 ms (i.e., 38 ms and 39 ms were both interpreted as the same interval). In the second set, in which spikes were recorded at a better resolution (0.1 ms), they were quantized to 1 ms. Hence, each pair of neurons yielded 50 distinct neural components.

Spike sorting by shapes that were recorded through a single electrode can result in confusions. Intracellular properties that may generate precise time intervals can be confused with precise timing that is generated by the organization of activity in the network. Therefore, we considered only neural components consisting of two neurons that were recorded through different electrodes. For example, if we have three electrodes recording spikes from two neurons each, there are 30 – 6 = 24 valid pairs of neurons (note that the pairs <n1, n2> and <n2, n1> are different). Altogether, we have 24 ´ 50 = 1,200 potential neural components, some of which may never occur. In the days analyzed, there were thousands of such neural components.

Relations Between Drawing Components and Pairs of Neurons

Once we had the time occurrences of all of the drawing components and the neural components, we were interested in finding relations between drawing components and pairs of neurons. For each drawing component A and for each possible pair of different neurons <n1, n2>, we counted the occurrences of each neural component around A that consisted of a spike from n1 and a spike from n2. In other words, we were interested in the total occurrences of each relevant time interval between a spike of n1 and a spike of n2. The time regions in which we counted these intervals were determined relative to the occurrences of A by two external parameters Tfrom and Tto. Formally, suppose that during a recording day, A occurred at {T1, T2, T3, . . . , Tn}, then the time regions are [Ti + Tfrom, Ti + Tto], where 1 £ i = n.

Eventually, we defined the support of a relation to be the total number of occurrences of the most frequent interval. Fig. 6 illustrates the count of a specific time interval around four occurrences of a drawing component where [Tfrom, Tto] was [–0.4 s, –0.1 s]. Note that in practice, the supports of the relations were computed for several [Tfrom, Tto] windows. For the first set of measurements the windows were {[–1.4, –1.1], [–1.2, –0.9], [–1.0, –0.7], …, [–0.2, 0.1]}, whereas for the second test they were {[–1.0, –0.9], [–0.9, –0.8], [–0.8, –0.7], …, [–0.1, 0.0]} At a later stage, the range with the strongest result was selected for each recording day.

Fig. 7 shows appearances of the drawing component that was shown in Fig. 1a as well as the spike activity around these appearances. This figure provides further indications that the relations between the neural interval and the drawing component were not random or due to trivial artifacts: The delay between the neuronal component and the drawing component (blue marks) is not evenly distributed between –0.4 and –0.1 s as might be expected for chance relations. Second, in Lower the firing rate is stationary. In this condition, had the red marks been random the spike density around these dots marks should have approximated the autocorrelation function that must be symmetric. However, the little troughs on both sides of the peak (relative refractoriness) are not symmetric. The difference is significant at 0.013.

Evaluations using Teetered Neural Data

Using the computed support for each relation in a recording day, a statistic called the relations score was extracted for this day (details are given in the next section). Intuitively, relations score gets larger as the likelihood of the support for the relations decreases. Once a relations score was computed for the actual data (denoted by S0), we evaluated its probability of occurring by chance.

Because relations between hand motion and firing rates of neurons have been studied extensively (3, 4), we wanted to test whether S0 was significantly higher than what we would expect from random data that have similar firing rates. To simulate random data that preserves firing rates of neurons, we randomly teetered the time of each original spike within a small window of W ms (5). For example, if W = 10 ms and the original time of some spike was 125 ms, its new time after teetering may be any time within [120 ms, 130 ms]. Using this technique, we generated 5,000 such surrogate spike trains (when the first set of measurements was analyzed) or 1,000 such surrogates (when the second set was analyzed). Each surrogate train was given a relations score (S1, S2, S3, . . ., respectively) following exactly the same procedure as for computing S0 (including reteetering for 1,000 times to evaluate the probability of each relation, reselecting the best support among all possible intervals, and reselecting the best time slice [Tfrom, Tto] that leads to the largest relations score). These 5,000 values were used for estimating the probability [denoted by p(S0)] of getting the value S0 by chance. For example, if only 50 surrogate trains exceeded the relations score of the actual data, then p(S0) was estimated by 50/5,000 = 1%. Table 1 shows the minimal width of the teetering window for each day that led to a significant value of p(S0).

Table 1 shows minimal widths of the teetering windows used in each of the 6 (of 8) recording days to obtain a value of p(S0), which was <5%. The second set of measurements, where spike times were recorded at a resolution of 0.1 ms (instead of 1 ms), led to much better precision. Note that no significant results were achieved when smaller teetering windows were used.

A significant value for this probability indicates that the relations between drawing components and pairs of neurons (in that day) are damaged as a result of teetering within W ms. By this we can conclude that around the occurrences of similar behavior, pairs of spikes tend to prefer some specific time delays (also called interspike intervals or ISIs)

How do these ISIs distribute? To answer this question, we considered only ISIs that were involved in significant relations (i.e., relations with an estimated probability <0.01). Fig. 8 shows the distribution of all these ISIs in each set of measurements, consisting of 3 days each. Moreover, the number of distinct pairs of neurons that participated in significant relations in each of the sets was 21 and 44, respectively, whereas the total number of all possible pairs was 126 and 608 (fractions of 16% and 7%). Note that due to the fact that only a minute set of neurons in the cortex is recorded, not necessarily all of the recorded neurons would show the real time accuracy of the brain.

The Computation of Relations Score

Given a teetering window of W ms, the computation of the relations score statistic for a recording day involves the following steps:

A. Generate a set of 1,000 independent teetered neural data (denoted by J1,000) by teetering the actual data within the window W. Observe that the same W was used to create both J1,000 and all surrogates for which we recalculated the relations score.

B. Recognize all potential relations between a drawing component and a pair of neurons.

C. For each relation R do the following:

1. Compute the support of R (denoted by Rsupport).

2. Skip R if its support is less than a predefined noise threshold. This step is carried out to prune noisy relations. Note that the values used for this threshold were 60, 40, 20, 20, 30, and 30 for the 6 significant recording days listed in Table 1 from left to right, respectively (depending on the firing rates of the neurons in that day).

3. Compute the support of R for each neural data in J1,000 (using the same drawing component).

4. Compute the mean m and the variance s2 of these 1,000 values.

5. Estimate the probability of Rsupport assuming the normal distribution N(m, s2).

6. Set Rsurprise to –log2(Prob{Rsupport}).

D. Sort all relations by descending order of their computed Rsurprise.

E. If several relations involve the same two neurons and the same interval between them, delete them all except from the first (the one whose Rsurprise is the highest).

F. Set relations score by the sum of the Rsurprise values of the first 10 relations.

Note that step E is done to reduce dependencies between the most unlikely relations that are used in step F. For example, if the 10 best relations contain a relation that connects a neural component N to some drawing component C1 as well as a relation that connects the same N to another drawing component C2, we do not consider both relations for computing relations score. The logic beyond this idea is that C1 and C2 may be dependent. Suppose that C1 is based on direction changes and C2 is based on velocity changes, this dependency is supported by the two-thirds power-law.

Verifications

Several tests and verifications were made to ensure the reliability of the given results.

•We repeated the entire process described in this article to evaluate p(S0) on sham data. These sham data were built by teetering the actual neural data within 10 ms and then using them as the input for the entire algorithm (including reteetering the sham data for 5,000 times, etc.). Note that no significant results were found for any of the recording days in any of the 7 tested [Tfrom, Tto] windows.

•No significant results were found when we invented the occurrence times of each drawing component. For this operation, if a drawing component i repeated Ni times, we selected Ni random times and repeated the entire procedure.

•No significant results were found when the time region around the drawing component was later than the occurrence of the drawing component (i.e., Tfrom > 0).

•We observed that the same significant results can be achieved by using much smaller number of teetered data. This fact may indicate a small variance in the estimation of p(S0).

1. Mannila, H., Toivonen, H. & Verkamo, A. I. (1995) KDD 210–215.

2. Mannila, H. & Toivonen, H. (1996) KDD 146–151.

3. Evarts, E. V. (1966) J. Neurophysiol. 29, 1011–1027.

4. Georgopoulos, A. P., Kalsaska, J. F., Caminity, R. & Massey, J. T. (1982) J. Neurosci. 2, 1527–1537.

5. Date, A., Bienenstock, E. & Geman, S. (2000) Soc. Neurosci. Abstr. 26, 828.6.