Abstract
We outline here the mathematical expression of a diffusion model for cellphones malware transmitted through Bluetooth channels. In particular, we provide the deterministic formula underlying the proposed infection model, in its equivalent recursive (simple but computationally heavy) and closed form (more complex but efficiently computable) expression.
Introduction
The spreading of malware, i.e., malicious self-replicating codes, has rapidly grown in the last few years, becoming a substantial threat to the wireless devices, and mobile (smart)phones represent nowadays the most appetible present and future target. Papers studying the problem from both theoretical and technical points of view already appeared in literature since 2005 [1]–[9], and nowadays a number of different approaches to modeling the virus diffusion are already available to the community. With the present work we want to contribute to this topic by proposing a more accurate model for the spread of a malware through the Bluetooth channel, providing both a recursive and a combinatorial equivalent deterministic formulation of the described solution.
The Model
The dynamics of the proposed model is the following: at a certain time
, a number
of infected mobiles
come in contact with a number
of clean (non-infected) cellphones
; hereafter we will denote this configuration as
.
All
telephones are in the Bluetooth transmission range of each other and they all have their Bluetooth device on. Each infected mobile tries to establish a connection with another device, clearly not knowing whether it is trying to pair to a clean or to an infected phone. All these connections are established instantaneously at time
. However, for the sake of simplicity we assume that the infected mobiles establish connections following a given sequence, starting from
down to
. In other words,
is the first to try to establish a connection,
is the last one. Moreover, each connection is chosen uniformly at random among all possible available choices. Connections between infected and clean mobiles deterministically result in infection transmission: when a clean mobile gets paired to an infected one, it becomes infected. All these events occur in the time interval
, where
is the minimal time allowing all infected mobiles to establish a connection and eventually transmit the virus: in practice, it may be considered of the order of a few tens of seconds. We assume that in this time interval clean cellphones do not try to establish any connections, e.g., for non-malware purposes. We also assume that in this time interval no other mobile enters the Bluetooth transmission range of the
mobiles and, when a connection between two mobiles is established, the two mobiles remain connected for the whole time interval. Basically, we are assuming that the initial configuration
is given and it does not change in the time interval
. Note that, given the definition of
, new infections do not result in configuration changes in the time interval
.
All the aforementioned assumptions are reasonably realistic, due to the very short time-scale considered.
The task here is to discover the probability that, in this situation, a given clean mobile gets paired to an infected one, and thus it becomes itself infected.
Summarizing, the setup and the constraints of the model are the following:
Setup
infected mobiles
and
clean mobiles
are in a room (i.e., in the Bluetooth transmission range of each other).
Dynamics
Starting from
down to
, each infected mobile tries to connect with a yet unconnected device, regardless of whether it is infected or not.
Constraint #1
Since the connection channel is Bluetooth, once a connection between two mobiles is established, these two devices become unavailable to further connection, or, in other words, each device can have at most one connection to another cellphone.
Constraint #2
For each
, when it is
's turn to choose,
must connect to one of the still available devices, if any.
Let us consider the generic configuration
with
unpaired infected mobiles
and
unpaired clean mobiles
. According to the setup, the first mobile establishing a connection is
. In Fig. 1 a possible evolution is displayed starting from an initial configuration with
infected and
clean mobiles, together with an explanatory description of the occuring dynamics.
Figure 1. An example of model dynamics starting from the initial configuration (7,5).

In red, the pairing that it is established at each step. (a) At time
,
infected mobile phones
and
clean mobiles
are all within their mutual Bluetooth connection range. (b)
chooses a mobile among
; it chooses
establishing connection
. (c) Now it
's turn to choose, and
and
are not available anymore for pairing (marked by a grey circle ○ ). (d)
connects to
through pairing
. (e) The two mobiles
and
become unavailable for pairing, too and the next infected mobile in line
pairs to
via
. (f) Only
and
remain available for pairing with
, which chooses
(connection
). (g) Now the last mobile
must connect to the remaining unpaired clean phones
: it chooses
creating pairing
. (h) There are no more unpaired infected mobiles: the process ends at time
.
Due to the described dynamics, all the infected mobiles succeed in paring, with the exception of at most one
, which can remain unpaired if there are no more available mobiles. This case can only happen when there are more infected mobiles than clean ones, their sum is odd and all the clean mobiles get paired:
![]() |
(†) |
where
is the number of pairings between two infected mobiles. Henceforth, the last choosing infected mobile
cannot find any available device to pair to. In what follows, we will refer to this case as the case
; an example of this situation in the initial configuration
is shown in Fig. 2.
Figure 2. An example of the
situation.

Starting from the initial configuration
,
infects the clean mobile
,
pairs to
,
infects
and, finally,
pairs to
. Here the process ends, because there are no more mobiles available for pairing to
which remains unconnected.
The model is completely described by computing the probability
that a certain clean mobile, for instance
, gets infected in the time interval
.
Although
could be stochastically approximated by running repeated simulations, in the following Sections we will derive two equivalent exact (deterministic) formulæ for
in the aforementioned setup. The former is a simple recursive expression, which follows straightforwardly from the model dynamics, while the latter is its corresponding closed form (thus with no recursion involved), which has a more complex expression and it heavily relies on combinatorics. Other than their alternative mathematical nature, the two formulæ show different behaviours also from a computational point of view, as discussed in a dedicated Section.
The Recursive Formula
Recursively, the probability
of a given susceptible mobile
to get infected starting from a given initial configuration
can be written by the following expression:
![]() |
(1) |
where the trivial conditions
,
and
initialize the recursion, thus covering all possible cases.
Since all clean mobiles share the same probability
of getting infected, without loss of generality we may assume
. The three terms
,
, and
contributing to the general case of
come from the three mutually exclusive cases which can occur starting from the initial configuration
:
establishes a pairing with
. In this case
gets infected and this event occurs with probability
.
establishes a pairing with one of the other
clean mobiles
. This event occurs with probability
and of course
does not get infected by
. However,
may be infected later by the remaining
available infected phones (with only
clean mobiles still available, because one clean mobile has been infected by
), thus falling back to a
configuration.
establishes a pairing with one of the other
unpaired infected mobiles
. This event occurs with probability
and of course
does not get infected by
. However, similarly to the previous situation,
may be infected later by the remaining
unpaired infected phones, thus falling back to a
configuration.
A worked out example illustrating the construction of Eq. 1 is shown in Fig. 3. The formula in Eq. 1 for
relies on a recursive equation of second order with non constant coefficients, for which no general method is known to derive the corresponding non-recursive (closed) expression. Moreover, as detailed in a later Section, calculating
by using Eq. 1 is computationally heavy. However, we will obtain the equivalent time-saving closed form solution in the next Section using combinatorial arguments.
Figure 3. Construction of the general case of the recursive formula Eq. 1.

Starting from the initial configuration
, we want to compute the probability
that a clean mobile (
without loss of generality) gets infected in the proposed model. At time
, the first infected mobile
tries to establish a pairing, and only one of the three following alternatives can occur. In green, the case when
immediately infects
(with probability
) and we are done. In blue, the case when
pairs to one of the remaining another
infected mobiles
with probability
; then
and
becomes unavailable for pairing with the following choosing mobile
, and we are moved into the case of computing the probability that
gets infected when there are
unlinked infected mobiles and
clean ones, i.e.,
. Finally, in orange, the case when
pairs to one of the other
clean mobiles
(with
) with probability
; then
and
becomes unavailable for pairing with the following choosing mobile
, and we are moved into the case of computing the probability that
gets infected when there are
unlinked infected mobiles and
unlinked clean ones, i.e.,
. The general case
is obtained by summing the contributions of all three alternative cases described above.
The Combinatorial Formula
To construct the explicit formula equivalent to Eq. 1, we need to employ a few combinatorial considerations. The key observation is that we can count all wirings (lists of pairings) that can occur at the end of the pairing process. Clearly, the fact that there is an order in setting up the connections between the mobiles heavily influences the probability that a given wiring can occur: in particular, this probability depends on the number
of pairings between infected mobiles (bb-pairings, for short). As background material, we recall some definitions and results from combinatorics in the box in Fig. 4, together with the two following functions:
Figure 4. Basic definitions, examples and facts on dispositions, combinations and permutations.

the Heaviside step function
the Kronecker delta function
As an example, the following indicator function can be written in the two equivalent formulations:
![]() |
where
is the Euclidean remainder function, so
is zero for even
and one for odd
.
Suppose now we are starting from an initial configuration
; then define the following quantities:
: the minimum number of bb-pairings in a wiring;
: the probability that a wiring with exactly
bb-pairings occurs;
: the number of all possible ways to select
bb-pairings;
: the number of all possible wirings with a given list of
bb-pairings when a (generic) clean mobile gets paired;
: the number of all possible wirings with a given list of
bb-pairings and where the clean mobile
is paired;
: the number of all possible wirings with
bb-pairings when a (generic) clean mobile gets paired;
: the number of all possible wirings with
bb-pairings where the clean mobile
is paired;
: in the
case, with
, the number of possible wirings with
unpaired, for
.
In the above notations, the (non recursive) closed form expression equivalent to Eq. 1 for the probability
of a given susceptible mobile
to get infected in a given initial configuration
can be written as follows:
![]() |
(2) |
Eq. 2 has its roots on the following counting argument: the probability that a given clean mobile
gets infected is the sum over all admissible values of
of all possible wirings with
bb-pairings weighted by the probability that a wiring with exactly
bb-pairings occurs:
![]() |
(3) |
where
is the minimum number of
-pairings that can be established in an initial configuration
.
The rationale of summing over the number of
-pairings to compute
relies on the observation that the probability of
of getting infected depends on the number of available infected mobiles that will pair with clean mobiles, that is exactly the number of infected mobiles which are not already paired to another infected mobile, i.e., that are not involved in a bb-pairing.
In particular, the three terms between brackets in Eq. 2 match respectively the three factors in Eq. 3, while the term between double brackets (
to enhance readability) corresponds to
.
In what follows we will show that the expansion of the right-hand member of Eq. 3 coincides with Eq 2. The expansions of all terms will be carried out first by separately considering all occurring cases, and then providing an unique closed form formula (without conditional expressions) by using the Heaviside step and the Kronecker delta functions.
Lemma 1
Given an initial configuration
, the minimum number
of bb-pairings in a wiring is the following:
![]() |
while the maximum number is
.
In fact, while when
it is possible not to have any bb-pairing, when
they cannot be less than
or
respectively when
is even or odd. This is due to the constraint #1 imposing that an infected mobile
must connect to another device whenever available, when it is its turn to choose.
Lemma 2
Given a
configuration, the probability
that a wiring with exactly
bb-pairings between two infected mobiles occurs is the following:
![]() |
In fact, when there are
bb-pairings in the admissible range, all possible wirings depend on the choice of
infected devices
and
clean devices
, i.e.
elements from the original sets of
. The first element has probability
to be chosen, the second
, the third
and so on.
Lemma 3
Given an initial configuration
in the
case with
, then the number
of possible wirings with
unpaired, for
, is:
![]() |
The idea is that all the
infected mobiles
must be part of a bb-pairing, so they must be connected to one of the
. Once they have been chosen, the remaining
bb-pairings must be selected among the mobiles
that are yet unpaired. Both considerations can be exploited in terms of combinations using the definitions and the properties of Fig. 4.
Lemma 4
In the
configuration, the number of all possible ways to select
bb-pairings is:
![]() |
Apart from the
case, selecting
bb-pairings is equivalent to consecutively choosing
unordered pairs
from the original set of
infected mobiles. The first pair can be chosen in
ways, the second pair in
and so on. The division by
is motivated by the fact that the particular ordering in which the
pairs are chosen is irrelevant: the list
is undistinguishable from the list
. The number of these different ordering is precisely
by definition of permutations. In the
case, if
there is only one way to choose
bb-pairings, while if
the unpaired infected mobile can only be
, so from
we have to subtract the case where the only bb-pairing involves
, which is impossible. Finally, in the
case with
the unpaired infected mobile can be any
with
, and the total number of cases (which coincides with the number of cases where
is selected, since all the clean mobiles are connected in these situations) is the sum of all cases with
.
Lemma 5
In the
configuration, with
bb-pairings, the number of all possible cases when a particular
is chosen is:
![]() |
The result follows immediately from the cardinality equations in Fig. 4, in particular from the fact that among all combinations of
objects in groups of
elements, a particular element is selected exactly
times. When
is even and
we follow the convention
for
. In case
, since all the non infected mobiles are selected, the possible ways to select them are exactly their permutations.
This completes the expansion of Eq. 3 into Eq. 2.
Equivalence between the recursive and the closed formula can be proven by showing that Eq. 2 satisfies the recursive relations of Eq. 1. The analytical proof of the equivalence involves working out a large number of cumbersome identities of binomial coefficients and factorials: in the last Section, we will briefly outline a sketch of the proof in the simple case
. Numerically, the differences between the two formulæ are below machine precision for
.
We conclude the Section with the observation that the sum of the total number of cases weighted by their corresponding probabilities adds up correctly to one:
![]() |
because of the following counting lemma.
Lemma 6
In the
configuration with
bb-pairings, the number
of all possible ways to select the remaining clean mobiles for pairing is:
Apart from the
case, when there are
bb-pairings,
infected mobiles remain to be connected with
clean devices. This is equivalent to compute the number of possible sets of
elements from an initial set of
clean mobiles: since here the ordering matters, this is the definition of dispositions (see Fig. 4) of
elements from an original set of
.
Note that, since in the case
all the clean mobiles are selected, the two quantities
and
coincide.
Analytical and Computational Notes
Although defined only for positive integer values of
and
, it is possible to provide a graphical sketch of the shape of the function
by linear interpolation on the non integer real values. In Fig. 5 we show both the tridimensional surface of
and its corresponding contourplot for values of
and
ranging between 1 and 100. Asymptotically, the function
converges to the following limits:
![]() |
(4) |
Figure 5. Tridimensional surface (a) and corresponding levelplot (b) of
for
, linearly interpolated on the real non integer values.

Graphical examples of the behaviour stated in Eq. 4 are provided in Fig. 6, where a few curves of
are plotted when one of the two parameters is kept constant (and equal to 10, 50, 100) and the other ranges between 0 and 100, together with the curve corresponding to
for
. When one of the two parameter is equal to a constant
, the smaller is
, the faster
converges to the limits in Eq. 4.
Figure 6. Plot of curves of
for different configurations
.

In blue, we show three curves of
for constant
(
solid line,
dashed line and
dotted line) and
ranging from 0 to 100. All three curves approach the asymptotic value 0 for increasing
, more rapidly for smaller values of
. In black, we show the symmetric cases obtained keeping
constant (
solid line,
dashed line and
dotted line) and letting
range from 0 to 100. Again, all three curves approach the asymptotic value 1 for increasing
, more rapidly for smaller values of
. The sawtooth shape of the curve
for
is due to the effect of the
case, which induces abrupt differences in
for consecutive values of
(changing from even to odd). Finally, the dotted-dashed red line shows the curve of
for
ranging between 0 and 100: in this case, the curve gets very close to its asymptotic value 0.5 even with small values of
; for instance,
and
.
Apart from its intrinsic theoretical relevance, the non recursive closed formula is essential for numerically compute
. In fact, the computational cost is notably different by using either the recursive formula Eq. 3 or its closed form counterpart Eq. 2: namely, the explicit formula is much faster, as shown by the values reported in Table 1 and the curves plotted in Fig. 7. For the recursive formula the computing time shows an exponentially growing trends for increasing values of
and
, while for the non recursive formula the computing time is very small and minimally growing for
and
ranging between 0 and 100. Actually, the average time over 10 values using a Python implementation of the non recursive formula on a 24 core Intel Xeon E5649 CPU 2.53GHz Linux workstation with 47 GB RAM is 11 milliseconds for
and 60 milliseconds for
, with very limited standard deviation. On the same hardware, a Python implementation of the recursive formula took about 12 milliseconds for
, 2.4 seconds for
, 6 minutes for
and more than 9 hours for
, which was the largest tested value.
Table 1. Computing times (in seconds) required to compute
by the recursive formula in Eq. 1 and the equivalent closed formula in Eq. 2, for different values of the number of infected (I) and susceptible (S).
| I = S | Recursive | Closed Form | ||||
| Min | Mean | Max | Min | Mean | Max | |
| 5 | 0.012 | 0.012 | 0.013 | 0.011 | 0.011 | 0.012 |
| 10 | 0.012 | 0.013 | 0.013 | 0.011 | 0.012 | 0.012 |
| 15 | 0.013 | 0.013 | 0.014 | 0.011 | 0.011 | 0.012 |
| 20 | 0.031 | 0.031 | 0.032 | 0.011 | 0.011 | 0.012 |
| 25 | 0.223 | 0.229 | 0.235 | 0.011 | 0.011 | 0.012 |
| 30 | 2.365 | 2.449 | 2.491 | 0.012 | 0.012 | 0.012 |
| 35 | 26.203 | 26.757 | 27.419 | 0.012 | 0.013 | 0.013 |
| 40 | 361.621 | 362.351 | 362.894 | 0.014 | 0.014 | 0.014 |
| 45 | 3225.718 | 3287.492 | 3333.242 | 0.015 | 0.015 | 0.015 |
| 50 | 34336.694 | 34433.664 | 34555.204 | 0.016 | 0.015 | 0.016 |
| 55 | 0.018 | 0.018 | 0.019 | |||
| 60 | 0.020 | 0.021 | 0.021 | |||
| 65 | 0.023 | 0.023 | 0.023 | |||
| 70 | 0.026 | 0.027 | 0.027 | |||
| 75 | 0.030 | 0.030 | 0.030 | |||
| 80 | 0.035 | 0.035 | 0.035 | |||
| 85 | 0.039 | 0.040 | 0.040 | |||
| 90 | 0.046 | 0.046 | 0.046 | |||
| 95 | 0.052 | 0.052 | 0.052 | |||
| 100 | 0.060 | 0.060 | 0.061 | |||
In particular,
, and only the closed formula was used for
(due to the excessively long runtimes: e.g., computing
by the recursive formula took more than 9 hours). Mean, maximum (Max) and minimum (Min) values for 10 replicates of each experiment are reported. All simulations were run on a 24 core Intel Xeon E5649 CPU 2.53GHz workstation with 47 GB RAM, Linux 2.6.32 (Red Hat 4.4.6), with software written in Python 2.6.6.
Figure 7. Plot of the computing times (in
scale) needed to compute
for different values of
as listed in
Table 1
.

Error bars range between minimum and maximum, while lines connect mean values; all values refer to 10 replicates. Solid line represents computing times obtained by using the recursive formula Eq. 1, while dotted line corresponds to the values produced by using the closed formula Eq. 2.
Proof of Equivalence in the Case
In this Section we show the kind of arguments involved in proving the equivalence between Eq. 1 and Eq. 2 by outlining the main steps of the proof in a simple case, i.e., when there as many infected as clean mobiles, and their numnber is even. Clearly, the general case is computationally far more complex, but it used the same ideas.
Proving the equivalence between the recursive and the combinatorial formula requires substituting the explicit expression for
of Eq. 2 in its three occurrences in Eq. 1. We are assuming
, thus in this case the identity we need to prove reads as follows:
![]() |
or, equivalently:
| (5) |
The expression for
becomes:
![]() |
where the upper bound is
since the right-hand member vanishes for
and the product symbols were eliminated by using the factorial and double factorial notations:
![]() |
.
Analogously, the expansions for
and
become respectively:
![]() |
Then the left-hand member of Eq. 5 reads as follows:
![]() |
which, collecting common factors, reduces to:
![]() |
Now, expanding the double factorial by the identity:
and carrying the terms not involving
outside the summation symbol, the above quantity becomes:
| (6) |
Now, applying the following identity
![]() |
to Eq. 6 with
, we obtain that
as claimed.
Acknowledgments
The authors thank two anonymous referees for their precious suggestions and notes, which helped in greatly improving the paper.
Funding Statement
The authors acknowledge funding by the EU FP7 Project EPIWORK. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
References
- 1. Wang P, González M, Hidalgo C, Barabási AL (2009) Understanding the spreading patterns of mobile phones viruses. Science 324: 1071–1076. [DOI] [PubMed] [Google Scholar]
- 2.Nguyen HN, Shinoda Y (2009) A novel analytical framework to model malware diffusion in heterogeneous wireless networks. In: Proceedings of IEEE WoWMoM 09. IEEE, 1–10. [Google Scholar]
- 3. Zanero S (2009) Wireless Malware Propagation: A Reality Check. IEEE Security & Privacy 7: 70–74. [Google Scholar]
- 4. Guanhua Y, Eidenbenz S (2009) Modeling Propagation Dynamics of Bluetooth Worms (Extended Version). IEEE Transactions on Mobile Computing 8: 353–368. [Google Scholar]
- 5.Shuai F, Chang-Guang W, Li-Jing B, Qing-Yang H, Jian-Feng M (2009) Modeling and Analysis Malware Spread in Short-Range Wireless Networks. In: Proceedings of WiCom 09. IEEE, 1–4. [Google Scholar]
- 6.Khouzani M (2011) Optimal control of mobile malware epidemics. Ph.D. thesis, University of Pennsylvania. [Google Scholar]
- 7.Ghallali M, El Ouadghiri D, Essaaidi M, Boulmalf M (2011) Mobile phones security: the spread of malware via MMS and Bluetooth, prevention methods. In: Proceedings of MoMM 11. ACM, 256–259. [Google Scholar]
- 8.La Polla M, Martinelli F, Sgandurra D (2012) A Survey on Security for Mobile Devices. IEEE Communications Surveys & Tutorials PP: 1–26. [Google Scholar]
- 9. Ghallali M, El Ouahidi B (2012) Designing a new framework in order to limit the spread of malware in mobile phones. Int J EnCoTe 1: 1–8. [Google Scholar]



















