Abstract
The demand on simulation software in astrophysics has dramatically increased over the last decades. This increase is driven by improvements in observational data and computer hardware. At the same time, computers have become more complicated to program due to the introduction of more parallelism and hybrid hardware. To keep up with these developments, much of the software has to be redesigned. In order to prevent the future need to rewrite again when new developments present themselves, the main effort should go into making the software maintainable, flexible and scalable. In this paper, we explain our strategy for coupling elementary solvers and how to combine them into a high-performance multi-scale environment in which complex simulations can be performed. The elementary parts can remain succinct while supporting the aggregation to more satisfactory functionality by coupling them on a higher level. The advanced code-coupling strategies we present here allow such a hierarchy and support the development of complex codes. A library of simple elementary solvers subsequently stimulates the rapid development of more complex code that can co-evolve with the latest advances in computer hardware. We demonstrate how to combine several of these elementary solvers in a hierarchical and generic system, and how the resulting complex codes can be applied to multi-scale problems in astrophysics. Our aim is to achieve the best of several worlds with respect to performance, flexibility and maintainability while reducing development time. We succeeded in the development of the hierarchical coupling strategy and the general framework, but a comprehensive library of minimal fundamental-physics solvers is still unavailable.
This article is part of the theme issue ‘Multiscale modelling, simulation and computing: from the desktop to the exascale’.
Keywords: high-performance computing, Astrophysical Multi-purpose Software Environment, graphics processing unit, multi-physics, multi-scale
1. Introduction
The diversity of computational resources available to astrophysical researchers has increased enormously over the last decade. A little over 10 years ago the graphics processing unit (GPU), in the form of a multi-purpose compute accelerator, appeared on the scientific computing stage [1,2]. Today, we find the GPU in a large portion of the world's most powerful supercomputers, as well as in the workstations of researchers active in a wide range of scientific fields.
About a decade ago, cloud computing appeared on the stage when Amazon introduced their elastic cloud computing product [3]. Today compute power can be rented from many public cloud providers including Amazon, Google and Microsoft. These resources enable access to the latest generations of GPU devices, FPGAs and multi-core workstations. Cloud resources have made a wide variety of hardware accessible to the general research community because it can be rented instead of procured. This diversity in hardware is expected to grow enormously in the coming years, particularly since the stalling of Moore's Law [4] causes hardware developers to explore new ways to preserve the accustomed speed-up of their designs. The easy access to the latest hardware dramatically increases the demand on advanced software. But software development, as a result, becomes more complex and causes a demand for more flexibility [5].
The problem with porting existing code to new hardware hides in the intrinsic complexity and the lack of knowledge of how these codes work. Such dinosource software [6] often resulted from several generations of families of researchers who have been adding to an existing simulation environment, making it less transparent upon each incorporated change and every added novelty. Preparing astronomical simulation packages for the next generation requires drastic design changes. The lack of resources within the scientific community to develop, optimize and maintain software is in sharp contrast to the relative ease with which financial support is acquired for the purchase or the development of new hardware.
This discrepancy can only be lifted by making software easier to develop, optimize and maintain. A plea for more modular software [6] emphasizes that such modularity should start with a complete rewrite and update of the astronomical software simulation libraries. These libraries should be able to understand the input and output of specialized solvers and methods, and it should be accompanied by a collection of code coupling strategies. By keeping fundamental parts small, they remain easy to maintain, optimize and replace when new implementations become available. The complexity will be captured in the communication paradigms, the coupling strategies and the optimization of such compounded solvers.
Here we discuss one such a coupling strategy in which elementary solvers can be combined into a more complex compounded solver. With these strategies, we are able to take optimal advantage of modern high-performance computer architectures.
This paper is organized as follows: In §2, we elaborate on the difficulty we face in relation to future hardware and software developments. In §3, we address our approach for addressing some of these problems. We follow that discussion with §4 in which we discuss some of the multi-scale challenges we face, and we provide a coupling strategy solution in §5. Finally, we sum-up in §6.
2. High-performance computing developments
Very few astrophysical researchers have the proper coding skills to use advanced techniques like massive parallelism, multi-threading and advanced vector extensions.1 But there is no way around it; in order to take advantage of the latest hardware, and thereby being able to perform larger and more detailed simulations, software development work is required. Just recompiling old code using a new compiler on modern high-performance hardware is not enough. Modifying dinosource to make it compatible with the latest advantages offered by modern hardware provides short-term relief to a more complex problem. In the end, these modifications result in unreadable and unmaintainable code.2 On top of that, with such code, new architectures are hardly ever optimally used.
Modern compilers offer enormous advantages regarding automated parallelization, loop unrolling and auto-generated multi-threading, but the result is often inefficient and requires considerable tuning by the developers. This brings us to the situation where code is unable to use multiple cores and that achieves, at best, marginal speed-ups in a multi-core environment. The only optimal solution for non-parallel code in a multi-core environment is to run multiple instances of the same code.3 This breaks down, however, when the parallel environment uses limited-capability cores, such as GPU accelerators. But just rewriting an existing code by adding some parallelization calls is not an ideal option either, because rewriting single-core code results in atrocious parallel code.
In these cases, the algorithm and source code have to be reexamined and reimplemented whereby the (parallel) features of current-day technology should be taken into account from the start. An example is the Bonsai [7] gravitational tree-code, which we redesigned from scratch to work efficiently in parallel on many GPU-equipped nodes [8]. However, when a code is extensive in terms of the number of code lines, and complex in terms of the number of operations, rewriting is hardly a viable option for the domain-specific scientist.
Traditionally, the scale-width or extensive support for a broader physical domain has generally been realized by encapsulating and augmenting snippets of source-code dedicated to the specific hardware. These additions come at a cost of dramatically increased complexity, and the increased risk of hard-to-detect bugs and degraded performance. Preserving dinosource for running on modern architectures requires surgical work in the codes' intestines. This is the software equivalent of open-heart surgery; it requires expert knowledge and extreme care.
The main risks in this approach hide in the complexity of the codes. Simpler codes, in terms of addressing fewer physical processes simultaneously and executing fewer tasks, would increase the development rate and reduce the number of bugs. In addition, they would enable future generations of young and less experienced researchers to contribute to these codes in a constructive way. This simplification process can be mediated by separating fundamental tasks in dedicated solvers and by combining these solvers via a code coupling protocol. Such a structure makes it easier for researchers with a slightly different expertise to benefit from the latest technologies.
By meticulously executing the above simplifications existing code can be modularized into its fundamental building blocks. This may lead to the Duplo analogy advocated in [6]. The fundamental idea is to subdivide existing numerical solvers into blocks that are designed to solve one single type of physics on a limited scale. These fundamental pieces can later be combined using some common interface to a more elaborate structure to solve a wider range of scales or a broader palette of physics. For current software, this approach seems to be the most viable solution to our current crisis in simulation software. While separating out fundamental parts, optimizable snippets can be replaced for GPU enabled routines. Existing software can in this way be optimized for newer hardware and future implementations can benefit from these building blocks as submodules.
As is usual in complicated problems, the solution sounds nice but unrealistic. Three developments are needed for realizing the sketched simulation environment: A wide variety of simple but optimized fundamental solvers on a range of scales, a communication protocol and framework to make these elementary solvers operate in unison and a global environment to bind them all.
3. One code to bind them all
Fundamental to our approach is the separation of codes into their essential operations. Too many codes have grown into complex frameworks following the authors' desire to expand the functionality of the package. In this way, relatively simple gravity solvers have been polluted with stellar physics, background potentials and tidal energy dissipation. At the other side of the spectrum, we find cosmological large-scale structure-formation solvers that incorporate coarse-grained hydrodynamics, radiative processes and a wide variety of sub-grid physics. These codes, consisting of hundreds of thousands of lines of source code, are based on a wide variety of fundamental solvers without explicitly recognizing them as such. The coupling strategies for these solvers are rigidly hard-coded in the source; the initial design choices are petrified which limits operation and application range. Such hard-coded coupling not only limits the science, but also limits validation and verification.
On the bright side, there are many examples of relatively simple codes that are designed for one specific task. Even those codes, however, have the tendency to grow and aggregate functionality. It turns out to be very hard to prevent production quality source code from growing. One solution, as we discuss here, is the strict separation of fundamental solvers into patterns, and to introduce a flexible coupling strategy to combine the various solvers in a homogeneous and self-consistent framework. We begin this by describing the overall structure of the framework in §4, and subsequently discuss one of two of the coupling strategies we have been developing in §5.
Over the last few years, we have been building a simulation environment that supports the aggregate development of simulation codes. We call it the Astrophysical Multi-purpose Software Environment, or AMUSE for short [9–11]. According to our Noah's Arc philosophy for AMUSE, each solver should be fundamental and there should be at least two codes that solve for the same physics. These codes may operate on a different scale. Instead of designing new algorithms to solve new problems, we adopt the philosophy that each of the physical domains can be solved separately, then combined into a consistent solution at discrete intervals in time and space. Validation of such a simulation could subsequently be accomplished by recomputing the model with a different solver for each of the physical domains. In this way, we can study the convergence (or the absence thereof) on the solution with a sequence of numerical implementations.
With AMUSE, a user can combine existing solvers to build new applications that may be combined again to study increasingly complex problems. This enables the growth of multi-physics and multi-scale software in a hierarchical fashion, and allows testing of each intermediate step while the complexity of the software increases. The complexity remains part of the application script, and it will not penetrate into the framework software. In this way, complicated scripts and dedicated codes remain strictly separate.
These dedicated solvers include the stand-alone packages to solve for the specialized physics on a particular scale. In the AMUSE hierarchical component environment, we provide executables for simulating stellar evolution, gravitational dynamics, (magneto)hydrodynamics, radiative processes, and also have limited support for astrochemistry. A researcher is able to use some or all of these resources simultaneously in an application and data can be exchanged in a deterministic way. With this hybrid simulation environment, it becomes possible to study multi-physics processes operating on a broad range of length and time scales. Ideas from this environment now percolate to a wider community, including oceanography (OMUSE, [12]), water management and climate science (HyMUSE, see [13]). Since this is a topical collection of papers on multi-scale computing, we will discuss some of the multi-scale aspects of AMUSE, rather than dwell on the multi-physical aspects of the framework.
Essential to this multi-scale approach is a method called Bridge [14]. Much of the flexibility of AMUSE stems from this underlying coupling strategy. In the next section, we demonstrate how Bridge is used for multi-scale simulations of dense stellar systems, a problem that has kept astronomers busy since the early 1960s [15–17]. We will show one multi-scale strategy which we call Nemesis. But we start with explaining the fundamentals of Bridge.
4. Multi-scale simulation patterns
Aggregating a selection of relatively simple simulation codes into a consistent unity requires a separate strategy for each of the underlying couplings. For the coupling of codes that are based on the same underlying physics, as we discuss here, we rely on Bridge. Originally, the Bridge scheme was designed for connecting two different scales in gravity. Here we demonstrate how Bridge can be expanded to build a hierarchy of scales from the smallest micro-scale to the largest macro-scale.
(a). The hierarchical multi-scale coupling strategy
To analyse the problem, we work from the scale separation map, introduced in [18]. In figure 1, we show the scale separation map (SSM) for simulating planetary systems and stellar multiplicity in their clustered environment in the Galaxy. Given such a map, the methods and examples discussed in this paper fall under the patterns developed for the heterogeneous multi-scale method (HMM) [19,20]. This method describes how a complex model can be solved with a code that evolves the macroscopic scale using information from codes that resolve the microscopic scale.
Figure 1.

Scale Separation Map for interaction between bodies in the Galaxy. In planetary systems, interactions take place on a timescale of days to 1000 years and from 0.3 to about 100 au. for the Solar System, these scales should be associated with the orbit of the planet Mercury on the small scales and the Oort cloud on the longest scales. The stars in a cluster interact on time scales ranging from 103 to 106 years and typical distance scales range from 1000 au to about a 1 parsec. Galactic scales range in billion years and 30 kilo-parsec. In red, we added the binary and multiple interactions that operate between planetary systems in star clusters. The arrows indicate how the various systems interact by exchanging particles from one to the other. In figure 2, we present a more visual view of this scale separation map. (Online version in colour.)
In figure 2, we illustrate this range in scales by presenting images of each of these domains, with the approximate scales, scaled to the size of the Earth. The associated timescales have a similar range in terms of their dynamical time-scale. This problem poses one of the greatest numerical multi-scale challenges in astrophysical simulations.
Figure 2.
Scales in the local universe. Presenting a picture of the Earth (a), the Solar System (c), a globular cluster (b) and a galaxy (d). A more abstract version of the scale-separation map is presented in figure 1. (Online version in colour.)
The macroscopic scale, for example, could be the parental star cluster in which the Solar System was born [21]. The actual planetary systems and stellar binaries are not resolved on that scale but are represented by point masses. In our scale-separation map, the large squares indicate the dimensional range, in size and time, of the various systems in our problem. An important distinction between our application in astrophysics and the classic HMM description is the dynamics of the scale separation. This is illustrated with the arrows, which indicate that elements from one system can become part of the other. For example, if a planet escapes a planetary system it may become part of the parental star cluster, and eventually become part of the Galaxy at large. The opposite may also happen. A recent example includes the intrusion of the interstellar object ‘Oumuamua’, which had a very close but unbound encounter with the Solar system [22]. This object jumped from the Galactic scale in the SSM to the planetary scale, as is indicated by the blue arrow in figure 1. In our framework, we want to account for this scale-traversing behaviour. Another aspect that was absent in the original scale-separation map is the possibility that scales overlap. This is illustrated in figure 1 with the red square where binaries are part of the planetary scale range as well as of the cluster scales, making the problem considerably more involved and posing a challenging complexity.
The particular problem of simulating the Solar System was addressed by [23] using a second-order Verlet integrator [24]. This method can be augmented by splitting the underlying Hamiltonian (based on Newtonian gravity) in a similar manner to which symplectic integrators, used in planetary dynamics, are derived. This relatively simple operator splitting strategy method was later expanded to include a variety of solvers [25,26], but the fundamental generalization was introduced in [14]. They referred to bridging solvers that operate on different scales. The Bridge scheme was expanded to higher (even) orders in [27] and employed among others in [28,29]. We keep ‘bridge’ as a generic term for this multi-scale coupling strategy.
In Bridge, the coupling of the microscopic system with the macroscopic system is done on discrete time intervals, which we call the Bridge time step. This introduces an intrinsic time scale to the problem, which in principle is independent of the microscopic time scale or the macroscopic time scale. This particular scale, however, has to be either determined before the calculation or derived at run-time from some qualitative expression. The choice of a proper time scale is fundamental to the accuracy, the precision and the efficiency of Bridge.
In general, the microscopic system employs the shortest time scales, whereas the macroscopic system has the longer timescales (figure 2). The resulting Bridge time step is then strategically chosen geometrically between these two timescales. Owing to the overhead in the coupling strategy, Bridge is not optimal for coupling systems that operate on very similar timescales. In the examples provided here, the microscopic system will be composed of relatively few objects which require high precision, whereas the macroscopic system has many more objects but is less stringent on the precision. The macroscopic system can subsequently be addressed with a relatively low-order integrator that employs a lower precision to speed up the calculation. Such low-precision low-order integration strategies are often more efficient in terms of computer-time spent per-particle, and is therefore excellently suited to addressing the macroscopic system. The microscopic system often requires high precision and accuracy, in part because it has to be evolved over a longer time scale with respect to the characteristic time scale of the system. The desire to use relatively few particles in the microscopic system stems directly from the requirement of high precision and relatively long time scales, both of which leads to high expense in terms of computer time per particle.
In our earlier example of a planetary system in a star cluster, we could envision using a high-order symplectic integrator with shared time steps for the eight planets in the Solar system, and a regular predictor-corrector scheme with a block time-step scheme for addressing the hosting star cluster dynamics. With such a combination of techniques, one would need to parallelize the 1000 planetary systems over a similar number of cores in order to keep up with the integration of the equations of motion of the individual stars in the cluster. The integration error will, in such a case, be dominated either by the lowest order method or by the largest number of intrinsic time steps. The Bridge time step can then be used to balance these two errors and optimize for execution time.
Because Bridge is pivotal to our coupling strategy, we will briefly describe the underlying methodology.
(b). The Bridge integrator
The Bridge integrator can be formulated from a Hamiltonian splitting argument, similar to the derivation of symplectic integrators used in planetary dynamics. The Hamiltonian of a gravitational N-body system with sub-systems A and B is given by the expression
| 4.1 |
Systems A and B may represent a star cluster and its parent galaxy, respectively. Following [14], the Hamiltonian shown in equation (4.1) can be separated in the following way:
| 4.2 |
where
| 4.3 |
The time evolution of the whole system, for a second-order approximation, can be written as follows
| 4.4 |
The operator eτHint represents pure momentum kicks, since H int only depends on the positions. During this process, the velocity of the stars in the cluster is updated due to the external force generated by the galaxy. The velocity of the stars in the galaxy is also updated after computing the acceleration due to their self-gravity.
Since HA and HB are completely independent, the evolution operator eτHA+B = eτHA eτHB consists of the separate evolution of the two subsystems. For the cluster in a galaxy example, a direct code is used to accurately evolve the stellar cluster while in parallel a tree-code is used to follow the evolution of the galaxy system. A full time-step in the Bridge integrator, then, consists of
-
(i)
mutually kicking the sub-systems A and B for τ/2,
-
(ii)
evolving the two sub-systems A and B in isolation for τ using suitable codes together with an update of their positions, and
-
(iii)
mutually kicking the sub-systems A and B for another τ/2.
In the classical Bridge scheme, a self-consistent treatment of the system as a whole is achieved using equation (4.4). The Bridge integrator can enable a more efficient calculation of the evolution of a joined system under the following conditions: the first (necessary) requirement is that the time-step allowed by the interaction term Hint is longer than one or both of the internal time-steps of the HA and HB systems (this can happen, but not exclusively so, if the two subsystems are spatially and temporally well separated). Secondly, and this is optional, it may be that the two systems are evolving in a different regime, such that different integrators, geared towards their respective dynamics, can be used. In our example both conditions contribute: the internal dynamical timescale of the cluster is much shorter than the interaction timescale of the cluster-galaxy interactions and the cluster is governed by collisional dynamics while the galaxy experiences collisionless dynamics.
The coupling of codes using Bridge works best when the interacting systems are well separated in spatial and/or temporal scales throughout the simulation. When the various scales start to overlap, a mid-run reorganization may be necessary in order to keep the strict hierarchy on which Bridge is based, efficient, numerically accurate and physically correct. For the application we have in mind, such run-time reorganization is essential, the technical aspects of which are discussed in §5c. An interesting aspect of astronomical simulations is exactly this complicated and non-intuitive changing of dynamical behaviour, making a flexible environment essential. This malleability is reflected in terms of exchanging particles from one system to the other (as is illustrated in figure 1), and in the possibility of, at times, entirely reorganizing the topology. One of these cases is discussed in the next section, where we show how to simulate a cluster with single stars, binaries and planetary systems.
5. A practical application of Bridge
When integrating star clusters, similar to the classic application of Bridge, we fundamentally experience two scales; the microscopic stellar scale and the macroscopic cluster scale. The macroscopic scale tends to be addressed by means of a fourth- or sixth-order Hermite predictor corrector scheme [30,31]. Two rather different solutions tend to be used for addressing the microscopic scales. These two solutions are dedicated to either the integration of planetary systems or to the integration of binary (and multiple) stellar systems.
We designed a number of quite different strategies using Bridge to address both microscopic scales simultaneously. One of them is named Nemesis, which is designed to resolve the interactions we typically find in planetary systems and stellar multiplicity.
(a). The Nemesis strategy
We developed the Nemesis strategy for hierarchical systems where scale allows the separation of the gravitational forces between the different levels of the hierarchy. The hierarchy is separated in a macroscopic system, which we call the parent, and the microscopic system, which we call the child. A parent can have multiple children, but each child has only one parent. The structure is hierarchical in the sense that a child can be a parent with multiple children of its own. If a parent has itself no parent, we call it the global parent, otherwise it will be a local parent (Japanese is so much easier for such terminology [32]).
In Nemesis we employ Bridge, with four extensions, these include:
-
(i)
Topology. Nemesis supports two types of systems: (1) the parent that provides the frame of reference and (2) children, each of which is a subsystem of its parent. Each child is represented as a single particle in the parent system.
-
(ii)
Creation/destruction. A child is created for every close encounter between two individual particles in the parent. If a single particle or a child encounters a child, it is absorbed by the larger child. If a child system contains a single particle, that particle becomes part of the parent and the particular integrator is stopped.
-
(iii)
Exchange. Particles can be exchanged between systems. A particle that moves too far from a child will be removed from that system and transferred to the parent. A particle in the parent that approaches a child system will be incorporated into the child and removed from the parent.
-
(iv)
Database. A database is maintained to keep track of all the particles in the parent system and those in the child systems. This database also keeps track of the running codes.
A parent is strictly separated from each of its children and their mutual influence is taken into account by the top-level integrator in which every parent feels its children but each child only feels the forces from the particles in the parent. The children themselves are not inter-connected, with the consequence that the mutual forces between children are ignored. The secular effects of a parent on its own direct children are taken into account by the secular N-body solver that is used to integrate the subsystem.
With this strategy, we create a complex geography of interacting codes. In its simplest form Nemesis requires three different gravity solvers:
-
—
one code to calculate forces and coordinate all communication between the other codes,
-
—
one code for evolving the macroscopic system, i.e. integrating the particles in the parent system, and
-
—
one code for each of the microscopic systems, integrating the particles in each of the children.
In a more elaborate set-up, children themselves could be subdivided into children, etc. These sub-children could be integrated with another method. In principle, each child and each parent can have its own dedicated integration method, depending on the local requirements. In practice, we often opt for a fourth-order Hermite predictor-corrector scheme for the parent, and some high-order symplectic scheme for each of the children.
(b). The parallization of Nemesis
Between Bridge time steps, each of the child codes runs independently of any of the others and can run concurrently. This is illustrated in figure 3a. The performance of this task-based parallelism is not optimal because of the relatively large communication overhead for each child and their wide range of execution characteristics. (The integration time of a planetary system depends sensitively on the orbital period and eccentricity of the planet closest to the parent star.)
Figure 3.

Dividing the work load over codes and nodes using Nemesis. (a) All subsystem codes run and communicate in parallel. (b) An intermediate level is introduced in which subsystems are managed in serial to reduce the communication overhead. (Online version in colour.)
We boost the parallel performance by introducing an intermediate level of hierarchy, in which a subset of the children is resolved on a single compute node. This set-up is illustrated in figure 3b, where the intermediate nodes, called I1 to Im, each serially integrate a subset of children (three each in the illustration). The parallelization of these intermediate nodes allows us to optimally use a wide range of architectures, and the choice of how to divide the children among the intermediate nodes may depend on the complexity of the child or the speed of the node. We introduce further optimizations by allowing two-particle children to be solved analytically (using Kepler's third Law [33]).
The hierarchical system now operates on five different levels:
-
—
parents are composed of:
-
(1)
global parent integrated using a fourth-order Hermite predictor-corrector scheme,
-
(2)
local parent sixth-order Hermite predictor-corrector scheme,
-
(1)
-
—
children are bridged at second order with the parents and composed of:
-
(1)
two-body system Kepler solver,
-
(2)
multi-star system sixth-order Hermite predictor corrector scheme with post-Newtonian treatment,
-
(3)
planetary system high-order symplectic integrator.
-
(1)
We added the various integration schemes one could use for each of these subsystems for illustrative purposes, but in practice we use only three of these. Here we introduced the sixth-order scheme with post-Newtonian treatment (i.e. including general relativistic effects) to indicate that this may be a way to simulate the evolution of black-hole binaries in a dense star cluster.
As it turns out, Nemesis works excellent for simulating star clusters with multiple hierarchies, such as planetary systems with moons, asteroids and comets, but also when the cluster is composed of binaries or higher-order hierarchical stellar systems that interact.
(c). The evolution of planetary systems in a star cluster
Here we present the results of a calculation in which we integrated the multiple hierarchical Nemesis configuration for simulating a star cluster with planetary systems. In figure 4, we show the distribution of planetary masses and orbital separations for the planetary systems in this cluster at an age of about 2 Myr. The original cluster originated from the hydrodynamical collapse of a giant molecular cloud, simulated using a smoothed-particles hydrodynamics code. Sink particles are used to mimic star formation. The discs around these stars were turned into multiple planets on circular orbits with their semi-major axes equally spaced in terms of their mutual Hill radii [35], mimicking Oligarchic growth [36]. The orbital separations and eccentricities of the planetary orbits evolve due to dynamical encounters among planets and due to encounters with other stars in the cluster.
Figure 4.
Semi-major axis of planetary systems as function of the host-star mass in an N-body simulation in which we allow stars to have planets in orbits according to the description by Jurić & Tremaine [34]. Some of the more massive companions are dynamically captured brown-dwarfs or main-sequence stars. The blue symbols indicate the Solar System's planets. (Online version in colour.)
In this particular calculation, we used a fourth-order Hermite code [30] for the global system and the symplectic direct N-body code Huayno [37] for the planetary systems. Each of these codes operates on its internal time scale, but the coupling Bridge time step was fixed to 100 yr.
A more detailed analysis of Nemesis is presented in [38], where we simulate a star cluster for 10 Myr. A total of 512 of the 1500 stars have four, five or six planets. Runs were performed on 14 compute nodes with 16 CPU cores each. A total of 357 planets (out of 2522 or approx. 16.5%) were unbound from their parent star and subsequently unbound from the cluster. These planets will move through the galaxy without orbiting any star. We could introduce the next hierarchy for these escaped stars, integrating them in a background potential of the Milky-Way Galaxy. This would introduce yet another Bridge hierarchy in the system.
6. Summary
New computing technologies force us to rethink the development of (astrophysical) simulation codes.4 The trend for new technology and more complex simulation models resulting in more complex software has to be broken, otherwise researchers will continue to spend more time on software development rather than on actual research.
The growing diversity in available hardware architectures makes today an ideal moment to sit back and rethink the requirements for the next-generation research computing infrastructure. The growing complexity of source code also requires us to rethink the software architecture, in particular, to keep simulation environments modular and expandable. By rethinking and reimplementing a large number of astronomical source codes with the objective to make them more modular, we can also accomplish the second goal by achieving more satisfactory scaling. The separate modules are excellently suited to work independently and therefore to be launched on a wide variety of computer systems. In this work, we presented how such modules interact in a gravitational setting using the Bridge and Nemesis strategies. Shifting our focus from algorithms to methods allows us to create individual physics modules and codes that enable smoother software development in the future.
This focus on modules, rather than algorithms, naturally results in a distributed systems approach to the problem, in which each of the many computer systems runs its dedicated optimized version of the code.
Separating the source code in a modular way has yet another advantage, in which we address the multi-scale aspects of the problem with different modules. With a self-consistent module-coupling strategy, it then becomes possible to distribute the workload in terms of performance but also in terms of physical domain, i.e. in scale. The basis of this algorithmic scale-separation stems from the multi-scale nature of the underlying problem which requires different processes to be addressed with different algorithms. Paramount to this endeavour is the coupling strategy, for which we adopted a fundamental Hamiltonian splitting method, called Bridge. A bonus of this strategy is that it also works for coupling various physical domains, solving the multi-physics coupling simultaneously with the multi-scale problem.
Future developments follow naturally from the analysis presented here. The astronomical community is sufficiently dedicated and the entire body of astronomical simulation source code sufficiently confined that with the dedicated effort of a few dozen researchers we should be able to rewrite a large fraction of the common astronomical source code in the modular system we advocate above. All of this could be done in just a couple of months provided we can agree on a common language, communication protocol and input/output format. Once we achieve the more modular software objective, computational research in astronomy will be able to employ more complex and higher resolution models than are possible today.
Acknowledgements
We thank Lucie Jílová, Steve McMillan, Joshua Wall and Alfons Hoekstra for discussions, and the referees for a very helpful report. S.P.Z. thanks Norm Murray and CITA for the hospitality during his long-term visit.
Footnotes
The lack of software expertise in the scientific community is only emphasized by the use of scripting languages for simulation codes.
Commonly called spaghetti code.
There is an entire chapter on replica computing in this journal, and how this could be supported by modern supercomputer job scheduling environments.
Data-processing codes face a similar dilemma.
Data accessibility
Our software is available at amusecode.org and https://github.com/treecode.
Authors' contributions
All authors worked intensively on the manuscript and on the development of the codes referred to in the manuscript. All authors have read and approved the manuscript.
Competing interests
The author(s) declare that they have no competing interests.
Funding
This work was supported by The Netherlands Research School for Astronomy (NOVA), NWO (grant no. 621.016.701 [LGM-II]) and by the European Union's Horizon 2020 research and innovation program under grant agreement no. 671564 (COMPAT project).
References
- 1.Belleman RG, Bédorf J, Portegies Zwart SF. 2008. High performance direct gravitational N-body simulations on graphics processing units II: an implementation in CUDA. New Astron. 13, 103–112. ( 10.1016/j.newast.2007.07.004) [DOI] [Google Scholar]
- 2.Nickolls J, Buck I, Garland M, Skadron K. 2008. Scalable parallel programming with CUDA. Queue 6, 40–53. ( 10.1145/1365490.1365500) () [DOI] [Google Scholar]
- 3.2006. Amazon EC2. ‘Amazon’. See https://aws.amazon.com/ec2/ (15 June 2018).
- 4.Moore GE. 1965. Cramming more components onto integrated circuits. Electronics 38,114–117. [Google Scholar]
- 5.Portegies Zwart S, Bédorf J. 2015. Using GPUs to enable simulation with computational gravitational dynamics in astrophysics. Computer 48, 50–58. ( 10.1109/MC.2015.334) [DOI] [Google Scholar]
- 6.Portegies Zwart S. 2018. Computational astrophysics for the future. Science 361, 979–980. ( 10.1126/science.aau3206) [DOI] [PubMed] [Google Scholar]
- 7.Bédorf J, Gaburov E, Portegies Zwart S. 2012. A sparse octree gravitational N-body code that runs entirely on the GPU processor. J. Comput. Phys. 231, 2825–2839. ( 10.1016/j.jcp.2011.12.024) [DOI] [Google Scholar]
- 8.Bédorf J, Gaburov E, Fujii MS, Nitadori K, Ishiyama T, Portegies Zwart S. 2014. 24.77 Pflops on a gravitational tree-code to simulate the milky way galaxy with 18600 GPUs. In Proc. of the Int. Conf. for High Performance Computing, Networking, Storage and Analysis. SC '14 , pp. 54–65. Piscataway, NJ: IEEE Press. ( 10.1109/SC.2014.10) [DOI]
- 9.Portegies Zwart S. et al. 2009. A multiphysics and multiscale software environment for modeling astrophysical systems. New Astron. 14, 369–378. ( 10.1016/j.newast.2008.10.006) [DOI] [Google Scholar]
- 10.Portegies Zwart S, McMillan SLW, van Elteren AK, Pelupessy I, de Vries N. 2013. Multi-physics simulations using a hierarchical interchangeable software interface. Comput. Phys. Commun. 183, 456–468. ( 10.1016/j.cpc.2012.09.024) [DOI] [Google Scholar]
- 11.Pelupessy FI, van Elteren AK, de Vries N, McMillan SLW, Drost N, Portegies Zwart SF. 2013. The astrophysical multipurpose software environment. Astron. Astrophys. 557, A84 ( 10.1051/0004-6361/201321252) [DOI] [Google Scholar]
- 12.Pelupessy I, van Werkhoven B, van Elteren A, Viebahn J, Candy A, Portegies Zwart S, Dijkstra H. 2017. The oceanographic multipurpose software environment (OMUSE v1.0). Geosci. Model Dev. 10, 3167–3187. ( 10.5194/gmd-10-3167-2017) [DOI] [Google Scholar]
- 13.Crommelin et al. 2019.
- 14.Fujii M, Iwasawa M, Funato Y, Makino J. 2007. BRIDGE: a direct-tree hybrid n-body algorithm for fully self-consistent simulations of star clusters and their parent galaxies. Public. Astron. Soc. Jpn 59, 1095 ( 10.1093/pasj/59.6.1095) [DOI] [Google Scholar]
- 15.von Hoerner S. 1960. Die numerische Integration des N-Körper-Problemes für Sternhaufen. I. Z. Astrophys. 50, 184–214. [Google Scholar]
- 16.Aarseth SJ, Hoyle F. 1964. An assessment of the present state of the N-body problem. Astrophys. Norv. 9, 313. [Google Scholar]
- 17.van Albada TS. 1968. Numerical integrations of the N-body problem. Bull. Astron. Inst. Netherlands 19, 479. [Google Scholar]
- 18.Borgdorff J, Lorenz E, Hoekstra AG, Falcone JL, Chopard B. 2011. A principled approach to distributed multiscale computing, from formalization to execution. In IEEE Seventh Int. Conf. on e-Science Workshops , 2011 December. ( 10.1109/eScienceW.2011.9) [DOI]
- 19.Karabasov S, Nerukh D, Hoekstra A, Chopard B, Coveney PV. 2014. Multiscale modelling: approaches and challenges. Phil. Trans. R. Soc. A 372, 20130390 ( 10.1098/rsta.2013.0390) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Alowayyed S, Groen D, Coveney PV, Hoekstra AG. 2017. Multiscale computing in the exascale era. J. Comput. Sci. 22, 1525 ( 10.1016/j.jocs.2017.07.004) [DOI] [Google Scholar]
- 21.Portegies Zwart SF. 2009. The lost siblings of the sun. Astrophys. J. Lett. 696, L13–L16. ( 10.1088/0004-637X/696/1/L13) [DOI] [Google Scholar]
- 22.Portegies Zwart S, Torres S, Pelupessy I, Bédorf J, Cai MX. 2018. The origin of interstellar asteroidal objects like 1I/2017 U1 ‘Oumuamua. Mon. Not. R. Astron. Soc. 479, L17–L22. ( 10.1093/mnrasl/sly088) [DOI] [Google Scholar]
- 23.Wisdom J, Holman M. 1991. Symplectic maps for the n-body problem. Astron. J. 102, 1528–1538. ( 10.1086/115978) [DOI] [Google Scholar]
- 24.Verlet L. 1967. Computer ‘experiments’ on classical fluids. I. Thermodynamical properties of Lennard-Jones molecules. Phys. Rev. 159, 98–103. ( 10.1103/PhysRev.159.98) [DOI] [Google Scholar]
- 25.McMillan SLW, Aarseth SJ. 1993. An O(N log N) integration scheme for collisional stellar systems. Astrophys. J. 414, 200–212. ( 10.1086/173068) [DOI] [Google Scholar]
- 26.Iwasawa M, Portegies Zwart S, Makino J. 2015. GPU-enabled particle-particle particle-tree scheme for simulating dense stellar cluster system. Comput. Astrophys. Cosmol. 2, 6 ( 10.1186/s40668-015-0010-1) [DOI] [Google Scholar]
- 27.Portegies Zwart SF, McMillan SLW, van Elteren AK, Pelupessy FI, deVries N. 2013. Multi-physics simulations using a hierarchical interchangeable software interface. Comput. Phys. Commun. 184, 456–468. ( 10.1016/j.cpc.2012.09.024) [DOI] [Google Scholar]
- 28.Martínez-Barbosa CA, Jílková L, Portegies Zwart S, Brown AGA. 2017. The rate of stellar encounters along a migrating orbit of the Sun. Mon. Not. R. Astron. Soc. 464, 2290–2300. ( 10.1093/mnras/stw2507) [DOI] [Google Scholar]
- 29.Saladino MI, Pols OR, van der Helm E, Pelupessy I, Portegies Zwart S. 2018. Gone with the wind: the impact of wind mass transfer on the orbital evolution of AGB binary systems. Astron. Astrophys. 618, A50 ( 10.1051/0004-6361/201832967) [DOI] [Google Scholar]
- 30.Makino J, Aarseth SJ. 1992. On a Hermite integrator with Ahmad-Cohen scheme for gravitational many-body problems. Public. Astron. Soc. Jpn 44, 141–151. [Google Scholar]
- 31.Nitadori K, Makino J. 2008. Sixth- and eighth-order Hermite integrator for N-body simulations. New Astron. 13, 498–507. ( 10.1016/j.newast.2008.01.010) [DOI] [Google Scholar]
- 32.Hut P, McMillan S, Makino J, Portegies Zwart S. 2010. Starlab: A Software Environment for Collisional Stellar Dynamics. Astrophysics Source Code Library.
- 33.Kepler J. Astronomia nova. vol. 1; 1609.
- 34.Jurić M, Tremaine S. 2008. Dynamical origin of extrasolar planet eccentricity distribution. Astrophys. J. 686, 603–620. ( 10.1086/529172) [DOI] [Google Scholar]
- 35.Zhou JL, Lin DNC, Sun YS. 2007. Post-oligarchic evolution of protoplanetary embryos and the stability of planetary systems. Astrophys. J. 666, 423–435. ( 10.1086/509299) [DOI] [Google Scholar]
- 36.Kokubo E, Ida S. 1998. Oligarchic growth of protoplanets. Icarus 131, 171–178. ( 10.1006/icar.1997.5840) [DOI] [Google Scholar]
- 37.Pelupessy FI, Jänes J, Portegies Zwart S. 2012. N-body integrators with individual time steps from Hierarchical splitting. New Astron. 17, 711–719. ( 10.1016/j.newast.2012.05.009) [DOI] [Google Scholar]
- 38.van Elteren AK, Pelupessy I, McMillan SLW, Portegies Zwart S. 2019. The survivability of planetary systems in young and dense star clusters. Mon. Not. R. Astron. Soc. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Our software is available at amusecode.org and https://github.com/treecode.


