Skip to main content
HHS Author Manuscripts logoLink to HHS Author Manuscripts
. Author manuscript; available in PMC: 2024 Jan 4.
Published in final edited form as: Ind Eng Chem Res. 2022 Oct 24;61(43):16128–16140. doi: 10.1021/acs.iecr.2c01636

Simulation-optimization framework for the digital design of pharmaceutical processes using Pyomo and PharmaPy

Daniel Laky a, Daniel Casas-Orozco a, Carl D Laird b, Gintaras V Reklaitis a, Zoltan K Nagy a,*
PMCID: PMC10765421  NIHMSID: NIHMS1954469  PMID: 38179037

Abstract

The problem of performing model-based process design and optimization in the pharmaceutical industry is an important and challenging one both computationally and in choice of solution implementation. In this work, a framework is presented to directly utilize a process simulator via callbacks during derivative-based optimization. The framework allows users with little experience in translating mechanistic ODEs and PDEs to robust, fully discretized algebraic formulations, required for executing simultaneous equation-oriented optimization, to obtain mathematically guaranteed optima at a competitive solution time when compared with existing derivative-free and derivative-based frameworks. The effectiveness of the framework in accuracy of optimal solution as well as computational efficiency is analyzed on on two case studies: (i) an integrated 2-unit reaction synthesis train used for the synthesis of an anti-cancer active pharmaceutical ingredient, and (ii) a more complex flowsheet representing a common synthesis-purification-isolation train of a pharmaceutical manufacturing processes.

1. Introduction

Pharmaceutical process design and optimization via digital modeling and simulation presents an effective route to improve manufacturing procedures, while ensuring that critical product guidelines are followed, and process efficiency is optimized. Given the stringent guidelines on safety, efficacy and product quality standards, the need for accurate digital models during the simulation and optimization of pharmaceutical manufacturing processes is imperative. As with other sectors of the process industry, deriving equations and implementing a new digital model for each unique process present a significant technical barrier and an impractical usage of technical staff time in many applications. Thus, general modeling tools, such as AspenPlus1 and gPROMS,2 have met a critical need in the market for process simulation within the general chemical processing and pharmaceutical industries. Specifically for pharmaceutical manufacturing, the new open-source Python tool PharmaPy has been developed to facilitate end-to-end optimal design of batch, continuous, and hybrid operating systems.3 PharmaPy is a process simulation tool that has parameter estimation capabilities and optimization frameworks that are currently being built around the tool, such as the one presented in this work.

However, models that are written or developed to have good numerical behavior during simulation do not always behave well during process optimization. Therefore, choosing the correct tool or framework for optimization in terms of both numerical robustness and ease of user implementation can play an important role in the success of optimal process development. Various optimization strategies have been used in process optimization applications, which can be differentiated based on whether the model equations and optimization problem are solved simultaneously (e.g. by discretizing the model equations and adding them as constraints to the optimization problem), or in a sequential-iterative way, whereby model equations are solved via call-backs within the optimization iterations. Direct multiple-shooting optimization strategies, that combine discretization and sequential callback-based solution of the model, have also been used successfully, especially in the case of solving dynamic optimization and model predictive control problems.4,5 The sequential optimization approaches can be further differentiated whether derivate-based or derivative-free techniques are employed in solving the optimization problem. In this article three optimization frameworks will be addressed: (i) fully algebraic simultaneous equation-oriented optimization, (ii) derivative-free sequential simulation-optimization, and (iii) derivative-based sequential simulation-optimization. Note, simulation-optimization here indicates that at each iteration of the optimization algorithm, a simulation model is executed with inputs from the current iterate to obtain necessary numerical information for evaluating the objective function and constraints. This action of calling the simulation model directly at each iteration will be referred to as a callback throughout this work.

If available, the use of up-to-date mechanistic models from the literature provides the most accurate basis for building process simulations. However, challenges using fully mechanistic models arise when those models do not exist within current literature due to novelty or lack of mechanistic exploration, or when complexity of the modeling technique required to accurately capture unit operation behavior is prohibitively expensive. A discussion of this practical reality is given in Rogers and Ierapetritou.6 Surrogate-based modeling using fully data-driven models or hybrid mechanistic/data-driven models may be employed to overcome these difficulties. As a particular example, in Boukouvala et al.,7 a continuous tableting process was analyzed with a fully data-driven feeder model, a mechanistic milling model, and a hybrid mechanistic/data-driven hopper model in the same process. In that article, both the absence of a mechanistic model in the literature for the feeder, and the computational complexity of discrete element and finite element methods for other solids processing steps are cited as barriers to using mechanistic models for process optimization.

However, when possible, the use of mechanistic models with simultaneous optimization strategies and fully discretized models as algebraic constraints should not be overlooked, due to their computational efficiency. Simultaneous optimization approaches have recently regained some traction in the pharmaceutical space through design flexibility formulations aligning with design space and probabilistic design space identification.8,9,10 The usage of simultaneous equation-oriented optimization and modeling has distinct advantages by enabling direct access to powerful numerical solvers such as IPOPT11 for local solution of nonlinear optimization problems and BARON12 for global solution of mixed-integer nonlinear programs, as well as others such as KNITRO13 and Couenne.14 These derivative-based solvers mathematically guarantee optimality and interface directly with the algebraic formulations, avoiding the need to simulate or integrate model equations at each solver iteration. However, the non-quantitative requirement of user expertise and user time required to formulate the discretized, algebraic equation-oriented models from complex, ODEs, DAEs or PDEs can be a significant barrier.

These difficulties with simultaneous optimization encourage the use of a simulator or black box to represent the process model during process optimization. There has long been interest in response surface methodologies, fully data-driven techniques, as well as hybrid mechanistic/data-driven frameworks in process optimization and control. Some recent examples include Bayesian optimization to reduce the number of experiments required to model tablet properties via a wet molding procedure15 and Kriging-based approximation of objective and constraints on several global optimization test problems.16 Multiple linear regression was used to approximate the boundary of the probabilistic design space to find optimal operation of an integrated filtration-washing-drying model.17 Kriging, adaptive sampling, and a proposed feasibility-driven surrogate-based optimization was also used for various global optimization test problems with the intention of furthering pharmaceutical manufacturing modeling capabilities.18 A recent review of surrogate-based modeling and optimization strategies in chemical process engineering was given by McBride and Sundmacher,19 where it was noted that pharmaceutical processes constitute an important application space for surrogate-based modeling. However, the purpose of the present work is to develop a framework around PharmaPy using the simulator directly in the optimization. Therefore, those techniques that employ data-driven, black box or grey-box modeling to reduce the cost of an existing simulation, which are applicable to the general problem of pharmaceutical process design and optimization, are beyond the scope of this work.

Throughout this work, a PharmaPy simulation model will be used as a black box when derivative-free or derivate-based simulation-optimization methodologies are used. A relatively comprehensive review of the use cases of a derivative-free constrained black box optimization software tool used in this work is given by Alarie et al.20 However, providing a solver with derivative information can be beneficial. For instance, fmincon in MATLAB may be used with finite-difference based derivative evaluations to enable a derivative-based simulation-optimization framework.21 However, the subscription fee to access any software tool, such as MATLAB, can inhibit widespread adoption. Therefore, an alternative open-source framework in Python with access to powerful nonlinear solvers like IPOPT would be beneficial as a derivative-based simulation-optimization framework around PharmaPy.

There have also been examples of incorporating rigorous process simulators within process synthesis problems solved using derivative-based simulation-optimization schemes. For example, ASPEN-HYSYS was used as a process simulator along with a MATLAB framework to perform generalized disjunctive programming (GDP) to determine optimal process configuration of the process with callbacks.22 Another example used ASPEN-HYSYS with a logic-based outer approximation23 to determine the optimal configuration for an isobutane alkylation.24 However, the same drawback of using a simulator, ASPEN-HYSYS, or framework, MATLAB, that is not open exists in both these cases. Another example of GDP used a finite-differencing scheme to evaluate relevant derivative values from a PRO/II process simulation, but utilized a derivative-free optimizer while solving the non-linear programming subproblem at each iteration.25

There have been recent developments in Pyomo26 that have allowed for the incorporation of input-output, or black box, models in otherwise fully algebraic mathematical programs. The PyNumero package,27 an extension within Pyomo, along with the availability of the cyipopt package,28 and the ability to obtain gradient approximations via callbacks to PharmaPy allow for derivative-based solution of non-linear programs with no explicit algebraic representation of the process that PharmaPy simulates at each solver iteration. This allows for an effective, open-source, derivative-based simulation-optimization framework built around PharmaPy.

The remainder of this article is organized into three sections. In section 2, we present five solution frameworks for optimizing pharmaceutical process design with a focus on methods that utilize callbacks to a simulator. In this section, we introduce a framework that uses partial algebraic and partial black box modeling to solve a simulation-optimization problem with derivative-based solvers. In section 3, we present two case studies and results when using those methods described in section 2. Then, section 4 offers the conclusions of the work.

2. Methodology

In this section, the simulation-optimization methodology using Pyomo with callbacks to the pharmaceutical process simulator, PharmaPy, will be described in detail. There are also descriptions of four alternative methodologies to solve a general simulation-optimization problem using the same PharmaPy model. As noted in the introduction, these alternatives represent approaches that are currently available and may set a baseline in terms of solution quality and computational time with respect to the Pyomo/PharmaPy framework.

A general optimization formulation for process design may be described below with Eq. (15):

mind,x,z,θp,θmf(d,x,z,θp,θm) (1)
s.t.h(d,x,z,θp,θm)=0 (2)
g(d,x,z,θp,θm)0 (3)
dD,xX,zZ (4)
θmΘm,θpΘp (5)

Where d are design variables (e.g. equipment design), x are state variables, z are control variables, θm are model parameters (e.g. kinetic parameters), and θp are process parameters (e.g. operational decision variables). The process is described by model equations h, which determine the interaction of state variables with system physics. Often, the process has operational or safety constraints, and especially in the pharmaceutical industry, critical quality attributes (CQAs) on the product, which are represented by inequality constraints g. The function f represents the performance metric for the process optimization (e.g. minimize production cost).

Additionally, in pharmaceutical process design, the underlying model equations h often represent a complex system of dynamic, nonlinear equations. In some cases, this complexity constitutes a barrier to using fully algebraic, simultaneous equation-oriented optimization approaches, providing a need to accurately model and simulate the pharmaceutical process in question. To mitigate this modeling hurdle, four of the five methods shown will utilize a PharmaPy model to simulate a portion of a pharmaceutical manufacturing line.

2.1. Reactor Models

PharmaPy is a Python-based, open-source pharmaceutical process simulation tool. There are a variety of unit operations available for small-molecule oral drug substance manufacturing processes with the ability to simulate fully batch, fully continuous, and hybrid operating modes. In this work, case study 1 involves a two-step reaction synthesis where the process is composed of two continuous reactors and was written and simulated using PharmaPy. This process as simulated in PharmaPy represents the model equations to be considered during optimization, as described later in sections 2.5 and 2.6. The synthesis process that case study 1 represents is comprised of both a plug flow reactor (PFR) and a continuous stirred tank reactor (CSTR) as shown in Figure 1 below.

Figure 1:

Figure 1:

Schematic representation of the two-reactor synthesis of case study 1 API, species E.

A general model for a PFR assuming constant density and limited axial diffusion is represented as a set of partial differential equations below:

Cjt=VPFRτPFRCjV+irivi,jj{1,,ncomp} (6)

Where Eq. (6) is the mass balance equation. Cj is concentration of species j as a function of position and time, VPFR is the total volume of the PFR, τPFR is the overall residence time in the PFR, ri is reaction rate for reaction i and vi,j is the stoichiometric coefficient for species j in reaction i. Through experimentation, it was found that the heats of reaction for this API are relatively constant in temperature. Thus, in this work, the unit operations are set to be isothermal, removing the need to model energy balances in the set of model equations. During scale-up, this assumption should be revisited.

For CSTRs with assumed good mixing, constant density and isothermal operation as described above, no volume coordinate is necessary resulting in the following set of ordinary differential equations:

dCjdt=1τCSTR(Cj,inCj)+irivi,jj{1,,ncomp} (7)

Where Eq. (7) represent the component mass balances. Cj is concentration of species j as a function of time, τCSTR is the residence time for the CSTR, ri is reaction rate for reaction i and vi,j, is the stoichiometric coefficient for species j in reaction i.

In both systems, mass action kinetics are assumed with reaction rates following the Arrhenius rate law equation:

ri=k¯ijCjαi,ji{1,,nrxn} (8)
k¯i=Aiexp(Ea,iRT)i{1,,nrxn} (9)

Here, for reaction i, k¯i is the overall reaction rate, Ai is the preexponential factor, and Ea,i is the activation energy. The reaction order αi,j corresponds to the stoichiometric coefficient vi,j when species j is a reactant, i.e. when vi,j is less than 0. A detailed description of the reactor module and those equations used for the PFR and CSTR is given in Casas-Orozco et al.3

2.2. Crystallization model

PharmaPy also has capabilities for simulation of purification via crystallization and isolation via filtration, among other separations unit operations. In case study 2, a hybrid flowsheet is used to produce paracetamol crystals in the form of a cake to be further processed using drug product manufacturing processes. Case study 2 includes a PFR followed by a holding tank. The holding tank holds material to be processed in batches using cooling crystallization. Ultimately, the slurry produced from the crystallization step is then filtered using a batch filter. A schematic of the process is shown in Figure 2.

Figure 2:

Figure 2:

Schematic representation of the hybrid process for the synthesis-purification of the API paracetamol, used in case study 2.

In crystallization, particle evolution through various crystal mechanisms within the vessel can be described using a population balance model (PBM). Crystallization is driven by a thermodynamic gradient, which in this work is created using cooling of the reaction mixture. Assuming that batch crystallization is taking place in a perfectly mixed tank, the temperature trajectory is known and well-defined, and crystal growth is one-dimensional and size independent, a general PDE over the crystal size distribution, f, can be written as shown below:

ft+Gfx=Bδ(xx0),S>0 (10)
ftDfx=0,S<0 (11)

Here, G represents the crystal growth rate, B represents the nucleation rate of new crystals, and D represents dissolution. Each of these crystallization kinetic mechanisms are driven by S, the relative supersaturation of the crystallizing species in the liquid phase. When S is greater than zero, the existing crystals will grow, and new crystals may form. When S is less than 0, existing crystals lose size or completely dissolve. In case study 2, S is controlled during batch crystallization by imposing a temperature profile, which allows a design criterion on crystal size to be appropriately met. The spatial component of the distribution is x, accounting for the size of the crystals in the crystal size distribution. In Eq. (10), δ is the Dirac delta function with a non-zero value at the nucleated crystal size, x0. In PharmaPy, simulation of the entire distribution at each time step is accomplished by converting the PDE system in Eq. (10 and 11) to an ODE system by discretizing over the size dimension, x, using the finite volume method and a high-resolution discretization scheme.29 More information on the details on this process are included in Casas et. al.3

The crystallization model must be closed using a material balance on the compounds in the liquid phase. Therefore, an ODE representing a mass balance on each compound can be written as shown below:

dCjdt=trVL(δj,tgCjρL)j{1,,ncomp} (12)
dVLdt=trρL (13)

Where Cj is the mass concentration of species j, and VL and ρL are the volume and density of the liquid phase, respectively. For the target compound, a transfer term from liquid to solid phase is written as tr, where transfer only happens when the Kronecker delta δj,tg is non-zero, or when species j is the target compound indicated by index tg. The transfer term tr must be defined as follows:

tr=[3kvρsGμ2,S>03kvρsDμ2,S<0] (14)

Where ρs and kv are the solid density and shape factor of the crystals, and μ2 is the second moment of the crystal size distribution. In general, the n-th moment of the crystal size distribution is defined below:

μn=0xnf(x)dx (15)

Where f is once again the crystal size distribution defined by crystal size x.

2.3. Filtration model

As seen in Figure 2, the process also includes a batch filtration step. This process is modeled as an ODE on filtrate mass considering a pressure drop applied across the slurry given cake and liquid properties of the slurry being filtered. Under constant cake resistivity αcake and filtrating media resistance Rf, and considering a filtration cross are Af, the model for remaining liquid in the filtration chamber, mf, can be written as shown below:

dmfdt=ΔPμL(αcakeCsmfAf2ρL2+RfAfρL) (16)

Where ΔP is the pressure drop applied across the slurry, μL and ρL are viscosity and density of the liquid component of the slurry, respectively, and Cs is the mass concentration of solids in the slurry phase.

2.4. Simultaneous Equation-Oriented Optimization

For the first method, fully algebraic simultaneous equation-oriented optimization is performed by modeling the two-reactor flowsheet as a system of differential algebraic equations (DAEs) in Pyomo. Using the equation system above and the pyomo.DAE module,30 a discretized version of the PFR and CSTR models can be constructed for solution in a simultaneous algebraic equation-oriented fashion. First, the PFR must be discretized both spatially and temporally. Since the problem is defined with initial conditions and boundary conditions at the inlet, a backward difference scheme for both time and volume may be used to construct the following set of algebraic equations:

[Cjt]tk,Vl=VPFRτPFR[CjV]tk,Vl+iriνi,j(j,k,l) (17)
[Cj]tk+1,Vl=[Cj]tk,Vl+Δt[Cjt]tk+1,Vl(j,l),k{1,,nt1} (18)
[Cj]tk,Vl+1=[Cj]tk,Vl+ΔV[CjV]tk,Vl+1(j,k),l{1,,nV1} (19)

Here, discretization step sizes for time, Δt with nt steps, and volume, ΔV with nV steps, are introduced from the backward difference scheme. These transformations are automated with the pyomo.DAE package, with more complex collocation options available, such as the Lagrange-Radau and Lagrange-Legendre methods.31 However, for this work backward differences are used.

The initial conditions and boundary conditions for the PFR in this work can be written in discretized form, respectively, as follows:

Cj(0,Vl)=0l{1,,nV1} (20)
Cj(tk,0)=Cj,PFR,ink (21)

Here, the initial concentration for all species is zero throughout the reactor. For the boundary condition at all times, the first volume element is set to the inlet concentration.

For the CSTR model, only temporal discretization is required. Once again using a backward difference scheme, the following algebraic equation set may be formulated:

[dCjdt]tk=1τCSTR[Cj,inCj]tk+irivi,j(j,k) (22)
[Cj]tk+1=[Cj]tk+Δt[dCjdt]tk+1j,k{1,,nt1} (23)

Here Δt represents the discretization step size with nt steps. In generalized applications, the discretization scheme in the time coordinate for individual units may require unique discretization schemes due to temporal stiffness and unit operation physics. This would require an additional set of algebraic constraints for interpolation from an upstream unit operation to a downstream unit operation, expanding the size of the problem. Given this paradigm, interpolation of inputs from unit to unit is avoided by adopting the most elaborate time discretization scheme for all units in the synthesis flowsheet. Therefore, Δt and nt are equivalent for both the PFR and CSTR. The pyomo.DAE package necessarily handles the sharing of a time domain among variables during discretization.

There is an intermediate mixing step which occurs as a pure feed of a reactant for the second reaction must be added. This step assumes constant volume, as the additional reactant has negligible flowrate compared to the flowrate from the first reactor. Using these assumptions, the mixer equations for discretized time are shown below:

Cj,MIX(tk)=Cj,PFR(tk,VR01)(j,k) (24)

Here, the mixing unit concentration for species j at time tk is set to the concentration from the upstream unit or source. In this work, the concentration of those species from the final volume element of the PFR and those fed directly into the mixer from a pure source are the values of interest.

With the concentration of the mixer defined, a final connecting constraint for the inlet concentration conditions of the CSTR discretized in time is represented with the following set of equations:

Cj,CSTR,in(tk)=Cj,MIX(tk)(j,k) (25)

Using these discretized equations, model equations h from Eq. (2) may be written in the form of an algebraic system of non-linear equations using an algebraic modeling language, such as Pyomo, and passed to a solver, such as IPOPT. Collecting all the necessary discretized equations, a general representation of the flowsheet in Figure 1 with model equations h explicitly defined is shown below:

[Cj,PFRt]tk,Vl=VPFRτPFR[Cj,PFRV]tk,Vl+iri,PFRvi,j(j,k,l) (26)
[Cj,PFR]tk+1,Vl=[Cj,PFR]tk,Vl+Δt[Cj,PFRt]tk+1,Vl(j,l)k{1,,nt1} (27)
[Cj,PFR]tk,Vl+1=[Cj,PFR]tk,Vl+ΔV[Cj,PFRV]tk,Vl+1(j,k),l{1,,nV1} (28)
Cj,PFR(0,Vl)=0l{1,,nV1} (29)
Cj,PFR(tk,0)=Cj,PFR,ink (30)
Cj,MIX(tk)=Cj,PFR(tk,VR01)(j,k) (31)
[dCj,CSTRdt]tk=1τCSTR[Cj,CSTR,inCj,CSTR]tk+iri,CSTRνi,j(j,k) (32)
[Cj,CSTR]tk+1=[Cj,CSTR]tk+Δt[dCj,CSTRdt]tk+1j,k{1,,nt1} (33)
Cj,CSTR,in(tk)=Cj,MIX(tk)(j,k) (34)
[ri,PFR]tk=k¯ij[Cj,PFR]tk,Vlαi,j(i,k,l) (35)
[ri,CSTR]tk=k¯ij[Cj,CSTR]tkαi,j(i,k) (36)
k¯i=Aiexp(Ea,iRT)i (37)
i{1,,nrxn},j{1,,ncomp},k{1,,nt},l{1,,nV} (38)

This fully algebraic representation of model equations h would be passed into the general formulation represented with Eq. (1-5) including algebraic representations of objective function f and constraints g to develop a fully simultaneous, equation-oriented optimization program.

Note, the process of discretizing temporal and spatial domains shown here are handled automatically by the pyomo.DAE package. However, tuning the number of discretization points, selection of the collocation method, and modeling algebra in a numerically robust representation are tasks left to the user. Throughout the remainder of this work, this method will be referred to as the SO method.

As another important note, Pyomo is a Python-based, open-source modeling and optimization tool, and IPOPT with efficient linear subroutines is available for free to academics, with a subset of those routines available for free for personal use.

For case study 2, a PDE for the PFR, a large-scale PDE for the crystallizer, and an additional ODE for the filtration system are required to construct the system algebraically. The complexity, size, and poor scaling of the PDE system for the batch cooling crystallizer is itself prohibitive in modeling the system in a simultaneous, equation-oriented setting. Therefore, in case study 2, the simultaneous, equation-oriented method is not employed. Similar techniques to those provided in this section would be useful in formulating such optimization problems algebraically, however the poor scaling of crystallization processes in this format require care when discretizing time and crystal size dimensions. For this method to work, the numerical stability of the system must be feasible for the entire search space of a given formulation: assurance of such stability is nontrivial.

2.5. Derivative-Free Sequential Simulation-Optimization

The second and third methods employ black-box solvers where the optimization of the black box is carried out with derivative-free approaches. There are several optimization tools available within the standard Python library, SciPy.32 In this work, the SciPy implementation of the well-known Nelder-Mead algorithm was utilized.33 Here, state variables are not discretized and represented with explicit algebraic constraints as shown in the previous section. Also, the formulation described by Eq. (1-5) is no longer sufficient, as the constraints g cannot be considered through an algebraic formulation as in section 2.2. The SciPy Nelder-Mead implementation requires a scalar value, or the objective function value, for each callback. Therefore, constraint violations must be quantified at each simulation callback and penalty functions are added to the objective function as shown below:

mind,z,θp,θmf(d,x¯,z,θp,θm)+cαcpc (39)
s.t.x¯=PharmaPy(d,z,θm,θp) (40)
pc=max(0,gc(d,x¯,z,θm,θp))2,c (41)
dD,zZ (42)
θmΘm,θpΘp (43)

Here, the values of states x¯ are generated by a call to the PharmaPy simulation. Using those states, each constraint equation gc for all inequality constraints and the objective function f are evaluated. Then, the augmented objective function is formed by adding constraint violation pc subject to penalty weight αc for each constraint. Note that this form is known as an exterior penalty function as it allows some constraint violation to occur. As mentioned previously, SciPy’s Nelder-Mead routine only accepts a scalar value of the objective function which necessitates the use of this augmented objective function value. Throughout the remainder of this work, the derivative-free approach using the SciPy Nelder-Mead method will be referred to as DF-NM. A key user input required for this approach is selection of the penalty weights. In practice convergence to a feasible final solution may require solving a sequence of problems with varying weights of the formulation described by Eq. (39 - 43) with increasing values of the penalty parameters.

The third technique is another black-box optimization routine utilizing a mesh adaptive direct search (MADS) algorithm, NOMAD.34 NOMAD offers a python interface, pyNomad, providing relatively easy incorporation of the PharmaPy simulation. Within NOMAD, automatic constraint handling is offered with either explicit rejection of infeasible points or a progressive barrier approach that allows for less strict numerical avoidance of close-to-feasible points. Therefore, no user-made augmented objective-penalty function is required while using NOMAD. Rather, the constraint residuals are evaluated at each callback and automatically considered during the execution of the derivative-free MADS routine. In this work, a stop criterion for the mesh grid size, εDFMADS=105, was specified. It should be noted that although the NOMAD Python interface was used, PharmaPy may also be called from other programming languages, including NOMAD’s native C++. Throughout the remainder of this work, the NOMAD method will be referred to as DF-MADS.

2.6. Derivative-Based Sequential Simulation-Optimization

In the final two methods, a derivative-based simulation-optimization framework was utilized for process design. For this purpose, a formulation with callbacks to the PharmaPy simulation without penalty functions was used, shown below:

mind,z,θp,θmf(d,x¯,z,θp,θm) (44)
s.t.x¯=PharmaPy(d,z,θm,θp) (45)
g(d,x¯,z,θm,θp)0 (46)
dD,zZ (47)
θmΘm,θpΘp (48)

Here, constraints g may now be explicitly modelled with outputs from the callback to the PharmaPy model. The first derivative-based method uses callbacks to PharmaPy from MATLAB’s fmincon algorithm. In this work, the default interior-point algorithm was used with a central-difference finite difference formula with a step size of εDBMATLAB,FD (chosen for this work to be 103). Given the versatility of Python code within other frameworks as noted with C++ previously, wrapping callbacks to PharmaPy within fmincon was straightforward. The main drawback with using MATLAB is that MATLAB and its related optimization libraries are behind a paywall. For the remainder of this work, the derivative-based method using MATLAB will be called DB-MATLAB.

A free alternative to MATLAB’s fmincon capabilities comes with newer developments in Pyomo facilitated by the PyNumero extension. These extensions allow for callbacks to input-output, or black-box, models and still utilize state-of-the art nonlinear optimization solver IPOPT through the cyipopt interface. Using this functionality, an algebraic model representing Eq. (44 and 46) is written in Pyomo and an input-output model is drafted, noted as the PharmaPy function in Eq. (45). This input-output model has functions for computing the relevant Jacobian and Hessian information at each iteration step. As IPOPT is a derivative-based solver, the accuracy of these is important to facilitate numerical robustness and convergence. In this case, finite differences on the input-output model were computed for Jacobian values via backward difference with absolute and relative step sizes εDBPyomo,abs and εDBPyomo,rel respectively. A general backward difference scheme is shown below:

dfdθp=f(θp)f(θpΔθp)Δθp (49)

Where f is the function and Δθp is the step size. The procedure for obtaining an appropriate finite differencing step size for a given iteration was modelled after a method used in SUNDIALS, shown below:

Δθp,i=εDBPyomo,relθp,i+εDBPyomo,absi (50)

Here, θp is a vector of inputs to the model with length ninputs, εDBPYOMO,abs is the absolute tolerance (chosen for this work to be 10−4), εDBPYOMO,rel is the relative tolerance (chosen for this work to be 10−4), and Δθp,i, is a vector of step sizes corresponding to input θp,i. This step size routine allows for scaling with the magnitude of input variables and performed better than using a flat value for finite differencing step size.

The Hessian was not evaluated via finite differences to limit callbacks to the PharmaPy model and instead the quasi-newton BFGS approximation35 was used within IPOPT. With this derivative information appropriately defined at each iteration, cyipopt can interface with IPOPT with Harwell linear subroutines even though a portion of the model is an input-output model and not explicitly defined as algebraic constraints. It should also be noted that using the more sophisticated linear solvers that are compatible with IPOPT greatly increase chances at convergence, often with a smaller number of iterations, and with quicker speed for each iteration than the default free linear subroutines available. In this work for the simulation-optimization methodology in Pyomo, the MA27 linear solver36 was used, and algorithm tolerance (εDBPyomo) was reduced to 5104. For the remainder of this work, the derivative-based method using Pyomo will be referred to as DB-Pyomo.

A summary of the methods used with important algorithmic stopping criteria are shown below in Table 1.

Table 1.

Model characteristics and stopping criteria for each solution methodology.

Approach Model Format Objective Constraints Stopping Criteria
SO Discretized Explicit Explicit εSO=106
DF-NM PharmaPy Sim. Augmented Penalties εDFNM=104
DF-MADS PharmaPy Sim. Explicit Explicit εDFMADS=105
DB-MATLAB PharmaPy Sim. Explicit Explicit εDBMATLAB=106
DB-Pyomo PharmaPy Sim. Explicit Explicit εDBPyomo=5104

In addition to these criteria, the derivative-based methods (DB-MATLAB and DB-Pyomo) have additional finite difference step sizes specified. A second table summarizing those few values is shown below in Table 2.

Table 2.

Finite difference constants for relevant methodologies.

Approach Constant
DB-MATLAB (overall, central difference) εDBMATLAB,FD=103
DB-Pyomo (absolute, backward difference) εDBPyomo,abs=104
DB-Pyomo (relative, backward difference) εDBPyomo,rel=104

A generalized workflow for each methodology is shown side-by-side in Figure 3. The SO method uses a standard algebraic formulation and calls directly to IPOPT. The DF methodologies evaluate the penalty-augmented objective function using a callback to PharmaPy, converging by reducing the size of a simplex or mesh search space. Finally, the DB methodologies utilize callbacks to PharmaPy to calculate the function values required to compute finite differences derivatives at each solver iteration.

Figure 3:

Figure 3:

Algorithm workflows for each solution methodology. In DF and DB, the added step of PharmaPy callbacks is required before evaluating the mathematical program at each iteration. Notably, the termination criteria and models are all unique, as described in their respective methodology sections.

In this work, the decision variables in all methods were scaled from physical values to between 0 and 1 to avoid numerical issues with finite differencing and constraint penalty evaluations in those relevant methodologies.

3. Results

In this section, two case studies are explored to assess the performance of the optimization methodologies described in the previous section. Case study one involves a continuous, two-step synthesis of an unnamed API. Case study two consists of a hybrid pharmaceutical manufacturing process comprised of a continuous synthesis step followed by batchwise purification using a cooling crystallizer and filter.

3.1. Case Study 1

Using each of the methods described in the previous section, a case study involving the maximization of the rate of production of an anti-cancer drug produced via a two-unit reactor sequence was undertaken. The reaction system for the active pharmaceutical ingredient (API) can be represented as follows, with flowsheet representation in Figure 1.

A+Bk¯1C+D (51)
C+Dk¯2E+F (52)
A+Fk¯3S1 (53)

Here, species A and B are fed into a PFR, R01, to produce reaction intermediate species C. Species C is then combined with an incoming flow of species D where API species E and subproduct species F are created in a CSTR, R02. A side reaction occurs when excess initial reactant A combines with subproduct F to generate an undesired contaminant species S1. The temperature effect on reaction kinetics is modelled by a standard Arrhenius rate law with activation energies Ea{2.0,2.0,2.0}[kJmol-1] and preexponential factors A{2.1,0.07,0.01}[L mol-1s-1] from Eq. (9).

A PharmaPy model of the system described in Figure 1 was created for analysis. Here, some design decisions were fixed, namely PFR tube diameter was set to 0.5 inches and the initial contents of both the PFR and CSTR was comprised of only solvent, no reagents. Also, in Figure 1, the decision variables for the system are shown for each relevant component. These decision variables are the inlet concentration of species A and B, the volume and residence time of reactor R01 (PFR), the effective inlet concentration of species D, and the volume of R02. The values of these variables are summarized in Table 3. It should be noted that only one value is listed for CAB,in as the reacting species were fed in an equimolar ratio.

Table 3.

Bounds and initial values for process decision variables for case study 1.

Variable Lower Bound Upper Bound Initial Value
CAB,in 0.01 mol/L 1.0 mol/L 0.505 mol/L
CD,in 0.01 mol/L 1.0 mol/L 0.505 mol/L
τR01 150 s 2400 s 1275 s
τR02 150 s 4800 s 2475 s
VR01 0.0001 m3 0.01 m3 0.00505 m3

It should be further noted here that all the input variables were scaled from 0 to 1 to avoid accuracy issues with numerical derivatives from the PharmaPy model. These decision variables act as the inputs to the PharmaPy model with outputs being those relevant state variables to the optimization formulation. Also, each method was initialized with the same point at 0.5 for each scaled variable, or with relative value shown in Table 3 above. A general optimization formulation for this case study with design and CQA constraints is shown below:

maxτR01,τR02,VR01,CAB,in,CD,inCEVR01τR01 (54)
s.t.h(d,x,z,θp,θm)=0 (55)
CEαCsat (56)
CA+CB+CC+CDCE (57)
VR01τR01tbatch50L (58)
VR01τR01τR0220L (59)
4.0CECC (60)
θpΘp (61)

Here Eq. (56) ensures that during operation the maximum concentration of API, species E, is such that it does not prematurely crystallize in reactor R02. An α value of 0.95 was chosen to this end. Eq. (57) ensures that unreacted precursors to the API do not exceed the API concentration. This is to prevent unrealistic separation costs and prevent complications with crystallization dynamics in the following processing step. Eq. (58) ensures that crystallization batch size does not exceed 50 L, with crystallization batch time tbatch set to 3 hours for this case study. Eq. (59) ensures that volume of reactor R02 does not exceed 20 L. It is assumed that the flow in and out of both R01 and R02 is equal during operation to keep operating volume constant. Then, Eq. (60) ensures that conversion of intermediate species C to API is sufficiently large. This constraint is important because, in the real system that motivated this case study, it is known that species C can crystallize under conditions similar to those applicable to the API.

For each case, the representation of these constraints and model equations h is slightly altered. In the SO case, model equations h are a fully discretized DAE system written in Pyomo as shown in section 2.2 in Eq. (26 - 38). As a note, only active species in each unit are modelled to avoid numerical issues when modeling zero equals zero equations from mass balances which can occur when the species are not present. With the other four approaches, model equations h are captured as an input-output model represented in PharmaPy. More importantly, with the DF-NM approach, the constraints in Eq. (56 - 60) are evaluated with violations treated as penalties added to the objective function with penalty weights α being [106, 108, 1012, 1012, 108] to scale penalties similarly and ensure low constraint violation upon algorithm convergence. A note on the values of the penalties, the final penalty weights were chosen after executing a series of optimization problems with varying penalty weights to ensure convergence to a feasible optimal solution. In all approaches barring DF-NM, an algebraic representation of the constraints is used, and constraints are handled by the algorithms internally.

With the case study defined above and those methods described in detail in section 2, the optimization formulations were solved using a Macbook Pro with a 2.6GHz Quad-Core intel i7 and 16 GB 2133 MHz LPDDR3 RAM. Results for decision variables, simulation time, and number of callbacks to PharmaPy are summarized in Table 4.

Table 4.

Optimization objective, decision, and timing results for each technique for case study 1.

Variable SO DF-NM DF-MADS DB-MATLAB DB-Pyomo
CAB,in (mol/L) 0.0878 0.0880 0.0814 0.0834 0.0836
CD,in (mol/L) 0.115 0.112 0.117 0.116 0.116
τR01 (s) 150.1 381.2 1956.6 1626.7 1529.4
τR02 (s) 2670.9 2968.2 4010.9 3460.0 3446.0
VR01 (L) 0.695 1.76 9.06 7.53 7.08
Flow rate (L/h) 16.669 16.667 16.668 16.667 16.667
Product Flow (kg/h) 0.26530 0.26538 0.26538 0.26537 0.26536
Solve time (s) 4.5 762.8 881. 885.1 148.8
PharmaPy Callbacks N/A 561 559 621 112

From comparison of the flowrate and product flow obtained, it is evident that each method produces results of similar quality. Since in this application the flowrate is the limiting factor, only the optimal ratio of reactor R01 volume to R01 residence time will be unique. As shown, that ratio can be achieved via several different residence time and volume pairings. The SO approach provides the lowest solution time, which was expected given the large overhead of calling a simulator at each iteration in the case of all other approaches. However, the total analyst time to implement the SO model for this case, which is not quantified in the table, greatly exceeds that of the other four methods.

With the methods using callbacks, the DB-Pyomo framework has the most efficient convergence. This method only takes about 150 seconds in computational time with 112 callbacks to PharmaPy. The derivative-free black box methods and the DB-MATLAB method requires over 750 seconds of simulation time and over 550 callbacks to PharmaPy, nearly 5 to 6 times the computational time and call-backs required when compared to the proposed framework.

The DF-NM method is easy to use in terms of user time required from installation to implementation, however penalty weights had to be tuned over several iterations to converge to results of similar quality to the other methods. NOMAD automatically handles the incorporation of constraints, so this extra user input was not required for the DF-MADS method. With DB-MATLAB, controlling python environments to properly call PharmaPy is straightforward, and resulted in a reliable solution relatively quickly. The major drawback with fmincon was that it provided slowest solution of the 5 methods, and requires a subscription to MATLAB, whereas all the other techniques used open-source code modules and tools that are free to academics.

The different approaches have slightly varying values of decision variables that is most apparent in the variation in volume and residence time of the PFR reactor, R01. These variations can be explained by the system flowrate, as higher flowrates correspond to higher objective function values. As shown in Figure 4, the identical flowrates for varying solutions in residence time and reactor volume for each solution exist on the optimal flowrate curve. The optimization formulation represented by Eq. (54 - 61) could be reformulated and solved with VR01τR01 as a single decision variable to eliminate multiplicity in the solution. However, this more general formulation with VR01 and τR01 as independent decision variables would allow an extra degree of freedom in the case of a more detailed techno-economic optimization. If the optimization problem, for example, included an equipment cost term in the objective, the design would be more constrained, and these solutions should collapse onto a unique solution with respect to VR01. However, in the definition of the problem used in these computations, these solutions are identical in terms of optimal objective value.

Figure 4:

Figure 4:

Family of optimal solutions for various τR01, VR01 pairs with identical flowrates. Each methodology achieves the same flowrate with varying residence time and volume values.

3.2. Case Study 2

A second case study was explored to analyze a more complex flowsheet representing a common synthesis-purification train of pharmaceutical manufacturing processes. This process is shown with a schematic in Figure 2. As shown, the hybrid manufacturing process for paracetamol includes synthesis, crystallization, and filtration. The process is composed of a continuous reactor (PFR), a batch cooling crystallizer, and a batch filter. Paracetamol (C) can be synthesized from p-aminophenol (A) and acetic anhydride (B) while producing a byproduct of acetic acid (D).

A+Bk¯1C+D (62)

Here, as in case study 1 species A and B are fed in an equimolar ratio to the PFR before being deposited in a holding tank. The temperature effect on reaction kinetics, as estimated in previous work,37 is modeled by a standard Arrhenius rate law with activation energy Ea of 7676 [kJ mol−1] and a preexponential factor A of 0.126 [L mol−1s−1] from Eq. (9).

After paracetamol is synthesized, batches of the reactor output are purified using cooling crystallization as shown in Figure 2. As described in section 2.2, for crystallization, a population balance model is used to account for crystallization kinetic mechanisms of growth and primary nucleation of paracetamol. Using a one-dimensional, finite volume method, the population balance model is written as a highly nonlinear PDE system within PharmaPy.3 For this reason, the SO method was not employed for case study 2, as capturing crystallization behavior using 1D-FVM while simultaneously solving a PDE from the preceding PFR represents a substantial scaling and numerical barrier for simultaneous, equation-oriented optimization given current capabilities. The crystallization kinetics used for paracetamol are those found previously by Nagy et al.38 and are summarized in Table 5, with temperature-dependent solubility in kg m−3 determined using Eq. (63). For the filter model, PharmaPy uses a balance on the filtrate volume as an ODE as described in previous work.37 Lastly, for the filter parameters, αcake was set 109mkg1, and Rf was set to 3109m2 using a filter diameter of 14 inches (Af of 0.099 m2) based on previous work.39

CPCM,sat=4.44210330.76T+0.05376T2 (63)

Table 5:

Crystallization kinetics constants for paracetamol found by Nagy et al.38

Constant Value Unit
kb 16.034 # s−1(kg m3)−b
b 6.23 --
kg 6.56 · 10−3 μm s−1(kg m3)−d
g 1.54 --
kd 6.56 · 10−3 μm s−1(kg m3)−d
d 1.54 --

PharmaPy was then used to model the process in Figure 2. Here, the PFR was fixed to a volume of 0.5 L and a tube diameter of 0.5 inches. The initial contents in the reactor were pure solvent with no initial reactants present. The residence time, τR01, and inlet concentration of species A and B, CAB,in, are the two remaining decision variables for R01. The reactor system is run for 5 hours to populate the holding tank with material. For the crystallizer, a two-segment, piecewise-linear cooling profile is imposed to drive the kinetic mechanisms. For this purpose, the midpoint temperature, T1, and final temperature, T2, are decision variables. Also, the batch time for the crystallizer, tbatch, completes the decision variables for the crystallizer. The filter is chosen to have a diameter of 7 inches with the default pressure drop setting in PharmaPy (1 bar), and consequently has no remaining decision variables. A summary of the decision variables and their bounds for case study 2 is shown in Table 6.

Table 6:

Bounds and initial values for process decision variables for case study 1.

Variable Lower Bound Upper Bound Initial Value
τR01 600 s 4500 s 2550 s
CAB,in 0.5 mol/L 2.5 mol/L 1.5 mol/L
T1 303 K 343 K 323. K
T2 283 K 323 K 303. K
tbatch 1 h 6 h 3.5 h

As with case study 1, all the input variables were scaled from 0 to 1 to avoid accuracy issues with numerical derivatives from the PharmaPy model. These decision variables act as the inputs to the PharmaPy model with outputs being those relevant state variables to the optimization formulation. Again, each method was initialized with the same point at 0.5 for each scaled variable, or with relative value shown in Table 6. The optimization formulation for the paracetamol crystallization case study is shown below:

maxτR01,CAB,in,T1,T2,tbatch(CPCM,inCPCM,in)VR01τR01 (64)
s.t.h(d,x,z,θp,θm)=0 (65)
CPCM,inαCPCM,sat (66)
(CAB,inCA)CAB,in0.8 (67)
μ1μ025μm (68)
T0T1tcooling0.5°Cmin1 (69)
T1T2tcooling0.5°Cmin1 (70)
θpΘp (71)

Here Eq. (66) ensures that during operation the maximum concentration of API (paracetamol or PCM), is such that it does not prematurely crystallize in reactor R01. Like the first case study, an α value of 0.95 was chosen to this end. Eq. (67) ensures that the conversion of reactants to paracetamol is at least 80 percent. Eq. (68) ensures that mean crystal size of paracetamol is at least 25 μ. This constraint ensures that filtration time of the slurry remains close to the batch time of the batch cooling crystallizer. Eq. (69 and 70) ensure that the cooling rate of the temperature profile using T0, T1, and T2 does not exceed the maximum cooling ramp rate. In this case, T0 is the inlet temperature from the reaction mixture and T1 and T2 are decision variables for the midpoint and endpoint of the linear, piecewise cooling profile, respectively.

In this case study, the optimal decision variables from each method match almost exactly, except for the midpoint temperature, T1, in the DF-NM method, as summarized in Table 7. The maximum production of paracetamol from the system is very close to 0.67 kg/h for all alternatives, as shown in the table below. With regards to computational efficiency, the DB-Pyomo method once again has the lowest number of callbacks to the PharmaPy model, with 40 percent less callbacks to the solver than any other methodology used and over 20 percent less computational time. For visualization, the evolution of the target compound solubility profile and crystal size distribution by number is shown for the DB-Pyomo method optimal solution in Figure 5, generated using the plotting capabilities of PharmaPy.

Table 7:

Optimization objective, decision, and timing results for each technique for case study 2.

Variable SO DF-NM DF-MADS DB-MATLAB DB-Pyomo
τR01(s) - 600. 600. 600. 600.
CAB,in(mol/L) - 1.959 1.959 1.959 1.959
T1(K) - 330.5 308.3 308.3 309.2
T2(K) - 285.3 285.6 285.6 285.6
tbatch(h) - 6.0 6.0 5.999 5.951
Reactor conversion - 0.803 0.803 0.803 0.803
Mean size (μm) - 25.26 25.001 25.002 25.12
Production (kg/h) - 0.66956 0.66997 0.66994 0.66993
Solve time (s) - 1109.2 1406.3 1274.25 862.3
PharmaPy Callbacks - 667 785 741 400

Figure 5:

Figure 5:

Dynamic concentration profile of paracetamol (left) and number-based crystal size distribution -CSD- (right) of the optimal solution using the DB-Pyomo method. The CSD for the right plot was sampled every 15 min from the onset of crystallization (480.7 s).

As noted previously, the intensive algebraic modeling required for the SO method is prohibitive both in engineer;s time and required expertise. It should also be noted that the penalty weights used in the DF-NM method required more tuning than the previous case study, whereas each of the DF-MADS, DB-MATLAB, and DB-Pyomo methods required no additional user modeling time. Therefore, even with a more complex example of a pharmaceutical process to produce solid form paracetamol as an API, there is benefit using the DB-Pyomo method with respect to both a user modeling burden and computational cost.

There is also the need to compare these methods on the basis of an important non-quantitative feature: ease of user implementation. For the DB-Pyomo method, as it is specifically built around PharmaPy, it is the easiest to implement when considering designing a pharmaceutical manufacturing process in PharmaPy. The second easiest method would be DF-MADS, where most features are automated, thus requiring the least amount of alteration beyond the PharmaPy model definition. Next would be DB-MATLAB, where so long as a license is available, creating a function to solve the design problem and setting up callbacks to PharmaPy are straightforward steps. The least desirable of the callback methods is the DF-NM approach. Here, the determination of effective penalty function weights is nontrivial and may in many cases be subjective in terms of the levels of constraint violation that are accepted for the different constraints/heuristics. Finally, the most challenging to implement is the SO approach. For a process modeler, the body of knowledge required to derive numerically robust discretized algebraic modeling constraints and formulate the simultaneous optimization represents significant barriers and indeed comprises somewhat of an art. From an implementation time perspective, for most cases, creating a reliable and consistent PharmaPy simulation model and using it directly in a suitable optimization framework, will take less time than implementing a robust fully discretized, simultaneous algebraic equation-oriented model and formulating it as a simultaneous optimization problem of the same pharmaceutical manufacturing system. This is seen directly in this work: while case study 1 could be modeled with moderate effort for use within the SO methodology, case study 2 is prohibitively difficult to implement using the SO methodology while retaining the same modeling accuracy that the black box methods provide.

4. Conclusions

The problem of performing process design and optimization in the pharmaceutical industry is an important and challenging one from a user standpoint. The choice of the preferred method that balances user expertise required, numerical accuracy, and computational performance is critical to successfully design optimal pharmaceutical manufacturing processes.

Here, we have presented a new framework that utilizes recent developments in Pyomo through the PyNumero extension to allow callbacks to input-output models while employing the powerful nonlinear solver, IPOPT, through the cyipopt interface. As shown in the case studies, the proposed framework is a computationally competitive alternative to other solution frameworks. Specifically, when compared with using fmincon with callbacks to PharmaPy, and two derivative-free optimization approaches, an easy to access SciPy’s Nelder-Mead solver and the more sophisticated NOMAD solver, solution time and the required number of callbacks to PharmaPy was significantly lower with the new framework as shown in both case studies. The discretized simultaneous equation-oriented optimization approach presented the fastest solution time due to the absence of callbacks for case study 1. However, when the flowsheet becomes more complex and more unit operations are added, as with case study 2, the ability to create robust discretized models in the simultaneous optimization framework becomes increasingly more difficult.

In terms of user implementation and the ability of users with modest modeling experience to design optimal pharmaceutical processes, this new framework poses a low-expertise option while still maintaining the advantage of derivative-based optimization. In terms of usability, only the NOMAD software presented a similar level of automation, as using fmincon and SciPy required careful parameter tuning for finite differences and augmenting the objective function with penalties, respectively. The simultaneous approach, while very powerful, presents unique formulation challenges that are only overcome by user expertise. Also, while this work was motivated by the need to solve process optimization problems relevant to pharmaceutical manufacturing, the findings are relevant to any process optimization application in which the model is dynamic and nonlinear and in which there are a significant number of process and quality inequality constraints that must be satisfied.

In terms of robustness of the method, reduction of noise in finite difference techniques is a step that could vastly improve the application of this framework. Numerical integration with small deviations in inputs results in distinct error propagation, more apparent in models that have stiff dynamics. In the case where data-driven modeling is employed, noise is reduced when high amounts of data are available but given the point-calculation nature of a finite difference derivative value, reducing noise with a limited number of simulations is vital. The accuracy and consistency of derivative values on small domains determines whether the method will converge using a solver such as IPOPT. Thus, more creativity or complexity in the finite difference calculation could lead to even quicker convergence for the derivative-based simulation-optimization frameworks.

Acknowledgments

This project was supported by the United States Food and Drug Administration through grant U01FD006738. The views expressed by the authors do not necessarily reflect official policies of the Department of Health and Human Services; nor does any mention of trade names, commercial practices, or organization imply endorsement by the United States Government.

6. References

  • (1).Aspen Plus. Aspen Technology, Inc. USA, 2022. https://www.aspentech.com/en/products/engineering/aspen-plus (accessed 2022-02-01) [Google Scholar]
  • (2).gPROMS. Process Systems Enterprise, 2022. www.psenterprise.com/products/gproms (accessed 2020-03-01)
  • (3).Casas-Orozco D; Laky D; Wang V; Abdi M; Feng X; Wood E; Laird C; Reklaitis GV; Nagy ZK PharmaPy: An object-oriented tool for the development of hybrid pharmaceutical flowsheets. Comput. Chem. Eng, 2021, 153 (10). 10.1016/j.compchemeng.2021.107408 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (4).Bock HG; Plitt KJ A multiple shooting algorithm for direct solution of optimal control problems. In: IFAC 9th Triennial World Congress, Budapest, Hungary, 1984; pp 1603–1608. [Google Scholar]
  • (5).Mesbah A; Huesman AEM; Kramer HJM; Nagy ZK; Van den Hof PMJ Real-Time Control of a Semi-Industrial Fed-Batch Evaporative Crystallizer Using Different Direct Optimization Strategies. AIChE J., 2011, 57 (6), 1557–1569. DOI: 10.1002/aic.12366 [DOI] [Google Scholar]
  • (6).Rogers A; Ierapetritou M Challenges and opportunities in modeling pharmaceutical manufacturing processes. Comput. Chem. Eng, 2015, 81 (10), 32–39. 10.1016/j.compchemeng.2015.03.018 [DOI] [Google Scholar]
  • (7).Boukouvala F; Vasilios N; Ramachandran R; Muzzio FJ; Ierapetritou MG An integrated approach for dynamic flowsheet modeling and sensitivity analysis of a continuous tablet manufacturing process. Comput. Chem. Eng, 2012, 42 (7), 30–47. doi: 10.1016/j.compchemeng.2012.02.015 [DOI] [Google Scholar]
  • (8).Ochoa MP; Deshpande A; García Muñoz S; Stamatis S; Grossmann IE Flexibility analysis for design space definition. In: Proceedings of the International Conference on Foundations of Computer-Aided Process Design, Copper Mountain, Colorado, USA, July 14-18, 2019; García Muñoz S; Laird C; Realff M, Eds.; 47, pp. 323–328. 10.1016/B978-0-12-818597-1.50051-5 [DOI] [Google Scholar]
  • (9).Bano G; Facco P; Ierapetritou M; Bezzo F; Barolo M Design space maintenance by online adaptation in pharmaceutical manufacturing. Comput. Chem. Eng, 2019, 127 (8), 254–271. 10.1016/j.compchemeng.2019.05.019 [DOI] [Google Scholar]
  • (10).Laky D; Xu S; Rodriguez JS; Vaidyaraman S; García Muñoz S; Laird C An optimization-based framework to define the probabilistic design space of pharmaceutical processes with model uncertainty. Processes, 2019, 7. doi: 10.3390/pr7020096 [DOI] [Google Scholar]
  • (11).Wächter A; Biegler LT On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program, 2006, 106 (5), 25–57. DOI: 10.1007/s10107-004-0559-y [DOI] [Google Scholar]
  • (12).Tawarmalani M, Sahinidis NV, 2002. Convexification and Global Optimization in Continuous and Mixed-Integer Nonlinear Programming. Springer US, Boston, MA. 10.1007/978-1-4757-3532-1 [DOI] [Google Scholar]
  • (13).Byrd RH; Nocedal J; Waltz RA KNITRO: An integrated package for nonlinear optimization. In: Large-Scale Nonlinear Optimization, Springer, 2006; pp 35–59. [Google Scholar]
  • (14).Belotti P; Lee J; Liberti L; Margot F; Wächter A Branching and bounds tightening techniques for non-convex MINLP. Optim. Methods Softw, 2009, 24 (4–5), 597–634. DOI: 10.1080/10556780903087124 [DOI] [Google Scholar]
  • (15).Sano S; Kadowaki T; Tsuda K; Kimura S Application of Bayesian optimization for pharmaceutical product development. J. Pharm. Innov, 2020, 15 (9), 333–343. 10.1007/s12247-019-09382-8 [DOI] [Google Scholar]
  • (16).Wang Z; Ierapetritou M Constrainted optimization of black-box stochastic systems using a novel feasibility enhanced kriging-based method. Comput. Chem. Eng, 2018, 118 (10), 210–223. 10.1016/j.compchemeng.2018.07.016 [DOI] [Google Scholar]
  • (17).Destro F; Hur I; Wang V; Abdi M; Feng X; Wood E; Coleman S; Firth P; Barton A; Barolo M; Nagy ZK Mathematical modeling and digital design of an intensified filtration-washing-drying unit for pharmaceutical continuous manufacturing. Chem. Eng. Sci, 2021, 244 (11). 10.1016/j.ces.2021.116803 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Wang Z; Escotet-Espinoza S; Singh R; Ierapetritou M Surrogate-based optimization for pharmaceutical manufacturing processes. In: Proceedings of the 27th European Symposium on Computer Aided Process Engineering – ESCAPE 27, Barcelona, Spain, October 1-5, 2017; Espuña A; Graells S; Puigjaner L, Eds.; 40, 2797–2802. 10.1016/B978-0-444-63965-3.50468-2 [DOI] [Google Scholar]
  • (19).McBride K; Sundmacher K Overview of surrogate modeling in chemical process engineering, Chem. Ing. Tech 2019, 91 (3), 228–239. DOI: 10.1002/cite.201800091 [DOI] [Google Scholar]
  • (20).Alarie S; Audet C; Gheribi AE; Kokkolaras M; Le Digabel S Two decades of blackbox optimization applications. EURO J. Comput. Optim, 2021, 9:100011. 10.1016/j.ejco.2021.100011 [DOI] [Google Scholar]
  • (21).MATLAB R2021b; The Mathworks Inc., USA, 2021. https://www.mathworks.com/products/matlab.html (accessed 2021–08-20) [Google Scholar]
  • (22).Navarro-Amorós MA; Riuz-Femenia R; Caballero JA Integration of modular process simulators under the generalized disjunctive programming framework for the structural flowsheet optimization. Comput. Chem. Eng, 2014, 67 (8), 13–25. 10.1016/j.compchemeng.2014.03.014 [DOI] [Google Scholar]
  • (23).Türkay M; Grossmann IE Logic-based MINLP algorithms for the optimal synthesis of process networks. Comput. Chem. Eng, 1996, 20 (8), 959–978. [Google Scholar]
  • (24).García N; Fernández-Torres MJ; Caballero JA Simultaneous environmental and economic process synthesis of isobutane alkylation. J. Clean. Prod, 2014, 81 (10), 270–280. 10.1016/j.jclepro.2014.06.016 [DOI] [Google Scholar]
  • (25).Corbetta M; Grossmann IE; Manenti F Process simulator-based optimization of biorefinery downstream processes under the generalized disjunctive programming framework. Comput. Chem. Eng, 2016, 88 (5), 73–85. 10.1016/j.compchemeng.2016.02.009 [DOI] [Google Scholar]
  • (26).Bynum ML; Hackebeil GA; Hart WE; Laird CD; Nicholson BL; Siirola JD; Watson J-P; Woodruff DL Pyomo – Optimization Modeling in Python. 3rd ed.; Springer, 2021. [Google Scholar]
  • (27).Rodriguez JS; Parker R; Laird CD; Nicholson B; Siirola JD; Bynum M Scalable Parallel Nonlinear Optimization with PyNumero and Parapint. In: Optimization Online, 2021. http://www.optimization-online.org/DB_HTML/2021/09/8596.html (accessed 2021-11-20) [Google Scholar]
  • (28).Moore J. cyipopt: Cython Interface for the Interior Point Optimizer IPOPT, 2017. https://github.com/mechmotum/cyipopt (accessed 2021-05-01) [Google Scholar]
  • (29).LeVeque RJ Finite volume methods for hyperbolic problems. Cambridge University Press, New York, 2002. [Google Scholar]
  • (30).Nicholson B; Siirola JD; Watson JP; Zavala VM; Biegler LT pyomo.dae: a modeling and automatic discretization framework for optimization with differential and algebraic equations. Math. Program. Comput, 2018, 10 (2), 187–223. 10.1007/s12532-017-0127-0 [DOI] [Google Scholar]
  • (31).Biegler LT Simultaneous Methods for Dynamic Optimization. In Nonlinear Programming: Concepts Algorithms and Applications to Chemical Processes, MOS-SIAM Series on Optimization, 2010; pp 287–323. [Google Scholar]
  • (32).Virtanen P. et al. SciPy 1.0: Fundamental algorithms for scientific computing in python. Nature Methods, 2020, 17 (3), 261–272. 10.1038/s41592-019-0686-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (33).Nelder JA, Mead R, 1965. A simplex method for function minimization. Comput. J 7, 308–313. [Google Scholar]
  • (34).Audet C; Le Digabel S; Montplaisir VR; Tribes C NOMAD version 4: Nonlinear optimization with the MADS algorithm. 2021. https://arxiv.org/abs/2104.11627 (accessed 2021-12-15) [Google Scholar]
  • (35).Nocedal J; Wright SJ Numerical Optimization, 2nd ed.; Springer, 2006. [Google Scholar]
  • (36).Rutherford Appleton Laboratory. HSL: A collection of Fortran for large-scale scientific computing, 2022, Oxfordshire, UK. http://www.hsl.rl.ac.uk/ (accessed 2021-03-15) [Google Scholar]
  • (37).Laky DJ; Casas-Orozco D; Destro F; Barolo M; Reklaitis GV; Nagy ZK Integrated synthesis, crystallization, filtration, and drying of active pharmaceutical ingredients: a model-based digital design framework for process optimization and control. In: Fytopoulos A; Ramachandran R; Pardalos PM (eds) Optimization of Pharmaceutical Processes. Springer Optimization and Its Applications, vol 189. Springer, Cham. 2022. 10.1007/978-3-030-90924-6_10 [DOI] [Google Scholar]
  • (38).Nagy ZK; Fujiwara M; Woo XY; Braatz RD Determination of the kinetic parameters for the crystallization of paracetamol from water using metastable zone width experiments. Industrial and Engineering Chemistry Research, 2008, 47 (4), 1245–1252. doi: 10.1021/ie060637c [DOI] [Google Scholar]
  • (39).Destro F; Hur I; Wang V; Abdi M; Feng X; Wood E; Soleman S; Firth P; Barton A; Barolo M; Nagy ZK Mathematical modeling and digital design of an intensified filtration-washing-drying unit for pharmaceutical continuous manufacturing. Chemical Engineering Science, 2021, 244. 10.1016/j.ces.2021.116803 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES