2. Background Theory

This section aims to provide the user with a basic review of the physics, discretization, and optimization techniques used to solve the frequency domain quasi-static electromagnetics problem. It is assumed that the user has some background in these areas. For further reading see [Nab91].

Important

The theory provided on this page works for the following right-handed coordinate systems:

X = Easting, Y = Northing, Z = Up (standard Cartesian)
X = Northing, Y = Easting, Z = Down (standard magnetotelluric which this code uses!)

2.1. Fundamental Physics

Maxwell’s equations provide the starting point from which an understanding of how electromagnetic fields can be used to uncover the substructure of the Earth. In the frequency domain Maxwell’s equations are:

(2.1)\[\begin{split}\begin{align} \nabla \times &\mathbf{E} - i\omega\mu \mathbf{H} = 0 \\ \nabla \times &\mathbf{H} - \sigma \mathbf{E} = \mathbf{s} \end{align}\end{split}\]

where \(\mathbf{E}\) and \(\mathbf{H}\) are the electric and magnetic fields, \(\mathbf{s}\) is some external source and \(e^{-i\omega t}\) is suppressed. Symbols \(\mu\), \(\sigma\) and \(\omega\) are the magnetic permeability, conductivity, and angular frequency, respectively. This formulation assumes a quasi-static mode so that the system can be viewed as a diffusion equation (Weaver, 1994; Ward and Hohmann, 1988 in [Nab91]). By doing so, some difficulties arise when solving the system;

the curl operator has a non-trivial null space making the resulting linear system highly ill-conditioned

the conductivity \(\sigma\) varies over several orders of magnitude

2.2. Natural Sources: MT and ZTEM

The sources in the magnetotelluric (MT) and Z-axis tipper elecromagnetic (ZTEM) methods are modeled as plane waves originating from natural phenomenon. These waves can be of very low frequency (< 1 Hz) and very high energy, making it possible to image very deep targets. This also implies that the source term is zero inside the domain of interest, and therefore the source term on the boundaries becomes very important. For natural source electromagnetic (NSEM) problems, we solve the following system:

(2.2)\[\begin{split}\begin{align} \nabla \times &\mathbf{E} - i\omega\mu \mathbf{H} = 0 \\ \nabla \times &\mathbf{H} - \sigma \mathbf{E} = 0 \\ &\mathbf{E} \big |_{\partial \Omega} = \mathbf{E_0} \end{align}\end{split}\]

where \(\mathbf{E_0}\) is the electric field solution on the boundary \(\partial \Omega\).

Consider the case where the Earth is a uniform half-space with a plane surface. If the source field is assumed to be homogeneous, infinite in dimension, and is located at infinity, then the plane waves impinging on the Earth’s surface travel in the z-direction.

For plane waves polarized such that their electric fields lie along the x direction, the electric field is defined by the following Helmholtz equation:

(2.3)\[\frac{\partial^2 E_x}{\partial z^2} = k^2 E_x \;\;\; \textrm{s.t.} \;\;\; k^2 = -i\mu\omega\sigma\]

and the relationship between \(E_x\) and \(H_y\) is given by:

(2.4)\[i \omega \mu H_y = \frac{\partial E_x}{\partial z}\]

The solution to (2.3) takes the form:

(2.5)\[E_x = Q e^{-kz}\]

where \(Q\) is some constant. Taking the ratio of the electric and magnetic fields measured at the surface gives:

(2.6)\[Z_{xy} = \frac{E_x}{H_y} = \frac{-i\omega \mu}{k} = \sqrt{\dfrac{-i\omega\mu}{\sigma}}\]

This implies that conductivity \(\sigma\) of the Earth can be determined by taking measurements of the field components, and therefore the impedance constitutes the basic MT response function, or data. A 1D layered Earth model can be used to compute the source wave components by iteratively propagating a plane wave from the surface to depth.

2.2.1. Magnetotelluric (MT) Data

For a 3-dimensional Earth, the magnetotelluric data are defined by the impedance tensor. The impedance tensor can be defined using the ratios of electric and magnetic field components in both the x and y directions for 2 orthogonal plane wave polarizations; one polarization with the electric field along the x axis and one polarization with the electric file along the y axis. Where the impedance tensor \(\mathbf{Z}\) is a 2 by 2 matrix:

(2.7)\[\mathbf{Z} = \mathbf{E H}^{-1}\]

such that:

(2.8)\[\begin{split}\begin{bmatrix} Z_{xx} & Z_{xy} \\ Z_{yx} & Z_{yy} \end{bmatrix} = \begin{bmatrix} E_{x}^{(1)} & E_{x}^{(2)} \\ E_{y}^{(1)} & E_{y}^{(2)} \end{bmatrix} \begin{bmatrix} H_{x}^{(1)} & H_{x}^{(2)} \\ H_{y}^{(1)} & H_{y}^{(2)} \end{bmatrix}^{-1}\end{split}\]

where 1 and 2 refer to fields associated with plane waves polarized along two perpendicular directions.

Important

For standard MT data, X = Northing, Y = Easting and Z = Down; which this code uses! Thus:

Superscript \(\! ^{(1)}\) refers to fields resulting from a plane wave whose electric field is polarized along the Northing direction. And superscript \(\! ^{(2)}\) refers to fields resulting from a plane wave whose electric field is polarized along the Easting direction.
\(Z_{xy}\) is essentially the ratio of the electric field along the Northing and the magnetic field along the Easting.

2.2.2. ZTEM Data

The Z-Axis Tipper Electromagnetic Technique (ZTEM) (Lo2008) records the vertical component of the magnetic field everywhere above the survey area while recording the horizontal fields at a ground base reference station. In the same manner as demonstrated for MT, transfer functions are computed which relate the vertical fields to the ground based horizontal fields. This relation is given by:

(2.9)\[H_z(r) = T_{zx}(r,r_0)H_x(r_0) + T_{zy}(r,r_0)H_y(r_0)\]

where \(r\) is the location of the vertical field and \(r_0\) is the location of the ground base station. \(T_{zx}\) and \(T_{zy}\) are the vertical field transfer functions, from z to x and z to y respectively. For a 3-dimensional Earth, the transfer function can be defined using the magnetic field components for 2 orthogonal plane wave polarizations; one polarization with the electric field along the x axis and one polarization with the electric file along the y axis. In this case,

(2.10)\[\begin{split}\begin{bmatrix} H_z^{(1)} \\ H_z^{(2)} \end{bmatrix} = \begin{bmatrix} H_x^{(1)} & H_y^{(1)} \\ H_x^{(2)} & H_y^{(2)} \end{bmatrix} \begin{bmatrix} T_{zx} \\ T_{zy} \end{bmatrix}\end{split}\]

where 1 and 2 refer to fields associated with plane waves polarized along two perpendicular directions. Thus the transfer functions are given by:

\[\begin{split}\begin{bmatrix} T_{zx} \\ T_{zy} \end{bmatrix} = \big ( H_x^{(1)} H_y^{(2)} - H_x^{(2)} H_y^{(1)} \big )^{-1} \begin{bmatrix} - H_y^{(1)} H_z^{(2)} + H_y^{(2)} H_z^{(1)} \\ H_x^{(1)} H_z^{(2)} - H_x^{(2)} H_z^{(1)} \end{bmatrix}\end{split}\]

Important

For standard natural source data, X = Northing, Y = Easting and Z = Down; which this code uses! Thus:

Superscript \(\! ^{(1)}\) refers to fields resulting from a plane wave whose electric field is polarized along the Northing direction. And superscript \(\! ^{(2)}\) refers to fields resulting from a plane wave whose electric field is polarized along the Easting direction.
\(T_{zx}\) is the transfer function related to an incident plane wave whose electric field is polarized along the Northing direction; which produces magnetic fields with components in the Easting direction.

2.3. Octree Mesh

By using an Octree discretization of the earth domain, the areas near sources and likely model location can be give a higher resolution while cells grow large at distance. In this manner, the necessary refinement can be obtained without added computational expense. The figure below shows an example of an Octree mesh, with nine cells, eight of which are the base mesh minimum size.

When working with Octree meshes, the underlying mesh is defined as a regular 3D orthogonal grid where the number of cells in each dimension are \(2^{n_1} \times 2^{n_2} \times 2^{n_3}\). The cell widths for the underlying mesh are \(h_1, \; h_2, \; h_3\), respectively. This underlying mesh is the finest possible, so that larger cells have side lengths which increase by powers of 2. The idea is that if the recovered model properties change slowly over a certain volume, the cells bounded by this volume can be merged into one without losing the accuracy in modeling, and are only refined when the model begins to change rapidly.

2.4. Discretization of Operators

The operators div, grad, and curl are discretized using a finite volume formulation. Although div and grad do not appear in (2.8), they are required for the solution of the system. The divergence operator is discretized in the usual flux-balance approach, which by Gauss’ theorem considers the current flux through each face of a cell. The nodal gradient (operates on a function with values on the nodes) is obtained by differencing adjacent nodes and dividing by edge length. The discretization of the curl operator is computed similarly to the divergence operator by utilizing Stokes theorem by summing the magnetic field components around the edge of each face. Please see [HHG+12] for a detailed description of the discretization process.

2.5. Forward Problem

To solve the forward problem, we must first discretize and solve for the fields in Eq. (2.2), where \(e^{-i\omega t}\) is suppressed. Using finite volume discretization, the electric fields on cell edges (\(\mathbf{u_e}\)) are obtained by solving the following system at every frequency:

(2.11)\[\big [ \mathbf{C^T \, M_\mu \, C} + i\omega \mathbf{M_\sigma} \big ] \, \mathbf{u_e} = - i \omega \mathbf{s}\]

where \(\mathbf{C}\) is the curl operator and:

\[\begin{split}\begin{align} \mathbf{M_\mu} &= diag \big ( \mathbf{A^T_{f2c} V} \, \boldsymbol{\mu^{-1}} \big ) \\ \mathbf{M_\sigma} &= diag \big ( \mathbf{A^T_{e2c} V} \, \boldsymbol{\sigma} \big ) \\ \end{align}\end{split}\]

where \(\mathbf{V}\) is a diagonal matrix containing all cell volumes, \(\mathbf{A_{f2c}}\) averages from faces to cell centres and \(\mathbf{A_{e2c}}\) averages from edges to cell centres. The magnetic permeabilities and conductivities for each cell are contained within vectors \(\boldsymbol{\mu}\) and \(\boldsymbol{\sigma}\), respectively.

The right-hand side \(\mathbf{s}\) has values \(\mathbf{E_0}\) on the boundary and 0 at inner edges. Values for \(\mathbf{E_0}\) are obtained by solving a set of 1D problems for a given planewave polarization; either \(\mathbf{E_0} = E_x \, \hat{x}\) or \(\mathbf{E_0} = E_y \, \hat{y}\). For explanation of the 1D solution, see Ward and Hohmann.

Once the electric field on cell edges has been computed, we must project to the receivers. For E3DMT version 2, straight wires of finite length are used to measure the average electric field along the path of the wire. And closed wire loops are used to measure the average magnetic field perpendicular to the loop.

Electric field measurements (\(E\)) are obtained by integrating the electric field (\(\mathbf{e}\)) along the path of the wire to compute the voltage, then dividing by the length of the wire. In practice, electric field measurements can be approximated accurately by applying a linear projection matrix (\(\mathbf{P_e}\)) to the electric fields computed on cell edges:

\[E = \frac{1}{| \mathbf{r_2 - r_1 }| } \int_{\mathbf{r_1}}^{\mathbf{r_2}} \mathbf{e} \cdot d\mathbf{l} \approx \mathbf{P_e \, u_e}\]

Magnetic field measurements (\(H\)) are obtained by integrating the electric field (\(\mathbf{e}\)) over the path of close loop to compute the EMF. The EMF is then divided by \(i\omega \mu_0 A\), where \(A\) is the cross-sectional area, to represent the quantity in terms of the average magnetic field normal to the receiver. In practice, magnetic field measurements can be approximated accurately by applying a linear projection matrix (\(\mathbf{P_h}\)) to the electric fields computed on cell edges:

\[H = \frac{1}{i\omega \mu_0 A} \int_C \mathbf{e} \cdot d\mathbf{l} \approx \mathbf{P_h \, u_e}\]

To obtain impedance tensor (MT) or ZTEM data, we need the electric and/or magnetic fields for two orthogonal source polarizations; generally one in the x direction and one in the y direction. Let \(\mathbf{s}^{(1)}\) and \(\mathbf{s}^{(2)}\) denote the right-hand sides for source fields generated for each polarization. And let \(\mathbf{u_e}^{(1)}\) and \(\mathbf{u_e}^{(2)}\) denote the corresponding solutions for the electric fields on the edges. Then the average electric field (Ex or Ey) or average magnetic field (Hx, Hy or Hz) for some receiver is given by:

(2.12)\[\begin{split}\begin{align} E^{(j)} &= \mathbf{P_e \, u_e}^{(j)} = -i\omega \mathbf{P_e \, A}(\sigma)^{-1} \, \mathbf{s}^{(j)} \;\;\; \textrm{for} \;\;\; j=1,2 \\ H^{(j)} &= \mathbf{P_h \, u_e}^{(j)} = -i\omega \mathbf{P_h \, A}(\sigma)^{-1} \, \mathbf{s}^{(j)} \;\;\; \textrm{for} \;\;\; j=1,2 \end{align}\end{split}\]

where the matrix

(2.13)\[\mathbf{A}(\sigma) = \mathbf{C^T \, M_\mu \, C} + i\omega \mathbf{M_\sigma}\]

depends on the Earth’s conductivity. If the fields at each observation location are known, MT data can be obtained using Eq. (2.8) and ZTEM data can be obtained using Eq. (2.10). The only thing that is needed is the source term for Eq. (2.11).

2.5.1. Source Term

2.5.1.1. 1D Approach

For this approach, we solve a 1D wave equation of the following form:

(2.14)\[\mathbf{\tilde{A} \tilde{u}_e} = \mathbf{\tilde{q}}\]

where \(\mathbf{\tilde{u}_e}\) is the electric field for the 1D solution polarized along the x or y directions. \(\mathbf{\tilde{A}}\) is an operator of the form:

\[\mathbf{\tilde{A}} = \mathbf{L} + i \omega \mu_0 \tilde{\sigma}\]

such that \(\mathbf{L}\) is the Laplacian operator, \(\mu_0\) is the permeability of free-space and \(\tilde{\sigma}\) is a 1D conductivity model. The right-hand side \(\mathbf{\tilde{q}}\) is a vector of zeros except for \(\tilde{q}_1\). A Dirichlet condition is imposed by setting \(A_{11} = 1\) and \(\tilde{q}_1 = i\omega \mu_0 h^{-1}\); where \(h\) is the layer thickness. Once Eq. (2.14) is solved for a particular frequency, the solution is transferred to the edges of an OcTree mesh. If the electric field is polarized along the x direction, there are no electric fields along y or z; similarly for a solution polarized along the y direction.

Let \(\mathbf{u_s}\) and \(\sigma_s\) be the electric fields and 1D conductivity model transferred to the edges of the OcTree mesh, respectively. Then the source term in Eq. (2.11) is computed for a given frequency and polarization using:

\[\frac{1}{i\omega} \mathbf{A u_s} = \mathbf{s}\]

where \(\mathbf{A}\) is similar to expression (2.13), except the mass matrix \(\mathbf{M_\sigma}\) is formed using the transferred conductivity \(\sigma_s\).

2.5.1.2. 3D Approach

Let \(\sigma_b\) be the 3D background conductivity model. And let \(\mathbf{A}\) be an operator similar to expression (2.13), except the mass matrix \(\mathbf{M_\sigma}\) is formed using the background conductivity. If \(j=1,...,J\) denotes the indicies for all internal edges and \(k=1,...,K\) denotes the indicies for all top edges, then for each polarization we solve a smaller system:

\[\mathbf{A_{j,j} u_j} = - \mathbf{A_{j,k} b}\]

where \(\mathbf{b}\) is a vector of ones with length \(K\) and \(\mathbf{u_j}\) is the background electric field on internal edges. From this we form a vector \(\mathbf{u_b}\) where:

\(\mathbf{u_b} = 1\) on the top edges

\(\mathbf{u_b} = \mathbf{u_j}\) on internal edges

\(\mathbf{u_b} = 0\) otherwise

Once this is done, the source term in Eq. (2.11) is computed for a given frequency and polarization using:

\[\frac{1}{i\omega} \mathbf{A u_s} = \mathbf{s}\]

2.6. Sensitivity

2.6.1. MT Data

Impedance tensor data are split into their real and imaginary components. Thus the data at a particular frequency for a particular reading is organized in a vector of the form:

(2.15)\[\mathbf{Z} = [Z^\prime_{xx}, Z^{\prime \prime}_{xx}, Z^\prime_{xy}, Z^{\prime \prime}_{xy}, Z^\prime_{yx}, Z^{\prime \prime}_{yx}, Z^\prime_{yy}, Z^{\prime \prime}_{yy}]^T\]

where \(\prime\) denotes real components and \(\prime\prime\) denotes imaginary components. To determine the sensitivity of the data (i.e. (2.15)) with respect to the model (\(\boldsymbol{\sigma}\)), we must compute:

\[\frac{\partial \mathbf{Z}}{\partial \boldsymbol{\sigma}} = \Bigg [ \dfrac{\partial Z_{xx}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial Z_{xx}^{\prime\prime}}{\partial \boldsymbol{\sigma}} , \dfrac{\partial Z_{xy}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial Z_{xy}^{\prime\prime}}{\partial \boldsymbol{\sigma}} , \dfrac{\partial Z_{yx}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial Z_{yx}^{\prime\prime}}{\partial \boldsymbol{\sigma}} , \dfrac{\partial Z_{yy}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial Z_{yy}^{\prime\prime}}{\partial \boldsymbol{\sigma}} \Bigg ]^T\]

where the conductivity model \(\boldsymbol{\sigma}\) is real-valued and

(2.16)\[Z_{xx}^\prime = \textrm{Re} \Bigg [\frac{E_{xx} H_{yy} - E_{xy} H_{yx}}{H_{xx}H_{yy} - H_{xy}H_{yx}} \Bigg ]\]

which can be expanded and expressed explicitly in terms of the real and imaginary components of \(E_{ij}\) and \(H_{ij}\). Similar expressions result for the other elements of (2.15).

To differentiate (2.16) (or any other element and component of the impedance tensor) with respect to the model, we replace \(E_{ij}\) and \(H_{ij}\) according to Eq. (2.12) and use the chain rule. The final expression contains the derivative of the electric fields on the edges (\(\mathbf{u_e}\)) with respect to the model. This is given by:

(2.17)\[\frac{\partial \mathbf{u_e}}{\partial \boldsymbol{\sigma}} = - i\omega \mathbf{A}^{-1} diag(\mathbf{u_e}) \, \mathbf{A_{e2c}^T V }\]

2.6.2. ZTEM Data

ZTEM data are also split into their real and imaginary components. Thus the data at a particular frequency for a particular reading is organized in a vector of the form:

(2.18)\[\mathbf{T} = [T^\prime_{zx}, T^{\prime \prime}_{zx}, T^\prime_{zy}, T^{\prime \prime}_{zy}]^T\]

where \(\prime\) denotes real components and \(\prime\prime\) denotes imaginary components. To determine the sensitivity of the data (i.e. (2.18)) with respect to the model (\(\boldsymbol{\sigma}\)), we must compute:

\[\frac{\partial \mathbf{T}}{\partial \boldsymbol{\sigma}} = \Bigg [ \dfrac{\partial T_{zx}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial T_{zx}^{\prime\prime}}{\partial \boldsymbol{\sigma}} , \dfrac{\partial T_{zy}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial T_{zy}^{\prime\prime}}{\partial \boldsymbol{\sigma}} \Bigg ]^T\]

where the conductivity model \(\boldsymbol{\sigma}\) is real-valued and

(2.19)\[T_{zx}^\prime = \textrm{Re} \Bigg [ \frac{-H_y^{(1)} H_z^{(2)} + H_y^{(2)} H_z^{(1)}}{ H_x^{(1)} H_y^{(2)} - H_x^{(2)} H_y^{(1)}} \Bigg ]\]

which can be expanded and expressed explicitly in terms of the real and imaginary components of \(H_j^{(i)}\). Similar expressions result for the other elements of (2.18).

To differentiate (2.19) (or any other element and component) with respect to the model, we replace \(H_j^{(i)}\) according to Eq. (2.12) and use the chain rule. The final expression contains the derivative of the electric fields on the edges (\(\mathbf{u_e}\)) with respect to the model with is given by Eq. (2.17).

2.7. Inverse Problem

We are interested in recovering the conductivity distribution for the Earth. However, the numerical stability of the inverse problem is made more challenging by the fact rock conductivities can span many orders of magnitude. To deal with this, we define the model as the log-conductivity for each cell, e.g.:

\[\mathbf{m} = log (\boldsymbol{\sigma})\]

The inverse problem is solved by minimizing the following global objective function with respect to the model:

(2.20)\[\phi (\mathbf{m}) = \phi_d (\mathbf{m}) + \beta \phi_m (\mathbf{m})\]

where \(\phi_d\) is the data misfit, \(\phi_m\) is the model objective function and \(\beta\) is the trade-off parameter. The data misfit ensures the recovered model adequately explains the set of field observations. The model objective function adds geological constraints to the recovered model. The trade-off parameter weights the relative emphasis between fitting the data and imposing geological structures.

2.7.1. Data Misfit

Here, the data misfit is represented as the L2-norm of a weighted residual between the observed data (\(d_{obs}\)) and the predicted data for a given conductivity model \(\boldsymbol{\sigma}\), i.e.:

(2.21)\[\phi_d = \frac{1}{2} \big \| \mathbf{W_d} \big ( \mathbf{d_{obs}} - \mathbb{F}[\boldsymbol{\sigma}] \big ) \big \|^2\]

where \(W_d\) is a diagonal matrix containing the reciprocals of the uncertainties \(\boldsymbol{\varepsilon}\) for each measured data point, i.e.:

\[\mathbf{W_d} = \textrm{diag} \big [ \boldsymbol{\varepsilon}^{-1} \big ]\]

Important

For a better understanding of the data misfit, see the GIFtools cookbook .

2.7.2. Model Objective Function

Due to the ill-posedness of the problem, there are no stable solutions obtained by freely minimizing the data misfit, and thus regularization is needed. The regularization uses penalties for both smoothness, and likeness to a reference model \(m_{ref}\) supplied by the user. The model objective function is given by:

(2.22)\[\begin{split}\begin{align} \phi_m = \frac{\alpha_s}{2} \!\int_\Omega w_s | m - & m_{ref} |^2 dV + \frac{\alpha_x}{2} \!\int_\Omega w_x \Bigg | \frac{\partial}{\partial x} \big (m - m_{ref} \big ) \Bigg |^2 dV \\ &+ \frac{\alpha_y}{2} \!\int_\Omega w_y \Bigg | \frac{\partial}{\partial y} \big (m - m_{ref} \big ) \Bigg |^2 dV + \frac{\alpha_z}{2} \!\int_\Omega w_z \Bigg | \frac{\partial}{\partial z} \big (m - m_{ref} \big ) \Bigg |^2 dV \end{align}\end{split}\]

where \(\alpha_s, \alpha_x, \alpha_y\) and \(\alpha_z\) weight the relative emphasis on minimizing differences from the reference model and the smoothness along each gradient direction. And \(w_s, w_x, w_y\) and \(w_z\) are additional user defined weighting functions.

An important consideration comes when discretizing the regularization onto the mesh. The gradient operates on cell centered variables in this instance. Applying a short distance approximation is second order accurate on a domain with uniform cells, but only \(\mathcal{O}(1)\) on areas where cells are non-uniform. To rectify this a higher order approximation is used ([HHG+12]). The second order approximation of the model objective function can be expressed as:

\[\phi_m (\mathbf{m}) = \mathbf{\big (m-m_{ref} \big )^T W^T W \big (m-m_{ref} \big )}\]

where the regularizer is given by:

(2.23)\[\begin{split}\begin{align} \mathbf{W^T W} =& \;\;\;\;\alpha_s \textrm{diag} (\mathbf{w_s \odot v}) \\ & + \alpha_x \mathbf{G_x^T} \textrm{diag} (\mathbf{w_x \odot v_x}) \mathbf{G_x} \\ & + \alpha_y \mathbf{G_y^T} \textrm{diag} (\mathbf{w_y \odot v_y}) \mathbf{G_y} \\ & + \alpha_z \mathbf{G_z^T} \textrm{diag} (\mathbf{w_z \odot v_z}) \mathbf{G_z} \end{align}\end{split}\]

The Hadamard product is given by \(\odot\), \(\mathbf{v_x}\) is the volume of each cell averaged to x-faces, \(\mathbf{w_x}\) is the weighting function \(w_x\) evaluated on x-faces and \(\mathbf{G_x}\) computes the x-component of the gradient from cell centers to cell faces. Similarly for y and z.

If we require that the recovered model values lie between \(\mathbf{m_L \preceq m \preceq m_H}\) , the resulting bounded optimization problem we must solve is:

(2.24)\[\begin{split}\begin{align} &\min_m \;\; \phi_d (\mathbf{m}) + \beta \phi_m(\mathbf{m}) \\ &\; \textrm{s.t.} \;\; \mathbf{m_L \preceq m \preceq m_H} \end{align}\end{split}\]

A simple Gauss-Newton optimization method is used where the system of equations is solved using ipcg (incomplete preconditioned conjugate gradients) to solve for each G-N step. For more information refer again to [HHG+12] and references therein.

2.7.3. Inversion Parameters and Tolerances

2.7.3.1. Cooling Schedule

Our goal is to solve Eq. (2.24), i.e.:

\[\begin{split}\begin{align} &\min_m \;\; \phi_d (\mathbf{m}) + \beta \phi_m(\mathbf{m}) \\ &\; \textrm{s.t.} \;\; \mathbf{m_L \preceq m \preceq m_H} \end{align}\end{split}\]

but how do we choose an acceptable trade-off parameter \(\beta\)? For this, we use a cooling schedule. This is described in the GIFtools cookbook . The cooling schedule can be defined using the following parameters:

beta_max: The initial value for \(\beta\)

beta_factor: The factor at which \(\beta\) is decrease to a subsequent solution of Eq. (2.24)

nBetas: The number of times the inversion code will decrease \(\beta\) and solve Eq. (2.24) before it quits

Chi Factor: The inversion program stops when the data misfit \(\phi_d \leq N \times Chi \; Factor\), where \(N\) is the number of data observations

2.7.3.2. Gauss-Newton Update

For a given trade-off parameter (\(\beta\)), the model \(\mathbf{m}\) is updated using the Gauss-Newton approach. Because the problem is non-linear, several model updates may need to be completed for each \(\beta\). Where \(k\) denotes the Gauss-Newton iteration, we solve:

(2.25)\[\mathbf{H}_k \, \mathbf{\delta m}_k = - \nabla \phi_k\]

using the current model \(\mathbf{m}_k\) and update the model according to:

(2.26)\[\mathbf{m}_{k+1} = \mathbf{m}_{k} + \alpha \mathbf{\delta m}_k\]

where \(\mathbf{\delta m}_k\) is the step direction, \(\nabla \phi_k\) is the gradient of the global objective function, \(\mathbf{H}_k\) is an approximation of the Hessian and \(\alpha\) is a scaling constant. This process is repeated until any of the following occurs:

The gradient is sufficiently small, i.e.:

\[\| \nabla \phi_k \|^2 < tol \_ nl\]

The smallest component of the model perturbation its small in absolute value, i.e.:

\[\textrm{max} ( |\mathbf{\delta m}_k | ) < mindm\]

A max number of GN iterations have been performed, i.e.

\[k = iter \_ per \_ beta\]

2.7.3.3. Gauss-Newton Solve

Here we discuss the details of solving Eq. (2.25) for a particular Gauss-Newton iteration \(k\). Using the data misfit from Eq. (2.21) and the model objective function from Eq. (2.23), we must solve:

(2.27)\[\Big [ \mathbf{J^T W_d^T W_d J + \beta \mathbf{W^T W}} \Big ] \mathbf{\delta m}_k = - \Big [ \mathbf{J^T W_d^T W_d } \big ( \mathbf{d_{obs}} - \mathbb{F}[\mathbf{m}_k] \big ) + \beta \mathbf{W^T W} \big ( \mathbf{m}_k - \mathbf{m_{ref}} \big ) \Big ]\]

where \(\mathbf{J}\) is the sensitivity of the data (\(\mathbf{Z}\) or \(\mathbf{T}\)) to the current model \(\mathbf{m}_k\); see sensitivity section to learn how sensitivities are computed. The system is solved for \(\mathbf{\delta m}_k\) using the incomplete-preconditioned-conjugate gradient (IPCG) method. This method is iterative and exits with an approximation for \(\mathbf{\delta m}_k\). Let \(i\) denote an IPCG iteration and let \(\mathbf{\delta m}_k^{(i)}\) be the solution to (2.27) at the \(i^{th}\) IPCG iteration, then the algorithm quits when:

the system is solved to within some tolerance and additional iterations do not result in significant increases in solution accuracy, i.e.:

\[\| \mathbf{\delta m}_k^{(i-1)} - \mathbf{\delta m}_k^{(i)} \|^2 / \| \mathbf{\delta m}_k^{(i-1)} \|^2 < tol \_ ipcg\]

a maximum allowable number of IPCG iterations has been completed, i.e.:

\[i = max \_ iter \_ ipcg\]