Shear properties of MgO inferred using neural networks

. Shear properties of mantle minerals are vital for interpreting seismic shear wave speeds and therefore inferring the composition and dynamics of a planetary interior. Shear wave speed and elastic tensor components, from which the shear modulus can be computed, are usually measured in the laboratory mimicking the Earth’s (or a planet’s) internal pressure and temperature conditions. A functional form that relates the shear modulus to pressure (and temperature) is ﬁtted to the measurements and used to interpolate within and extrapolate beyond the range covered by the data. Assuming a functional form provides prior information, and the constraints on the predicted shear modulus and its uncertainties might depend largely on the assumed prior rather than the data. In the present study, we propose a data-driven approach in which we train a neural network to learn the relationship between the pressure, temperature and shear modulus from the experimental data without prescribing a functional form a priori. We present an application to MgO, but the same approach works for any other mineral if there are sufﬁcient data to train a neural network. At low pressures, the shear modulus of MgO is well-constrained by the data. However, our results show that different experimental results are inconsistent even at room temperature,


46
A. Rijal et al.: Shear properties of MgO inferred using neural networks strain, Vinet). This approach relates the bulk modulus information to the shear modulus assuming that the uncertainties in the latter are related to and can be derived from the former, but the relation is semi-empirical. In addition to the assumptions about uncertainties, this relationship also implies that the scaling of the actual value of the shear modulus would be the same for different minerals.
In Rijal et al. (2021), we demonstrated the use of the mixture density network (Bishop, 1995) for inferring the relationship between pressure, temperature and volume and thus bulk modulus and thermal expansivity. In this neuralnetwork-based approach, the inferred properties are learned from the available experimental data, rather than prescribing a priori a functional form to explain the data. Our study showed that fixing a functional form provides a priori information to the inversion of experimental data which can bias the uncertainty quantification. In the present study, we apply a similar approach to infer the shear properties, in particular the shear modulus and wave speed, of MgO with uncertainties, using experimental shear modulus data collated from various studies. Hence, the inferred pressure-temperatureshear modulus (P -T -G) relationship (as well as its uncertainties) is entirely data driven without any prior assumption about the functional form to describe the dependence of the shear modulus on pressure and temperature.

MgO shear modulus data
Experimental shear modulus data for MgO are available from various measurement techniques. In this study, we collate data ( Fig. 1) with uncertainties from Brillouin scattering (BS), ultrasonic interferometry (US), rectangular parallelepiped resonance (RPR), inelastic X-ray scattering (IXS) and inelastic neutron scattering (INS) methods. A list of experimental methods from which we collected data is given in Appendix A. Readers are referred to Marquardt and Thomson (2020) for a technical review of BS, US and IXS and to Sumino et al. (1976) and  for RPR techniques. Nevertheless, it is important to mention that BS and US (on polycrystalline materials) provide a sample's aggregate shear wave speed (V s ) that can be used to extract the shear modulus if the density (ρ) of the same sample is known. However, not all data considered in this study provide V s and ρ information for the same MgO sample. For example, Murakami et al. (2012) use density and velocity at 300 K from their previous study (Murakami et al., 2009) on polycrystalline samples, and Zha et al. (2000) use single-crystal (SC) data for velocity and polycrystalline (PC) samples for density. Although the densities measured on polycrystalline samples by these two different studies agree with each other, the shear modulus values are inconsistent (Appendix B).  Sangster et al. (1970), Sumino et al. (1983), Isaak et al. (1989), , , Zha et al. (2000), Jacobsen et al. (2002), Li et al. (2006), Fukui et al. (2008, Murakami et al. (2009), Kono et al. (2010, Murakami et al. (2012), Finkelstein et al. (2018), andFan et al. (2019). Note uncertainties in collected experimental data are not plotted because most of them are smaller than the size of the plotting symbol. There are 311 data points in total, and we generate a dataset to train the neural networks by sampling from the uncertainty ranges of these data.
With RPR and BS in a single crystal, it is also possible to determine the elastic tensor of a sample (e.g. Isaak et al., 1989;. The elastic tensor components can be used to compute the shear modulus using an averaging scheme, such as Reuss or Voigt. We collected the shear modulus of MgO given by Isaak et al. (1989) (RPR, Hashin-Shtrikman average), Sumino et al. (1983) (RPR, Voigt-Reuss-Hill average),  (BS SC, Voigt-Reuss-Hill average), and  (BS SC, Voigt-Reuss-Hill average). The shear moduli of MgO based on measurements of its acoustic phonon dispersion curves using high-energy-resolution inelastic X-ray scattering (Fukui et al., 2008;Finkelstein et al., 2018) and inelastic neutron scattering (Sangster et al., 1970) are also included in this study.
As shown in Fig. 1, most experimental data come from measurements where the temperature is below 1800 K. Although the maximum temperature in the dataset is 2700 K, there are only a few measurements on a polycrystalline sample at that temperature (Murakami et al., 2012), and there exists a central temperature gap of around 900 K in which there are no data. Along the pressure axis, most measurements fall below approximately 30 GPa. There are some data up to 128 GPa but only at ambient temperature on a polycrystalline sample (Murakami et al., 2009). The total data shown in Fig. 1 come from various experimental techniques which probe elastic properties at different frequencies. One can train a separate neural network using data from each type of experiment if sufficient data are available covering the P -T conditions of the lower mantle. This would be helpful to compare, for example, shear properties measured by ul-trasonic techniques at megahertz frequencies with Brillouin measurements at gigahertz and IXS data at terahertz frequencies. In addition, one can also separate SC data from PC samples. But currently, even when combining datasets from different experiments and sample types, we do not have sufficient P -T coverage for the whole lower mantle. For this reason, in this study we shall use all the data shown in Fig. 1 to infer the complete uncertainty estimate of the shear modulus.

Methodology
P , T and G measurement errors -which may include systematic errors due to instrument calibrations, different averaging schemes and inconsistent samples for measuring velocity are the sources of uncertainties in experimental P -T -G data. In addition, inconsistencies between different studies have been highlighted in earlier papers, which further contribute to the uncertainty (e.g. Li et al., 2006;Kono et al., 2010;Fan et al., 2019). Hence, we work within a probabilistic setting to infer the P -T -G relationship, which allows us to answer the following question: what is the full range of shear modulus uncertainty at a given pressure and temperature, based on the available experimental data? We use a neural-network-based approach, specifically a mixture density network (MDN) (Bishop, 1995), to solve the probabilistic inverse problem. A detailed description of the MDN approach to estimate material properties of minerals of the lower mantle is given in Rijal et al. (2021). Briefly, a solution to the probabilistic inverse problem of finding the P -T -G relationship is given by a posterior probability density function (pdf) for G at a given P and T . The pdf p(G|P , T ) can be approximated using an MDN as shown in Appendix C. To address the non-uniqueness of the regression problem, we train a number of independent MDNs (10 3 ) and combine their predicted pdf's (see details in Rijal et al., 2021). The number of hidden nodes and number of Gaussian kernels of each MDN are drawn randomly from 12-24 and 3-5, respectively.
We follow a standard approach of training neural networks, whereby training, monitoring and test sets are randomly generated from the total dataset. These sub-sets have similar pressure and temperature distributions. Although the total data contain similar numbers of SC and PC measurements, not all the referenced studies (in Fig. 1) equally contribute to the three sub-sets because of limited data. Training (≈ 70 %) and monitoring (≈ 20 %) sets are used to train the MDN and restrict the overfitting of the trained MDN, respectively. The MDN is constructed and trained using TensorFlow 1.13.1 (Abadi et al., 2015). The test set (≈ 10 %), which is not seen by the network during training, is used to test the prediction performance of the trained MDN. The monitoring set is used to monitor the overfitting. We compute the error in the monitoring set after each training iteration and stop the training if this error starts to increase. Figure 2. Prediction performance of the trained MDN as a function of pressure using the test set. The mean of the posterior pdf's on the shear modulus predicted by the MDN is subtracted from the actual shear modulus values of the test set (i.e. target values) to compute the variation in the shear modulus. The mean variation is shown as circles, and the size of uncertainty (1 standard deviation) is given by grey error bars. One could also represent the same information by using the log-likelihood function given by the MDN instead. The dashed cyan line refers to a perfectly resolved shear modulus. Hence, the closer the data plot to the line, the better the resolving power of the neural network. The differences between target and predicted shear modulus values are mostly located close to the cyan line, although intermediate-and high-pressure predictions are more uncertain and are located away from it. The range of temperature of the test data is given by the colour bar on the right. Note some error bars are smaller than or comparable to the plotting symbol.
This technique is known as early stopping. In addition, we standardize these datasets to bring all three variables (i.e. P , T and G) to a common scale. The training time of each MDN depends on its architecture. To give an estimate, an MDN with 22 hidden nodes and 3 Gaussian kernels trained for 16 120 iterations (until the early stopping kicks in) took about 40 min in a CPU with two cores and 250 GB memory. However, there are some networks which took less than 3 min to train.
The mean and the variance (or the standard deviation) of the posterior pdf (Bishop, 1994) for the shear modulus predicted by the MDN are used to evaluate the prediction performance. These moments (i.e. mean and standard deviation) for each input of the test set are compared with the actual shear modulus from the test set. Although this approach ignores the information provided by a complete posterior pdf, it is a practical way to evaluate the network performance (e.g. de Wit et al., 2013). Figure 2 shows the resolving power of the trained MDN for various P and T conditions. Shear moduli that fall along or close to the dashed cyan line in the figure are well-resolved predictions. The MDN resolves the shear modulus better in the region of low P (below ≈ 30 GPa) than at intermediate and high P . This is in line with the sparse data (cf. Fig. 1) and increased experimental uncertainty towards high P . Furthermore, for measurements conducted at 300 K, constraints on the MgO shear modulus above 30 GPa are entirely due to the experimental data of Zha et al. (2000) and Murakami et al. (2009). At pressures around 50 GPa, however, these two datasets suggest two distinct trends (Appendix B and Fig. 3a). Due to these different trends, the standard deviation of the posterior pdf shows increasing uncertainties with pressures (Fig. 2). Moreover, because of this inconsistency, the targets are not aligned with the mean of the posterior pdf, demonstrating the shear modulus is only moderately constrained at pressures greater than 50 GPa (large uncertainties). Figure 3 shows the pdf's for the shear modulus predicted by the trained MDN as a continuous function of pressure along a 300 K isotherm. The plot also compares the pdf's with the shear modulus given by the finite-strain equation of Lithgow-Bertelloni (2005, 2011) and ab initio calculations of Wentzcovitch et al. (2010b), denoted as SLB0511 and W10, respectively. Whilst SLB0511 use fitting parameters (G 0 and G 0 ) given by , which are based on experiments up to a pressure of 18.6 GPa, W10 use theoretical calculation of the elastic tensor by Karki et al. (1999). For the 300 K isotherm, the width of pdf increases, i.e. uncertainty increases, with increasing pressure. This is because the experimental data become sparse, and the experimental measurements are more uncertain in the highpressure region. Predictions by SLB0511 and W10 are consistent with each other up to approximately 75 GPa. They start to diverge as pressure increases because the predictions from SLB0511 are based on fitting low-pressure experimental data and the W10 study is based on ab initio calculations at high pressure.

P -T -G relationship
For 300 K and above approximately 50 GPa (Fig. 3a), the trained MDN shows a peak in probability density functions favouring the Murakami et al. (2009) data because these are the only high-pressure measurements. However, the MDN still assigns some probability towards smaller values of the shear modulus, resulting in broader pdf's ( Fig. 3a, c). The widening of the pdf's is due to different data trends from Murakami et al. (2009) and Zha et al. (2000) around 50 GPa pressure, leading to bimodal pdf's. However, the effect of experimental data of Zha et al. (2000), which have a maximum pressure of 55 GPa, decreases towards high pressures as shown by a smaller probability mass associated with them. The effect is more clearly understood when comparing panels (a) and (c) with (b) and (d) in Fig. 3. The latter panels show the pdf's predicted by another MDN trained without the data of Zha et al. (2000). 1 Now, at 50 GPa, the width of the pdf is smaller and unimodal, suggesting significant reduction in uncertainty. However, after removing Zha et al. (2000), two different shear modulus trends appear around 25 GPa, leading to a discontinuity (in Fig. 3d). The discontinuity or "softening" of the shear modulus shows that the high-pressure Brillouin-scattering experimental data in polycrystalline material (Murakami et al., 2009) are not compatible with the remaining low-pressure data from Brillouin and ultrasonic measurements on single crystals and polycrystalline samples, respectively.
Besides ambient temperature, we plot (Fig. 4) one isotherm (2000 K) and an isobar (0 GPa) where comparison with theoretical data of W10 is possible. More isotherms and isobars are shown in Appendix D. The 0 GPa isobar in Fig. 4 shows that the pdf's for the shear modulus are tightly constrained by experimental data at this pressure. The EOS of SLB0511 at low temperature follows the maximum likelihood given by the pdf's. However, at high temperature, it is mostly located on the lower bound of our pdf's. The EOS of W10 falls on the edge or outside of our pdf's, suggesting the theoretical calculations are not fully consistent with experimental measurements. Near 2000 K temperature, there are only a few experimental measurements at low pressure. Thus, our pdf's show increasing uncertainty as the pressure increases. In that region, the MDN is not constrained by currently available experimental data and is forced to extrapolate, which is not advisable. In this study, we plot the shear modulus from W10 only in the region where the quasiharmonic approximation is valid according to Wentzcovitch et al. (2010a). However, more recent studies have shown a complex picture of the limit of quasiharmonic approximation (e.g. Giura et al., 2019;Calandrini et al., 2021), and a detailed discussion on the limit is out of the scope of this study.

P -T -V s relationship
We take the density with uncertainties from our previous study on the volumetric properties of MgO (Rijal et al., 2021) to compute shear wave speed from the shear modulus (using uncertainty propagation). In particular, the mean and 1 standard deviation of the shear modulus pdf's of the present study are considered together with those from density pdf's of the previous study. We are aware that one can also directly train a network with P -T -V s data from different experiments by excluding elastic tensor component measurements which require density to compute shear wave speed. However, excluding these data reduces the available experimental data by about 26 %. Standard deviations of the shear modulus and V s are shown along different isotherms and an isobar in Fig. 5 together with those of SLB0511 and W10 for comparison. Excluding Zha et al. (2000) (Fig. 5b, d) removes the probability mass associated with smaller shear modulus values, which ultimately decreases the standard deviation at high pressures. The inconsistent shear modulus values between Murakami et al. (2009) and other studies, as mentioned in   Lithgow-Bertelloni (2005, 2011) and Wentzcovitch et al. (2010b). While the former study closely follows the maximum likelihood predicted by neural networks along the isobar, the latter study falls outside the experimental uncertainty range. The predictions from W10 are shown only for pressures and temperatures where the quasiharmonic approximation is valid (Wentzcovitch et al., 2010a). Circles represent a sub-set of the total data whose pressure (or temperature) falls within the range shown in the colour bar on the side panel. Along the 2000 K isotherm, the neural network predictions go beyond the range of the experimental data at high pressure, as reflected in the widening of pdf's when the pressure increases. Fig. 3d, affect the shear wave speed as shown by its softening around 25 GPa along the 300 K isotherm in Fig. 5d. Along the 0 GPa isobar, the V s given by SLB0511 largely agrees with the mean value predicted by neural networks. However, at high temperatures, it falls outside the standard deviation range given by the MDN. The V s values along other isotherms and isobars are shown in Appendix E.

Discussion
Theoretical computations provide mineral elastic properties across the lower mantle's pressure and temperature conditions (e.g. Karki et al., 1999;Matsui et al., 2000;Wentzcovitch et al., 2010b). However, computations usually report uncertainties to be within a few percent, if reported at all, and accurate experimental measurements are required to benchmark these calculations (Marquardt and Thomson, 2020). On the experimental side, measurements are largely reported at low pressures, mostly below 30 GPa. A finite-strain equation that controls the functional form is thus generally used to handle the extrapolation away from the measured shear moduli. In the absence of further knowledge, it is reasonable to assume a functional form based on, for example, an elasticity theory. But if the MgO's true elastic behaviour is not compatible with this assumption, then we make inadequate predictions about its properties.
The finite-strain equation of Lithgow-Bertelloni (2005, 2011), whose G 0 and G 0 are constrained by experimental measurements at low pressures, is mostly consistent with the posterior mode given by our MDN. However, it deviates from the maximum likelihood of the pdf's at the pressure range 30 to 65 GPa (Fig. 3b), as well as at pressures larger than 110 GPa (Fig. 3a) along 300 K isotherms. It also diverges at temperatures greater than 1500 K along a 0 GPa isobar (Fig. 4). While the EOS of SLB0511 falls within the pdf predicted by our MDN, the MDN gives a more complete picture of the level of certainty based directly on the data consistency. This in turn enables us to assess the level of confidence we can place on interpretations of seismic data. A more robust comparison of our results with SLB0511 and W10 would require quantification of uncertainties in their predicted shear properties. For that, correlations between fitting parameters are needed, which are not provided in SLB0511. In addition, the residual standard deviation for such an explicit equation would depend on the choice of its mathematical form. Therefore, we proposed a data-driven approach for both approximating functional forms between variables (i.e. equations of state) and quantifying their uncertainties. Theoretical calculations by Karki et al. (1999) report qualitative uncertainties in their calculation that are within a few percent. Taking that into account, W10 would still show a lot of confidence in the lowprobability region of the pdf predicted by neural networks, for example along a 0 GPa isobar, along a 300 K isotherm at pressures approaching the core-mantle boundary.
Experiments that are performed on polycrystalline materials (e.g. Murakami et al., 2009) provide an average sample velocity. However, the interpretation of "the average" can be difficult (Marquardt and Thomson, 2020). This is not a problem as far as the bulk modulus of MgO is concerned because for cubic crystals, the common assumptions of uniform stress (i.e. Reuss scheme) and uniform strain (i.e. Voigt scheme) are equivalent. However, this is not the case for the shear modulus. Moreover, the velocity measured in a polycrystalline material is sensitive to grain size, shape and orientation (e.g. Yeheskel et al., 2005;Marquardt et al., 2011;Marquardt and Thomson, 2020). Along a 300 K isotherm, once the data from Zha et al. (2000) are removed, there are only polycrystalline measurements at pressures larger than 25 GPa (Murakami et al., 2009). These measurements clearly show a different trend (Fig. 3d) when compared with the shear modulus determined from other single-and polycrystalline studies at pressure below 25 GPa. Furthermore, including data from Zha et al. (2000) provides a larger standard deviation in the shear modulus which ultimately translates into larger uncertainties in the shear wave speed. For example, at 300 K and at 135 GPa, 1 standard deviation in the shear modulus is approximately ±14 %. Together with about ±1 % density uncertainty (Rijal et al., 2021), the shear wave speed uncertainty is approximately ±7 % under those conditions. This is larger than the shear wave speed variations reported in seismic tomographic models of the lower mantle, although they only capture the long-wavelength structures (e.g. Ishii and Tromp, 1999;Trampert et al., 2004;Simmons et al., 2010;Moulik and Ekström, 2014;French and Romanowicz, 2014;Koelemeijer et al., 2015;Lei et al., 2020). Hence, even at room temperature, current experimental constraints on shear properties of MgO are uncertain at high pressures. However, when data from Zha et al. (2000) are removed, the shear wave speed uncertainty reduces by a factor of approximately 3.5 under the same conditions. This is the case for MgO, which is arguably the best-studied mantle mineral, and it would be worth evaluating the uncertainties for other minerals.
To quantify the uncertainties in shear wave speeds from the MDN-predicted pdf's of the shear modulus and density, we chose a pragmatic approach. We simply extracted the mean and standard deviation from the pdf's. However, one can also take the most probable Gaussian kernel, i.e. the kernel with the largest weight (in Fig. C1). Additionally, similarly to the mean and standard deviation, other moments (Bishop, 1994) of probability density functions can be easily extracted if desired. Besides that, as discussed in Rijal et al. (2021), neural networks show increasing uncertainty if we need to extrapolate substantially outside the prior data range (i.e. training data). SLB0511 and W10 results closely follow the MDN extrapolation (Fig. 4, right panel). However, all the extrapolated pdf's might be heavily dependent on the network architecture, and we therefore advise against extrap-  Zha et al. (2000). W10 data are shown only up to the limit where quasiharmonic approximation is valid (Wentzcovitch et al., 2010a). At 300 K, the uncertainties at high pressures are entirely due to Murakami et al. (2009) when excluding Zha et al. (2000). The sudden change in the velocity curvature near 25 GPa is due to inconsistency between data from Murakami et al. (2009) and other studies. olation at pressures above approximately 65 GPa at all temperatures apart from 300 K.
With the MDN approach, we quantify uncertainties in the shear modulus (in Figs. 3 and 4 and Appendix D) with the help of posterior probability density functions. The MDN predicts complete uncertainties from the data with which it is trained -in the sense that it accounts for all experimental errors and uncertainties, interpolation uncertainties including data sparsity and inconsistency, and uncertainties in the inverse problem itself. One can in principle add theoretically computed shear modulus data to the existing experimental data. This would provide additional constraints on the shear properties of MgO, especially in the regions where the experimental measurements are not yet feasible. However, there are regions (Figs. 3,4 and 5) where the shear modulus uncertainty based on the combined data would likely be larger than those uncertainties based on experiments alone. This is because current theoretical studies fall on the edge or outside of the pdf's from experimental data.
Furthermore, the MDN-based approach can be extended to model material properties that have more complicated dependencies on pressure and temperature, such as the transition from a high-to low-spin state of iron in (Fe,Mg)O ferropericlase. A reduction in the bulk modulus of ferropericlase on transition from the high-spin to low-spin state has been reported by various studies (e.g. Lin et al., 2006;Speziale et al., 2007;Crowhurst et al., 2008;Marquardt et al., 2009;Wentzcovitch et al., 2009). A single finite-strain equation cannot model the high-and low-spin states simultaneously. As a result, some studies (e.g. Fei et al., 2007) have used two such EOSs to fit the high-and the low-spin data of ferropericlase separately, which may not capture its properties in the mixed-spin region. Others have considered the electronic contribution, in addition to the elastic contribution, to the total free energy of a material (Sturhahn et al., 2005;Chen et al., 2012). Although the pressure range of the spin transition is a function of temperature and iron content (e.g. Sturhahn et al., 2005;Lin et al., 2007;Solomatova et al., 2016), it varies between studies (Marquardt et al., 2018). Our neural-networkbased approach has an advantage that it does not require a predetermined functional form or a solid solution model to capture material properties across iron spin transition in ferropericlase. By simply appending composition (e.g. mol % Fe) as an extra dimension in the input data, it is in principle straightforward to model such properties, which otherwise would be difficult to represent with finite-strain equations.

Conclusions
1. The shear modulus of MgO is constrained by experiments at low-P and low-T conditions.
2. In general, at low pressures, shear moduli based on an explicit finite-strain equation whose fitting parameters (i.e. G 0 and G 0 ) are constrained by the same experimental measurements are consistent with the maximum likelihood given by neural networks. However, there are several regions where the finite-strain equation diverges from the pdf predicted by experimental data, mainly under P -T conditions not used to constrain it. In such regions and under P -T conditions where extrapolation is necessary, constraints on the shear modulus are largely based on the functional form of the EOS and not the measurements.
3. Comparisons with MDN-predicted pdf's show that an explicit finite-strain equation represents one possible solution within the range of uncertainties, which is sometimes, although not always, the most likely value of the pdf's.

Data-driven approaches identify inconsistent data.
Brillouin-scattering experiments on polycrystalline MgO are currently the only available measurement type that span the lower mantle's temperature and pressure conditions. However, these measurements follow a different trend from the remaining low-pressure experimental data from Brillouin scattering on single crystals and ultrasonic measurements on polycrystals. Even at ambient temperature, these different experimental datasets are inconsistent.
5. From a purely data-driven point of view, our pdf's show a large uncertainty in the shear properties of MgO, especially for pressures larger than about 30 GPa.
6. There are P -T regions in the mantle in which shear properties of MgO are constrained neither by training data nor by the test dataset. Hence, we would not recommend extrapolating the shear modulus using MDNs in such regions outside of the prior experimental data.
7. The MDN approach provides realistic estimates of the uncertainties in the pressure-temperature range where measurements have been taken, which should be considered a lower bound if one extrapolates shear elastic properties outside of this region using, for example, a finite-strain formalism. Although the formalism will appear better-constrained, it could potentially be biased as some of our examples have shown in this study.
8. Currently, MgO is the mineral with the most data in the lower mantle. Therefore, one can expect larger uncertainties for other minerals.

54
A. Rijal et al.: Shear properties of MgO inferred using neural networks Appendix C: Architecture of the mixture density network Figure C1. A mixture density network (figure modified after Bishop, 1994;Rijal et al., 2021) to approximate the (c) posterior probability density function for G at a given P and T (inputs). The posterior is approximated using a combination of (a) a conventional two-layer feedforward neural network and (b) a Gaussian mixture model (GMM) that consists of Gaussian functions. α represents weights and biases of the feed-forward network. h j and y k represent hidden nodes and the outputs of the feed-forward network, respectively. The mean, standard deviation and weight of each Gaussian kernel are computed from y k . Then, a weighted sum of these Gaussians in the GMM approximates an arbitrary posterior probability density function.
Appendix D: P -T -G Figure D1. P -T -G relationship of MgO predicted by neural networks along 1000 and 1500 K isotherms and 20 and 50 GPa isobars. Previously published studies Lithgow-Bertelloni, 2005, 2011;Wentzcovitch et al., 2010b) are also shown for comparison. Circles are the data that belong to temperature and pressure ranges shown by colour bars in each panel.
Appendix E: P -T -V s Figure E1. P -T -V s relationship of MgO predicted by neural networks along 1000 and 1500 K isotherms and 20 and 50 GPa isobars. Previously published studies Lithgow-Bertelloni, 2005, 2011;Wentzcovitch et al., 2010b) are also shown for comparison.
Code availability. The code used in this paper is freely available by contacting the corresponding author.
Data availability. The data used in this paper were collected from already published literature which is referenced in the caption of Fig. 1. Nevertheless, the dataset is also available by contacting the corresponding author.
Author contributions. AR contributed in terms of data curation, methodology, software, validation, visualization, formal analysis, investigation, visualization and writing -original draft. LC contributed in terms of conceptualization, supervision, funding acquisition, and writing -review and editing. JT contributed in terms of supervision and writing -review and editing. HM and JMJ contributed in terms of writing -review and editing.
Competing interests. The contact author has declared that none of the authors has any competing interests.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.