J. Phys. Soc. Jpn. 86, 104004 (2017) [7 Pages]

Cu Diffusion in Amorphous Ta2O5 Studied with a Simplified Neural Network Potential

Kiyoyuki Terakura
JPSJ News Comments 14,  11 (2017).

+ Affiliations
1Department of Materials Engineering, The University of Tokyo, Bunkyo, Tokyo 113-8656, Japan2Research Center for Computational Design of Advanced Functional Materials, National Institute of Advanced Industrial Science and Technology, Tsukuba, Ibaraki 305-8568, Japan3Center for Materials Research by Information Integration, Research and Services Division of Materials Data and Integrated System, National Institute for Materials Science, Tsukuba, Ibaraki 305-0047, Japan

Understanding atomistic details of diffusion processes in amorphous structures is becoming increasingly important due to the recent advances in various information and energy devices. Atomistic simulations based on the density functional theory (DFT) represent a powerful approach; however, the development of a method characterized by both high reliability and computational efficiency remains a challenge. In this study, a simple neural network (NN) interatomic potential is constructed from the results of DFT simulations to investigate the diffusion of a single Cu atom in amorphous Ta2O5. The proposed technique is as accurate as the DFT in predicting hopping paths and the corresponding barrier energies in a given amorphous structure. Although the developed NN-based approach exhibited some limitations since it was constructed specifically for Cu, the obtained results showed that the NN potential was able to satisfactorily describe the Cu diffusion behavior. Thus, the Cu diffusion activation energy calculated at low temperatures (between 500 and 800 K) using kinetic Monte Carlo simulations and the NN potential matched the experimental data reasonably well.

©2017 The Author(s)
This article is published by the Physical Society of Japan under the terms of the Creative Commons Attribution 4.0 License. Any further distribution of this work must maintain attribution to the author(s) and the title of the article, journal citation, and DOI.
1. Introduction

Amorphous oxides (including a-Ta2O5, a-HfO2, and a-TiO2) are widely used in various energy and electronic devices,1) whose operating mechanism is often related to the atomic diffusion inside the amorphous phase. For example, in the resistance switches27) the rate of diffusion in the insulating amorphous oxide layer has a significant effect on the device performance characteristics, such as turn-on voltage, endurance, and switching speed.6) Moreover, several amorphous electrolytes (such as LiPON and Li3PO4) also play key roles in the operation of solid-state thin-film batteries because their charge and discharge rates are determined by the lithium diffusion rate.810) Another example is the solid electrolyte interphase in Li batteries, which serves as a protective amorphous layer for the anode and highly affects the battery performance.11,12) Thus, the elucidation of atomic diffusion in such amorphous layers is very important for the development of electronic devices with superior properties.

In recent years, computational simulations have become powerful tools for understanding the mechanism of atomic diffusion in various media. The computational approaches that are most frequently used for this purpose are represented by nudged elastic band (NEB)10,1315) and molecular dynamics (MD).1621) Both approaches require the force field provided by either first-principles calculations or classical interatomic potentials. Density functional theory (DFT)-based NEB22) method and ab initio MD calculations can provide reliable descriptions of diffusion mechanisms in solids at an atomic level. However, the application of DFT is limited by its high computational demand, which may be critical for studying atomic diffusion in amorphous solids because modeling realistic amorphous structures typically requires the use of relatively large supercells, and the complexity of local atomic environment significantly exceeds those in the crystal. On the other hand, the simulations using classical interatomic potential with physically motivated simple forms (such as Lennard-Jones, Morse, Tersoff,23) EAM,24) and ReaxFF25) ones) allow relatively fast determination of atomic migration trajectories (as compared to DFT) since it does not take into account electronic states. However, its accuracy remains limited because of the simple formula utilized for describing interatomic interactions.

In order to achieve sufficient accuracy combined with high computational speed during simulations of microscopic atomic movements, various approaches have been proposed. For example, several researchers tried to fit the interatomic potentials used in classical MD simulations with the results of ab initio calculations.26,27) Despite the improvement of the interatomic potential accuracy, its degree was significantly restricted by the simplicity of the potential formula. As a result, the obtained accuracy was not sufficient for many applications, including modeling atomic diffusion in amorphous structures.

Recently, a new method, which involves the construction of interatomic potentials via machine learning techniques to determine the relationship between the studied structure and its corresponding energy, has attracted much attention as a promising way of achieving high computational accuracy and speed. Usually, the results of first-principles calculations are used as the “training” data for the learning process. After sufficient training, the method becomes capable of predicting structural energies that were not included in the training data set. Such machine learning-based interatomic potentials were found to be much more accurate than conventional interatomic potentials with simple formulas and thus were utilized for simulating various materials. The techniques used for this purpose included neural networks (NNs),2837) kernel regression,38,39) Gaussian process regression,40) and linear regression.41,42)

Among these methods, NNs appear to be especially promising, as has been demonstrated by Behler and coworkers.30,33,43,44) They evaluated the performance of the NN potentials constructed for Si,30) C,35) Cu,31) ZnO,33) TiO2,43) H2O dimers,45) Cu clusters supported on Zn oxide,32) and Au/Cu nanoparticles with water molecular.44,46) However, the reported potentials were complicated and contained several thousands of various parameters, which required a relatively large amount of training data, in particular for the systems including more than one atomic species.

In this study, we have developed a new NN potential to examine the diffusion behavior of a single Cu atom in amorphous Ta2O5. Although the described system contains three different atomic elements, the utilized NN potential is much simpler than those proposed previously for ternary systems.32) We investigate the Cu diffusion in the amorphous Ta2O5 because this is a very important process in the operation of Cu/Ta2O5/Pt atomic switch. Considering the facts that Cu filaments grows/shrinks in Ta2O5 during the operations and that the network of Ta–O polyhedrons is formed in amorphous Ta2O5, it is believed that the single Cu migration via interstitial sites must be a primary process during the operation.1,47,48) More complicated processes involving other Cu, Ta, and/or O atoms may play roles at certain stages during the operations, but the consideration of such processes is out of scope of the present study. The outline of this paper is as follows. Section 2 describes the details of constructing the NN potential. The potential energy surface of Cu, diffusion paths, and barrier energies obtained using the NN potential (which are compared with the results of DFT calculations) are presented in Sect. 3. In Sect. 4, the diffusion behavior of Cu is further examined by calculating the diffusion network, diffusion coefficients, and effective diffusion activation energy. The obtained conclusions are provided in Sect. 5.

2. Method

Owing to the high accuracy of high-dimensional NN potentials, they have been extensively used in atomic modeling.34,35,49) However, their application may be restricted by the high computational demand of training data, which usually require many hours of DFT calculations. For this reason, a simple NN potential is proposed in this work to investigate the diffusion of a single atom in amorphous materials.

The main idea of the simplified NN potential utilized in this study is to express the total energy of the structure containing a relaxed amorphous Ta2O5 matrix and a single Cu atom, \(E_{\text{Ta2O5+Cu}}\), as the sum of the energies of the pure Ta2O5 matrix, \(E_{\text{Ta2O5}}\), and dissolution energy of the Cu atom, \(E_{\text{dissolution}}\). The second term \(E_{\text{dissolution}}\) can be further subdivided into two parts: 1) the energy change when the Cu atom is inserted into the frozen host matrix, \(E_{\text{insert}}\), and 2) the energy change due to the host matrix relaxation (i.e., the displacements of nearby Ta and O atoms caused by the insertion of Cu atom), \(E_{\text{relax}}\): \begin{align} E_{\text{Ta$_{2}$O$_{5}$+Cu}}&=E_{\text{Ta$_{2}$O$_{5}$}}+E_{\text{dissolution}}\notag\\ &=E_{\text{Ta$_{2}$O$_{5}$}}+E_{\text{insert}}+E_{\text{relax}}. \end{align} (1) The value of the first term, \(E_{\text{Ta2O5}}\), can be directly obtained by performing DFT calculations, while the NN approach is focused on predicting the value of \(E_{\text{dissolution}}\). The value of \(E_{\text{insert}}\) is determined from the atomic arrangements around the inserted Cu atom before the relaxation, while that of \(E_{\text{relax}}\) is determined from the matrix structure after the relaxation. In this work, however, we implicitly assumed the positional shifts of atoms in the host Ta2O5 matrix in structure optimization were not significant. In this case, the magnitudes of both \(E_{\text{insert}}\) and \(E_{\text{relax}}\), and thus that of \(E_{\text{dissolution}}\) can be approximated as a function of atomic arrangements around the inserted Cu atom before the relaxation. Fortunately, this assumption appears to be valid for NEB calculations since the atomic displacements in the host matrix are usually smaller than 0.5 Å. Then we can forget about the host matrix structure after the relaxation and consider only the initial host structure and the Cu insertion position explicitly. It should be noted that although the matrix relaxation was not actually preformed in calculations using the constructed NN potential, the energy after matrix relaxation was directly predicted, because \(E_{\text{relax}}\) calculated with in the DFT was taken into account in the training of NN. In another word, we expect that the effects of Ta2O5 relaxation after the Cu insertion were automatically taken into account in the NN potential.

The use of a simplified NN potential considerably reduces the complexity of the computational procedures utilized in previous works because it contains only one NN, while the other high-dimensional NN potentials consist of multiple NNs corresponding to different species. The number of parameters in a simplified NN potential is equal to several hundred, which is much smaller than that in a typical NN potential. The details of the utilized simplified NN potential will be provided in the following sections.

NN input

The symmetry functions proposed by Behler et al. were used to describe the atomic environment of Cu.50,51) They provide a unique description of the atomic positions and contain a constant number of functions independent of the number of atoms in the supercell, which make them suitable input coordinates for the atomic NNs. In the present work, the following two kinds of symmetry functions (radial and angular ones) were used: \begin{align} G^{1}&=\sum_{i}e^{-\eta R_{i}^{2}}f_{c}(R_{i}), \end{align} (2) \begin{align} G^{2}&=2^{1-\zeta}\sum_{i}\sum_{j\neq i}(1+\lambda\cos\theta_{ij})^{\zeta}\notag\\ &\quad\times e^{-\eta(R_{i}^{2}+R_{j}^{2})}f_{c}(R_{i})f_{c}(R_{j}). \end{align} (3) Here \(R_{i}\) is the interatomic distance between the insertion position of Cu and atom i, and \(\theta_{ij}\) is the angle formed by the Cu vertex and atoms i and j. \(f_{c}\) is the cutoff function that defines the local environment sphere: \begin{equation} f_{c}= \begin{cases} 0.5\times\left[\cos\left(\dfrac{\pi R_{i}}{R_{c}}\right)+1\right] &\text{for $R_{i}\leq R_{c}$}\\ 0 &\text{for $R_{i}>R_{c}$}\end{cases}, \end{equation} (4) where \(R_{c}\) is the cutoff distance. Generally speaking, the cutoff distance should be large enough to include the interatomic interactions sufficiently. In the present study, the cutoff distance \(R_{c}\) is set to be 8 Å according to the convergence test (for details, see Table S2 in Supplemental Material).52) The detailed description of the utilized symmetry functions can be found in Refs. 50 and 51.

NN output

The NN defines a functional relation between the position of a Cu atom in the amorphous matrix (which is described by the symmetry functions in the nodes of the input layer) and the dissolution energy of the Cu atom \(E_{\text{dissolution}}\), which is obtained in the node of the output layer. After the value of \(E_{\text{dissolution}}\) is calculated, the total energy of the structure can be easily computed by adding the energy of the pure amorphous matrix \(E_{\text{Ta2O5}}\). As we mentioned previously, in the training procedure of the present NN, the pre-relaxation structures were used as the input, while the corresponding energies after matrix relaxation were used as the target. Thus, the effects of Ta2O5 relaxation after Cu insertion were automatically incorporated into the training process of NN. Similar to the high-dimensional NN potential, the force component acting on the Cu atom \(F_{\alpha}\) with respect to the coordinate \(\alpha=x\), y, or z can be evaluated according to the chain rule: \begin{equation} F_{\alpha}=-\frac{\partial E}{\partial\alpha}=-\sum_{\mu}\frac{\partial E}{\partial G_{\mu}}\times\frac{\partial\mu}{\partial\alpha}. \end{equation} (5)

Training data

Vienna ab initio simulation package (VASP)53,54) was utilized to model the amorphous Ta2O5 structure and perform all the required DFT calculations for training and validating the NN potential. The projector augmented wave (PAW) method was adopted to treat atomic core electrons, while the generalized gradient approximation (PW91) was adopted to describe the electron–electron interactions.55,56) In addition, a plane-wave basis set with a cutoff energy of 500 eV was used.

A model for the amorphous Ta2O5 matrix57) was constructed using the melt quenching method under the conditions specified in the Ref. 5. The training data utilized for training the NN potential were generated as follows. First, one Cu atom was randomly inserted into the amorphous Ta2O5, and the resulting structure was optimized while keeping the Cu atom position fixed. During optimization, \(1{\times} 1{\times} 2\) k-points were used for all structures, and the force tolerance was equal to 0.05 eV/Å. At last, the Cu dissolution energy \(E_{\text{dissolution}}\) at this position can be obtained by subtracting the energy of pure Ta2O5 matrix from the final total energy.

3. Performance of NN Potential
Energy calculations

To construct the simplified NN potential, 2000 training data were obtained via DFT. The resulting data was randomly split into a training set (90%) and an independent testing set (10%). The training set was used to fit the parameters of the NN, which consisted of two hidden layers with 9 nodes each, one input layer with 36 nodes, and one output node. The NN parameters (corresponding to the number of hidden layers and number of nodes in each layer) were determined by performing a special testing procedure.58)

The optimization of NN potential parameters was done iteratively with the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm. In each iteration during the optimization, the root-mean-square errors (RMSEs) in both the training set and testing set were monitored. The training process was truncated when testing RMSE stops decreasing. The learning curve of such NN potential was shown in Supplemental Material (Fig. S2).59) The RMSEs for energies after training process were 23 and 39 meV per structure for the reference training and independent testing set, respectively. The mean absolute errors (MAEs) obtained for the training and test structures were 16 and 21 meV/structure, respectively. According to the results presented in Fig. 1, the energies of the training and test structures calculated using the NN potential matched the values obtained via DFT simulations (the corresponding correlation coefficients were equal to 0.998 and 0.997, respectively). In contrast with the high-dimensional NN potentials, whose remaining total energy errors are typically under 5 meV/atom, the accuracy of simplified NN potential appears to be worse. However, it should be pointed out that the RMSEs in the present study are given as values per whole supercell, while they are given as the values per atom in the high-dimensional NN potentials. So the comparison of the potential accuracy between the simplified and high-dimensional NN potentials is not straightforward. On the other hand, the fitting accuracy of NN can be constantly improved by increasing training data at the expense of additional DFT calculations. Considering the fact that the satisfactory accuracy was obtained for diffusion barriers (which will be shown in Sect. 3.2), we decided to perform the following investigation based on the present NN potential.

Figure 1. (Color online) A comparison between the total energies of the training set and testing set structures obtained using the DFT and NN approaches. The inset contains the MAE distributions (\(|\mathrm{E}_{\text{NNP}}-\mathrm{E}_{\text{DFT}}|\)) calculated for these two data sets.

It should be noted that the calculation of the total energy of a single geometric configuration, which was conducted using the current implementation of the NN potential model on a single CPU core (Intel® Xeon® W3565 processor), took only 0.001 s (for comparison, performing the same task using the VASP code and 12 CPU cores of the same type took about 1 h). Hence, the computational speed obtained using the NN potential was \(10^{7}\) times greater than that observed during DFT simulations.

NEB calculations

To test the reliability of the NN potential for describing the atomic diffusion process, the former was used to predict a long path of the Cu diffusion in the amorphous Ta2O5 matrix and the corresponding energy profile via NEB simulations. During this procedure, the overall motion of the Cu atom was described by several hops between the equilibrium sites. The locations of images after optimization (representing the diffusion path) are shown in Fig. 2(a). The obtained NN potential was in good agreement with the DFT results. The lengths of the diffusion paths defined as the sums of distances between the adjacent Cu transient positions were equal to 28.6 and 28.9 Å for the NN potential and DFT simulations, respectively. The corresponding energy variations along the long-range diffusion path, which are plotted against the diffusion length in Fig. 2(b), also demonstrate a good agreement between the two utilized methods. The largest mismatch between the data obtained via the NN potential and DFT calculations was about 0.15 eV, which corresponded to the saddle point of the highest barrier. Apart from the inherent inaccuracy of the NN potential method, a large discrepancy observed for this point might also result from the NEB images that missed the saddle point of the minimum energy path, which represented a common problem for NEB calculations and could be fixed by using a climbing image NEB method.60) It is worth noting that the range of Cu–O coordination number is wider in the training structures than the transient images and the range of Cu–Ta coordination number is the same between them (for details, see Sect. B of Supplemental Material).61) This suggests that the training dataset includes a variety of local environments sufficient to describe the transient image structures, which may be one of the reasons why the simplified NN potential method can achieve a good accuracy in the NEB calculation.

Figure 2. (Color online) A comparison between the NEB calculation results obtained for the selected diffusion pathway of a Cu atom in amorphous bulk Ta2O5 using the DFT and NN potential approaches. (a) Transient images of the Cu atom obtained during the NEB calculations, from which the locations of diffusion pathways can be acquired. The convex polyhedrons represent Ta–O species in the amorphous matrix, while O and Ta atoms are not shown for clarity. (b) Energy profiles computed along the corresponding diffusion paths.

Advantages and disadvantages of simplified NN potential

The advantages of the simplified NN potential are as follows: 1) it is based on the multi-layer perceptron NN, which can be easily constructed using various open-source codes; and 2) the number of data sets required for training the NN is about 1–2 orders of magnitude smaller than that utilized for a typical ternary high-dimensional NN potential.

However, the proposed simplified method also has limitations. First, it cannot be extended to MD techniques because of the absence of atomic descriptions (especially forces acting on atoms) in the host matrix. Second, although the NN potential can be trained in a particular amorphous network, it is often desirable to transfer it to another amorphous matrix. However, it was found that the transferability of the simplified potential was not very high (except for the cases with a very similar matrix structure) concerning the energy of the system including a Cu atom inserted at a randomly chosen point.62) For the calculation of metastable Cu positions, migration paths and migration energy barriers, the transferability seems much better, but in some cases the errors are considerably large.62) We expect that the transferability of the proposed method can be significantly improved by training the NN potential with a wide variety of the local environments of Cu atoms (which would require the use of various amorphous host matrices) and/or larger cutoff radii of the symmetry functions (this hypothesis will be tested in future studies).

On the basis of the above, we think that the NN potential method proposed in this paper can be complementary to the high-dimensional NN potential in studying atomic diffusion. Highly reliable description of atomic diffusion behavior is possible by a well-constructed high-dimensional NN potential, while at least rough overview can be obtained by the present NN potential with much lower computational cost.

4. Cu Diffusion in a-Ta2O5
Metastable interstitial positions

Determining potential equilibrium sites in amorphous bulk materials is a challenging task because of structural disorder. To avoid bias, it is important to locate all interstitial sites without making assumptions about their structure. Owing to the high computation speed of the NN potential, all possible Cu positions in the amorphous Ta2O5 matrix structure were investigated in this work. First, Cu atoms were inserted at each of the points defining the \(50{\times} 50{\times} 50\) uniform grid in the matrix supercell, and their positions were optimized using the BFGS algorithm. After optimization, many of the final Cu positions were close both spatially and energetically. This phenomenon represented an artifact originated from the lax force convergence criterion, which terminated Cu optimization slightly before reaching energy minimum. Therefore, hierarchical clustering analysis was performed for the optimized interstitial positions using the distances between two Cu atoms as the metric63) (it should be noted that the described technique was used in previous studies to search for equilibrium sites in amorphous bulk materials64,65)). As a result, the calculated interstitial positions were successively merged until all the distances between clusters became longer than the cutoff distance of 0.5 Å (during this procedure, 29 different clusters were obtained). For each cluster, the energy variance of all configurations was below 0.04 eV, and the lowest energy configuration was used to represent the metastable Cu interstitial position.

Diffusion network

According to the results obtained in the previous section, 29 metastable interstitial positions of Cu atoms were determined using the NN potential. Assuming that a Cu atom can hop between the two adjacent interstitial positions with a distance between them less than 5 Å, 67 diffusion paths have been obtained (during the search, the paths that can be considered combinations of two or more individual pathways were eliminated). After that, the activation energies of the resulting pathways were determined using the NEB method (the spatial locations of all the obtained diffusion paths are depicted in Fig. 3).

Figure 3. (Color online) Equilibrium sites (blue spheres) and diffusion pathways (blue lines) determined for the Cu atoms in the bulk amorphous Ta2O5 structure. The polyhedrons represent Ta–O polyhedrons, while O and Ta atoms have been removed for clarity.

For each individual diffusion path, two different activation energies corresponding to the two opposite hopping directions were computed. In general, for the path connecting sites i and j with the saddle point energy \(E_{ij}\) (which represents the maximum energy of the NEB energy profile), the activation energy of Cu diffusion from i to j is equal to \(E_{ij}\)\(E_{i}\) (here \(E_{i}\) is the energy of site i), while the activation energy of the diffusion in the opposite direction equals \(E_{ij}\)\(E_{j}\). The calculated barrier energies of the diffusion paths span across a wide range between 0.01 and 1.79 eV with an average value equal to 0.55 eV, while their lengths vary from 0.91 to 4.98 Å. Figure 4 displays the distribution of barrier energies determined for single hops. Although most of the pathways exhibit relatively small barrier energies of less than 0.2 eV, it does not imply that the corresponding Cu diffusion activation energies are very low because a particular Cu atom may not be able to escape from the narrow region between the sites connected via the paths with such small energies. For this reason, the periodical diffusion paths (which are defined as the pathways connecting equilibrium sites and their equivalents in the adjacent supercells) have been sampled. As a result, the shortest pathways (determined from the obtained number of nearest-neighbor hops) and the second shortest ones were selected. The barrier energy of a periodical path was equivalent to the largest barrier energy calculated for individual hops. The results of the statistical analysis summarized in Fig. 4 reveal that the obtained barrier energies of the periodical paths vary from 0.68 to 1.79 eV. Interestingly, the lowest value (0.68 eV) corresponds to the activation energy of Cu diffusion in the diffusion network obtained via kinetic Monte Carlo (KMC) simulations, which will be described in detail in the next subsection.

Figure 4. (Color online) Distributions of the barrier energies calculated for all diffusion pathways of atomic Cu in bulk amorphous Ta2O5. The black columns correspond to individual hops (single NEB calculations), and the red columns represent the periodical paths, which connect equilibrium sites with their equivalent sites in the adjacent supercells. The frequencies were normalized with respect to the total number of diffusion pathways.

KMC simulations of Cu diffusion

The diffusivity and diffusion activation energy of Cu atoms were examined by performing KMC simulations. During the utilized procedure, the probability for the Cu atom hopping from interstitial site i to site j was determined using the harmonic transition state theory: \begin{equation} \mathrm{p}=k\exp\left(-\frac{\Delta E_{ij}}{k_{\text{B}}T}\right), \end{equation} (6) where the prefactor k represents the jump frequency (equal to about \(10^{13}\) s−1), and \(\Delta E_{ij}\) is the hopping barrier between sites i and j. At the beginning of the simulations, Cu atoms were distributed randomly among the interstitial sites according to the Boltzmann-weighted energy relationship. Afterwards, the initial state was equilibrated by running 5000 MC events per atom. After reaching equilibrium, additional 10 million MC events were run for 1000 atoms, and the self-diffusivity of Cu atoms was evaluated from their respective trajectories using the Einstein expression: \begin{equation} \mathrm{D}=\lim_{t\to\infty}\frac{1}{6t}\langle|\boldsymbol{{r}}(t)-\boldsymbol{{r}}(0)|^{2}\rangle. \end{equation} (7) Figure 5(a) describes the self-diffusivity of Cu atoms in a-Ta2O5 estimated at temperatures ranging from 500 to 1400 K. After fitting the obtained temperature dependence of the Cu self-diffusivity with the Arrhenius expression \(\mathrm{D}=\mathrm{D}_{0}\exp(-E_{\text{eff}}/k_{\text{B}}T)\), where D0 and \(E_{\text{eff}}\) were the diffusion prefactor and effective activation energy, respectively, slight differences between the activation energies calculated for the low-temperature (500–800 K) and high-temperature (900–1500 K) regions were observed. The activation energy at low temperature is 0.67 eV, which is 0.2 eV lower than the high temperature activation energy 0.87 eV. In addition, the obtained low-temperature activation energy values matched the lowest barrier energies of the periodical paths obtained in the previous subsection (4.2), suggesting that at low temperatures, one or several periodical paths with the lowest barrier energies determined the hopping behavior of Cu atoms, while the contribution of the pathways with higher barriers [which were calculated via Eq. (6)] increased rapidly with increasing temperature.

Figure 5. (Color online) (a) Cu self-diffusivity in the amorphous bulk Ta2O5 structure calculated from the results of KMC simulations. The two black lines represent the linear fits of the simulation data obtained at high temperatures (900–1400 K) and low temperatures (500–800 K). (b) The comparison of Cu diffusivity between experimental observations and KMC simulation. The data of experiments 1 and 2 are taken from Refs. 6 and 7, respectively.

Lastly, we compared the diffusion behaviors extracted from KMC simulation with the experimental observations. Two independent experiments about the Cu diffusion in amorphous Ta2O5 thin film have been reported.6,7) The diffusivities evaluated from the experiments are shown in Fig. 5(b), as well as the KMC simulation results in the same temperature range. In the first experiment, the Cu diffusivity at 200–500 °C was measured with secondary ion mass spectroscopy.6) By fitting the measured data to the Arrhenius relation, the activation energy for Cu diffusion is estimated to be 0.64 eV, while the room temperature diffusivity is estimated to be \(4.9\times 10^{-21}\) cm2/s. In the other experiment, the Cu diffusivity in amorphous Ta2O5 film deposited by electron-beam was evaluated to be \(10^{-13}\) cm2/s from cyclic voltammogram measurements.7) The discrepancy of about eight orders of magnitude in the room temperature diffusivity may be ascribed to the difference in the sample thickness and the structure of the deposited Ta2O5 film.7) Though the large discrepancy between the two experiments makes the quantitative comparison with our simulation results difficult, we may be able to say that our KMC results are reasonable. Firstly, the low temperature activation energy obtained from KMC simulation, 0.67 eV, is close to the one estimated from the first experiment, 0.64 eV. Secondly, the room temperature diffusivity extrapolated from the KMC results, \(5.7\times 10^{-17}\) cm2/s, is in between the two experimental estimates.

5. Conclusion Remarks

In this work, a simplified NN potential was developed to investigate the diffusion behavior of a single Cu atom in amorphous Ta2O5 structure. In contrast to the NN potentials that were used for describing binary and ternary systems in previous studies, the proposed technique was found to be much simpler, easier to use, and required much less training data. The reliability of the utilized NN potential was validated via energy predictions and NEB calculations. Thus, its RMSE for energy predictions was equal to 23 meV/structure for the training set and 39 meV/structure for the testing set, and the NEB calculation results obtained for selected diffusion pathways were in good agreement with the DFT simulation data.

Using the developed NN potential, the equilibrium positions and diffusion paths of Cu atoms in the amorphous Ta2O5 structure were located and characterized. The obtained energy barriers of single-hop paths ranged between 0.01 and 1.79 eV. The lowest energy barrier determined for the periodical pathways connecting equilibrium interstitial sites and their equivalent sites in the adjacent supercells was 0.68 eV. This magnitude matched the Cu diffusion activation energy calculated from the Arrhenius plot of the low-temperature self-diffusivity values generated via KMC simulations (0.67 eV) as well as the number obtained from experimental observations (0.64 eV). The conducted KMC simulations also revealed that the Cu diffusion activation energy obtained at high temperatures (0.87 eV) was larger than the magnitude calculated at low temperatures. Though the proposed simplified NN potential has disadvantages such as the inability to perform molecular dynamics simulations, it can be used as an important complementary technique, especially for studying atomic diffusion processes.


This work is partially supported by JSPS KAKENHI Grant Number JP16H01535, the Support Program for Starting up Innovation Hub funded by the Japan Science and Technology Agency (JST), and the Core Research for Evolutional Science and Technology program of the Japan Science and Technology Agency (CREST–JST). Some computations were performed using the facilities of the Supercomputer Center of the Institute of Solid State Physics, University of Tokyo.


  • 1 T. Hasegawa, K. Terabe, T. Tsuruoka, and M. Aono, Adv. Mater. 24, 252 (2012). 10.1002/adma.201102597 CrossrefGoogle Scholar
  • 2 J. J. Yang, M. D. Pickett, X. Li, D. A. A. Ohlberg, D. R. Stewart, and R. S. Williams, Nat. Nanotechnol. 3, 429 (2008). 10.1038/nnano.2008.160 CrossrefGoogle Scholar
  • 3 J. Yao, L. Zhong, D. Natelson, and J. M. Tour, Sci. Rep. 2, 242 (2012). 10.1038/srep00242 CrossrefGoogle Scholar
  • 4 Y. Yang, P. Gao, S. Gaba, T. Chang, X. Pan, and W. Lu, Nat. Commun. 3, 732 (2012). 10.1038/ncomms1737 CrossrefGoogle Scholar
  • 5 B. Xiao, T. Gu, T. Tada, and S. Watanabe, J. Appl. Phys. 115, 034503 (2014). 10.1063/1.4861724 CrossrefGoogle Scholar
  • 6 N. Banno, T. Sakamoto, N. Iguchi, H. Sunamura, K. Terabe, T. Hasegawa, and M. Aono, IEEE Trans. Electron Devices 55, 3283 (2008). 10.1109/TED.2008.2004246 CrossrefGoogle Scholar
  • 7 T. Tsuruoka, I. Valov, S. Tappertzhofen, J. van den Hurk, T. Hasegawa, R. Waser, and M. Aono, Adv. Funct. Mater. 25, 6374 (2015). 10.1002/adfm.201500853 CrossrefGoogle Scholar
  • 8 A. Urban, D.-H. Seo, and G. Ceder, npj Comput. Mater. 2, 16002 (2016). 10.1038/npjcompumats.2016.2 CrossrefGoogle Scholar
  • 9 Z. Deng, Y. Mo, and S. P. Ong, NPG Asia Mater. 8, e254 (2016). 10.1038/am.2016.7 CrossrefGoogle Scholar
  • 10 Y. A. Du and N. A. W. Holzwarth, Phys. Rev. B 76, 174302 (2007). 10.1103/PhysRevB.76.174302 CrossrefGoogle Scholar
  • 11 H. Wu, G. Chan, J. W. Choi, I. Ryu, Y. Yao, M. T. McDowell, S. W. Lee, A. Jackson, Y. Yang, L. Hu, and Y. Cui, Nat. Nanotechnol. 7, 310 (2012). 10.1038/nnano.2012.35 CrossrefGoogle Scholar
  • 12 P. Verma, P. Maire, and P. Novák, Electrochim. Acta 55, 6332 (2010). 10.1016/j.electacta.2010.05.072 CrossrefGoogle Scholar
  • 13 Z. Wang, T. Gu, T. Kadohira, T. Tada, and S. Watanabe, J. Chem. Phys. 128, 014704 (2008). 10.1063/1.2814245 CrossrefGoogle Scholar
  • 14 T. Kouno and S. Ogata, J. Phys. Soc. Jpn. 77, 054708 (2008). 10.1143/JPSJ.77.054708 LinkGoogle Scholar
  • 15 M. Nakayama, M. Kimura, R. Jalem, and T. Kasuga, Jpn. J. Appl. Phys. 55, 01AH05 (2016). 10.7567/JJAP.55.01AH05 CrossrefGoogle Scholar
  • 16 E. H. Brandt, J. Phys.: Condens. Matter 1, 10003 (1989). 10.1088/0953-8984/1/50/003 CrossrefGoogle Scholar
  • 17 Y. Deng, C. Eames, J.-N. Chotard, F. Lalère, V. Seznec, S. Emge, O. Pecher, C. P. Grey, C. Masquelier, and M. S. Islam, J. Am. Chem. Soc. 137, 9136 (2015). 10.1021/jacs.5b04444 CrossrefGoogle Scholar
  • 18 G. Pranami and M. H. Lamm, J. Chem. Theory Comput. 11, 4586 (2015). 10.1021/acs.jctc.5b00574 CrossrefGoogle Scholar
  • 19 F. Shimojo and M. Aniya, J. Phys. Soc. Jpn. 74, 1224 (2005). 10.1143/JPSJ.74.1224 LinkGoogle Scholar
  • 20 F. Shimojo and M. Aniya, J. Phys. Soc. Jpn. 72, 2702 (2003). 10.1143/JPSJ.72.2702 LinkGoogle Scholar
  • 21 O. Kamishima, Y. Iwai, T. Hattori, K. Kawamura, and J. Kawamura, J. Phys. Soc. Jpn. 79, 33 (2010). 10.1143/JPSJS.79SA.33 LinkGoogle Scholar
  • 22 Berne, G. Cicootti, and D. Coker, Classical and Quantum Dynamics in Condensed Phase Simulations (World Scientific, Singapore, 1998). CrossrefGoogle Scholar
  • 23 J. Tersoff, Phys. Rev. B 39, 5566 (1989). 10.1103/PhysRevB.39.5566 CrossrefGoogle Scholar
  • 24 M. S. Daw and M. I. Baskes, Phys. Rev. B 29, 6443 (1984). 10.1103/PhysRevB.29.6443 CrossrefGoogle Scholar
  • 25 T. P. Senftle, S. Hong, M. M. Islam, S. B. Kylasa, Y. Zheng, Y. K. Shin, C. Junkermeier, R. Engel-Herbert, M. J. Janik, H. M. Aktulga, T. Verstraelen, A. Grama, and A. C. T. van Duin, npj Comput. Mater. 2, 15011 (2016). 10.1038/npjcompumats.2015.11 CrossrefGoogle Scholar
  • 26 S. Tsuneyuki, Y. Matsui, H. Aoki, and M. Tsukada, Nature 339, 209 (1989). 10.1038/339209a0 CrossrefGoogle Scholar
  • 27 J. R. Hill and J. Sauer, J. Phys. Chem. 98, 1238 (1994). 10.1021/j100055a032 CrossrefGoogle Scholar
  • 28 K. V. J. Jose, N. Artrith, and J. Behler, J. Chem. Phys. 136, 194111 (2012). 10.1063/1.4712397 CrossrefGoogle Scholar
  • 29 J. Behler, S. Lorenz, and K. Reuter, J. Chem. Phys. 127, 014705 (2007). 10.1063/1.2746232 CrossrefGoogle Scholar
  • 30 J. Behler and M. Parrinello, Phys. Rev. Lett. 98, 146401 (2007). 10.1103/PhysRevLett.98.146401 CrossrefGoogle Scholar
  • 31 N. Artrith and J. Behler, Phys. Rev. B 85, 045439 (2012). 10.1103/PhysRevB.85.045439 CrossrefGoogle Scholar
  • 32 N. Artrith, B. Hiller, and J. Behler, Phys. Status Solidi 250, 1191 (2013). 10.1002/pssb.201248370 CrossrefGoogle Scholar
  • 33 N. Artrith, T. Morawietz, and J. Behler, Phys. Rev. B 83, 153101 (2011). 10.1103/PhysRevB.83.153101 CrossrefGoogle Scholar
  • 34 J. Behler, R. Martoňák, D. Donadio, and M. Parrinello, Phys. Rev. Lett. 100, 185501 (2008). 10.1103/PhysRevLett.100.185501 CrossrefGoogle Scholar
  • 35 R. Z. Khaliullin, H. Eshet, T. D. Kühne, J. Behler, and M. Parrinello, Nat. Mater. 10, 693 (2011). 10.1038/nmat3078 CrossrefGoogle Scholar
  • 36 R. Z. Khaliullin, H. Eshet, T. D. Kühne, J. Behler, and M. Parrinello, Phys. Rev. B 81, 100103 (2010). 10.1103/PhysRevB.81.100103 CrossrefGoogle Scholar
  • 37 J. Behler, J. Chem. Phys. 145, 170901 (2016). 10.1063/1.4966192 CrossrefGoogle Scholar
  • 38 M. Rupp, A. Tkatchenko, K.-R. Müller, and O. A. von Lilienfeld, Phys. Rev. Lett. 108, 058301 (2012). 10.1103/PhysRevLett.108.058301 CrossrefGoogle Scholar
  • 39 T. Suzuki, R. Tamura, and T. Miyazaki, Int. J. Quantum Chem. 117, 33 (2017). 10.1002/qua.25307 CrossrefGoogle Scholar
  • 40 A. P. Bartók, M. C. Payne, R. Kondor, and G. Csányi, Phys. Rev. Lett. 104, 136403 (2010). 10.1103/PhysRevLett.104.136403 CrossrefGoogle Scholar
  • 41 A. Seko, A. Takahashi, and I. Tanaka, Phys. Rev. B 90, 024101 (2014). 10.1103/PhysRevB.90.024101 CrossrefGoogle Scholar
  • 42 A. Seko, A. Takahashi, and I. Tanaka, Phys. Rev. B 92, 054113 (2015). 10.1103/PhysRevB.92.054113 CrossrefGoogle Scholar
  • 43 N. Artrith and A. Urban, Comput. Mater. Sci. 114, 135 (2016). 10.1016/j.commatsci.2015.11.047 CrossrefGoogle Scholar
  • 44 N. Artrith and A. M. Kolpak, Comput. Mater. Sci. 110, 20 (2015). 10.1016/j.commatsci.2015.07.046 CrossrefGoogle Scholar
  • 45 T. Morawietz, V. Sharma, and J. Behler, J. Chem. Phys. 136, 064103 (2012). 10.1063/1.3682557 CrossrefGoogle Scholar
  • 46 N. Artrith and A. M. Kolpak, Nano Lett. 14, 2670 (2014). 10.1021/nl5005674 CrossrefGoogle Scholar
  • 47 T. Tsuruoka, K. Terabe, T. Hasegawa, and M. Aono, Nanotechnology 21, 425205 (2010). 10.1088/0957-4484/21/42/425205 CrossrefGoogle Scholar
  • 48 T. Tsuruoka, K. Terabe, T. Hasegawa, and M. Aono, Nanotechnology 22, 254013 (2011). 10.1088/0957-4484/22/25/254013 CrossrefGoogle Scholar
  • 49 G. C. Sosso, G. Miceli, S. Caravati, F. Giberti, J. Behler, and M. Bernasconi, J. Phys. Chem. Lett. 4, 4241 (2013). 10.1021/jz402268v CrossrefGoogle Scholar
  • 50 J. Behler, J. Chem. Phys. 134, 074106 (2011). 10.1063/1.3553717 CrossrefGoogle Scholar
  • 51 S. Kondati Natarajan, T. Morawietz, and J. Behler, Phys. Chem. Chem. Phys. 17, 8356 (2015). 10.1039/C4CP04751F CrossrefGoogle Scholar
  • 52 (Supplemental Material) The convergence test of the cutoff distance of the symmetry function is availiable online. Google Scholar
  • 53 G. Kresse and J. Furthmüller, Comput. Mater. Sci. 6, 15 (1996). 10.1016/0927-0256(96)00008-0 CrossrefGoogle Scholar
  • 54 G. Kresse and J. Furthmüller, Phys. Rev. B 54, 11169 (1996). 10.1103/PhysRevB.54.11169 CrossrefGoogle Scholar
  • 55 G. Kresse and D. Joubert, Phys. Rev. B 59, 1758 (1999). 10.1103/PhysRevB.59.1758 CrossrefGoogle Scholar
  • 56 Y. Wang and J. P. Perdew, Phys. Rev. B 44, 13298 (1991). 10.1103/PhysRevB.44.13298 CrossrefGoogle Scholar
  • 57 (Supplemental Material) A detailed description of amorphous Ta2O5 structure is available online. Google Scholar
  • 58 (Supplemental Material) A detail description about how the neural network parameters are determined is available online. Google Scholar
  • 59 (Supplemental Material) The learning curve of the neural network potential is shown online as Fig. S2. Google Scholar
  • 60 G. Henkelman, B. P. Uberuaga, and H. Jónsson, J. Chem. Phys. 113, 9901 (2000). 10.1063/1.1329672 CrossrefGoogle Scholar
  • 61 (Supplemental Material) The Cu–O and Cu–Ta coordination numbers of the training structures and transient images are shown online as Fig. S1. Google Scholar
  • 62 (Supplemental Material) A detailed description of transferability of neural network potential is available online. Google Scholar
  • 63 P. Langfelder, B. Zhang, and S. Horvath, Bioinformatics 24, 719 (2008). 10.1093/bioinformatics/btm563 CrossrefGoogle Scholar
  • 64 G. A. Tritsaris, K. Zhao, O. U. Okeke, and E. Kaxiras, J. Phys. Chem. C 116, 22212 (2012). 10.1021/jp307221q CrossrefGoogle Scholar
  • 65 S. Hao and C. Wolverton, J. Phys. Chem. C 117, 8009 (2013). 10.1021/jp311982d CrossrefGoogle Scholar

Cited by

View all 21 citing articles

no accessAnalysis of Kohn–Sham Eigenfunctions Using a Convolutional Neural Network in Simulations of the Metal–Insulator Transition in Doped Semiconductors

094001, 10.7566/JPSJ.90.094001

Drawing Phase Diagrams of Random Quantum Systems by Deep Learning the Wave Functions

022001, 10.7566/JPSJ.89.022001

Machine Learning Forces Trained by Gaussian Process in Liquid States: Transferability to Temperature and Pressure

044601, 10.7566/JPSJ.88.044601

free accessCollaboration between Third and Fourth Paradigms Reveals Complex Atomic Processes

11, 10.7566/JPSJNC.14.11