 Full text:
 PDF (eReader) / PDF (Download) (776 kB)
We verified that the deep learning method named reading periodic table introduced by ref. Deep Learning Model for Finding New Superconductors [T. Konno et al., arXiv:1812.01995], which utilizes deep learning to read the periodic table and the laws of the elements, is applicable not only for superconductors, for which the method was originally applied, but also for other problems of materials by demonstrating band gap estimations. We then extended the method to learn the laws better by directly learning the cylindrical periodicity between the right and leftmost columns in the periodic table at the learning representation level, that is, by considering the left and rightmost columns to be adjacent to each other. Thus, while the original method handles the table as is, the extended method treats the periodic table as if its two edges are connected. This is achieved using novel layers named periodic convolution layers, which can handle inputs exhibiting periodicity and may be applied to other problems related to computer vision, time series, and so on for data that possess some periodicity. In the reading periodic table method, no material feature or descriptor is required as input. We demonstrated two types of deep learning estimation: methods to estimate the existence of a band gap, and methods to estimate the value of the band gap given when the existence of the band gap in the materials is known. Finally, we discuss the limitations of the dataset and model evaluation method. We may be unable to distinguish good models based on the random traintest split scheme; thus, we must prepare an appropriate dataset where the training and test data are temporally separate. The code and the data are open.
Machine learning (ML) methods have been employed in the search for inorganic materials,^{1}^{,}^{2}^{)} and useful libraries have been developed.^{3}^{,}^{4}^{)} In the case of organic materials, machine learning methods have been studied, typically by employing graph structures,^{5}^{,}^{6}^{)} which are relevant to problems in computer science. Researchers have invested more efforts into searching for organic materials than for inorganic ones. However, the search for inorganic materials still has a wide scope. Density functional theory (DFT),^{7}^{–}^{9}^{)} which involves firstprinciple computational calculations, requires expensive computational resources, is difficult to apply to highly correlated systems, and often requires ordered crystal structures, although progress is being made to address such issues. Machine learning and DFT can be used to complement each other, and both must be investigated further. Deep learning^{10}^{–}^{12}^{)} has achieved advances in image recognition,^{13}^{)} machine translation,^{14}^{)} image generation,^{15}^{)} natural language inference,^{16}^{,}^{17}^{)} raw audio generation,^{18}^{)} and imperfect information games.^{19}^{)} Now, deep learning is finding increasing applications in mathematics and physics as well, in the fields of quantum systems,^{20}^{)} particle physics,^{21}^{)} Gauss–Manin connection,^{22}^{)} neural networks and quantum field theory,^{23}^{)} holographic QCD,^{24}^{)} conformal field theory,^{25}^{)} black hole metrics,^{26}^{)} supergravity,^{27}^{)} Seiberg–Witten curves,^{28}^{)} Calabi–Yau manifolds,^{29}^{,}^{30}^{)} etc. Deep learning possesses superior capabilities over prior machine learning methods, such as support vector machines (SVM),^{31}^{,}^{32}^{)} random forest,^{33}^{)} and kmeans.^{34}^{–}^{36}^{)}
A deep learning method for finding new superconductors was introduced in our previous study^{1}^{)} (Although it is not technically deep learning, the critical temperature of superconductors was studied by the random forest method, which is a method of machine learning, in Ref. 37). In the study,^{1}^{)} a deep learning model was trained to read the periodic table to learn the law of elements in order to identify novel materials not present in the data. This technique does not require calculations for inputting any feature or material descriptor. The method was named reading periodic table. This deep learning model outperformed random forest.
Nevertheless, two problems remain. The deep learning method of reading periodic table does not use any specific feature of superconductors; it only uses the periodic table. Although it was demonstrated that the method was useful for problems of superconductors, it is still unclear whether the method can be used to solve other materialrelated problems. If it can be used, this will open significant opportunities. Thus, we verify it in the present paper. The second problem is regarding the cylindrical periodicity between the left and rightmost columns in the table. The periodic table is effectively cylindrical, in contrast to its twodimensional (2D) appearance; the left and rightmost columns are adjacent. Our previous method was not designed to allow the deep learning model to read the periodic table in the cylindrical form at the level of training representation. If the layers are sufficiently deep, neural networks can also learn the laws from the cylindrical periodicity without making any specific modifications to the original method of reading periodic table. However, they may learn them insufficiently well, and it is unclear whether the deep learning model learns the laws represented by the cylindrical periodicity, as deep learning is a black box model. Hence, it is better to use a learning representation that allows the deep learning model to unquestionably learn the laws represented by the cylindrical periodicity. We extended our previous method^{1}^{)} to reflect the cylindrical periodicity at the learning representation level. We designed this functionality as a new layer named the periodic convolution layer, which performs the required operation, i.e., the layers process the periodic table as if the table is cylindrically rolledup, even though the input is the ordinary two dimensional periodic table. This method is named reading periodic table with cylindrical periodicity. The periodic convolution layer can be used for other problems related to computer vision, time series data, and so on, if the data being examined contains some periodicity.
To solve the two aforementioned problems, we demonstrate that the extended method can predict band gaps. A band gap is a fundamental material property that forms the basis for separation among conductors, semiconductors, and insulators; it influences thermal and electrical conductivities, the functioning of light emitting and laser diodes, etc. In addition, its structure is relevant to some topological matter that has recently been reported and is receiving considerable attention. To design functional materials, knowledge of the band gap is important. MLbased bandgap estimation has already been performed in earlier studies.^{38}^{–}^{43}^{)} We demonstrate that the proposed method of reading periodic table with cylindrical periodicity has estimation capabilities comparable to those of SVM classification and regression studied in Ref. 43. We also highlight the problem in the random traintestsplit model evaluation scheme, in that we may not be able to identify good models by using the scheme.
The main contributions of the paper are as follows. In the previous study, we introduced the method named reading periodic table, which uses deep learning learn to read the periodic table and estimate the critical temperature of superconductors. In this paper, we extended the method for deep learning to also learn the cylindrical periodicity directly, i.e., by considering the fact that the right and leftmost columns are adjacent in the table. We then demonstrated that the method has wide applicability to materialrelated problems other than superconductors by demonstrating two band gap estimations. This verification is necessary for future applications.
The learning representation of the reading periodic table allows the deep learning model to read the periodic table and learn the laws of elements to identify novel materials. The method is designed to learn the relative positions of elements on periodic table, which is difficult when using a 118dim onehot representation. Our methods allow us to extract the laws represented by the periodic table. A material feature or descriptor is not required, and instead, the features are automatically determined from the periodic table via deep learning. Furthermore, it is not necessary to input the crystal structure of the materials. This is both advantageous and disadvantageous. The advantage is as follows. Because we do not need to input spatial information, we just need to input the chemical formula only to estimate material properties. We do not have to calculate the spatial structure of new materials via first principle calculations, which require high computational cost, before inputting spatial structure to the machine learning model. Furthermore, we could not acquire spatial information for the experimental data on band gaps. Databases of experimentally measured spatial information are available, such as the Crystallography Open Database (COD).^{44}^{–}^{46}^{)} COD is an openaccess collection of the crystal structures of organic, inorganic, and metal–organic materials, and it has approximately 460,000 entries as of 2020. However, there are no databases for specific problems like band gaps, superconductors, and so forth. It takes significant effort and time to determine the spatial structure of the experimental data of materials related to band gap problems using such a database. In such databases, the same chemical formulas can have different spatial structures. To determine the spatial structure of a specific material, we need to check original papers. The disadvantage of not inputting the crystal structure is that the capability of machine learning will improve if we use the complete information of spatial structures, since it plays an important role in physics. Although it is not visibly apparent, the left and rightmost columns of the periodic table are adjacent, and it also possesses cylindrical periodicity. Thus, for the abovementioned reasons, we extend the method so that deep learning can also learn the laws represented by the cylindrical periodicity more directly at the learning representation level, as illustrated in Fig. 1. We achieve this functionality using a new layer named periodic convolution layer, as illustrated in Fig. 2. The layer can be applied to other problems that use input data possessing some periodicity.
Figure 1. (Color online) Reading periodic table with cylindrical periodicity. The composite values of a hypothetical material AB_{2}C_{6}D are listed in the periodic table, which is divided into four separate tables according to the electron orbitals. In the middle four tables, as an example, for the second top table, all the values on the table except for those of the pblock elements are 0. Then, the tables are processed as if they are rolled up in the horizontal direction. This is done to allow the deep learning model to learn that the left and rightmost columns of the table are adjacent at the learning representation level.
Figure 2. (Color online) Periodic convolution layer. Top: schematic of the periodic convolution layer that reads the periodic table with cylindrical periodicity. Bottom: inside of the periodic convolution layer. If the learning representation of the 2D periodic table is input to the layer, the layer considers the 2D periodic table with four orbitals as if the left and rightmost columns are adjacent and the tables are cylindrically rolled up. The output of the layer is in 2D
For both estimation tasks, we used the absolute representation in the method of reading periodic table with cylindrical periodicity, where the learning rate is
In the absolute representation, H_{2}O is represented as H: 2, O: 1, whereas in its relative representation, it is H:
Basic periodic block. To explain the model structure, we need to describe the basic periodic block. Let a basic periodic block be denoted by basic periodic block [input channel, inside channel, output channel, kernel size], which is composed of the following layers. First, we have the periodic convolution [input channel, output channel = inside channel, kernel size], followed by batch norm, followed by a ReLU, followed by periodic convolution [input channel = inside channel, output channel, kernel size], followed by batch norm, and then followed by a ReLU. We also have a skip connection in the basic periodic block between input of the block and before the last ReLU.
The structure of the neural network is as follows. First, we have the periodic convolution [input channel = 4, output channel = 10, kernel size =
Here, we demonstrate two types of estimations: (1) estimating the existence of a band gap and predicting whether the material is a nonmetal, and (2) estimating the values of the band gap given that the existence of a band gap in the materials is known in advance. We used the dataset employed in the previous study using SVM^{43}^{)} after we made correction and removed inappropriate data to compare our results with the SVMbased results. It is noteworthy that the previous study used an existing method, SVM, whereas we invented a novel method.
We use the experimental data of 3734 gapped materials, of which 2473 are unique compositions. For the binary classification, we also use the first principle calculation data of 2450 nongapped materials from Material Project.^{47}^{)} The numbers are balanced to avoid possible bias caused by an imbalance between the number of gapped and nongapped materials. Only unique compositions are used. We use only the experimental data for the regression of band gaps. We use all the experimental data of the 3734 gapped materials to accord with previous studies.
We randomly split the dataset into training and test data in a ratio of 80 to 20. The results of the binary classification are summarized in Table I, and the ROC curve is illustrated in Fig. 3.
Figure 3. (Color online) ROC curve for the binary classification of bandgap existence for test data.

Our model exhibits good scores that are comparable to those obtained in the previous study, where the accuracy = 0.92, precision = 0.89, recall = 0.95, f1 = 0.92, and area under curve (AUC) = 0.97. We will compare our result to that of previous study. The AUCs are the same. In four basic metrics (accuracy, precision, recall, and f1), the previous model is only better in terms of recall, and the accuracies are the same. Our model is better in terms of precision and f1. f1 is particularly important because it is an integrated measure of precision and recall.
The dataset used had 3734 compositions. We did not remove identical compositions with different band gaps to accord with the previous study, as they may differ in their crystal structures, which could not be understood from the available data. We also randomly separated the dataset into training and test data in a ratio of 80 to 20. The results of bandgap regression (for test data) are summarized in Table II, and the scatter plot is illustrated in Fig. 4.
Figure 4. (Color online) Scatter plot for the band gap regression: predicted vs true band gap values for test data.

In this study, we extended the deep learningbased reading periodic table method, which reads the periodic table and learns the laws of elements. We solved two problems. First, we determined whether we can use the deep learning method of reading periodic table^{1}^{)} not only for superconductors but also for other materialrelated problems. Second, we used the cylindrical periodicity between the left and rightmost columns of the periodic table directly.
Through the extensions made to the deep learningbased reading periodic table method, it can now also learn the laws represented by the cylindrical periodicity of the periodic table at the learning representation level, meaning that the left and rightmost columns of the table are adjacent. We implemented this functionality through a new layer named periodic convolution layer. Even if the input to the layer is in the form of an ordinary 2D periodic table, the layer considers the table as if it were rolled up horizontally, and the output is also in a 2D form with arbitrary depth so that it is handled by the succeeding layers in a similar manner. The periodic convolution layer can also be applied to other problems related to computer vision, time series data, and so on, provided that the input data have some periodicity.
We demonstrated the applicability of the method for other materialrelated problems by solving two problems related to the band gap. The deep learning method satisfactorily provides a binary prediction of bandgap existence and a regression of the bandgap values. The training and test data obtained by randomly splitting the dataset were used to compare our results with those of the previous studies. However, because the test data is inevitably similar to the training data, the estimation by machine learning with the test data may become very easy. To more accurately evaluate models, we require an appropriately prepared dataset. Thus, the data must be divided temporally, such that the data until a certain year constitute the training data and the data after that year constitute the test data, as discussed in Ref. 1. This is the same situation under which we will use machine learning models for material search, and the evaluation of the machine learning models must be performed under similar conditions. However, as we could not know when the data was obtained, we could not temporally separate the dataset in this study. Contrary to the naive intuition of a machine learning novice, the preparation of an appropriate dataset is very difficult and is among the most crucial contributions to the progress of the field. We would like to stress that the development of such a dataset remains a challenge for the machine learning community, and if the dataset is appropriately prepared, we would be more capable of distinguishing and making good models based on such a dataset.
The data used for band gap existence binary classification, the data used for the band gap regression, the code for the periodic convolution layer, the code for the neural network will be found in the link presented as Ref. 49, since a link cannot be used in a paper in this journal. Readers can use the code and the data provided that they cite this paper. Furthermore, the code that transforms chemical formulas into the reading periodic table data format can be obtained under the same condition. See the conditions in detail from the link presented as Ref. 49.
Acknowledgments
The author thanks Hodaka Kurokawa for his suggestion to tackle band gap estimation and for finding the data, Ya Zhuo and Jakoah Brgoch for communication, the conference Deep Learning and Physics 2019 at Yukawa Institute of Theoretical Physics, and Yoshiaki Shimada, who is not an author, for very fruitful discussions through which the idea of rolling up the periodic table came up at National Institute of Information and Communications Technology Open Summit 2019.
References
 1 T. Konno, H. Kurokawa, F. Nabeshima, Y. Sakishita, R. Ogawa, I. Hosako, and A. Maeda, arXiv:1812.01995. Google Scholar
 2 W. F. Schneider and H. Guo, J. Phys. Chem. Lett. 9, 569 (2018). 10.1021/acs.jpclett.8b00009 Crossref, Google Scholar
 3 S. P. Ong, W. D. Richards, A. Jain, G. Hautier, M. Kocher, S. Cholia, D. Gunter, V. L. Chevrier, K. A. Persson, and G. Ceder, Comput. Mater. Sci. 68, 314 (2013). 10.1016/j.commatsci.2012.10.028 Crossref, Google Scholar
 4 L. Ward, A. Agrawal, A. Choudhary, and C. Wolverton, npj Comput. Mater. 2, 16028 (2016). 10.1038/npjcompumats.2016.28 Crossref, Google Scholar
 5 Z. Zhang, P. Cui, and W. Zhu, arXiv:1812.04202. Google Scholar
 6 J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu, and M. Sun, arXiv:1812.08434. Google Scholar
 7 A. J. Cohen, P. MoriSánchez, and W. Yang, Chem. Rev. 112, 289 (2012). 10.1021/cr200107z Crossref, Google Scholar
 8 R. O. Jones, Rev. Mod. Phys. 87, 897 (2015). 10.1103/RevModPhys.87.897 Crossref, Google Scholar
 9 N. Mardirossian and M. HeadGordon, Mol. Phys. 115, 2315 (2017). 10.1080/00268976.2017.1333644 Crossref, Google Scholar
 10 K. P. Murphy, Machine Learning: A Probabilistic Perspective (MIT Press, London, 2012). Google Scholar
 11 I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep Learning (MIT Press, London, 2016) Vol. 1. Google Scholar
 12 A. Zhang, Z. C. Lipton, M. Li, and A. J. Smola, Dive into Deep Learning (Online Book, 2019) http://www.d2l.ai. Google Scholar
 13 A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems (2012) Vol. 25, p. 1097. Google Scholar
 14 A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, Attention is all you need, in Advances in Neural Information Processing Systems (2017) Vol. 30. Google Scholar
 15 I. Goodfellow, J. PougetAbadie, M. Mirza, B. Xu, D. WardeFarley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing Systems (2014) Vol. 27, p. 2672. Google Scholar
 16 M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, arXiv:1802.05365. Google Scholar
 17 J. Devlin, M.W. Chang, K. Lee, and K. Toutanova, arXiv:1810.04805. Google Scholar
 18 A. v. d. Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu, arXiv:1609.03499. Google Scholar
 19 A. Blair and A. Saffidine, Science 365, 864 (2019). 10.1126/science.aay7774 Crossref, Google Scholar
 20 T. Ohtsuki and T. Mano, J. Phys. Soc. Jpn. 89, 022001 (2020). 10.7566/JPSJ.89.022001 Link, Google Scholar
 21 D. Bourilkov, arXiv:1912.08245. Google Scholar
 22 K. Heal, A. Kulkarni, and E. C. Sertöz, arXiv:2007.13786. Google Scholar
 23 J. Halverson, A. Maiti, and K. Stoner, arXiv:2008.08601. Google Scholar
 24 K. Hashimoto, H.Y. Hu, and Y. You, arXiv:2006.00712. Google Scholar
 25 H. Chen, Y. He, S. Lal, and M. Z. Zaz, arXiv:2006.16114. Google Scholar
 26 Y. kun Yan, S. Wu, X. Ge, and Y. Tian, arXiv:2004.12112. Google Scholar
 27 C. Krishnan, V. Mohan, and S. Ray, arXiv:2002.12927. Google Scholar
 28 Y. He, E. Hirst, and T. Peterken, arXiv:2004.05218. Google Scholar
 29 Y.H. He and A. Lukas, arXiv:2009.02544 [hepth]. Google Scholar
 30 H. Erbin and R. Finotello, arXiv:2007.15706. Google Scholar
 31 B. E. Boser, I. M. Guyon, and V. N. Vapnik, Proceedings of the Fifth Annual Workshop on Computational Learning Theory, COLT ’92, 1992, p. 144. 10.1145/130385.130401 Crossref, Google Scholar
 32 C. Cortes and V. Vapnik, Mach. Learn. 20, 273 (1995). 10.1007/BF00994018 Crossref, Google Scholar
 33 L. Breiman, Mach. Learn. 45, 5 (2001). 10.1023/A:1010933404324 Crossref, Google Scholar
 34 H. Steinhaus, Bull. Acad. Pol. Sci., Cl. III 4, 801 (1957) [in French]. Google Scholar
 35 J. MacQueen, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1967, p. 281. Google Scholar
 36 E. W. Forgy, Biometrics 21, 768 (1965). Google Scholar
 37 V. Stanev, C. Oses, A. G. Kusne, E. Rodriguez, J. Paglione, S. Curtarolo, and I. Takeuchi, npj Comput. Mater. 4, 29 (2018). 10.1038/s4152401800858 Crossref, Google Scholar
 38 T. Gu, W. Lu, X. Bao, and N. Chen, Solid State Sci. 8, 129 (2006). 10.1016/j.solidstatesciences.2005.10.011 Crossref, Google Scholar
 39 Z. Zhaochun, P. Ruiwu, and C. Nianyi, Mater. Sci. Eng. B 54, 149 (1998). 10.1016/S09215107(98)001573 Crossref, Google Scholar
 40 P. Dey, J. Bible, S. Datta, S. Broderick, J. Jasinski, M. Sunkara, M. Menon, and K. Rajan, Comput. Mater. Sci. 83, 185 (2014). 10.1016/j.commatsci.2013.10.016 Crossref, Google Scholar
 41 J. Lee, A. Seko, K. Shitara, K. Nakayama, and I. Tanaka, Phys. Rev. B 93, 115104 (2016). 10.1103/PhysRevB.93.115104 Crossref, Google Scholar
 42 G. Pilania, A. MannodiKanakkithodi, B. Uberuaga, R. Ramprasad, J. Gubernatis, and T. Lookman, Sci. Rep. 6, 19375 (2016). 10.1038/srep19375 Crossref, Google Scholar
 43 Y. Zhuo, A. M. Tehrani, and J. Brgoch, J. Phys. Chem. Lett. 9, 1668 (2018). 10.1021/acs.jpclett.8b00124 Crossref, Google Scholar
 44 S. Gražulis, A. Daškevič, A. Merkys, D. Chateigner, L. Lutterotti, M. Quiros, N. R. Serebryanaya, P. Moeck, R. T. Downs, and A. Le Bail, Nucleic Acids Res. 40, D420 (2011). 10.1093/nar/gkr900 Crossref, Google Scholar
 45 S. Gražulis, D. Chateigner, R. T. Downs, A. F. T. Yokochi, M. Quirós, L. Lutterotti, E. Manakova, J. Butkus, P. Moeck, and A. Le Bail, J. Appl. Crystallogr. 42, 726 (2009). 10.1107/S0021889809016690 Crossref, Google Scholar
 46 R. T. Downs and M. HallWallace, Am. Mineral. 88, 247 (2003). Crossref, Google Scholar
 47 A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder, and K. a. Persson, APL Mater. 1, 011002 (2013). 10.1063/1.4812323 Crossref, Google Scholar
 48 (Supplemental Material) More discussion on regression and classification is provided online. Google Scholar
 49 Web [https://github.com/tomo835g/Superconductors]. Google Scholar