Subscriber access provided by Massachusetts Institute of Technology
J. Phys. Soc. Jpn. 91, 091013 (2022) [12 Pages]
SPECIAL TOPICS: Hyper-Ordered Structures: Recent Progress and Future Perspectives

Persistent Homology Analysis for Materials Research and Persistent Homology Software: HomCloud

+ Affiliations
1Cyber-Physical Engineering Informatics Research Core (Cypher), Okayama University, Okayama 700-8530, Japan2National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Ibaraki 305-8568, Japan3Kyoto University Institute for Advanced Study, WPI-ASHBi, Kyoto University, Sakyo, Kyoto 606-8501, Japan4Center for Advanced Intelligence Project, RIKEN, Chuo, Tokyo 103-0027, Japan

This paper introduces persistent homology, which is a powerful tool to characterize the shape of data using the mathematical concept of topology. We explain the fundamental idea of persistent homology from scratch using some examples. We also review some applications of persistent homology to materials researches and software for persistent homology data analysis. HomCloud, one of persistent homology software, is especially featured in this paper.

©2022 The Author(s)
This article is published by the Physical Society of Japan under the terms of the Creative Commons Attribution 4.0 License. Any further distribution of this work must maintain attribution to the author(s) and the title of the article, journal citation, and DOI.
1. Introduction

This paper introduces the concept of persistent homology (PH), a data analysis technique based on the mathematical concept of topology. PH utilizes topological structures such as connected components, holes, rings, and voids to characterize the shape of data at multiple scales.

We often have the opportunity to analyze the geometric structure of materials science data. From atomic-scale data obtained by molecular dynamics simulations to larger-scale data obtained by various types of microscopy, materials science provides many spatial structures. It is a typical problem to investigate how such structures are correlated with physical parameters and the physical properties of the materials.

We need to quantify the geometric structures to investigate the problems. PH enables us to summarize the shape of data quantitatively using mathematics. From the viewpoint of materials informatics, PH provides descriptors of the shape of data.

The output of PH is a persistence diagram or a persistence barcode. The diagram and barcode have the same information, and only how to visualize is different. Therefore we will always use the persistence diagram in this paper.

This paper is organized as follows: Section 2 introduces the history and recent research trends in PH. Section 3 describes the theoretical foundations of PH, which will help the readers to understand how PH extracts the geometric features of data. Section 4 introduces the applications of PH to materials researches, which will help readers to understand what kind of data is suitable for PH. Section 5 introduces the PH software, especially HomCloud, which is mainly developed by Obayashi, one of the authors. Section 6 summarizes this review paper and gives some concluding remarks.

2. A Brief History of PH

The concept of PH first appeared in the paper1) by Edelsbrunner et al. in 2002. They studied topological persistence in a growing sequence of simplicial complexes in \(\mathbf{R}^{3}\). Then, Zomorodian and Carlsson2) introduced a mathematical framework of PH based on finitely-generated graded modules over polynomial rings \(\mathbf{k}[t]\) with one variable t, where \(\mathbf{k}\) is a field. This framework generalized the original concept of PH and enlarged the applicability of PH into wider classes of geometric problems. Furthermore, they clarified the so-called structure theorem of PH, i.e., under a certain finiteness condition, any PH can be uniquely decomposed into a direct sum of intervals (see Sect. 3.2). This property provides a natural visualization of PH, called persistence diagrams, by plotting points with birth and death endpoints of intervals on the plane. The persistence diagrams show topological summaries of our input data in a multi-scale way.

In order for PH to be applied to practical data analysis, it should be clarified whether the persistence diagram is stable with respect to small perturbations. This property will be important for applying PH to practical problems since those data often contain noises or errors, and the output of PH should be stable to those small perturbations for obtaining essential data structure. The question of this type was first solved by Cohen-Steiner et al.,3) and they positively proved this property of persistence diagrams. Several generalizations of this result have also been reported later.4,5)

The structure and stability theorem explained here are the most important properties in applying PH to practical problems. The former provides a compact descriptor showing a topological multi-scale structure of data, and the latter guarantees their stability with respect to small perturbations. In fact, PH has been recently applied in various fields of science, including sensor networks,6) materials science,7,8) biological evolution,9) biomolecular structural analysis,10) brains,11) cosmology,12) etc. In those applications, a significant property of PH characterizing topological structures in a multi-scale way plays an important role in clarifying new insights. In this paper, we focus on the applications to materials science and show some of those successful examples.

We also remark that these various applications give strong motivations for further mathematical studies of PH. For example, some methods of inverse problems and machine learnings of PH are actually developed by considering practical applications into materials science.13,14) For details of the recent development of PH-based machine learning and its applications, we refer to the survey paper.15)

3. Foundation of PH

This section describes how to characterize the shape of data quantitatively in the form of persistence diagrams. PH is available for various kinds of data such as pointclouds and bitmaps, but we mainly consider pointclouds as input data in this paper.

A pointcloud is a finite set of points. A typical pointcloud is an atomic configuration data. We can analyze such data using PH. The outline of PH is explained by the examples in Figs. 1 and 2.


Figure 1. Filtration and PD (a) Input data (b) Pointcloud and discs with \(r = b_{1}\) (c) \(r=b_{2}\) (d) \(r=d_{2}\) (e) \(r=d_{1}\) (f) PD.


Figure 2. PDs for regular tetrahedral and octahedral points. (a) PD1 for tetrahedral points (b) PD2 for tetrahedral points (c) a regular tetrahedron (d) PD1 for octahedral points (e) PD2 for octahedral points (f) a regular octahedron.

The pointcloud Fig. 1(a) has no ring or hole since all points are separated, but the pointcloud looks like it has two rings due to the proximity of the points. To construct topological structures on the data, we put discs on all points whose radii are the same, and holes appear as in Figs. 1(b)–1(e). We can count the number of holes in these figures.

The problem here is to determine the radius of discs. The number of holes changes when the radius changes. The fundamental idea of PH is to consider the changing process instead of fixing the radius. When the radius r gradually becomes larger from zero, holes appear and disappear as in Figs. 1(b)–1(e). The increasing process of the shapes is called a filtration, and the theory of PH makes pairs of appearance and disappearance of holes. In Fig. 1, \((b_{1}, d_{1})\) and \((b_{2}, d_{2})\) are paired. The radius of appearance is called a birth time, the radius of disappearance is called a death time, and the pair of birth and death times is called a birth-death pair. The set of birth-death pairs with multiplicity is called a persistence diagram (PD). The PD is often visualized by a scatter plot or 2D histogram [Fig. 1(f)]. We remark that the birth times and death times are sometimes squared according to conventions in computational geometry and topological data analysis.

PH is applicable to any dimensional data. For 3D data, we use spheres instead of discs. In homology theory, 3D geometric objects have three types of homology information, and each type is characterized by dimension. 0D homology has information about connectivity, 1D homology has information about holes or rings, and 2D homology has information about cavities or bubbles. Corresponding 0D, 1D, and 2D PDs are available. Since we consider rings in Fig. 1, in Fig. 1(f) is a 1D PD.

One advantage of PH is its mathematical background. The structure theorem of PH gives the algorithm to compute the diagram, and the theorem also ensures the uniqueness of the diagram. It means that the same input gives the same diagram, unlike the Monte-Carlo method or stochastic gradient descent. The stability theorem of PH ensures that the small change of the input data causes only small changes of output PD. These theorems play an important role in reliable data analysis.

To understand PDs, we examine some PDs for typical pointclouds. Figures 2(a) and 2(b) show the PDs of regular tetrahedral points [Fig. 2(c)] where a is the distance between two points. PD1 for regular tetrahedral points has three birth-death pairs, and all of them are \((a/2, a/\sqrt{3})\). PD2 has one birth-death pair \((a/\sqrt{3}, a\sqrt{3/8})\). Three pairs in PD1 correspond to triangles of the tetrahedron, and one pair in PD2 corresponds to the void at the center of the points. \(a/2\) is half the length of the edge, \(a/\sqrt{3}\) is the circumradius of the triangle, and \(a\sqrt{3/8}\) is the circumradius of the tetrahedron.

Probably readers will wonder why PD1 has three pairs, not four. The reason comes from the mathematical theory of homology. We will discuss the background in Sect. 3.2.

We can apply the same idea to the regular octahedral points. Figures 2(d) and 2(e) show the PDs for the pointcloud. PD1 for the regular octahedral points has seven pairs at \((a/2, a/\sqrt{3})\). These pairs correspond to the triangles of the octahedron. The PD1 does not have pairs corresponding to the eighth triangle and the square for the same reason as the tetrahedron. PD2 has one birth-death pair at \((a/\sqrt{3}, a/\sqrt{2})\) corresponding to the void at the center of the octahedron.

The PDs of hcp crystalline structure are in some sense the combination of PDs for tetrahedral points and octahedral points. PD1 for the hcp has only birth-death pairs at \((a/2, a/\sqrt{3})\) corresponding to the triangles, and PD2 has birth-death pairs at \((a/\sqrt{3}, a\sqrt{3/8})\) and \((a/\sqrt{3}, a/\sqrt{2})\) corresponding to the tetrahedral sites and octahedral sites in the hcp structure. Indeed fcc crystalline structure also has the tetrahedral sites and octahedral sites, and the PDs for the fcc points are completely the same as hcp.

Geometric representations

The concept of PH is applicable to various data. We now describe some geometric representations to understand available data for PH.

One important geometric representation is called a simplicial complex. A simplicial complex is composed of points, line segments, triangles, tetrahedrons, and higher dimensional counterparts.

Formally saying, an n-dimensional simplicial complex is a finite set of k-simplices for \(k=0,\ldots,n\), which are represented by \(k+1\) vertices. A 0-simplex is a point, 1-simplex is a line segment, 2-simplex is a triangle, and 3-simplex is a tetrahedron.

For an n-simplex σ and \(k < n\), a k-simplex included in σ is called a face of σ if the all vertices of the k-simplex is also the vertices of σ.

A finite set of simplices X is called a simplicial complex if X satisfies the following two conditions:

(1)

If \(\sigma\in X\), any face of σ is contained in X

(2)

The intersection of two simplices in X is their common face

Condition (2) means that two simplices are glued together by a lower-dimensional simplex.

A purely combinatorial description of a simplicial complex is called an abstract simplicial complex. For a finite set V, a family of subsets of V, Σ, is called abstract simplicial complex if the following two conditions hold:

For any \(v\in V\), \(\{v\}\in \Sigma\)

If \(\sigma\in \Sigma\) and \(\tau\subset \sigma\), \(\tau\in \Sigma\)

Of course, We can regard a simplicial complex as an abstract simplicial complex. An abstract simplicial complex is helpful to represent a geometric object on a computer.

One important simplicial complex is an alpha complex.16) We can construct an alpha complex with radius parameter r from pointcloud as a subset of Delaunay triangulation using the Voronoi diagram. An alpha complex with parameter r must have the same topological information as the union discs model shown in Fig. 1. Therefore we usually use an alpha complex to represent the union discs model such as Figs. 1(b)–1(e).

We explain the construction of an alpha complex using the example shown in Fig. 3. Figure 3(a) shows a pointcloud and its Voronoi diagram, (b) shows the overlay image of (a) and discs, and (c) shows the alpha complex of the pointcloud. The triple intersection P in (b) corresponds to the triangle \(P'\) in (c), and the double intersections in (b) (shown in dotted lines) correspond to edges in (c). Since there exists no triple intersection at Q in (b), the alpha complex has no triangle at \(Q'\) in (c). As shown in Figs. 3(b) and 3(c) have the same topological information. For example, both (b) and (c) have one hole (Q and \(Q'\)) and two connected components. Nerve theorem from abstract topology ensures that the union of discs and the alpha complex gives the same topological information. A filtration by alpha complexes is called an alpha filtration.


Figure 3. Construction of an alpha complex (a) Pointcloud and Voronoi diagram (b) Overlay image of discs and Voronoi diagram (c) Alpha complex.

We can extend the idea of the alpha complex to 3D or higher dimensional spaces by considering quadruple intersections and more.

Another important simplicial complex for data analysis is a clique complex and Vietoris–Rips complex. We can define a clique complex from an undirected graph. For a graph \(G = (V, E)\), a set of vertices \(C = \{v_{0},\ldots, v_{k}\}\) is called a clique if any pair \((v_{i}, v_{j})\) is adjacent, that is, the graph G has an edge between \(v_{i}\) and \(v_{j}\). The set of all cliques is called a clique complex. We easily show that the clique complex satisfies the conditions of an abstract simplicial complex, and we write it as \(X(G)\). By using clique complexes, we can construct filtration from a weighted graph. For a weight \(w: E\to \mathbb{R}\) and \(a\in \mathbb{R}\), \(G_{a} = (V_{a}, E_{a})\) is a subgraph of G, where: \begin{equation} \begin{split} V_{a} &= V,\\ E_{a} &= \{e \in E \mid w(e) \leq a \}. \end{split} \end{equation} (1) \(X(G_{a})\) gradually increases when a increases, so \(\{X(G_{a})\}_{a\in \mathbb{R}}\) gives a filtration of clique complexes.

By considering a finite number of points V on a metric space, we can construct a filtration called a Vietoris–Rips filtration using the complete graph on V and the weight function \(w(\{v_{i}, v_{J}\}) = d(v_{i}, v_{j})\), where \(d(v_{i}, v_{j})\) is the distance between two points.

Vietoris–Rips filtrations are sometimes used for PH data analysis. In the study of molecular phylogenetics by Chen et al.,9) they use Vietoris–Rips filtrations using a genomic distance between genomic sequences.

One advantage of Vietoris–Rips filtrations is that they can be used as long as pairwise distances are available. At the same time, the disadvantages are high computation cost and lack of geometric correspondence as in alpha filtrations.

We also use cubical complexes to analyze pixel and voxel data. A cubical filtration consists of points, line segments, squares, cubes, and higher-dimensional counterparts. We can construct a filtration of cubical complexes by considering a level function on pixels. The concept can be directly applicable to gray-scale bitmap data. This type of filtration is called “level-set filtration”. Some studies1720) apply a cubical filtration to 2D or 3D binary bitmap data using distance transform.

Mathematical foundation

PH is based on the theory of homology, a part of topology theory. Since this paper is intended for readers who are interested in the application of PH, we do not attempt to fully formulate the mathematics of PH. Instead, we will illustrate the mathematical ideas using some examples.

First, we consider the problem of counting the number of rings in a tetrahedral skeleton. The tetrahedron [Fig. 4(a)] seems to have four rings, but it looks like three rings when we see the tetrahedron from above [Fig. 4(b)]. This is because the outer ring D in Fig. 4(c) looks like the combination of the three rings A, B, and C in some sense.


Figure 4. (a) Tetrahedron (b) Tetrahedron looking from above (c) Four rings in the tetrahedron (d) A ring C in (c) is filled with a triangle.

The theory of homology justifies intuition using linear algebra. In homology theory, each ring is regarded as a vector, and we can justify the equality \(A + B + C = D\) by assuming the rule that adding the same edge twice equals zero. An easy way to justify the rule is to consider \(\mathbb{Z}/2\mathbb{Z}\)-vector space. The equality means that the four rings \(A, B, C, D\) are not linearly independent. We can consider the linear space of all rings, and we can count the number of linearly independent rings by computing the dimension of the linear space. In Figs. 4(a) and 4(b), the tetrahedron skeleton has three linearly independent rings. This is because PD1 for regular tetrahedral points has three birth-death pairs at \((a/2, a/\sqrt{3})\), not four. The same is true for the regular octahedral points.

Next, we consider the counting of holes in Fig. 4(d). The figure seems to have two holes since the hole C in Fig. 4(c) is filled with a triangle and the rest are A and B. In other words, we can count the number of holes by \(z - b\), where z is the number of linearly independent rings, and b is the number of linearly independent rings filled by triangles. We can realize the idea by canceling out the ring C in an algebraic way called the quotient. By quotient, we can compute the linear space of all unremoved rings. In fact, \(z - b\) is the dimension of the quotient linear space. Of course, the same idea is available to count the number of cavities.

Finally, we explain the idea of PH using the example in Fig. 5. In this figure, the numbers of holes are \(0, 1, 2, 1, 0\) from left to right. The sets of linearly independent rings without filling with triangles are \(\{\}, \{A\}, \{B', C'\}, \{B''\}, \{\}\). To understand the relationship between these sets, we consider the change of basis. By changing the basis from \(\{B', C'\}\) to \(\{B' + C', C'\}\), we can consider the following relationship called interval decomposition, where \(*\) means the vanishment of the hole:

(2)
In the above relationship, \(A''= B''\) is justified due to canceling out rule. We can say that the hole A appears at Fig. 5(b) disappearance at Fig. 5(e), and another hole \(C'\) appears at Fig. 5(c) and disappears at Fig. 5(d). We can make pairs of appearance and disappearance of the holes mathematically in that way. The theory of PH ensures that we can always find such proper bases. This is the fundamental idea of PH.


Figure 5. Increase sequence of simplicial complexes.

4. Applications to Materials Research

PH, especially for point clouds, is currently being used as a tool to extract structures in various materials. In Ref. 21, two of the authors already give some case studies to show how PH works in representative structures found in material systems. Since then, many interesting applications have been investigated, including granular materials, network former, and polymer. There are several properties of these systems that explain the effectiveness of PH feature extraction. They are not found in the introduction of the previous case studies. Now, we will focus on applications of the point clouds and present both the validity of using PDs as a descriptor for these materials and the validity of geometric interpretations for them. We then will present some notable recent results for each of these materials.

PD as a descriptor of disordered system

PH has the potential to become a universal tool for representing complex structures in material science. One of the central issues in materials science is to express the difference in physical properties by the difference in structure. This approach has been very effective for regularly arranged structures, such as crystals, and extremely disordered structures, such as gases. Apart from these two extremes, however, it is difficult to take such an approach because of the difficulty in quantitatively evaluating complex structures. For example, there can be structures partially ordered in some regions while disordered in other regions, or structures in an intermediate stage of ordering. To quantify them, the method must be able to express the diversity of the structure. Since PD can be regarded as a kind of distribution function, it can represent those features.

When extracting a characteristic shape from a disordered configuration of particles, we may face the problem of the arbitrariness of the threshold, as mentioned in Sect. 3 as the problem of determining the radius of the disk. For example, in order to introduce hydrogen bonds, metal complexes, polyhedra of metal clusters, etc., it is necessary to determine an appropriate threshold. Usually, thresholds are set based on physical and chemical properties, but PH provides an alternative solution to this problem. PH evaluates the degree of robustness of the thresholds by sweeping them. Holes with a small \(d-b\), that is, holes that disappear soon after they appear as the radius parameter increases, are not detected unless the threshold is finely tuned, and vice versa. In other words, holes far from the diagonal \(b=d\) in the PD can be extracted even if some numerical noise is added. Therefore, we can assume that such holes have important information for the disordered configuration.

Comparison with conventional methods

To gain a better understanding of the uniqueness of PH, it will be helpful to compare it with conventional methods used in material science such as Voronoi analysis,22) ring statistics,23) bond orientational order parameter,24) and radial distribution function.25) The Voronoi analysis and the ring statistics provide discrete variable indicators. Due to the discreteness, two configurations close to each other may be identified as different indices. In PH, the stability theorem allows us to avoid this problem.

The bond orientational order parameter is a continuous variable indicator. Together with the two mentioned above, this indicator expresses the degree of disorder in terms of similarity to the crystalline structure. When the structure of interest is relatively close to the crystal, these quantities work for interpreting the structure. However, they are not suitable for representing disordered shapes far from the crystalline structure. Whereas, PH provides interpretations that are not based on similarity to crystals.

The radial distribution function is a quantity that does not rely on the similarity of crystal and can be applicable to a disordered structure. In fact, it can completely express the disordered structure of the simple liquid. However, it is not suitable for expressing complex shapes composed of many particles, which can be expressed by PH.

As can be seen from these comparisons, PH is suitable for describing shapes consisting of a large number of particles embedded in a structure that is far from the crystal structure. Then, the degree of the disorder can be expressed as a distribution function. A material system that has the benefit of applying PH has these characteristics.

An object with disordered structure

We can get an overview of the systems in which PH is currently successfully applied by analyzing the reasons for the realization of disordered structure from the viewpoint of material science. There are three possible reasons for the disordered structure: 1. because of the property of the microscopic constituents of the material, 2. because of the property of the process of preparing the material, 3. because the material is composed of macroscopic components with heterogeneity and friction. Here we will describe them.

Soft matter, such as polymers, often has disordered structures on the atomic and molecular scales. In molecular systems that constitute soft materials, there are intermolecular potential and intramolecular potential such as a torsional potential. The magnitude of these energies is of the same order as the thermal energy at room temperature. As a result, both single molecules and molecular assemblies are allowed to have various structures in soft matter and take quite disordered structures.

The solid-state with a disordered structure, called glass or amorphous, is obtained by quenching the material while maintaining the disorder that appeared in the liquid state. Many multi-component systems reach the glassy state by this procedure. In addition to the soft matter, alloys and ceramics can also achieve a glassy disordered structure.

Granular consisting of macroscopic particles also often have disordered structures. Unlike atoms or molecules, the granular particles have diversity in size and interactions. Therefore, when they are densely packed, it is not necessarily arranged regularly like crystals. Even if particles have a uniform property in the simulation, they may not always form crystals and solidify in a disordered structure due to friction. Therefore, in both experiments and simulations, granular systems often show a disordered structure.

For these reasons, these material systems often have structures that are difficult to quantify, and PH is required to be quantified there.

Material systems with geometric interpretation

We can interpret the shape extracted by PH if the physical process of creating the point cloud is governed by geometry. In other words, the validity of interpreting the point cloud as a complex consisting of an expanding sphere accompanying each point is justified on the basis of materials science. For a mono-disperse system, the construction is always justified, but for a multi-component system, it is not always. It is non-trivial that each particle has an intrinsic quantity corresponding to its radius. Only if this property is present can we define a physically interpretable contact between a particle and another adjacent particle. In a system with N kinds of particles, there are \(N(N+1)/2\) kinds of inter-particle distances. Since there must be N radii of particles, the degree of indefiniteness is \(N(N-1)/2\). The radius is well defined only when this indefiniteness is not present.

Since the configuration of the system is determined by the inter-particle potential between the particles, let us consider the conditions under which the potential has the distance corresponding to the radius. For example, consider a binary system (\(N=2\)) with a Lennard-Jones potential \(U_{AB}(r)=\epsilon_{AB}[(\sigma_{AB}/r)^{12}-(\sigma_{AB}/r)^{6}]\), where A and B represent the type of particle. The length of the interaction between component A and component B is determined by \(\sigma_{AB}\). As mentioned above, there are three length parameters, \(\sigma_{AA},\sigma_{AB},\sigma_{BB}\), corresponding to the type of distance between the particles. If \(\sigma_{AB}\) is expressed using the arithmetic mean of \(\sigma_{AA},\sigma_{BB}\), we can introduce the radius as \(\sigma_{A}=\sigma_{AA}\) or \(\sigma_{B}=\sigma_{BB}\). Here, the condition of the arithmetic-mean has eliminated the indefiniteness for the radius. The same applies to the system with N kinds of particles. Especially in the case of LJ, the equation given by this arithmetic mean is called the Lorentz–Bertelot combining rule.26) Thus, the combining rule guarantees the validity of considering the radius of the particle and makes the virtual contact in alpha shape physically meaningful.

Many polymer systems currently analyzed might be considered to satisfy the combining rule. The interaction between granular particles always satisfies it whether in an experimental or simulated system because the particle is macroscopic. The details will be discussed later respectively. In the case of network former, radius and contact can be reasonably introduced from another justification. This will also be discussed later.

Achievements in material systems

In the following, we will introduce some notable recent results of PH in the material system. Most of the analyses are point clouds generated by particle simulations, and some data are generated by processing data from the Reverse Monte Carlo method based on the experimental measurement.

Granular

Granular systems have been studied extensively since the very early days of PH's application to material systems. Unlike the atomic and molecular scales, granular particles are macroscopic objects, and the radius and shape are well defined without ambiguity. In fact, the interactions between granular particles are usually represented by the contact force such as repulsive Hertzian or harmonic. These potentials satisfy the combining rule, which guarantees that PH works effectively. Concerning extraction of structure, local structures close to crystalline structure and deformation with geometric constraints have been studied.8)

In addition to the extraction of structure, PH also helps us to quantify the force-chain network. Due to the contact force, the mechanical properties at the particle scale are characterized by a force chain network. This is a unique property to granular systems. The bulk mechanical properties are then found to be related to the Betti number of the force chain network. The work by Kodic et al.27) is one of the very early studies of PH in material systems, which found that friction and polydispersity are expressed as differences in the force chain network and that the force chain network changes qualitatively at the Jamming transition point.

PH is also useful for expressing the process dependency of the force chain network. The static mechanical properties of granular depend on the process. This subject is one of the central issues of research of granular systems, it is often addressed in TDA. In Ref. 28, it was found that PD1 can represent the history dependence of the tapping operation of the granular system even for the system with the same density. This is in contrast to liquids, where density is a state quantity that represents the physical properties of the system. In addition to the static properties, the transient properties are also characterized by PH. Changes in the force chain network associated with the impact have been quantified for both the granular system29) and the suspension system.30)

Network former

Network former, such as silica (SiO2), is another material where PH is actively applied. In the case of network former, the combining rule may not hold for the interaction parameters.31) However, there is another reason that justifies interpreting the extracted shapes. Since the closest particle type to each particle is limited, a radius can be introduced for each particle species. In the realized particle configuration, indefiniteness to determine radius does not practically make a problem because the type of particle with the closest coordination to each particle is restricted.

As an example, let us consider silica glass. For the silica at low temperatures, silicon (Si) is always at the center of a tetrahedral structure consisting of oxygen (O), and no two silicons are directly next to each other. Therefore, the types of inter-particle distances are limited to Si–O and O–O. The radius of the oxygen can be determined from the nearest neighbor distance of O–O, and the radius of Si can be determined from the nearest neighbor distance of Si–O. Consequently, we can interpret the shape realized by silica through the union of balls that appeared in PH. If there are additional elements, there is no guarantee that the interpretation will work. However, we may ensure the validity of the interpretation by considering additives as secondary effects added to the backbone structure of silica.

PH was first applied to network former by Hiraoka et al.7) There, the origin of the medium-range order (MRO) of silica glass was discussed based on two results. These results might be considered a prototype of the subsequent research on the use of PH in materials. We will describe them in detail.

The first result is a relationship between PD and the first sharp diffraction peak (FSDP). Glassy systems are believed to show MRO instead of the long-range order for the crystalline system. As for the silica glass, MRO is detected by FSDP of the structure factor \(S(q)\). In fact, FSDP was observed for the configuration data obtained by the molecular dynamics simulation of silica (Fig. 6 bottom right). PD1 obtained from the same configuration data shows curves \(C_{P}\) (red box), \(C_{O}\) (green box), and \(B_{O}\) (blue box) in Fig. 6 left panel. Then, it was found that the histogram of wave-number converted from d for these curves has the same support around the FSDP (Fig. 6 upper right). From this, the length scale of the MRO was concluded to be the size of the ring associated with the curve of PD1.


Figure 6. (Color) PD for the silica glass obtained from the molecular dynamics simulation (left panel). FSDP of structure factor (q) (right bottom panel) and the histogram of death d for curve as a function of wave-number (right top panel). Red, green, and blue colors correspond to \(C_{P}\), \(C_{O}\), and \(B_{O}\), respectively.

The second result is an expression of the geometric constraints that appear as curves in the silica PD. A curve in the PD sometimes represents the geometric constraint for the point cloud data. \(C_{P}\) and the lower curve in Fig. 6 correspond to the constraint of Si–O bond length in the large ring and tetrahedral structures of SiO4, respectively. In contrast, the geometric constraint represented by \(C_{O}\) is more complicated. Figure 7 expresses the geometric constraint of \(C_{O}\). By extracting the three oxygen atoms that generate \(C_{O}\) and making a triangle by them, the shortest side \(L_{1}\), the second shortest side \(L_{2}\) of the triangle, and angle θ between them (Fig. 7 upper left) are drawn as a three-dimensional scatter plot for each triangle (Fig. 7 upper right). The scatter plots from the directions of A, B, and C in the figure are also depicted in the lower left, center, and right panels of Fig. 7, respectively. The points of \((L_{1}, L_{2},\theta)\) are distributed around a surface. This feature is not seen when \(L_{1}\), \(L_{2}\), and θ are independent, which means that the distribution represents a geometric constraint.


Figure 7. (Color) The schematic of three oxygen atoms that generate \(C_{O}\) and definition of \(L_{1}\), \(L_{2}\), and θ (upper left panel). The bird's-eye view of scatter plot \((L_{1},L_{2},\theta)\) (upper right panel). The identical plots from the direction A (lower left panel), B (lower center panel), and C (lower right panel). The color represents the distance from the center of curve \(C_{O}\).

After the research, a detailed interpretation of the geometric constraints has been investigated using a simplified model.32) Moreover, it was found that PD has the ability to estimate the glass transition temperature using the support vector machine.33) PH has also been applied to experimentally obtained configuration data for various network formers.34,35) Amorphous ice is another example of a network former, where the medium-range order of the hydrogen-bonding network is quantified by PH.36)

Polymer

Polymers are another material where PH is currently being actively applied. As mentioned earlier, soft matter systems are very compatible with PH analysis because of the need to extract characteristic shapes from the disordered configuration. In many cases, coarse-grained models are used in molecular simulations of polymers. In general, there is no guarantee that the combining rule holds for the interaction potential of coarse-grained models. For example, the combining rule breaks down when charged particles or heterogeneous particles are treated as a single particle in coarse-graining. However, since the system currently under analysis does not meet these conditions, such contribution is expected to be small. Therefore, the PH interpretation can be applied to data obtained by coarse-grained molecular dynamics.

In the case of polymer systems, the correlation with material properties is actively discussed rather than the extraction of the structure itself, compared to the two examples mentioned above. For example, correlations with dielectric constant,37) shear response,38) and crazing process39) were discovered. As a study that focuses on the structure itself, a method to quantify the threading of ring polymers has been proposed.40)

Other systems

In addition to these representative systems, PH has also been applied to other material systems. In Pd40Ni40P20 bulk metallic glasses, the medium-range ordered structure was extracted by applying PH to the configuration data of both total and each component.41) For the toy model of amorphous solids by simulation, the effectiveness of PH in describing the structural changes during plastic deformation was evaluated based on machine learning,42) the yielding was associated with a decrease in the number of robust holes introduced by PH.43) In the experiment, PH was used to express the local structure of two-dimensional binary colloidal configuration confined at the gas-liquid interface.44) In addition to point clouds, PH has been applied to the visualization of energy landscape45) and characterization of spatial patterns such as phase separation structures of magnetic materials46) and polymers.47)

Perspectives for material science

We have presented some recent applications of PH, mainly to granular materials, network formers, and polymers. It is expected that PH will be applied to other material systems in the future. For example, soft matter other than polymers, such as liquid crystals, emulsions, vesicles, and colloidal gels, will be promising applications of PH because of the need to extract characteristic structures embedded in disordered configurations. In addition to PH for point clouds, it is also promising to use PH for the quantification of systems with bubbles, fillers, and porous shapes because their shapes are dominated by holes.

5. Software

For the applications of PH, software is important. The development of algorithms and software has progressed in parallel with the development of theories. The paper by Edelsbrunner et al.1) showed an algorithm to compute a PD, and the algorithm has been refined theoretically and practically by subsequent researches.2,4852) Parallel and distributed algorithms has been also studied.53,54)

Many researchers on PH have developed various data analysis software using PH. Table I shows the list of software.

Data table
Table I. List of software.

The developers have their interests and analysis targets. For example, Ripser focuses on the efficient algorithm to compute PDs for Vietoris–Rips filtrations. A benchmark62) showed that Ripser is one of the fastest software compared to competitors. PHAT and Dipha are developed by the same people as Ripser, and they also have good performance.

Gudhi is more interested in computational geometry. Gudhi collaborates with CGAL (Computational Geometry Algorithms Library, https://www.cgal.org/), a famous computer geometry library, and works with CGAL on the development. Gudhi has various representations of geometric objects.

R-TDA provides an interface from the R language. In fact, R-TDA uses Gudhi, PHAT, and Dyonisys as a backend. R-TDA is a bridge between R and the TDA world.

Each software has its advantages and features.

HomCloud

In this subsection, we introduce our software, HomCloud. Obayashi, one of the authors of this paper, mainly develops HomCloud.

HomCloud is free software, and you can download HomCloud from https://homcloud.dev. You can freely use, copy, modify, and redistribute the software under GPL (https://www.gnu.org/licenses/gpl-3.0.html).

HomCloud focuses on applications, mainly to materials science. We use HomCloud to analyze atomic configuration given by molecular dynamical simulations and reverse Monte-Carlo and pixel data and voxel data given by an electron microscope. Of course, other data than materials science is also available. HomCloud has been already used in various scientific researches, including materials science,17,3438,40,6369) geology,20,70) structural biology,71) and medical image analysis.72,73)

HomCloud has useful functionalities such as visualization, inverse analysis, machine learning. Especially, inverse analysis of HomCloud, which detects a ring or a cavity corresponding to each birth-death pair, is the most advanced among other software.

HomCloud has two types of interface, command-line interface and python interface. Since Python has a rich scientific computing ecosystem, you can combine the output of HomCloud with the ecosystem using the Python interface.

HomCloud is available on Windows, Linux, and macOS, including Apple Silicon Mac. You can also use HomCloud on Google Colaboratory, which allows you to write and execute Python code in the web browser. You can try HomCloud on Colaboratory without installing HomCloud on your machine.

To improve the software quality, the developers of HomCloud do several practices. One practice is dogfooding; that is, the developers use HomCloud for daily data analysis. Close communication between developers and users is effective for software improvement, and dogfooding is one extreme way. Another practice is continuous integration. After a code is uploaded to the code repository, HomCloud is automatically built and tested. Continuous integration is essential to keep the quality and portability of the software.

HomCloud internally uses various third-party components to reduce the development cost. For example, HomCloud uses PHAT, Ripser, and Dipha to compute PDs. These are known for their good performance; therefore we use them. CGAL is also used to compute alpha filtrations. Python's standard scientific computing libraries, such as NumPy, SciPy, and Matplotlib are also used.

Installing HomCloud

Since HomCloud is written in Python, you need to install Python before installing HomCloud. After installing Python, you can easily install HomCloud using pip command, which is the de facto standard package management system for Python. We recommend pip to install HomCloud on Linux, Intel macOS, and Windows. Another installation option is conda (https://conda.io). Conda is the open-source package-management and environment management system mainly developed by Anaconda Inc. (https://www.anaconda.com). Conda is a famous tool among data science people. HomCloud prepares conda packages for Windows, Linux, and Apple Silicon Mac. We recommend conda for Apple Silicon Mac since conda-forge has the most extensive package for Apple Silicon Mac. The installation manual is available at https://homcloud.dev/install-guide/index.en.html.

Basic input/output

HomCloud accepts the following data.

2D/3D pointcloud (alpha filtrations are used)

Any dimensional bitmap data, both binary data, and grayscale data (cubical filtrations are used)

Distance matrix (Vietoris–Rips filtrations are used)

Abstract simplicial complex with weights

We can assign an initial radius to each particle for a 3D pointcloud to reflect the size of the particle using weighted alpha shape.74) This functionality is useful to reflect the physical radii such as ionic radii. We can also use periodic boundary conditions for 3D pointcloud.

The output of HomCloud is a PD. You can plot a histogram by HomCloud. The backend of plotting is Matplotlib, so you can create fancy figures by utilizing Matplotlib. You can also output the list of birth and death times.

Inverse analysis

HomCloud has advanced inverse analysis features. Figure 8 shows the outline of the functionality. In a PD, each birth-death pair corresponds to a ring or a cavity. It is very helpful to identify the structure to analyze PDs, but the identification is not easy since many candidates exist. Mathematical optimization is used to select the tightest structure from the candidates.13,7579) HomCloud already implements some methods.77,79) We can apply different methods depending on the purpose, input type, and performance. The inverse analysis is available for all types of input, but which method to use depends on the data type.


Figure 8. (Color) Inverse analysis.

We can easily visualize the result of inverse analysis using HomCloud. We can also output the geometric information of the result in several forms. We can compare the output with other information for further investigation.

Machine learning

HomCloud supports machine learning using PDs. Machine learning can find hidden patterns that are common to many PDs. Since many machine learning methods require feature vectors or Gram matrices as input data, we need to convert PDs into vectors or matrices.15) HomCloud supports persistence image80) for a vectorization method. Persistence image vectors are computed from the histograms of diagrams. Intuitively saying, the values of bins of the histogram are used as vector elements. The following techniques are used to improve the performance of learning.

2D Gaussian filter to reflect the adjacency of bins

Weight function to reflect the importance of birth-death pairs depending on the distance from the diagonal

The advantage of persistence image is its simplicity. We can intuitively understand the method. We can also convert a vector into a histogram in reverse. The advantage is especially useful if we use linear machine learning models such as linear regression, logistic regression, and principal component analysis. Since the models give vectors with the same dimension as input vectors, we can visualize the learned results in histograms. The histograms show which birth-death pairs are important. After identifying important birth-death pairs, we can apply inverse analysis to map the learned results onto the original data.14)

Other functionalities

HomCloud has some other utilities. For example, HomCloud supports bottleneck and Wasserstein distances to compare PDs using HERA.81) You can plot the slice of a PD histogram using HomCloud.

HomCloud examples

As explained above, HomCloud has a Python interface and command-line interface. Comprehensive tutorials (both in English and Japanese) are available on HomCloud's website (https://homcloud.dev/basic-usage.en.html). You can learn how to analyze pointclouds, 2D or 3D bitmap data and distance matrices in those tutorials using Jupyter notebook (https://jupyter.org/). Tutorials for both the Python interface and command-line interface are available. We also prepare a tutorial about PH with machine learning.

Now we introduce the Python interface. You can try the codes on jupyter notebook or other integrated development environments such as VSCode.

First, HomCloud, Numpy, and Matplotlib are imported.

The input pointcloud data should be a NumPy 2D array. The following code reads the data from a text file.

The text file should be as follows. These numbers are the values of the X, Y, and Z coordinates of each point. The X, Y, and Z values are separated by spaces or tabs.

5.043-16.116-4.787

7.184-16.066-3.850

-8.529-1.029-0.326

:    :    :

The PDs are computed as follows:

The PDs are saved into a file “pointcloud.pdmg”. You can load the information of the 1D PD by the following code:

Since “pointcloud.pdgm” has the information of PDs of all dimensions, hc.PDList("pointcloud.pdgm") returns PDList object and we get PD object by pdlist.dth_diagram(1).

The following code will give you the list of birth times as a NumPy array.

Of course, deaths is also available to get death times. You can construct a histogram of birth-death pairs and plot it using HomCloud.

You can specify the range and the number of bins as above. (-0.2, 3.0) is the range of birth and death times, and 256 is the number of bins in both axes. We can also modify the color bar of the plot. Since HomCloud uses Matplotlib as a backend, you can create good appearance figures by calling the Matplotlib functions.

You can compute an optimal volume, one important feature of HomCloud, by the following codes:

The first line picks up a birth-death pair nearest to \((0.72, 2.1)\) and the second line computes the optimal volume of the pair. You can get access to the coordinates of points on the corresponding ring or cavity as follows.

You can also get access to the edges of a ring or the faces of a cavity using boundary method. HomCloud also can visualize the rings and cavities as in Fig. 8.

Performance remarks

Since HomCloud uses efficient software such as PHAT or Ripser as a backend, HomCloud has good performance. You can analyze 1,000,000 points in 3D space or \(300\times 300\times 300\) voxel data in five or ten minutes with a recent PC with 32 GB memory. Larger data requires a larger computer. Some benchmark results are available at https://homcloud.dev/benchmarks.html.

Future plans

Our developer team has a plan to continuously improve HomCloud. New features will be implemented in parallel with theoretical research. Performance improvements and UI enhancements will be made as needed. We will also implement new methods proposed by previous researches. Now we have the idea to implement vectorization and kernel methods for machine learning other than persistence image.

6. Concluding Remarks

PH gives a way to summarize the shape of data quantitatively and it is helpful to analyze materials data from micro-scale data to macro-scale data. PH is now rapidly developed from theory, software, and to applications to various fields including materials science. The collaboration between theory and applications of PH is quite active.

One example of collaboration is the study of multi-parameter persistence. In PH, scale information is encoded in a filtration, enabling us to extract multi-scale geometric structures. If we want to integrate another information such as noise reduction level or temporal information, it is natural to introduce another filtration axis. Carlsson and Zomorodian82) first proposed multi-parameter persistence and rank invariant to characterize it. The paper also showed the theoretical difficulty of multi-parameter persistence. Now mathematicians tackle this mathematical problem to give a better characterization,8385) and in the future, they will provide feedback for applications.

Software is essential for connecting theory to applications. A lot of PH software has been developed according to the theoretical interests and applications of developers. One of the software, HomCloud, is introduced in this paper. HomCloud has some advanced features such as inverse analysis and machine learning support and has been applied to data analysis of materials.

This paper introduces PH for readers interested in the application to materials science and gives an intuitive explanation of PH. Some review papers and textbooks,62,8691) including a textbook in Japanese,92) have been written for readers who are interested in further mathematical details and algorithms.

Acknowledgments

This work was partially sponsored by JSPS KAKENHI JP19H00834, JP20H05884, JP18H01188, JST Presto JPMJPR1923, JST CREST Mathematics (JPMJCR15D3), JST MIRAI Program (JPMJMI18G3), and Council for Science, Technology and Innovation (CSTI), Cross-ministerial Strategic Innovation Promotion Program (SIP) and “Materials Integration” for revolutionary design system of structural materials (Funding agency: JST).


References

  • 1 H. Edelsbrunner, D. Letscher, and A. Zomorodian, Proc. 41st Annual Symposium on Foundations of Computer Science, 2000, p. 454. 10.1109/SFCS.2000.892133 CrossrefGoogle Scholar
  • 2 A. Zomorodian and G. Carlsson, Discrete Comput. Geom. 33, 249 (2005). 10.1007/s00454-004-1146-y CrossrefGoogle Scholar
  • 3 D. Cohen-Steiner, H. Edelsbrunner, and J. Harer, Discrete Comput. Geom. 37, 103 (2007). 10.1007/s00454-006-1276-5 CrossrefGoogle Scholar
  • 4 F. Chazal, D. Cohen-Steiner, M. Glisse, L. J. Guibas, and S. Y. Oudot, Proc. 25th Annual Symposium on Computational Geometry, 2009, p. 237. 10.1145/1542362.1542407 CrossrefGoogle Scholar
  • 5 U. Bauer and M. Lesnick, Proc. 30th Annual Symposium on Computational Geometry, 2014, p. 355. 10.1145/2582112.2582168 CrossrefGoogle Scholar
  • 6 V. de Silva and R. Ghrist, Algebr. Geom. Topol. 7, 339 (2007). 10.2140/agt.2007.7.339 CrossrefGoogle Scholar
  • 7 Y. Hiraoka, T. Nakamura, A. Hirata, E. G. Escolar, K. Matsue, and Y. Nishiura, Proc. Natl. Acad. Sci. U.S.A. 113, 7035 (2016). 10.1073/pnas.1520877113 CrossrefGoogle Scholar
  • 8 M. Saadatfar, H. Takeuchi, V. Robins, N. Francois, and Y. Hiraoka, Nat. Commun. 8, 15082 (2017). 10.1038/ncomms15082 CrossrefGoogle Scholar
  • 9 J. M. Chan, G. Carlsson, and R. Rabadan, Proc. Natl. Acad. Sci. U.S.A. 110, 18566 (2013). 10.1073/pnas.1313480110 CrossrefGoogle Scholar
  • 10 B. Wang and G.-W. Wei, J. Comput. Phys. 305, 276 (2017). 10.1016/j.jcp.2015.10.036 CrossrefGoogle Scholar
  • 11 C. Giusti, E. Pastalkova, C. Curto, and V. Itskov, Proc. Natl. Acad. Sci. U.S.A. 112, 13455 (2015). 10.1073/pnas.1506407112 CrossrefGoogle Scholar
  • 12 R. Van De Weygaert, G. Vegter, H. Edelsbrunner, B. J. Jones, P. Pranav, C. Park, W. A. Hellwing, B. Eldering, N. Kruithof, E. P. Bos et al., Transactions on Computational Science XIV (Springer, New York, 2011) p. 60. CrossrefGoogle Scholar
  • 13 E. G. Escolar and Y. Hiraoka, Optimal Cycles for Persistent Homology Via Linear Programming (Springer Japan, Tokyo, 2016) p. 79. CrossrefGoogle Scholar
  • 14 I. Obayashi, Y. Hiraoka, and M. Kimura, J. Appl. Comput. Topol. 1, 421 (2018). 10.1007/s41468-018-0013-5 CrossrefGoogle Scholar
  • 15 C. S. Pun, K. Xia, and S. X. Lee, arXiv:1811.00252. Google Scholar
  • 16 H. Edelsbrunner and E. P. Mücke, ACM Trans. Graph. 13, 43 (1994). 10.1145/174462.156635 CrossrefGoogle Scholar
  • 17 M. Kimura, I. Obayashi, Y. Takeichi, R. Murao, and Y. Hiraoka, Sci. Rep. 8, 3553 (2018). 10.1038/s41598-018-21867-z CrossrefGoogle Scholar
  • 18 V. Robins, M. Saadatfar, O. Delgado-Friedrichs, and A. P. Sheppard, Water Resour. Res. 52, 315 (2016). 10.1002/2015WR017937 CrossrefGoogle Scholar
  • 19 A. Herring, V. Robins, and A. Sheppard, Water Resour. Res. 55, 555 (2019). 10.1029/2018WR022780 CrossrefGoogle Scholar
  • 20 A. Suzuki, M. Miyazawa, J. M. Minto, T. Tsuji, I. Obayashi, Y. Hiraoka, and T. Ito, Sci. Rep. 11, 17948 (2021). 10.1038/s41598-021-97222-6 CrossrefGoogle Scholar
  • 21 T. Nakamura, Y. Hiraoka, A. Hirata, E. G. Escolar, and Y. Nishiura, Nanotechnology 26, 304001 (2015). 10.1088/0957-4484/26/30/304001 CrossrefGoogle Scholar
  • 22 Y. Cheng and E. Ma, Prog. Mater. Sci. 56, 379 (2011). 10.1016/j.pmatsci.2010.12.002 CrossrefGoogle Scholar
  • 23 R. Zallen, The Physics of Amorphous Solids (Wiley, New York, 2008). Google Scholar
  • 24 P. J. Steinhardt, D. R. Nelson, and M. Ronchetti, Phys. Rev. B 28, 784 (1983). 10.1103/PhysRevB.28.784 CrossrefGoogle Scholar
  • 25 J.-P. Hansen and I. R. McDonald, Theory of Simple Liquids (Elsevier, Amsterdam, 1990). Google Scholar
  • 26 J. Delhommelle and P. Millié, Mol. Phys. 99, 619 (2001). 10.1080/00268970010020041 CrossrefGoogle Scholar
  • 27 L. Kondic, A. Goullet, C. O’Hern, M. Kramar, K. Mischaikow, and R. Behringer, Europhys. Lett. 97, 54001 (2012). 10.1209/0295-5075/97/54001 CrossrefGoogle Scholar
  • 28 S. Ardanza-Trevijano, I. Zuriguel, R. Arévalo, and D. Maza, Phys. Rev. E 89, 052212 (2014). 10.1103/PhysRevE.89.052212 CrossrefGoogle Scholar
  • 29 M. X. Lim and R. P. Behringer, Europhys. Lett. 120, 44003 (2018). 10.1209/0295-5075/120/44003 CrossrefGoogle Scholar
  • 30 Pradipto and H. Hayakawa, Phys. Rev. Fluids 6, 033301 (2021). 10.1103/PhysRevFluids.6.033301 CrossrefGoogle Scholar
  • 31 D. Coslovich and G. Pastore, J. Phys.: Condens. Matter 21, 285107 (2009). 10.1088/0953-8984/21/28/285107 CrossrefGoogle Scholar
  • 32 D. Ormrod Morley, P. S. Salmon, and M. Wilson, J. Chem. Phys. 154, 124109 (2021). 10.1063/5.0040393 CrossrefGoogle Scholar
  • 33 G. Kusano, K. Fukumizu, and Y. Hiraoka, J. Mach. Learn. Res. 18, 6947 (2017). Google Scholar
  • 34 Y. Onodera, S. Kohara, P. S. Salmon, A. Hirata, N. Nishiyama, S. Kitani, A. Zeidler, M. Shiga, A. Masuno, H. Inoue, S. Tahara, A. Polidori, H. E. Fischer, T. Mori, S. Kojima, H. Kawaji, A. I. Kolesnikov, M. B. Stone, M. G. Tucker, M. T. McDonnell, A. C. Hannon, Y. Hiraoka, I. Obayashi, T. Nakamura, J. Akola, Y. Fujii, K. Ohara, T. Taniguchi, and O. Sakata, NPG Asia Mater. 12, 85 (2020). 10.1038/s41427-020-00262-z CrossrefGoogle Scholar
  • 35 C. Koyama, S. Tahara, S. Kohara, Y. Onodera, D. R. Småbråten, S. M. Selbach, J. Akola, T. Ishikawa, A. Masuno, A. Mizuno, J. T. Okada, Y. Watanabe, Y. Nakata, K. Ohara, H. Tamaru, H. Oda, I. Obayashi, Y. Hiraoka, and O. Sakata, NPG Asia Mater. 12, 43 (2020). 10.1038/s41427-020-0220-0 CrossrefGoogle Scholar
  • 36 S. Hong and D. Kim, J. Phys.: Condens. Matter 31, 455403 (2019). 10.1088/1361-648X/ab3820 CrossrefGoogle Scholar
  • 37 Y. Shimizu, T. Kurokawa, H. Arai, and H. Washizu, Sci. Rep. 11, 2274 (2021). 10.1038/s41598-021-80975-5 CrossrefGoogle Scholar
  • 38 Y. Yoshimoto, S. Sugiyama, S. Shimada, T. Kaneko, S. Takagi, and I. Kinefuchi, Macromolecules 54, 958 (2021). 10.1021/acs.macromol.0c02278 CrossrefGoogle Scholar
  • 39 T. Ichinomiya, I. Obayashi, and Y. Hiraoka, Phys. Rev. E 95, 012504 (2017). 10.1103/PhysRevE.95.012504 CrossrefGoogle Scholar
  • 40 F. Landuzzi, T. Nakamura, D. Michieletto, and T. Sakaue, Phys. Rev. Res. 2, 033529 (2020). 10.1103/PhysRevResearch.2.033529 CrossrefGoogle Scholar
  • 41 S. Hosokawa, J.-F. Bérar, N. Boudet, W.-C. Pilgrim, L. Pusztai, S. Hiroi, K. Maruyama, S. Kohara, H. Kato, H. E. Fischer et al. , Phys. Rev. B 100, 054204 (2019). 10.1103/PhysRevB.100.054204 CrossrefGoogle Scholar
  • 42 J. W. Rocks, S. A. Ridout, and A. J. Liu, APL Mater. 9, 021107 (2021). 10.1063/5.0035395 CrossrefGoogle Scholar
  • 43 T. Shirai and T. Nakamura, J. Phys. Soc. Jpn. 88, 074801 (2019). 10.7566/JPSJ.88.074801 LinkGoogle Scholar
  • 44 V. Lotito and T. Zambelli, Langmuir 34, 7827 (2018). 10.1021/acs.langmuir.8b01411 CrossrefGoogle Scholar
  • 45 J. M. Carr, D. Mazauric, F. Cazals, and D. J. Wales, J. Chem. Phys. 144, 054109 (2016). 10.1063/1.4941052 CrossrefGoogle Scholar
  • 46 T. Yamada, Y. Suzuki, C. Mitsumata, K. Ono, T. Ueno, I. Obayashi, Y. Hiraoka, and M. Kotsugi, Vac. Surf. Sci. 62, 153 (2019)10.1380/vss.62.153 [in Japanese]. CrossrefGoogle Scholar
  • 47 Y. Mototake, S. Yamanaka, T. Aoyagi, T. Ohnishi, and K. Fukumizu, Proc. Int. Symposium on Nonlinear Theory and Its Applications, 2020. Google Scholar
  • 48 C. Chen and M. Kerber, Proc. 27th European Workshop on Computational Geometry, 2011, Vol. 11, p. 197. Google Scholar
  • 49 V. de Silva, D. Morozov, and M. Vejdemo-Johansson, Inverse Probl. 27, 124003 (2011). 10.1088/0266-5611/27/12/124003 CrossrefGoogle Scholar
  • 50 V. de Silva, D. Morozov, and M. Vejdemo-Johansson, Discrete Comput. Geom. 45, 737 (2011). 10.1007/s00454-011-9344-x CrossrefGoogle Scholar
  • 51 G. Henselman and R. Ghrist, arXiv:1606.00199. Google Scholar
  • 52 J.-D. Boissonnat and C. Maria, J. Appl. Comput. Topol. 3, 59 (2019). 10.1007/s41468-019-00025-y CrossrefGoogle Scholar
  • 53 U. Bauer, M. Kerber, and J. Reininghaus, Topological Methods in Data Analysis and Visualization III (Springer, Cham, 2014) p. 103. CrossrefGoogle Scholar
  • 54 U. Bauer, M. Kerber, and J. Reininghaus, Proc. 16th Workshop on Algorithm Engineering and Experiments (ALENEX), 2014, p. 31. 10.1137/1.9781611973198.4 CrossrefGoogle Scholar
  • 55 K. Mischaikow and V. Nanda, Discrete Comput. Geom. 50, 330 (2013). 10.1007/s00454-013-9529-6 CrossrefGoogle Scholar
  • 56 U. Bauer, M. Kerber, J. Reininghaus, and H. Wagner, J. Symb. Comp. 78, 76 (2017).10.1016/j.jsc.2016.03.008 Algorithms and Software for Computational Topology. CrossrefGoogle Scholar
  • 57 U. Bauer, J. Appl. Comput. Topol. 5, 391 (2021). 10.1007/s41468-021-00071-5 CrossrefGoogle Scholar
  • 58 M. Čufar, J. Open Source Softw. 5, 2614 (2020). 10.21105/joss.02614 CrossrefGoogle Scholar
  • 59 N. Saul and C. Tralie, Scikit-TDA: Topological Data Analysis for Python (2019). Google Scholar
  • 60 G. Tauzin, U. Lupo, L. Tunstall, J. B. Pérez, M. Caorsi, A. M. Medina-Mardones, A. Dassatti, and K. Hess, J. Mach. Learn. Res. 22, 39 (2021). Google Scholar
  • 61 J. Binchi, E. Merelli, M. Rucco, G. Petri, and F. Vaccarino, Electron. Notes Theor. Comput. Sci. 306, 5 (2014).10.1016/j.entcs.2014.06.011 Proceedings of the 5th International Workshop on Interactions between Computer Science and Biology (CS2Bio’14). CrossrefGoogle Scholar
  • 62 N. Otter, M. A. Porter, U. Tillmann, P. Grindrod, and H. A. Harrington, EPJ Data Sci. 6, 17 (2017). 10.1140/epjds/s13688-017-0109-5 CrossrefGoogle Scholar
  • 63 A. Hirata, T. Wada, I. Obayashi, and Y. Hiraoka, Commun. Mater. 1, 98 (2020). 10.1038/s43246-020-00100-3 CrossrefGoogle Scholar
  • 64 Y. Onodera, S. Kohara, S. Tahara, A. Masuno, H. Inoue, M. Shiga, A. Hirata, K. Tsuchiya, Y. Hiraoka, I. Obayashi, K. Ohara, A. Mizuno, and O. Sakata, J. Ceram. Soc. Jpn. 127, 853 (2019). 10.2109/jcersj2.19143 CrossrefGoogle Scholar
  • 65 Y. Onodera, Y. Takimoto, H. Hijiya, T. Taniguchi, S. Urata, S. Inaba, S. Fujita, I. Obayashi, Y. Hiraoka, and S. Kohara, NPG Asia Mater. 11, 75 (2019). 10.1038/s41427-019-0180-4 CrossrefGoogle Scholar
  • 66 M. Murakami, S. Kohara, N. Kitamura, J. Akola, H. Inoue, A. Hirata, Y. Hiraoka, Y. Onodera, I. Obayashi, J. Kalikka, N. Hirao, T. Musso, A. S. Foster, Y. Idemoto, O. Sakata, and Y. Ohishi, Phys. Rev. B 99, 045153 (2019). 10.1103/PhysRevB.99.045153 CrossrefGoogle Scholar
  • 67 M. Cramer Pedersen, V. Robins, K. Mortensen, and J. J. K. Kirkensgaard, Proc. R. Soc. A 476, 20200170 (2020). 10.1098/rspa.2020.0170 CrossrefGoogle Scholar
  • 68 I. Ando, Y. Mugita, K. Hirayama, S. Munetoh, M. Aramaki, F. Jiang, T. Tsuji, A. Takeuchi, M. Uesugi, and Y. Ozaki, Mater. Sci. Eng. A 828, 142112 (2021). 10.1016/j.msea.2021.142112 CrossrefGoogle Scholar
  • 69 E. Minamitani, T. Shiga, M. Kashiwagi, and I. Obayashi, arXiv:2107.05865. Google Scholar
  • 70 A. Suzuki, M. Miyazawa, A. Okamoto, H. Shimizu, I. Obayashi, Y. Hiraoka, T. Tsuji, P. Kang, and T. Ito, Comput. Geosci. 143, 104550 (2020). 10.1016/j.cageo.2020.104550 CrossrefGoogle Scholar
  • 71 T. Ichinomiya, I. Obayashi, and Y. Hiraoka, Biophys. J. 118, 2926 (2020). 10.1016/j.bpj.2020.04.032 CrossrefGoogle Scholar
  • 72 K. Koseki, H. Kawasaki, T. Atsugi, M. Nakanishi, M. Mizuno, E. Naru, T. Ebihara, M. Amagai, and E. Kawakami, npj Syst. Biol. Appl. 6, 40 (2020). 10.1038/s41540-020-00160-8 CrossrefGoogle Scholar
  • 73 A. Oyama, Y. Hiraoka, I. Obayashi, Y. Saikawa, S. Furui, K. Shiraishi, S. Kumagai, T. Hayashi, and J. Kotoku, Sci. Rep. 9, 8764 (2019). 10.1038/s41598-019-45283-z CrossrefGoogle Scholar
  • 74 H. Edelsbrunner, Technical Report, Champaign, IL, U.S.A. (1992). Google Scholar
  • 75 A. Tahbaz-Salehi and A. Jadbabaie, IEEE 47th Conference on Decision and Control, 2008, p. 4170. 10.1109/CDC.2008.4738751 CrossrefGoogle Scholar
  • 76 T. K. Dey, A. N. Hirani, and B. Krishnamoorthy, SIAM J. Comput. 40, 1026 (2011). 10.1137/100800245 CrossrefGoogle Scholar
  • 77 I. Obayashi, SIAM J. Appl. Algebra Geom. 2, 508 (2018). 10.1137/17M1159439 CrossrefGoogle Scholar
  • 78 T. K. Dey, T. Hou, and S. Mandal, in Computational Topology in Image Context, ed. R. Marfil, M. Calderón, F. Díaz del Río, P. Real, and A. Bandera (Springer, Cham, 2019) p. 123. CrossrefGoogle Scholar
  • 79 I. Obayashi, arXiv:2109.11711. Google Scholar
  • 80 H. Adams, T. Emerson, M. Kirby, R. Neville, C. Peterson, P. Shipman, S. Chepushtanova, E. Hanson, F. Motta, and L. Ziegelmeier, J. Mach. Learn. Res. 18, 1 (2017). Google Scholar
  • 81 M. Kerber, D. Morozov, and A. Nigmetov, J. Exp. Algorithmics 22, 1 (2017). 10.1145/3064175 CrossrefGoogle Scholar
  • 82 G. Carlsson and A. Zomorodian, Discrete Comput. Geom. 42, 71 (2009). 10.1007/s00454-009-9176-0 CrossrefGoogle Scholar
  • 83 E. G. Escolar and Y. Hiraoka, Discrete Comput. Geom. 55, 100 (2016). 10.1007/s00454-015-9746-2 CrossrefGoogle Scholar
  • 84 W. Kim and F. Mémoli, J. Appl. Comput. Topol. 5, 533 (2021). 10.1007/s41468-021-00075-1 CrossrefGoogle Scholar
  • 85 H. Asashiba, E. G. Escolar, K. Nakashima, and M. Yoshiwaki, arXiv:1911.01637. Google Scholar
  • 86 H. Edelsbrunner and J. Harer, Computational Topology: An Introduction (American Mathematical Society, Providence, RI, 2010). Google Scholar
  • 87 R. Ghrist, Elementary Applied Topology (Createspace, North Charleston, SC, 2014) 1.0 ed. Google Scholar
  • 88 G. Carlsson, Bull. Am. Math. Soc. 46, 255 (2009). 10.1090/S0273-0979-09-01249-X CrossrefGoogle Scholar
  • 89 H. Edelsbrunner and J. Harer, Persistent Homology — A Survey (American Mathematical Society, Providence, RI, 2008) Contemp. Math., Vol. 453, p. 257. CrossrefGoogle Scholar
  • 90 M. Buchet, Y. Hiraoka, and I. Obayashi, in Persistent Homology and Materials Informatics, ed. I. Tanaka (Springer Singapore, Singapore, 2018) p. 75. CrossrefGoogle Scholar
  • 91 M. Vejdemo-Johansson, Sketches of a Platypus: A Survey of Persistent Homology and Its Algebraic Foundations (American Mathematical Society, Providence, RI, 2014) Contemp. Math., Vol. 620. Google Scholar
  • 92 Y. Hiraoka, Tanpakushitu Kouzou to Topology: Persistent Homology Gun Nyumon (Structure of Protein and Topology: Introduction to Persistent Homology) (Kyoritsu, Tokyo, 2013) [in Japanese]. Google Scholar

Author Biographies


Ippei Obayashi was born in Wakayama Prefercture, Japan in 1980. He obtained his B.Sc. (2004), M.Sc. (2006), and Ph.D. (2010) degree from Kyoto University. He was a postdoctoral researcher at Kyoto University for the CREST mathematics program (2010–2015), an assistant professor (2015– 2018) and an associate professor (2018) at AIMR, Tohoku University, a research scientist at RIKEN AIP (2018–2021). He has been a professor at Okayama University since 2021. He has worked on theory and applications of dynamical systems and topological data analysis. He is now especially interested in persistent homology, one of the main tools of topological data analysis. He has also developed HomCloud, persistent homology-based data analysis software.

Takenobu Nakamura was born in Yamaguchi Prefecture, Japan in 1978. He obtained his B.Sc. (2002) degree from Keio University, and M.Sc. (2004), Ph.D. (2007) degree from the University of Tokyo. He was a postdoctoral researcher at AIST for the CREST program (2007–2012), a postdoctoral researcher at FZ Jülich in Germany (2012), an assistant professor (2012–2017) at AIMR Tohoku University, and Senior Researcher at AIST (2017–). He has worked on theory and simulation of non-equilibrium soft matter and the application of topological data analysis to chemical physics.

Yasuaki Hiraoka was born in Oita prefecture, Japan in 1978. He obtained his B.Eng. degree (2000), M.Sc. degree (2002), D.Sc. degree (2005) from Osaka University, Japan. He belonged to Hiroshima University (2006–2011), Kyushu University (2011–2015), Tohoku University (2015–2018) as an assistant professor, associate professor and professor, and now works at Kyoto University (Director of Center for Advanced Study, Professor, and Deputy Director of ASHBi). He has been working on applied mathematics, especially topological data analysis and applied topology.