XRP Network and Proposal of Flow Index

XRP is a modern crypto-asset (crypto-currency) developed by Ripple Labs, which has been increasing its financial presence. We study its transaction history available as ledger data. An analysis of its basic statistics, correlations, and network properties are presented. Motivated by the behavior of some nodes with histories of large transactions, we propose a new index: the ``Flow Index.'' The Flow Index is a pair of indices suitable for characterizing transaction frequencies as a source and destination of a node. Using this Flow Index, we study the global structure of the XRP network and construct bow-tie/walnut structure.


Introduction
The world of crypto-assets is dynamic and complex (we use the term "crypto-asset" instead of "crypto-currency" throughout this paper. See [1]). Its presence in the financial market has steadily increased since its inception in early 2013. Understanding the nature of this world is important.
Ever since its inception, all transaction data (except for a few, which we will elaborate in the next section) are stored and available through various media, providing researchers with ample opportunity to study the intriguing properties, similar to Bitcoin. [2][3][4].
Traditional monetary transactions through financial institutions, such as banks, are crucial for analyzing and understanding inter-firm and firm-household relationships. However, the availability of such data is quite limited because of privacy concerns, except for a few rare cases [5].
In this study, we present a basic analysis of the XRP world observed through ledger data, which is the transaction record comprising the amount of XRP, source account, destination account, and the day and time (in coordinated universal time (UTC)) of transactions. (For a reasonable and readable introduction to XRP, see [6].) Accounts are just 33 letter-long codes such as "rfceigRxmgA-jWR6LH1L7YsooWKMqM5Pr6," and no other information on the owner (name, address, etc.) are available. Although this makes interpreting the results of analysis rather difficult, because of its importance and data availability, it remains a worthwhile endeavor.
In Section 2, we describe our ledger data and its basic properties, including the distribution of the number of transactions, properties of the time series with the day-of-the-week analysis, and correlations. Section 3 is devoted to analyzing the XRP network of the transactions, wherein nodes are accounts and edges are transactions. Section 4 describes the new proposal of the Modified Inverse Herfindahl-Hirschman Index and Flow Index. Section 5 is devoted to studying the global structure of the XRP network, similar to bow-tie/walnut decomposition, using the Flow Index. Section 6 offers

Data and its Basic Statistics
The ledger data we analyze are for 2,463 days of 1/2/2013-9/30/2019. The first 32,570 ledgers were lost because of "a mishap in 2012" [7].

Data Selection
From this data set, we extract data with the following criteria.
(1) The ledger contains transactions between 1,525 currencies and crypto-assets. Most of these are from XRP to XRP; however, some of them are from XRP to others, others to XRP, and others to others. We provide a yearly breakdown for those with XRP-XRP, and the rest is provided in Fig.1. We use XRP-XRP transactions only.
(2) The data set contains "partial payments" [8], wherein the actual transferred amount is different from the transaction amount because of the payment of transfer fees. In reality, they are rare: 0.018% of all of the XRP-XRP transactions, with the minimum of 0% in 2013 and a maximum of 0.041% in 2019. The distribution of "Amount" and "Delivered amount" and the delivered amount is provided in Fig. 2, wherein we observe some patterns of quantization of the delivered amount and the proportionality between the amount and the delivered amount. We drop these "partial payment" transactions from the following analysis.
After filtering, we arrive at the data of the sizes listed in Table II.

Data Distributions
The cumulative distribution function (CDF) of the annual XRP transaction is plotted in Fig. 3, wherein the dashed straight line has a gradient of = −1. The data for XRP > 10 5 fit are appropriate to this line, except for the first year of 2013. This means that the XRP distribution has a power-decaying fat tail, XRP −1 in CDF and XRP −2 in the Probability Distribution Function (eps). The absolute value of this exponent of CDF tail is called the "Pareto index.". The current Pareto index 1 is known to be a phase transition point between an "Oligopoly" phase and a "Pseudo Equality" phase [9,10]: In case the fat tail is thinner and the Pareto index is larger than 1, the share of those with higher ranks (top, largest, second largest, and so forth) have zero shares when an infinite number of entries are present.  Distribution of "Amount" and "Delivered amount" of the XRP-XRP transactions whose "Amount" is not equal to "Delivered Amount." The dashed diagonal has a gradient equal to one wherein the amount is equal to the delivered amount.
In contrast, the top-ranking ones have finite shares, even when an infinite number of entries exists. The top 10 transactions (in amount) are listed in Table III. As shown in Table II, the total traded amount is ∼ 1.1651 × 10 12 XRP. Therefore, the top two transactions in this table occupy 17% of all ∼ 45 million transactions, which is indeed a large share.
This criticality of the Pareto index being equal to one is most easily understood when referring to the case of the size of firms in a country. Empirical analysis of firms in developed countries (such as Japan, France, Germany, and the United Kingdom) showed that the firm size (number of employees, amount of sales, or income) distribution has a Pareto index very close to one through many years. This is in contrast to the distribution of personal income whose Pareto index varies around 2 (say, 1.5-2.5), depending on the economic situation of the year. For firms, business competition drives the Pareto index downward (fatter tail), as big firms attempt to dominate the market. In contrast, various political pressures and measures by the central bank and the ministries against monopoly and oligopoly are active. The current author argued that the balancing critical point is at a Pareto index equal to one [9]. However, a big difference between this argument on firm size and the current XRP transaction exists.  The former is "stock," while the latter (the transaction amount we analyze) is "flow." Moreover, the XRP world is free from central governing organization and there is no measure against monopoly and oligopoly. Thus, the reason behind the current finding remains a mystery for the moment. Fig. 4 shows the daily amount of transactions. Fig. 5 shows the daily number of users (blue, orange, and green lines show number of sources, destinations, and either sources or destinations, respectively). We observe that both the amount and number of users are highly volatile and that most users trade both as a source and destination.

Time Series
The left panel of Fig   are approximately weekends in most economically active regions, as the Pacific Standard time (PST) in the United States is UTC-7, and Japan Standard Time (JST) is UTC+9.) It is clear that the absolute values of the Fourier components in the right panel present a clear peak at a period of seven days. The daily number of users also shows reduction on weekends. However, the daily total amount of transactions does not show a clear periodicity. This may be because of its high volatility. This weekly periodicity is weaker in other years. A similar behavior was found in the analysis of the volume and number of Bitcoin transactions [2].

Correlation
There is a correlation that obeys an interesting phenomenological law. Fig. 7 shows the correlation between daily number of users and daily total amount of transactions, in which different colors show different years. The dashed line has gradient = 1.5, which means that We observe yearly development toward large numbers of users and a larger amount of transactions roughly along this line. A close examination of the annual data reveals that the distribution split into a group above and another below this line in 2013 and 2014, respectively; however, this converges in later years. This power law is curious, calling for the modeling of agents in this XRP world: Since Eq. (1) This means an interesting characteristic of the herding behavior in XRP trade: in a day of high activity with a number of users larger than usual, the amount of average transactions increases. For example, if the number of users becomes 10 times as much, the amount of average transactions becomes √ 10 ≃ 3.2 times as much.
Other correlations between the number of destinations, number of sources, and number of transactions show no such behavior. All three correlations follow linear proportionality (power exponents are close to one).

XRP Network
Let us examine the network(s) they form, where nodes are accounts, and edges are transactions. The CDFs of in-degree and out-degree are plotted in Fig. 8. We observe that, except for 2015-2017, it also has a fat tail with Pareto index 1. Again, this is an interesting finding, awaiting deeper insight and/or modelling.

Nodes with Large Transaction History
As noted above, the transactions cover a vast range of 10 −6 (minimum unit) XRP to ∼ 10 11 XRP (Fig. 3). As dealing with them all is unproductive, we introduce a threshold for their biggest    Table IV. transaction. Let us first examine nodes with transactions equal to or greater than 10 7,8,9 XRP at least once during the entire period (2013-2019). The network sizes they form are listed in Table IV, and the corresponding networks are visualized in Fig. 9. In denoting the node, we use the set of 1,136 nodes in the ≥ 10 7 category and name the nodes with "bn" plus its number (0001-1136) in the set. (Hereafter, we shall call those 1,136 nodes as "big nodes," and the 94 nodes with threshold = 10 9 XRP as "huge nodes.") For example, "bn0001" is the first node in a set of big nodes.

Some Notable Big Nodes
Some of the nodes that made these huge transactions have a rather notable transaction history, some of which are listed below. Pair Nodes This is a pair of nodes, among which a large amount of XRP was transferred, with no other notable activity. An example of this type of node is (bn0347, bn0864) at the top two in Table III. Within a minute, 2×10 11 XRP was transferred from the former to the latter. The former had no other activity, while the latter had numerous transactions considered to be negligible amounts compared to these transactions. Their transaction histories are provided in Fig. 10, where blue dots present the day and amount of transactions as destination (receival of XRP), red dots show those of transactions as source, and green lines show balance, assuming it is zero initially. Bridge Nodes This node receives a large amount of XRP and sends it to another node, with no other notable activity. The node bn530, the third in Table III, is an example of this case, whose transaction history is plotted in Fig. 11. In characterizing these nodes considering the amount and frequency of transactions, noting that some nodes make these huge transactions and many transactions of small amounts is important. A good example is the node bn0846 in Fig. 10: This node made several transactions of small amounts as both a destination and source. However, they are negligible compared to the two large transactions on the same day, totaling 2.0×10 11 XRP, which are the only significant transactions in characterizing the transaction behavior of this node. As made clear from this example, a simple count of the number of transactions (as a source or destination) or total number of transactions cannot be considered a good measure of its activity. What counts is the number of "significant" (in the amount) transactions Threshold # Nodes # edges 10 7 1,136 5,187 10 8 262 685 10 9 94 170

Herfindahl-Hirschman Index
The Herfindahl-Hirschman Index [11] (hereafter abbreviated to "HH Index,"") is used in several data analysis areas to quantify how numbers are distributed to components in a list. Consider a list ℓ of N non-negative numbers, whose total number is equal to 1: (One might think of this list as a list of shares: For example, the first entry ℓ 1 is the share of the 1st firm in the sales of a certain good, ℓ 2 share of the second firm, and so on.) Its HH Index H(ℓ) is defined as follows: and satisfies 0 < H(ℓ) ≤ 1.
Here are some examples: As presented here, the HH Index H(ℓ) is a measure of the concentration of the values in ℓ: If it is concentrated to just one component, H(ℓ) = 1. If it is less concentrated, the smaller H(ℓ) is. The inverse HH Index, 1/H(ℓ), may be used as a measure of the effective number of entries, as 1/H(ℓ) = m in the latter case (6). However, it has one undesirable property. In the next subsection, we describe the method and propose a modification for overcoming it.

Modified Inverse Herfindahl-Hirschman Index
Let us examine the behavior of the inverse HH Index for a generalization of (6).
with 0 ≤ r ≤ 1, which has The inverse is plotted in Fig. 12 as a function of m + r (≡ x). As shown here,1/H(ℓ(r)) = m at integer values of x = m (r = 0), as noted above. However, it flattens as r → 1, and the derivative with respect to x is discontinuous at x = m. Essentially, the inverse HH Index is much less sensitive to the reduction in the distribution of numbers (r decreases starting at r = 1 for fixed m − 1) than its expansion (r increases starting at r = 0 for a fixed m). It also deviates in certain ways from the dashed diagonal line f (x) = x, while having a measure closer to this line is desirable. Fig. 12. Behavior of the inverse HH Index 1/H(ℓ(r)) as a function of x = m + r.
Because of this property, one may choose n to be a large number for analysis. In the following, we use n = 20 because the difference observed in the right panel of Fig. 13 is at most ∼ 1.4%.

Flow Index
The modified inverse HH Index defined above is useful for quantifying a node's transaction history. Let us denote the time series of the daily outflow by f out and the daily inflow by f in for this discussion. All the components of f out and f in are positive. Their "normalized" (in the sense that the total of all components is equal to 1, as in Eq.(3)) versions are denotedf out andf in . In a case with no flow, for example, f out = {} (an empty set), we definef out = {} and M n (f out ) = 0, and so on.
Here we are dealing with the flows aggregated daily. Alternatively, one may deal with tick data of the flows. The difference is that if a node makes several large transactions within a short period of time, treating them as one transaction is most appropriate. Daily aggregation takes care of them unless several transactions are made in a time window that includes 0:00 UTC. For this reason, the daily aggregation is chosen in this study.
Take the node bn0864 shown in the right panel of Fig. 10. This node fits the case discussed above: two transactions of 1.0× 10 11 XRP each within 50 s of time were made. Daily aggregation treats them as one 2.0 × 10 11 XRP transaction. It also included lots of small income over 17 days. Therefore, its effective inflow history is best summarized to be on "(very close to) just one occasion." Its modified inverse HH Index is, in fact, M 20 (f in ) = 1.00005, quantifying this fact. This is not the end of the story. This node had payment over two days (two red dots in the right panel of Fig. 10) and its modified inverse HH Index for the outflow is M 20 (f out ) = 1.20427. However, the amount of outflow was negligible compared to the inflow. We need to discount the outflow relative to the inflow for this node.
To do so for all nodes, we introduce the following quantity, the Flow Index: , .
a very satisfactory result. One may think of modifying the above by using the total volume of flows instead of maximum values. However, that does not work. This can be further explored by our readers.

Global Structure of the XRP Network
The left panel of Fig. 14 is a scatter plot of 1,136 big nodes (green, threshold = 10 7 XRP) and 1,176 huge nodes (blue, above threshold = 10 9 XRP) (see Table IV) on the Flow Index plane. We observe that the nodes are distributed somewhat widely on the lower-left part of the Flow Index plane. This implies a tendency of nodes with a small number of effective transactions (as counted by the Flow Index) tend to trade as destination and as origin unevenly. In contrast, those with a higher number of transactions are located close to the diagonal, meaning that they tend to trade as destinations and origins evenly. This tendency is true for both big nodes and huge nodes.
The right panel of Fig. 14 shows the details of the left panel. We observe the existence of nodes that mostly participate in one mode. Motivated by this type of distribution, we classify the nodes with A 1 ≤ 0.5 (red rectangle) as "OUT" components, as they are mostly on the destination side. This means they are at the final goal of XRP when viewed as part of the whole network. We found 193 nodes. Similarly, we classify the nodes with A 2 ≤ 0.5 (purple rectangle) as "IN" components; there are 52 of them.
Using this criteria, we can draw bow-tie/walnut-like structures [12], as shown in Fig. 15. This characterizes the global structure of the XRP network, which forms the basis for understanding the dynamics and development of this complex structure.

Concluding Remarks
In this study, we presented an analysis of XRP transaction records from ledger data. These data are huge and complex: In addition to the number of transactions, The distribution of traded amount, frequency of transactions, and so on cover huge ranges, some of which cover 18 orders of magnitude (10 −6 to 10 11 XRP). A notable empirical finding includes the power distribution of several quantities with a Pareto exponent close to one, and a power-law correlation between the daily number of transactions and the daily amount of transactions. The former remains a puzzle: While Pareto index equal to one is known to be the beginning of the monopoly/oligopoly phase, we have a reasonable explanation behind it only for stock quantities like number of employees at firms. The current transaction amount between nodes is a flow.. The latter, the power-law correlation, may be explained in terms of "herding behavior." Proper modelling with the use of deeper analysis of the current data should lead to the explanation of this correlation. These subjects are worth more extensive exploration in the future.
The XRP network, a directive network with nodes as transaction accounts and edges as transactions, is another main subject of this research. To concentrate on the central structure of this network, we placed a threshold for the maximum amount of each node. We selected nodes that made transactions of more than ≥ 10 7 XRP at least once and called them big nodes. To examine transaction frequency while considering the huge range of transaction amount each nodes make, we defined a new index called "Flow Index," borrowing and extending the idea of the Herfindahl-Hirschman Index. We further introduced classification of nodes using the Flow Index and arrived at a view of the entire network as a bowtie/walnut-like.
We believe this work establishes a foundation for not only the XRP network but also other dynamic networks of transactions. Further research on this network should reveal details of the activities and clarification of each node's characteristics on the structure of the bow-tie/walnut-like decompositions. Note added in proof: Toward the end of writing this manuscript, the author learned of a new paper by Fujiwara and Islam [13], where they examined the Bitcoin network formed by "regular users." This approach is complementary to our current analysis. While the latter chooses to select users based on the number of transactions, the former analyzes the frequency of transactions. The current author believes that a new approach based on both ways of thinking and picking up good features from both is waiting for us in the near future.