Keywords

1 Introduction

The study of complex networks has recently attracted increasing interest in the scientific community, as it allows to model and understand a large number of real-world systems [4]. This is particularly relevant given the growing amount of available data describing the interactions and dynamics of real-world systems. Typical examples of complex networks include metabolic networks [8], protein interactions [7], brain networks [17] and scientific collaboration networks [11].

One of the key problems in network science is that of identifying the most relevant nodes in a network. This importance measure is usually called the centrality of a vertex [9]. A number of centrality indices have been introduced in the literature [2, 46, 10, 14], each of them capturing different but equally significant aspects of vertex importance. Commonly encountered examples are the degree, closeness and betweenness centrality [5, 6, 10]. A closely related problem is that of measuring the centrality of an edge [3, 9]. Most edge centrality indices are developed as a variant of vertex centrality ones. A common way to define an edge centrality index is to apply the corresponding vertex centrality to the line graph of the network being studied. Recall that, given a graph \(G = (V,E)\), the line graph \(\mathcal {L}(G) = (V',E')\) is a dual representation of G where each node \(uv \in V'\) corresponds to an edge \((u,v) \in E\), and there exists and edge between two nodes of \(\mathcal {L}(G)\) if and only if the corresponding edges of G share a vertex. By measuring the vertex centrality on \(\mathcal {L}(G)\), one can map it back to the edges of G to obtain a measure of edge centrality. However, as observed by Koschützki et al. [9], this approach does not yield the same result as the direct definition of the edge centrality on G. Moreover, the size of the line graph is quadratic in the size of the original graph, thus making it hard to scale to large networks when the chosen centrality measure is computationally demanding.

In this paper, we introduce a novel edge centrality measure rooted in quantum information theory. More specifically, we propose to measure the importance of an edge in terms of its contribution to the Von Neumann entropy of the network [13]. This can be measured in terms of the Holevo quantity, a well known quantum information theoretical measure that has recently been applied to the analysis of graph structure [15, 16]. We also show how to approximate this quantity in the case of large networks, where computing the exact value of the Von Neumann entropy is not feasible. This in turns highlights a strong connection between the Holevo edge centrality and the negative degree centrality on the line graph. Finally, we perform a series of experiments to evaluate the proposed edge centrality measure on real-world as well as synthetic graphs, and we compare it against a number of widely used alternative measures.

The remainder of the paper is organised as follows: Sect. 2 reviews the necessary quantum information theoretical background and Sect. 3 introduces the proposed edge centrality measure. The experimental evaluation is presented in Sect. 4 and Sect. 5 concludes the paper.

2 Quantum Information Theoretical Background

2.1 Quantum States and Von Neumann Entropy

In quantum mechanics, a system can be either in a pure state or a mixed state. Using the Dirac notation, a pure state is represented as a column vector \(\left| \psi _i\right\rangle \). A mixed state, on the other hand, is an ensemble of pure quantum states \(\left| \psi _i\right\rangle \), each with probability \(p_i\). The density operator of such a system is a positive unit-trace matrix defined as

$$\begin{aligned} \rho = \sum _i p_i \left| \psi _i\right\rangle \left\langle \psi _i\right| . \end{aligned}$$
(1)

The Von Neumann entropy [12] S of a mixed state is defined in terms of the trace and logarithm of the density operator \(\rho \)

$$\begin{aligned} S(\rho ) = -{{\mathrm{tr}}}(\rho \ln \rho )=-\sum _i \lambda _i \ln (\lambda _i) \end{aligned}$$
(2)

where \(\lambda _1,\ldots ,\lambda _n\) are the eigenvalues of \(\rho \). If \(\left\langle \psi _i\right| \rho \left| \psi _i\right\rangle =1\), i.e., the quantum system is a pure state \(\left| \psi _i\right\rangle \) with probability \(p_i=1\), then the Von Neumann entropy \(S(\rho ) = -{{\mathrm{tr}}}(\rho \ln {\rho })\) is zero. On other hand, a mixed state always has a non-zero Von Neumann entropy associated with it.

2.2 A Mixed State from the Graph Laplacian

Let \(G=(V,E)\) be a simple graph with n vertices and m edges. We assign the vertices of G to the elements of the standard basis of an Hilbert space \(\mathcal {H}_{G}\), \(\{\mathinner {|{1}\rangle },\mathinner {|{2}\rangle },...,\mathinner {|{n}\rangle }\}\). Here \(\mathinner {|{i}\rangle }\) denotes a column vector where 1 is at the i-th position. The graph Laplacian of G is the matrix \(L=D-A\), where A is the adjacency matrix of G and D is the diagonal matrix with elements \(d(u) = \sum _{v=1}^n A(u,v)\). For each edge \(e_{i,j}\), we define a pure state

$$\begin{aligned} \left| e_{i,j}\right\rangle :=\frac{1}{\sqrt{2}}(\left| i\right\rangle -\left| j\right\rangle ). \end{aligned}$$
(3)

Then we can define the mixed state \(\{\frac{1}{m},\left| e_{i,j}\right\rangle \}\) with density matrix

$$\begin{aligned} \rho (G):=\frac{1}{m}{\displaystyle \sum \limits _{\{i,j\}\in E}} \left| e_{i,j}\right\rangle \left\langle e_{i,j}\right| =\frac{1}{2m}L(G). \end{aligned}$$
(4)

Let us define the Hilbert spaces \(\mathcal {H}_{V}\cong \mathbb {C}^{V}\), with orthonormal basis \(\mathbf {a}_{v}\), where \(v\in V\), and \(\mathcal {H}_{E}\cong \mathbb {C}^{E}\), with orthonormal basis \(\mathbf {b}_{u,v}\), where \(\{u,v\}\in E\). It can be shown that the graph Laplacian corresponds to the partial trace of a rank-1 operator on \(\mathcal {H}_{V}\otimes \mathcal {H}_{E}\) which is determined by the graph structure [1]. As a consequence, the Von Neumann entropy of \(\rho (G)\) can be interpreted as a measure of the amount of entanglement between a system corresponding to the vertices and a system corresponding to the edges of the graph [1].

2.3 Holevo Quantity of a Graph Decomposition

Given a graph G, we can define an ensemble in terms of its subgraphs. Recall that a decomposition of a graph G is a set of subgraphs \(H_1,H_2,...,H_k\) that partition the edges of G, i.e., for all ij, \(\bigcup _{i=1}^k H_i=G\) and \(E(H_i)\cap E(H_j)=\emptyset \), where E(G) denotes the edge set of G. Notice that isolated vertices do not contribute to a decomposition, so each \(H_i\) can always be seen a subgraph that contains all the vertices. If we let \(\rho (H_1),\rho (H_2),...,\rho (H_k)\) be the mixed states of the subgraphs, the probability of \(H_i\) in the mixture \(\rho (G)\) is given by \(|E(H_i)|/|E(G)|\). Thus, we can generalise Eq. 4 and write

$$\begin{aligned} \rho (G)=\sum _{i=1}^k \frac{|E(H_{i})|}{|E(G)|}\rho (H_i). \end{aligned}$$
(5)

Consider a graph G and its decomposition \(H_1,H_2,...,H_k\) with corresponding states \(\rho (H_1),\rho (H_2),...,\rho (H_k)\). Let us assign \(\rho (H_1),\rho (H_2),...,\rho (H_k)\) to the elements of an alphabet \(\{a_1,a_2,...,a_k\}\). In quantum information theory, the classical concepts of uncertainty and entropy are extended to deal with quantum states, where uncertainty about the state of a quantum system can be expressed using the density matrix formalism. Assume a source emits letters from the alphabet and that the letter \(a_i\) is emitted with probability \(p_i = |E(H_i)|/|E(G)|\). An upper bound to the accessible information is given by the Holevo quantity of the ensemble \(\{p_i,\rho (H_i)\}\):

$$\begin{aligned} \chi (\left\{ p_i,\rho (H_i)\right\} ) = S\left( \sum \limits _{i=1}^k p_i\rho (H_i)\right) -\sum \limits _{i=1}^k p_i S(\rho (H_i)) \end{aligned}$$
(6)

3 Holevo Edge Centrality

We propose to measure the centrality of an edge as follows. Let \(G = (V,E)\) be a graph with \(|E| = m\), and let \(H_e\) and \(H_{\overline{e}}\) denote the subgraphs over edge sets \(\{e\}\) and \(E \setminus \{e\}\), respectively. Note that \(S(\rho (H_e)) = 0\) and

$$\begin{aligned} \frac{m-1}{m} \rho (H_{\overline{e}})+\frac{1}{m}\rho (H_e) = \rho (G). \end{aligned}$$
(7)

Then the Holevo quantity of the ensemble \(\{(m-1/m,H_{\overline{e}}),(1/m,H_e)\}\) is

$$\begin{aligned} \chi \left( \left\{ \left( \frac{m-1}{m},H_{\overline{e}}\right) ,\left( \frac{1}{m},H_e\right) \right\} \right) = S\left( \rho (G)\right) -\frac{m-1}{m}S\left( \rho (H_{\overline{e}})\right) \end{aligned}$$
(8)

Definition 1

For a graph \(G = (V,E)\), the Holevo edge centrality of \(e \in E\) is

$$\begin{aligned} HC(e) = \chi \left( \left\{ \left( \frac{m-1}{m},H_{\overline{e}}\right) ,\left( \frac{1}{m},H_e\right) \right\} \right) \end{aligned}$$
(9)

When ranking the edges of a graph G, the scaling factor \((m-1)/m\) is constant for all the edges and thus can be safely ignored. The Holevo edge centrality of an edge e is then a measure of the difference in Von Neumann entropy between the original graph and the graph where e has been removed. In other words, it can be seen as a measure of the contribution of e to the Von Neumann entropy of G. From a physical perspective, this can also be interpreted as the variation of the entanglement between between a system corresponding to the vertices and a system corresponding to the edges of the graph (see the interpretation of the graph Laplacian in Sect. 2).

3.1 Relation with Degree Centrality

In this subsection we investigate the nature of the structural characteristics encapsulated by the Holevo edge centrality. Let \(G=(V,E)\) be a graph with n nodes, and let \(I_n\) be the identity matrix of size n. We rewrite the Shannon entropy \(-\sum _i\lambda _i \ln (\lambda _i)\) using the second order polynomial approximation \(k\sum _i\lambda _i(1-\lambda _i)\), where the value of k depends on the dimension of the simplex. We obtain

$$\begin{aligned} S(\rho (G)) = -{{\mathrm{tr}}}\left( \rho (G) \ln \rho (G)\right) \approx \frac{|V|\ln (|V|)}{|V|-1}{{\mathrm{tr}}}\left( \rho (G)(I_n - \rho (G))\right) \end{aligned}$$
(10)

By noting that \(\rho (G) = L(G)/(2m)\) and using some simple algebra, we can rewrite Eq. 10 as

$$\begin{aligned} S(\rho (G)) \approx \frac{|V|\ln (|V|)}{|V|-1}\left( 1 - \frac{1}{4m^2} \sum _{v\in V} \left( d^2(v)+d(v) \right) \right) \end{aligned}$$
(11)

where d(v) denotes the degree of the vertex v. This in turn allows us to approximate Eq. 9 as

$$\begin{aligned} HC(e) = S(\rho (G)) - S(\rho (H_{\overline{e}})) \approx -\frac{|V|\ln (|V|)}{|V|-1}\frac{d(u)+d(w)}{2m^2} \end{aligned}$$
(12)

where \(e=(u,w)\), we omitted the scaling factor \((m-1)/m\) and we made use of the fact that \(1/(4m^2) \approx 1/(4(m-1)^2)\).

Fig. 1.
figure 1

The Holevo edge centrality and its quadratic approximation on a barbell graph. Here the edge thickness is proportional to the value of the centrality. In (a) the blue edges have a higher centrality than the red edges, but in (b) all these edges (blue) have the same degree centrality. (Color figure online)

Equation 12 shows that the quadratic approximation of the Holevo centrality is (almost) linearly correlated with the negative edge degree centrality (see Sect. 4). This in turn gives us an important insight into the nature of the Holevo edge centrality. However, the quadratic approximation captures only part of the structural information encapsulated by the exact centrality measure. In particular, Passerini and Severini [13] suggested that those edges that create longer paths, nontrivial symmetries and connected components result in a larger increase of the Von Neumann entropy. Therefore, such edges should have a high centrality value, higher than what the degree information alone would suggest.

Figure 1 shows an example of such a graph, where the central bridge has a high value of the exact Holevo edge centrality, but a relatively low value of the approximated edge centrality. In Fig. 1(b), the blue edges have all the same degree centrality, i.e., they are all adjacent to four other edges. However, from a structural point of view, the removal of the edges connecting the two cliques at the ends of the barbell graph would have a higher impact, as it would disconnect the graph. As shown in Fig. 1(a), the Holevo centrality captures this structural difference, i.e., the weight assigned to the two bridges (blue) is higher than that assigned to the edges in the cliques (red).

4 Experimental Evaluation

In the previous sections we have derived an expression for the Holevo edge centrality, both exact and approximated. Here, we first evaluate this measure on a number of standard networks, and we compare it against other well known edge centralities. We also analyse the behaviour of the proposed centrality measure when graphs endure structural changes.

4.1 Experimental Setup

We perform our experiments on two well known real-world networks, the Florentine families graph and the Karate club network. We then consider the following edge centrality measures:

Degree Centrality: The centrality of an edge e is computed as the degree of the corresponding vertex in the line graph. The idea underpinning the vertex degree centrality is that the importance of a node is proportional to the number of connections it has to other nodes. This is the simplest edge centrality measure, but also the one with the lowest computational complexity.

Betweenness Centrality: The centrality of an edge e is the sum of the fraction of all-pairs shortest paths that pass through e, i.e., \(EBC(e) = \sum _{u,v \in V} \frac{\sigma (u,v|e)}{\sigma (u,v)}\) where V is the set of nodes, \(\sigma (u,v)\) and \(\sigma (u,v|e)\) denote the number of shortest paths between u and v and the number of shortest paths between u and v that pass through e, respectively [3]. An edge with a high betweenness centrality has a large influence on the transfer of information through the network and thus it can be seen as an important bridge-like connector between two parts of a network. Note that the implementation we use does not rely on the line graph, but measure the centrality of an edge directly on the original graph.

Flow Centrality: This centrality measure is also known as random-walk betweenness centrality [10]. While the betweenness centrality measures the importance of an edge e in terms of shortest-paths between pairs of nodes that pass through e, the flow centrality is proportional to the expected number of times a random walk passes through the edge e when going from u to v. Similarly to the betweenness centrality, here we measure the flow centrality directly on the original graph.

4.2 Edge Centrality in Real-World Networks

In order to compare the Holevo edge centrality with the measures described in the previous subsection, we compute, for each network, the correlation between the Holevo quantity and the alternative measures. Figure 2 shows the value of these centralities on the Florentine families graph and the Karate club network. In these plots, the thickness of an edge is proportional to the magnitude of the centrality index. Figure 3, on the other hand, shows the correlation matrix between the different centralities. Here DC, BC, FC and HC denote the degree, betweenness, flow and Holevo centrality, respectively.

Fig. 2.
figure 2

Edge centralities on the Florentine families network (a–d) and the Karate club network (e–h). A thicker edge indicates a higher value of the centrality.

Fig. 3.
figure 3

Correlation matrices for the centrality measure on the Florentine family network and the karate club network. DC, BC, FC, and HC denote the degree, betweenness, flow and Holevo centralities, respectively.

Fig. 4.
figure 4

Toy example showing the difference in the structural information captured by the degree and Holevo centralities. (Color figure online)

The Holevo centrality is always strongly negatively correlated with the degree centrality. This is in accordance with the properties discussed in Sect. 3. However, there are some significant differences. In general, the Holevo centrality is higher on edges that connect low degree nodes. In this sense, it can be seen as a measure of peripherality, rather than centrality. However, when two edges have the same degree centrality, edges that would disconnect the network or break structural symmetries are assigned a higher weight, as Fig. 1 shows. Similarly, in Fig. 4(a) the three edges highlighted in blue have the same degree centrality, but the same edges in Fig. 4(b) have different Holevo centralities. In fact, the removal of the red edge does not result in significant structural changes, while the removal of one of the blue edges increases the length of the tail.

Fig. 5.
figure 5

Perturbation process: on the left, adjacency matrix and plot of the starting graph; in the middle, the edited graph; on the right, the differences between initial and modified graph are highlighted. (Color figure online)

4.3 Robustness Analysis

We then investigate the behaviour of the Holevo edge centrality when the graph undergoes structural perturbations. To this end, given an initial graph, we gradually add or delete edges according to an increasing probability p. Figure 5 shows an instance of the noise addition process. Starting from a randomly generated graph, we compute the Holevo edge centrality for all its edges. Then, we perturb the graph structure with a given probability p and again we recompute the Holevo edge centrality for all the graph edges. We compute the correlation between the Holevo centrality of the edges of the original graph and its noisy counterpart. More specifically, we measure the correlation between the centralities of the edges that belong to the intersection of their edge sets. In other words, we analyse how the centrality changes during the perturbation process, with respect to the starting state.

Since we are interested in the variation of the Holevo centrality as the graph structure changes, we use three different random graph models to generate the initial graph: (1) the Erdös-Rényi model, (2) the Watts-Strogatz model and (3) the Preferential Attachment model. For each model, we generate a starting graph with the same number of nodes n and we create 100 noisy instances as p varies from 0.01 to 0.3. We perform the same experiment for both the Holevo centrality and the betweenness centrality.

Figure 6 shows the average correlation as we perturb the graph structure, for both the Holevo and betweenness centrality. As expected, in both cases the correlation decreases as the similarity between the original graph and the edited one decreases. However, while the correlation for centrality measures decreases rapidly in the case of ErdösRényi graphs, on scale-free graphs our centrality measure decreases linearly with the value of p, while the betweenness centrality drops significantly more quickly. On the other hand, we observe the opposite behaviour on small-world graphs. This can be explained by noting that in small-world graphs there exist multiple alternative paths between every pair of nodes, and thus the betweenness centrality is less affected by structural modifications. On the other hand, in scale-free graphs most shortest-paths pass through a hub, and thus adding a random edge can create shortcuts that greatly affect the value of the betweenness centrality. The Holevo centrality, however, assigns large weights to long tails and leaves, which are less affected by the structural noise.

Fig. 6.
figure 6

Average correlation between the centrality of the edges of the original graph and those of increasingly noisy version of it. The different columns refer to different starting graphs: (a) Erdös-Rényi, (b) WattsStrogatz and (c) Preferential Attachment.

5 Conclusion

In this paper we have introduced a novel edge centrality measure based on the quantum information theoretical concept of Holevo quantity. We measured the importance of an edge in terms of the difference in Von Neumann entropy between the original graph and the graph where that edge has been remove. We showed that by taking a quadratic approximation of the Von Neumann entropy we obtain an approximated value of the Holevo centrality that is proportional to the negative degree centrality. We performed a series of experiments on both real-world and synthetic networks and we compared the proposed centrality measure to widely used alternatives. Future work will investigate higher order approximations of this centrality measure as well as the possibility of defining network growth models based on the Holevo quantity.