POTION : Optimizing Graph Structure for Targeted Diffusion
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
POTION : Optimizing Graph Structure for Targeted Diffusion ∗ †1 Sixie Yu , Leo Torres‡2 , Scott Alfeld§3 , Tina Eliassi-Rad ¶2 , and Yevgeniy Vorobeychik ‖1 1 Washington University in St. Louis 2 Northeastern University 3 Amherst College arXiv:2008.05589v4 [cs.SI] 17 Feb 2021 Abstract at particular subsets of critical devices [13], which should The problem of diffusion control on networks has been be accounted for when modeling attacking behavior. extensively studied, with applications ranging from Congestion cascades of ground traffic or flight networks marketing to controlling infectious disease. However, are other examples, where the goal of resilience may be in many applications, such as cybersecurity, an attacker to ensure that cascades concentrate on a subset of high- may want to attack a targeted subgraph of a network, capacity nodes that can handle them, limiting the impact while limiting the impact on the rest of the network on the rest of the network [11, 10]. In another domain, in order to remain undetected. We present a model medical treatments for certain diseases such as cancer POTION in which the principal aim is to optimize may leverage a molecular signaling network, with the graph structure to achieve such targeted attacks. We goal of targeting just the pathogenic portion of it, while propose an algorithm POTION-ALG for solving the limiting the deleterious effects on the rest [44]. model at scale, using a gradient-based approach that We study the problem of targeted diffusion in which leverages Rayleigh quotients and pseudospectrum theory. an attacker1 can modify the graph structure G = In addition, we present a condition for certifying that a (V, E) to achieve two goals: 1) maximize the diffusion targeted subgraph is immune to such attacks. Finally, we spread to a target subgraph GS , and 2) minimize demonstrate the effectiveness of our approach through the impact on the remaining graph G \ GS . We experiments on real and synthetic networks. capture the first goal by maximizing a utility function that incorporates spectral information of the adjacency 1 Introduction matrix of G, specifically its largest (in magnitude) Many diverse phenomena that propagate through a eigenvalue, eigenvector centrality, and the normalized network, such as epidemic spread, cascading failures, cut of the target subgraph. The second goal is achieved and chemical reactions, can be modeled by network by limiting the modifications made outside of the target diffusion models [4, 6, 24, 46, 43]. The problem of subgraph. We present a scalable algorithmic framework controlling diffusion has, as a result, received much for solving this problem. Our framework leverages a attention in the literature, with primary focus on two combination of gradient ascent with the use of Rayleigh mechanisms for control: the choice of initial nodes to quotients and pseudospectrum theory, which yields start the spread [15, 8, 48], and the modification of differentiable approximations of our objective and allows network structure [17, 38, 47, 39]. To date, most work us to avoid projection steps that would otherwise be on diffusion control (either promotion or inhibition) has costly and imprecise. Moreover, we derive a condition considered diffusion over the entire network. However, that enables us to certify if a network is robust against in many problems, the focus is instead on diffusion that a broad class of targeted diffusion attacks. Finally, we is targeted to a particular subgraph of the network. For demonstrate the effectiveness of our approach through example, in cybersecurity, diffusion commonly represents extensive experiments. malware spread, but malware attacks are often targeted In summary, our contributions are: ∗ The first two authors contributed equally to the paper. 1. We propose POTION (oPtimizing graph structures † sixie.yu@wustl.edu. fOr Targeted diffusION): a model for targeted ‡ leo@leotrs.com diffusion attack by optimizing graph structures. § salfeld@amherst.edu ¶ t.eliassirad@northeastern.edu ‖ yvorobeychik@wustl.edu 1 The attacker is the agent who initiates diffusion. Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
2. We present POTION-ALG : an efficient algorithm to edge perturbation. All of these prior efforts focus on the optimize POTION by leveraging Rayleigh quotient impact either at the network level or at the node-level and pseudospectrum theory. properties, while our focus is on the impact of diffusion dynamics on a targeted subgraph of the network. 3. We describe a condition for certifying that a targeted subgraph is immune to such attacks. 3 POTION : Proposed Model 4. We demonstrate the effectiveness and efficiency of We present a model for targeted diffusion through graph POTION and POTION-ALG on synthetic and real- structure optimization. We refer to the agent who world networks; and against baseline and competing initiates diffusion as the attacker. We use cybersecurity methods.2 as a running example. Here the attacker initiates the diffusion (e.g., the spread of malware) on a network of 2 Related Work computers. We define the impact of the diffusion as Various dynamical processes can be modeled as diffusion the number of infected nodes (e.g., compromised with dynamics on networks, including the spread of infectious malware). The attacker has two objectives: 1) she wishes diseases [4, 6], cascading failures in infrastructure to maximize the impact of the diffusion on a targeted set networks [24, 46], and information spread (e.g., rumors, of nodes (e.g., computing nodes with access to critical fake news) on social networks [20, 21]. One line of assets), and 2) to limit the impact on non-targeted nodes research assesses the impact of cascading failures. Yang to ensure stealth [13]. et al. [46] simulated cascading failures to quantify Let G = (V, E) be a connected, weighted or unweighted, the vulnerability of the power grid in North America. undirected graph with no self-loops. Let n = |V| be the Fleurquin et al. [11] studied the impact of flight delays number of nodes in G and A be its adjacency matrix. as a cascading failure diffusing through the network. Throughout this paper, the eigenvalues of A are ranked Motter and Lai [24] investigated the cascading failures in descending order λ1 (A) ≥ · · · ≥ λn (A). Suppose the on a network due to the malfunction of a single node. attacker targets a subgraph GS where S ⊆ V is the node Another line of research concerns diffusion control, for set of GS . Let S 0 = V \ S, and its induced subgraph example, selecting a set of nodes such that if the GS 0 . Throughout the paper we assume GS is connected, diffusion originated from them, it reaches as many and denote its adjacency matrix by AS . To achieve her nodes as possible [15, 8, 48]; or modifying network objectives, the attacker modifies the structure of G. The structures to increase or limit some diffusion [38, 31]. modified graph and targeted subgraph are represented by However, these lines of research do not differentiate G̃ and G̃S , respectively. Formally, the attacker’s action between targeted and non-targeted nodes. Ho et al. [14] is to add a perturbation ∆ ∈ Rn×n to A, which results studied targeted diffusion controlled by changing nodal in the perturbed adjacency matrix à = A + ∆. The status. We focus on the problem where an attacker adjacency matrix of G̃S is denoted by ÃS . manipulates underlying network structures in order to achieve targeted diffusion. 3.1 Diffusion Dynamics The status of a node is Another relevant research thread is network design, modeled by the well-known SIS (Susceptible-Infected- which is the problem of modifying network structure Susceptible) diffusion dynamics, where it alternates to induce certain desirable outcomes. Some prior work between “infected” and “susceptible”. 3 Due to the [38, 47, 36] considered the containment of spreading malware spread by infected neighbors, a susceptible node dynamics by adding or removing nodes or edges from becomes infected with probability β. An infected node the network, while others [42, 32, 17, 7, 39] considered becomes susceptible again (e.g., malware is removed) limiting the spread of infectious disease by minimizing with probability δ. Following Chakrabarti et al.[6], this the largest eigenvalue of the network. Kempe et al. [16] process is modeled by a nonlinear dynamical system. studied modifying network structure to induce certain Let πi be the probability of node i becoming infected outcomes from a game-theoretic perspective, but they (e.g., compromised with malware) in the steady state did not consider diffusion dynamics. Others have studied of this dynamical system, with π the vector of these the problem of manipulating node centrality measures probabilities. A key result in [6] is that when λ1 (A) < (e.g., eigenvector or PageRank centrality) [2, 1] or node δ/β the system converges to the steady state π = 0, similarity measures (e.g., Katz similarity) [49] through which implies that the diffusion process quickly dies out. 3 Due to brevity, a discussion on generalization of our approach 2 The code to replicate the experimental results is at https: to other diffusion dynamics is at https://arxiv.org/abs/2008. //github.com/marsplus/POTION. 05589. Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
However, when λ1 (A) ≥ δ/β the system converges to contributed by the infected neighbors of node i (which another steady state π =6 0. We leverage this connection increases Pit ), while the second term is the force due between graph structure, dynamical model of epidemic to i’s self recovery (which decreases Pit ). Rewriting in spread, and the epidemic threshold, in constructing our matrix notation yields: threat model, as discussed next. dP t = β à − δI P t , (3.3) dt 3.2 Threat Model Maximizing the Impact on GS : To maximize the impact of diffusion on GS , the which gives a linear approximation to the non-linear attacker has two goals: 1) ensure that epidemics starting dynamical system proposed in [6]. The steady state in GS spread rather than die out, and 2) ensure that π must satisfy β à − δI π = 0, which is equivalent epidemics starting outside GS are likely to reach it. We to Ãπ = (δ/β)π. Suppose λ1 (Ã) = δ/β, and π is the capture the first goal by maximizing the largest (in corresponding eigenvector. Let ṽ1 be the P unit eigenvector modulus) eigenvalue of GS , λ1 (ÃS ), which corresponds associated with λ1 (Ã). Let σ(S) = j∈S ṽ1 [j] be the to the epidemic threshold of the targeted subgraph.4 The eigenvector centrality of S. Noting that π may differ second goal is captured by maximizing the normalized from ṽ1 by up to a multiplicative constant c, the impact cut of GS , φ(S), where S is the set of nodes in GS and on GS 0 can be approximated as: S 0 are the nodes in the remaining graph. The normalized X X cut is formally defined as follows: (3.4) I(GS 0 ) = πj ≈ c ṽ1 [j] = c 1 − σ(S) , j∈S 0 j∈S 0 1 1 (3.1) φ(S) = cut(S, S 0 ) + , vol(S) vol(S 0 ) where the last equality is because S and S 0 are disjoint 0 where cut(S, S ) is the sum of the weights on the edges and ṽ1 is an unit vector. Thus, minimizing the impact across S and S 0 (unit weights for unweighted graphs), on GS 0 is approximately equivalent to maximizing the and vol(S) (resp. vol(S 0 )) is the sum of degrees of the eigenvector centrality of S. nodes in S (resp. S 0 ). The formal rationale for using Recall that to have an epidemic spread, one needs the normalized cut is based on Meila et al. [22], which λ1 (Ã) ≥ δ/β. Here, we assumed λ1 (Ã) = δ/β. In showed that increasing φ(S) increases the probability Section 6, we demonstrate that our analysis yields an that a random walker transitions from S 0 to S, if we approach that is effective even when this assumption assume that GS is smaller than GS 0 . fails to hold (i.e., when λ1 (Ã) > δ/β). Limiting the Impact on GS 0 : Another important Now we focus on limiting the impact on the spectrum objective of the attacker is to limit the impact on GS 0 , of A. 5 Let > 0 be the attacker’s budget. Formally, the non-targeted part of the graph. We capture this goal this notion is captured through the following constraints: in two different ways. First, by limiting the likelihood (3.5) |λi (Ã) − λi (A)| ≤ , i = 1, . . . , n. of the epidemic spreading to GS 0P , which we define as minimizing the impact I(GS 0 ) = i∈S 0 πi . Second, by In summary, the principal aims to (i ) maximize the limiting the impact on the spectrum of A. impact on GS through maximizing λ1 (ÃS ) while (ii ) We now demonstrate that minimizing I(GS 0 ) is approx- limiting the impact on GS 0 by maximizing the eigenvec- imately equivalent to minimizing the eigenvector cen- tor centrality σ(S), and satisfying Eq. (3.5). Formally, trality of S 0 . Let P t be the global configuration of the the principal aims to solve the following optimization graph at time step t, where Pit is the probability that problem: node i is infected (e.g., compromised with malware). Fol- (3.6) lowing Mieghem et al. [23] , ignoring higher-order terms max α1 λ1 (ÃS ) + α2 σ(S) + α3 φ(S) à and taking the time step to be infinitesimally small, the ( ) t |λi (Ã) − λi (A)| ≤ , i = 1, . . . , n, dynamics of Pi is modeled as the following: s.t. à ∈ P = à , dPit X à = Ã> , Ãii = 0, ∀i = 1, . . . , n (3.2) = β Ãij Pjt − δPit . dt where the relative importance of the terms is balanced by j∈V the nonnegative constants α1 , α2 , α3 , and the restrictions à = Ã> and Ãii = 0, ∀i = 1, . . . , n ensure that à is a Here, we can think of the two terms on the right side valid adjacency matrix. as two competing forces. The first term is the force 5 In cybersecurity, there are natural interpretations of an 4 If GS is not connected, we may replace λ1 (ÃS ) by the largest attack’s stealth. For further details, see the extended version eigenvalue of the largest connected component of GS . at https://arxiv.org/abs/2008.05589. Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
4 POTION-ALG : Proposed Algorithm and does not provide us with the necessary gradient To solve the optimization problem in Eq. (3.6), a natural information. Instead, we use the power method [12] to approach would be to use a form of projected gradient compute λ1 (ÃS ). Let vS be the eigenvector associated ascent. There are, however, two major hurdles to this with the largest eigenvalue λ1 (ÃS ). Using Rayleigh basic approach: 1) the objective function involves terms quotients [40], we can compute λ1 (ÃS ) as follows: that do not have an explicit functional representation in the decision variables, and 2) the projection step is quite (4.8a) vS = arg max x> ÃS x expensive, as it involves projecting into a spectral norm kxk2 =1 ball, which entails an expensive SVD operation [18]. We (4.8b) λ1 (ÃS ) = vS> ÃS vS . address these challenges in Algorithm 1, which is our gradient-based solution to the attacker’s optimization problem as described in Eq. (3.6). Thus, when vS is known, the computation of λ1 (ÃS ) Algorithm 1 POTION-ALG reduces to matrix multiplications. In addition, ÃS 1: Input: A, , {ηi }i=1 . {ηi }i=1 is a schedule of step sizes is usually sparse, so we can leverage sparse matrix 2: Initialize: i = 1, Ã1 = A, B1 = 0 . Bi : the amount of multiplication to speed up the computation. budget used just before step i 3: while True do The remaining challenge is that vS is an optimal solution 4: Set ∆i to the gradient of α1 λ1 (ÃS )+α2 σ(S)+α3 φ(S) of an optimization problem, and we need an explicit w.r.t. to Ãi derivative of it. Fortunately, our problem has a special 5: Set the diagonal entries of ∆i to zeros structure that we exploit to obtain an approximation 6: if k∆i k = 0 then . a local optimum is found of the derivative of vS . From our experiments we find 7: return Ãi that G̃S is nearly always connected. This means that 8: end if the largest eigenvalue of ÃS is simple. In addition, due 9: if Bi + ||ηi ∆i ||2 ≤ then . one-step look ahead to the Perron–Frobenius theorem, the absolute value 10: Ãi+1 = Ãi +ηi ∆i , Bi+1 = Bi +kηi ∆i k2 , i = i+1 of the largest eigenvalue is strictly greater than the 11: else 12: return Ãi absolute values of others, i.e., |λ1 (ÃS )| > |λk (ÃS )| for 13: end if all k 6= 1. Under these conditions, we can use the 14: end while power method to estimate vS by repeating the formula: (t+1) (t) (t) ṽS = ÃS ṽS /kÃS ṽS k2 . The `2 -norm distance A key step of Algorithm 1 is line 4, where we compute the between ṽS and vS decreases in a rate O(ρk ) [12], k gradient of the attacker’s utility function with respect where ρ < 1. In our experiments we found k = 50 to Ã. This gradient involves terms that do not have an is enough to give a high-quality estimation for a graph explicit functional form in terms of the decision variable, with 986 nodes. Intuitively, we are using a sequence and we deal with each of these in turn. of differentiable operations to approximate the argmax operation. Therefore the computation of ∇Ã λ1 (ÃS ) can First, consider the gradient of the normalized cut φ(S) be handled by PyTorch. w.r.t. Ã. Let xS be the characteristic vector of S, that is 1 iff i ∈ S. Let D̃ be the diagonal degree matrix We use the same machinery to compute ∇Ã σ(S). First, xS [i] = P D̃ii = j Ãij , and let L̃ = Ã − D̃ be the Laplacian we write σ(S) in matrix notation: matrix. Using D̃ and L̃ to express vol(S) and cut(S, S 0 ), (4.9) σ(S) = v > xS , respectively, we have: where v is the unit eigenvector associated with λ1 (Ã). ! 1 1 Then we apply the power method to compute v. Finally, (4.7) φ(S) = x>S L̃xS > + > . σ(S) is just a linear function of v. All of these operations xS D̃xS xS 0 D̃xS 0 are differentiable, and the computation of ∇Ã σ(S) is handled by PyTorch. Clearly, Eq. (4.7) is a differentiable function of Ã. Com- puting its gradient ∇Ã φ(S) can then be handled by au- We next address the challenge imposed by the constraints tomatic differentiation tools such as PyTorch [28]. (3.5), which can result in a computationally challenging projection step which can also significantly harm solution Next, we compute the gradient of λ1 (ÃS ) w.r.t. Ã. quality. We address this challenge as follows. Given a A standard way to compute λ1 (ÃS ) is by using SVD. real symmetric matrix X, let kXk2 denote its spectral However, this is both prohibitively expensive (O(n3 )), norm. To satisfy Eq. (3.5), we use the following result Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
from pseudospectrum theory (see [41], Theorem 2.2): After running Algorithm 1, we obtain a perturbed matrix à with fractional entries. For unweighted graphs, a (4.10) λi (Ã)−λi (A) ≤ , i = 1, . . . , n ⇐⇒ kÃ−Ak2 ≤ rounding heuristic is needed to convert à to a valid adjacency matrix. Let D = {(i, j)|Ãi,j 6= Ai,j } be the set of candidate edges that will be added or deleted Since ∆ = Ã−A is real and symmetric, we have k∆k2 = from G. For each edge (i, j) ∈ D define the score max{|λ1 (∆)|, |λn (∆)|} and −λn (∆) = λ1 (−∆), which s(i,j) = |Ãi,j − Ai,j |. Intuitively, s(i,j) indicates the leads to: impact that adding or deleting the edge has on the (4.11) principal’s utility. Next, we iteratively modify G, by à satisfies Eq. (3.5) ⇐⇒ max{|λ1 (∆)|, |λ1 (−∆)|} ≤ . adding or deleting edges in D, starting with the one with the largest s(i,j) . The modification process stops when the budget is exhausted, which results in the desired binary adjacency matrix. For weighted graphs, This equivalence allows the attacker to check let C = maxi,j Aij and normalize each entry by C, that is whether she is within budget simply by evalu- Aij /C. We run Algorithm 1 on the normalized adjacency ating max{|λ1 (∆)|, |λ1 (−∆)|}, i.e., computing the largest eigenvalue of a real symmetric matrix, which matrix, which results in Ã. The desired adjacency matrix can be computed efficiently using, e.g., the power is obtained by multiplying each Ãij by C, C Ãij . If method [12]. integer weights are desired (e.g., the number of packages transmitted between two computers), a final rounding Our algorithm leverages this connection as follows. step is applied. Our experimental results show that the Line 9 in Algorithm 1 is a one step look-ahead, which rounding heuristic is effective in practice. ensures that the perturbation ∆i is only added to Ãi when there is enough budget. Recall from Section 3.2 5 Certified Robustness that k∆i k2 = max{|λ1 (∆i )|, |λ1 (−∆i )|}. Thus this This section addresses the following question: what step requires us to compute λ1 (∆i ) and λ1 (−∆i ), using are the limits on the attacker’s ability to successfully again the power method. Line 10 tracks the amount of accomplish her attack? More precisely, we now seek to budget used so far. We now show that the output of identify necessary conditions on the attack budget so Algorithm 1 always returns a feasible solution. Suppose the attack succeeds; conversely, we can view a given Algorithm 1 terminates after k > 1 iterations. This graph to be certified to be robust to attacks that use a means Bk + kηk ∆k k2 > and Bk ≤ . In other words smaller budget than the one required. Pk−1 Bk = i=1 kηi ∆i k2 ≤ . Note that the total amount of perturbation added to A is ∆ = i=1 ηi ∆i . The Let TargetDiff(S, G, ) be an instance of the targeted Pk−1 triangle inequality implies k∆k2 ≤ . diffusion problem with target subset S, underlying graph G and budget . The attacker is successful on an instance For each iteration of Algorithm 1, the most computa- TargetDiff(S, G, ) if she is able to modify G into G̃ tionally expensive components are the power method within budget such that I(G̃S ) > I(GS ). We now and matrix multiplication. Let m be the number of derive a necessary condition for successful attacks, in nonzeros in Ãi ; if the graph is unweighted then m is the form of a lower bound on . the number of edges at this iteration. By leveraging the sparseness exhibited in Ãi , the power method runs in To derive the necessary condition on , we use our O(m) and the matrix multiplications cost O(mn). Thus, experimental observation that in successful attacks the the time complexity of each iteration is O(mn), which degrees of nodes in the targeted subgraph GS always significantly improves the O(n3 ) time complexity of SVD increase. This is intuitive: a denser subgraph GS will that would otherwise be needed. tend to increase the propensity of the diffusion (e.g., of malware) to spread within it, which is one of our explicit Recall that our model for targeted diffusion is applicable objectives. Let di (resp. d˜i ) be the degree of node i to both weighted and unweighted graphs. For weighted before (resp. after) graph modification. We assume if graphs, the attacker modifies the weights on existing an attack is successful, the degrees of nodes in GS are edges. For unweighted graphs, the attacker adds new increased, i.e., d˜i ≥ di for i ∈ S. edges or deletes existing edges from the graph. The main difference between the two settings is that the Now, observe that computing the exact value of I(GS ) latter needs a rounding heuristic to convert a matrix is intractable, since the exact computation of πi is with fractional entries to a binary adjacency matrix. We prohibitive (see, e.g., [23], Section IV.B). Mieghem discuss this heuristic below. et al. [23] proposed a simple yet effective estimator Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
for πi to be 1 − δ/(βdi ). The estimator works in 6 Experiments and Discussion the regime δ/β ≤ dmin , where dmin is the minimum This section presents experimental results on three real- degree of G. Consequently, an estimator for I(GS ) is world datasets: an email network, an airport network, ˆ S) = P I(G i∈S 1 − δ/(βdi ). We focus on the setting and a brain network. 7 where the estimation error is bounded by a small number, ˆ S ) − I(GS )| ≤ τ . Note that τ can be estimated i.e., |I(G For each network we run POTION with hyper- from historical diffusion data. The formal statement of parameters α1 = α2 = α3 = 1/3, which encodes that the necessary condition is in Theorem 5.1. 6 the attacker’s objectives are equally important. To study how the attacker’s effectiveness changes with re- Theorem 5.1. Given an instance TargetDiff(S, G, ), spect to her budget, we set = γλ1 (A) and vary γ from ˆ S) = P I(GS ) is estimated by I(G i∈S 1 − δ/(βdi ). 10% to 50%. A single initially infected node is selected Suppose we have an upper bound |I(G ˆ S ) − I(GS )| ≤ τ , uniformly at random. the degrees of nodes in S are increased, i.e., d˜i ≥ Recall we use G and G̃ to denote the original and the di for i ∈ S, and δ/β ≤ dmin . In order to have modified graphs, respectively. We simulate the spreading I(G̃S ) − I(GS ) > 2τ , the budget must satisfy: dynamics 2000 times on both G and G̃. For unweighted 1/2 graphs the recovery rate δ and transmission rate β are 2 r ! 2 P P |S| d i∈S i i∈S d i set to 0.24, 0.06, resp.; for weighted graphs we set (5.12) ≥ − . n |S| |S|2 δ = 0.24 and β = 0.2. The spreading dynamics converges exponentially fast to the steady state: empirically, we found 30 time steps to be enough to reach the steady The quantity inside the square root is always nonnegative state in most cases. When the simulation finishes, we due to Jensen’s inequality. The lower bound involves extract the number of nodes that are “infected”. We only structural properties of the graph (node degrees and use Ioriginal and Imodified to represent the fractions of the size of S) and thus can be easily computed given an infected nodes on G and G̃, resp. arbitrary graph. As mentioned above, we can view this lower bound as a robustness certificate, or guarantee for We use two other algorithms as baselines for comparison, the given graph. It guarantees, in particular, that when which we call deg and gel. The two algorithms work the budget is below the lower bound, the total probability by alternating between modifying GS and modifying of “infection” (e.g., malware infection) in GS cannot be GS 0 until the budget is spent. When modifying GS , increased by more than 2τ . In the special case of perfect deg chooses the edge (i, j) with the maximum value of estimation (τ = 0), it implies impossibility of increasing di + dj . Algorithm gel is based on [37], and chooses the the susceptibility of GS to targeted diffusion. edge (i, j) with maximum eigenscore, defined as u[i]v[j], where u, v are the left and right principal eigenvectors of The proof of Theorem 5.1 does not depend on the specific A, respectively. This edge is chosen from among those objective function proposed in this paper. Consequently, edges that are absent (if the graph is unweighted) or the certificate is not specific to our particular objective present (if the graph is weighted). When modifying GS 0 , function. Further, the lower bound is independent of the these baselines choose an existing edge of to remove (if values of δ and β, as long as δ/β ≤ dmin . unweighted) or decrease its weight (if weighted) with the highest value of di + dj or u[i]v[j], respectively. We briefly discuss the settings where the robustness guarantee is most applicable. First, the estimation Unweighted Graphs: We consider (the largest for the infected ratio on GS is accurate, i.e., |I(G ˆ S) − connected component of) an email network [19] that has I(GS )| ≤ τ and τ is small. According to Mieghem et al. 986 nodes. An edge (i, j) indicates that there were email [23], this usually happens on graphs with small degree exchanges between nodes i and j. This data set contains variation. Another setting is where the degrees of nodes ground-truth labels to indicate which community a node in GS increase as a result of the attack, which is both belongs to. We pick a community with 15 nodes as natural and empirically founded, as we mentioned earlier. S; the results for communities with other sizes are We provide experimental results on synthetic networks similar. to verify the robustness guarantee in Section 6. The overall effectiveness of our approach is shown in Figure 2, top. The difference Imodified − Ioriginal of the 6 Due to brevity the proof is in Appendix C at https://arxiv. 7 Due to brevity, additional results on real and synthetic org/abs/2008.05589. networks are at https://arxiv.org/abs/2008.05589. Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
impact on the modified and original graphs is shown for 15% BA Imodified Ioriginal GS (red line) and GS 0 (purple line), respectively. As γ 10% BTER gets larger, the impact on GS increases, while the impact WS on GS 0 is under control, which demonstrates that the 5% proposed approach is highly effective at both increasing 0% the impact of diffusion on the targeted subgraph, and at 0.0 0.2 0.4 the same time preventing the impact on the remaining graph. Figure 1: Certified robustness results. Dashed lines mark Weighted Graphs: We consider an airport network the lower bounds from Eq. (5.12). Solid lines represent and a brain network. The airport network [25] was infectious ratios within targeted subgraphs. collected from the website of Bureau of Transportation Statistics of the U.S., where the nodes represent all of Email Airport Brain the 1572 airports in the U.S. and the weights on edges % encode the number of passengers traveled between two 5.0 % 5% 0% 5% 0% 0% 0% 0% 0% 5.0 -0. 0. 0. 5. 10. 15. 0. 10. airports in 2010. We scaled the weights on the airport % Imodified Ioriginal % % % 5.0 0.0 0.0 5.0 0.0 network to [0, 1]. The targeted set S was chosen by first % sampling a node i uniformly at random, and then setting S to be i and all its neighbors. We report experimental 5% 0% 5% 0% -0. 0. 0. 0. results for an S with 60 nodes. The brain network [9] 5% 0% 5% -0. 0. 0. consists of 638 nodes where each node corresponds to a region in human brain. An edge between nodes i 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 and j indicates that the two regions have co-activated on some tasks. The weight on the edge quantifies the GS GS 0 POTION deg gel strength of the co-activation estimated by the Jaccard index. The weights on edges lie in [0, 1]. The 638 regions are categorized into four areas: default mode, visual, Figure 2: POTION effectively achieves targeted diffusion fronto-parietal, and central. Each area is responsible for (top) in GS (red line) without affecting GS 0 (purple); higher some functionality of human. We select 100 nodes from is better. Comparison against deg and gel baselines in GS the central area as the targeted set S. The results for (middle; higher is better) and GS 0 (bottom; lower is better). the airport (resp. brain) network are at the center (resp. right) column of Figure 2. The overall trend is similar to that of the email network. lower bounds on the budget computed using Eq. (5.12). Note that when the budget is less than the lower bound, Comparison against Baselines: The comparisons the differences are close to zero, which means that the against the baselines are shown in Figure 2, middle and network is robust against targeted diffusion. bottom rows. The moddle row shows the infectious Running Time: The running time of Algorithm 1 on ratios within the targeted subgraphs. It is clear that our the three real-world networks is showed in Figure 3. Each algorithm is more effective at increasing the infectious point in the figure is the average running time over 10 ratios than the baselines. The bottom row shows the trials. Intuitively, as the budget γ increases the attacker infectious ratios within the non-targeted subgraphs. The needs to search a larger space, therefore the running magnitudes of the differences are negligible, although in time increases. The numbers of nodes and edges of the some cases our algorithm is significantly better than the three networks are in Table 1. baselines (e.g., on airport network when γ = 0.4). Airport Verify the Certified Robustness: We run experi- 150 Brain Email seconds ments on synthetic networks to verify the certified ro- 100 bustness; the synthetic networks include Barabási-Albert (BA) [5], Watts-Strogatz [45], and Block Two-level Erdős- 50 Rényi (BTER) networks [33]. We use the same exper- imental setup as described above. Fig. 1 shows the 0.1 0.2 0.3 0.4 0.5 difference of infectious ratios on the modified and origi- nal graphs (within targeted subgraphs), as a function of Figure 3: Running time on the real-world networks. the attacker’s budget . The vertical dashed lines are the Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
Email Airport Brain of infectious diseases and its applications. Charles Griffin & Company Ltd, 1975. #nodes 986 1572 638 #edges 16064 17214 18625 [5] Albert-László Barabási and Réka Albert. Emer- gence of scaling in random networks. Science, 286 Table 1: Statistics of the real-world networks. (5439):509–512, 1999. [6] Deepayan Chakrabarti, Yang Wang, Chenxi Wang, 7 Conclusion Jure Leskovec, and Christos Faloutsos. Epidemic Diffusion control on network has attracted much atten- thresholds in real networks. ACM Trans. Inf. Syst. tion, however, most studies focus on diffusion over the Secur., 10(4):1:1–1:26, 2008. entire network. We address the problem of targeted dif- [7] Chen Chen, Hanghang Tong, B. Aditya Prakash, fusion attack on networks. We present a combination Charalampos E. Tsourakakis, Tina Eliassi-Rad, of modeling and algorithmic advances to systematically Christos Faloutsos, and Duen Horng Chau. Node address this problem. On the modeling side, we present immunization on large graphs: Theory and algo- a novel model called POTION that optimizes graph rithms. TKDE, 28(1):113–126, 2016. structure to affect such targeted diffusion attacks, which preserves structural properties of the graph. On the [8] Wei Chen, Yajun Wang, and Siyu Yang. Efficient algorithmic side, we design an efficient algorithm named influence maximization in social networks. In KDD, POTION-ALG by leveraging Rayleigh quotients and pages 199–208. ACM, 2009. pseudospectrum theory, which is scalable to real-world graphs. We also derive a condition to certify whether [9] Nicolas A Crossley, Andrea Mechelli, Petra E Vértes, a network is robust against a broad class of targeted Toby T Winton-Brown, Ameera X Patel, Cedric E diffusion. Our experiments on both synthetic and real- Ginestet, Philip McGuire, and Edward T Bullmore. world networks show that the model is highly effective Cognitive relevance of the community structure of in implementing the targeted diffusion attack. the human brain functional coactivation network. PNAS, 110(28):11583–11588, 2013. Acknowledgement [10] Ernesto Estrada. ‘Hubs-repelling’ Laplacian and re- SY and YV were partially supported by the National lated diffusion on graphs/networks. Linear Algebra Science Foundation (grants IIS-1903207 and IIS-1910392) Appl., 2020. and Army Research Office (grants W911NF1810208 and W911NF1910241). LT and TER were supported in part [11] Pablo Fleurquin, José J Ramasco, and Victor M by the National Science Foundation (IIS-1741197) and by Eguiluz. Systemic delay propagation in the US the Combat Capabilities Development Command Army airport network. Scientific Reports, 3:1159, 2013. Research Laboratory (under Cooperative Agreement Number W911NF-13-2-0045). The authors would like to [12] Gene Golub and Charles Van Loan. Matrix com- thank the anonymous reviewers and Chloe Wohlgemuth putations. Johns Hopkins Studies in Mathematical for their helpful comments. Sciences, 1996. [13] Nika Haghtalab, Aron Laszka, Ariel D. Procaccia, References Yevgeniy Vorobeychik, and Xenofon Koutsoukos. Monitoring stealthy diffusions. Knowledge and [1] Victor Amelkin and Ambuj K. Singh. Fighting Information Systems, 2017. opinion control in social networks via link recom- mendation. In KDD, pages 677–685. ACM, 2019. [14] Christopher Ho, Mykel J. Kochenderfer, Vineet Mehta, and Rajmonda S. Caceres. Control of [2] Konstantin Avrachenkov and Nelly Litvak. The epidemics on graphs. In CDC, pages 4202–4207. effect of new links on google pagerank. Stochastic IEEE, 2015. Models, 22(2):319–331, 2006. [15] David Kempe, Jon M. Kleinberg, and Éva Tardos. [3] Lars Backstrom and Jure Leskovec. Supervised Maximizing the spread of influence through a social random walks: predicting and recommending links network. In KDD, pages 137–146. ACM, 2003. in social networks. In KDD, pages 635–644, 2011. [16] David Kempe, Sixie Yu, and Yevgeniy Vorobeychik. [4] Norman TJ Bailey et al. The mathematical theory Inducing equilibria in networked public goods games Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
through network structure modification. In AAMAS, [29] Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. pages 611–619, 2020. Deepwalk: Online learning of social representations. In KDD, pages 701–710, 2014. [17] Long T. Le, Tina Eliassi-Rad, and Hanghang Tong. MET: A fast algorithm for minimizing propagation [30] B. Aditya Prakash, Deepayan Chakrabarti, Nicholas in large graphs with small eigen-gaps. In SDM, Valler, Michalis Faloutsos, and Christos Faloutsos. pages 694–702. SIAM, 2015. Threshold conditions for arbitrary cascade models on arbitrary networks. Knowl. Inf. Syst., 33(3): [18] Stamatios Lefkimmiatis, John Paul Ward, and 549–575, 2012. Michael Unser. Hessian schatten-norm regulariza- tion for linear inverse problems. IEEE Trans. Image [31] Victor M. Preciado, Michael Zargham, Chinwendu Process., 22(5):1873–1888, 2013. Enyioha, Ali Jadbabaie, and George J. Pappas. Optimal vaccine allocation to control epidemic [19] Jure Leskovec, Jon M. Kleinberg, and Christos outbreaks in arbitrary networks. In CDC, pages Faloutsos. Graph evolution: Densification and 7486–7491. IEEE, 2013. shrinking diameters. ACM Trans. Knowl. Discov. Data, 1(1):2, 2007. [32] Sudip Saha, Abhijin Adiga, B. Aditya Prakash, and Anil Kumar S. Vullikanti. Approximation [20] Jure Leskovec, Mary McGlohon, Christos Faloutsos, algorithms for reducing the spectral radius to Natalie S. Glance, and Matthew Hurst. Patterns of control epidemic spread. In SDM, pages 568–576. cascading behavior in large blog graphs. In SDM, SIAM, 2015. pages 551–556. SIAM, 2007. [33] Comandur Seshadhri, Tamara G Kolda, and Ali [21] Jure Leskovec, Lars Backstrom, and Jon M. Klein- Pinar. Community structure and scale-free collec- berg. Meme-tracking and the dynamics of the news tions of erdős-rényi graphs. Phys. Rev. E, 85(5): cycle. In KDD, pages 497–506. ACM, 2009. 056109, 2012. [22] Marina Meila and Jianbo Shi. Learning segmen- [34] Jimeng Sun, Huiming Qu, Deepayan Chakrabarti, tation by random walks. In NIPS, pages 873–879. and Christos Faloutsos. Neighborhood formation MIT Press, 2000. and anomaly detection in bipartite graphs. In ICDM. IEEE, 2005. [23] Piet Van Mieghem, Jasmina Omic, and Robert E. Kooij. Virus spread in networks. IEEE/ACM Trans. [35] Hanghang Tong, Christos Faloutsos, and Jia-Yu Netw., 17(1):1–14, 2009. Pan. Fast random walk with restart and its applications. In ICDM, pages 613–622. IEEE, 2006. [24] Adilson E Motter and Ying-Cheng Lai. Cascade- based attacks on complex networks. Phys. Rev. E, [36] Hanghang Tong, B. Aditya Prakash, Charalampos E. 66(6):065102, 2002. Tsourakakis, Tina Eliassi-Rad, Christos Faloutsos, and Duen Horng Chau. On the vulnerability of [25] Tore Opsahl. US airport network traffic data in large graphs. In ICDM, pages 1091–1096. IEEE 2010. https://bit.ly/3dlpN3a, 2010. Computer Society, 2010. [26] Lawrence Page, Sergey Brin, Rajeev Motwani, and [37] Hanghang Tong, B. Aditya Prakash, Tina Eliassi- Terry Winograd. The pagerank citation ranking: Rad, Michalis Faloutsos, and Christos Faloutsos. Bringing order to the web. Technical report, Gelling, and melting, large graphs by edge manipu- Stanford InfoLab, 1999. lation. In CIKM, pages 245–254. ACM, 2012. [27] Romualdo Pastor-Satorras, Claudio Castellano, [38] Hanghang Tong, B. Aditya Prakash, Tina Eliassi- Piet Van Mieghem, and Alessandro Vespignani. Rad, Michalis Faloutsos, and Christos Faloutsos. Epidemic processes in complex networks. Reviews Gelling, and melting, large graphs by edge manipu- of modern physics, 87(3):925, 2015. lation. In CIKM, pages 245–254. ACM, 2012. [28] Adam Paszke, Sam Gross, Soumith Chintala, Gre- [39] Leo Torres, Kevin S Chan, Hanghang Tong, gory Chanan, Edward Yang, Zachary DeVito, Zem- and Tina Eliassi-Rad. Node immunization with ing Lin, Alban Desmaison, Luca Antiga, and Adam non-backtracking eigenvalues. arXiv preprint Lerer. Automatic differentiation in pytorch, 2017. arXiv:2002.12309, 2020. Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
[40] Lloyd N Trefethen and David Bau III. Numerical Appendix linear algebra, volume 50. SIAM, 1997. A Generalization to Other Diffusion [41] Lloyd N Trefethen and Mark Embree. Spectra and Dynamics pseudospectra: the behavior of nonnormal matrices In this section we discuss generalization of the targeted and operators. Princeton University Press, 2005. diffusion model, i.e., Eq. (3.6), to other common diffusion dynamics. The fundamental question is: does the heuris- [42] Piet Van Mieghem, Dragan Stevanović, Fernando tic encoded by the model apply to other scenarios with Kuipers, Cong Li, Ruud Van De Bovenkamp, Daijie different diffusion dynamics (e.g., SIR or SEIR)? Liu, and Huijuan Wang. Decreasing the spectral radius of a graph by link removals. Phys. Rev. E, First, the feasible region of the model is independent of 84(1):016101, 2011. the diffusion dynamics, as it is only related to the spectral properties of the underlying graph. Thus, the structural [43] Tan Van Vu and Yoshihiko Hasegawa. Diffusion- (i.e., spectra, degree sequence, and triangle distribution) dynamics laws in stochastic reaction networks. Phys. preserving properties of the diffusion model generalize to Rev. E, 99:012416, 2019. other diffusion dynamics. Next, recall that the objective [44] Zhihui Wang and Thomas S. Deisboeck. Dynamic function of the model is the following targeting in cancer treatment. Frontiers in Physiol- α1 λ1 (ÃS ) + α2 σ(S) + α3 φ(S). ogy, 10:1–9, 2019. The third term is the normalized cut, which only depends [45] Duncan J Watts and Steven H Strogatz. Collective on structural properties of the underlying graph, so dynamics of small-world networks. Nature, 393 it generalizes to any other diffusion dynamics. The (6684):440, 1998. first term also generalizes to many common diffusion [46] Yang Yang, Takashi Nishikawa, and Adilson E dynamics, including SIR and SEIR, as their epidemic Motter. Small vulnerable sets determine large thresholds are known to also be determined by the largest network cascades in power grids. Science, 358(6365): eigenvalue of the underlying adjacency matrix [30]. The eaan3184, 2017. only exception is the second term σ(S), that is, limiting the impact on non-targeted subset through maximizing [47] Sixie Yu and Yevgeniy Vorobeychik. Removing the eigencentrality of the targeted subset. This is because malicious nodes from networks. In AAMAS, pages the rationale of maximizing σ(S) depends on the steady 314–322, 2019. state of the diffusion dynamics. Here, the steady state is where in the long run a constant (in average) fraction [48] Haifeng Zhang, Yevgeniy Vorobeychik, Joshua of infected nodes exist. However, both SIR and SEIR Letchford, and Kiran Lakkaraju. Data-driven agent- have been shown without a steady state, as in the long based modeling, with application to rooftop solar run all nodes will be in the recovered state (i.e., immune adoption. JAAMAS, 30(6):1023–1049, 2016. to the diffusion) [27]. [49] Kai Zhou, Tomasz P. Michalak, Marcin Waniek, Ta- Finally, the certified robustness in Section 5 generalizes lal Rahwan, and Yevgeniy Vorobeychik. Attacking to other diffusion dynamics, as its proof only depends on similarity-based link prediction in social networks. the spectral properties of the underlying graph. In AAMAS, pages 305–313, 2019. B Degree Sequence and Triangles We now show that satisfying the restrictions Eq (3.5) implies that certain structural properties of the graph will be perturbed by only a small amount. Indeed, the principal’s action has mild impact on the degree sequence of G. Let d = A1 be the vector whose i-th entry is the degree of the i-th node in the original graph, and similarly, let d˜ = Ã1 be the degree sequence after the perturbation. Proposition B.1. The degree sequence of G before and after the perturbation satisfies: √ (2.13) kd˜ − dk2 ≤ n. Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
Proof. And thus (2.14) n n (a) √ √ kd˜ − dk2 = kÃ1 − A1k2 = k∆1k2 = k∆(1/ n)1 nk2 X X 2|T̃ − T | ≈ λi (A)2 ηi ≤ λi (A)2 ηi √ i i ≤ n max k∆xk2 kxk=1 n (2.20) X (b) √ ≤ λi (A)2 = nk∆k2 i √ ≤ n, = Tr(A2 ). where (a) is due to the fact that à − A = ∆, and (b) comes Since A is symmetric and binary, we have Tr(A2 ) = 2m, from the definition of spectral norm. where m is the number of edges in G. A direct corollary of Proposition B.1 concerns the average In what follows we present experimental results to show degree of G. that the spectra and degree sequences do not change a lot due to the targeted diffusion. The spectra and Corollary B.1. The average degree of G after the pertur- degree sequences of G and G̃ are showed in Figure 4, bation is within of the average degree before the perturbation: in which the top row presents the spectra with the eigenvalues as ranked in descending order, and the bottom row presents the degree sequences. The three (2.15) davg (G; Ã) − davg (G; A) ≤ . columns (from left to right) correspond to the email network, the airport network, and the brain network, > ˜ > respectively. The parameter γ is set to 0.5, the most Proof. Note that davg (G̃) = 1 n d and davg (G) = 1 n d . powerful principal. Thus we have: The Email Network: From Figure 4 (top row), the ˜ − 1> d/n = (1/n) 1> (d˜ − d) 1> d/n eigenvalues with large value admit the largest deviation, (2.16) while the bottom row of that figure shows that the ≤ (1/n)k1k2 · kd˜ − dk2 degree sequence is not significantly affected by the = . targeted diffusion. In fact, the change to the original degree sequence is mild, and a student’s t-test cannot differentiate the modified degree sequence from the original one (p-value=0.081). Next, we perform a similar analysis for the number of triangles before and after the perturbation. The Airport and The Brain Networks: The airport network is directed and each node pair is Proposition B.2. Assume G is unweighted with m edges associated with two edges in opposite directions. We and T triangles. Suppose the number of triangles after the convert the network to an undirected one by substituting perturbation is T̃ . Then we have an undirected edge for the two edges. The weight of the undirected edge is the sum of the weights on the two (2.17) |T − T̃ | ≤ m, edges. The spectra and degree sequences of G and G̃ are showed in the last two columns of Figure 4. As we where the estimate is correct up to a first order approximation. can see from Figure 4 (top row), the graph spectrum is again nearly preserved, except the eigenvalues with small Proof. Since G is unweighted, we have T = Tr A3 /6, where Tr is the trace operator. The restrictions Eq. (3.5) values admit some deviation. Similarly, the modified guarantee that we can write λi (Ã) = λi (A) + ηi , where degree sequences cannot be differentiated from the ηi ∈ [−1, 1]. Thus, original one by student’s t-tests (airport: p-value=0.4969, (2.18) brain: p-value=0.9919). n X n X n X 6T̃ = λi (Ã3 ) = λi (Ã)3 = (λi (A) + ηi )3 . C Proof of Theorem 5.1 i i i Proof. From the discussion in the main paper, an instance Expanding the cube and neglecting the terms of higher order TargetDiff(S, G, ) can be encoded by the following meta in ηi , we have model: (2.19) n n n max I(G̃S ) − I(GS ) à X X X 6T̃ ≈ λi (A)3 + 3 λi (A)2 ηi = 6T + 3 λi (A)2 ηi . (3.21) i i i s.t. à ∈ P. Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
eigenvalues eigenvalues eigenvalues modified modified modified of two convex sets. 50 original 5 original 4 original 0 2 0 0 The objective function is concave in d˜S , since it is twice 0 500 1000 0 500 1000 1500 0 200 400 600 differentiable on the feasible region M and the Hessian matrix eigen-dimension eigen-dimension eigen-dimension is negative definite; the Hessian matrix is a diagonal matrix 300 1500 with the i-th diagonal element being −2/d˜3i . Thus, Eq. (3.23) modified modified 40 modified #nodes #nodes 200 1000 #nodes original original original is a convex optimization problem. Note that the Slater’s 100 500 20 condition is satisfied (e.g., with d˜S = dS ), which indicates 0 0 0 that strong duality holds. Thus, the KKT conditions are 0 100 200 300 0 2 4 0.0 2.5 5.0 7.5 10.0 degree degree 1e7 degree satisfied at any primal and dual optimal solutions. Figure 4: Spectra (top) and degree sequences (bottom) For convenience, in what follows we use di (resp. d˜i ) to for the left: the email network; middle: the airport represent the degree of a node i ∈ S before (resp. after) network; and right: the brain network. The hyper- graph modification. The Lagrange function of Eq. (3.23) is: (3.24) parameters are set to α1 = α2 = α3 = 1/3. X 1 1 L(d˜S , λ, β) = − + λ n2 − kd˜S − dS k22 i∈S di d˜i As discussed in [23] (see Section IV.B), computing the exact + β > d˜S − dS , value of I(GS ) is intractable, since the exact computation of πi is challenging. Our model Eq. (3.6) can be thought where λ ≥ 0 and β 0 are Lagrangian multipliers. Recall of as a tractable proxy to the meta model. An estimation ˆ S) = P that the degrees of nodes in S are increased, i.e., d˜i ≥ di to I(GS ) is given in [23], i.e., I(G i∈S 1 − δ/(βdi ), for all i ∈ S. Note that for a node i ∈ S such that d˜i = di , where di is the degree of node i in G. The estimator works in we let the corresponding βi = 0. Thus, by complementary the region δ/β ≤ dmin , where dmin is the minimum degree slackness, we have βi = 0 for all i ∈ S. The gradient of of G. When the estimation is reasonably good, that is ˆ S ) − I(GS )| ≤ τ , we have the following relation: L(d˜S , λ, β) w.r.t. d˜i becomes: |I(G (3.22) ∂L 1 1 I(G̃S ) − I(GS ) > 2τ =⇒ (I( ˆ G̃S ) + τ ) − (I(G ˆ S ) − τ ) > 2τ = 2 − 2λd˜i + βi = 2 − 2λd˜i . ∂ d˜i d˜i d˜i ˆ G̃S ) − I(G =⇒ I( ˆ S ) > 0. Setting the gradient to zero leads to: Thus, in what follows we focus on deriving the necessary 1/3 ˆ G̃S ) − I(G condition for I( ˆ S ) > 0, which directly translates 1 d˜i = , ∀i ∈ S. to the necessary condition for I(G̃S ) − I(GS ) > 2τ . 2λ Suppose there exists an adjacency matrix Ã∗ ∈ P such that Since the optimal solution exists, we have λ 6= 0. By I(G̃S ) − I(GS ) > 2τ . This indicates that the corresponding complementary slackness we have λ n2 − kd˜S − dS k22 = 0, instance TargetDiff(S, G, ) is successful. Consequently, it which indicates: follows ˆ G̃S )− I(G I( ˆ S ) > 0. Recall that I( ˆ G̃S )− I(G ˆ S) = δ P that 1 1 ˜S ∈ R|S| represent the degree n2 = kd˜S − dS k22 . β ( i∈S di − d˜i ). Let d Expand the above equation: sequence of nodes in S. Due to Proposition B.1 we have kd˜S − dS k22 ≤ kd˜ − dk22 ≤ n2 . Consider the optimization n2 = kd˜S − dS k22 problem in Eq. (3.23), where the objective function is X 2 ˆ G̃S ) − I(G I( ˆ S ) (up to a multiplicative factor). The last = d˜i − di constraint follows from the assumption that for a successful (3.25) i∈S instance the degrees of nodes in S increase. The fact that X 1 1/3 !2 ˆ G̃S ) − I(G I( ˆ S ) > 0 implies that the optimal solution of = − di . Eq. (3.23) exists and the associated objective value is greater i∈S 2λ than zero. 1 1/3 Substitute 2λ with a variable x and re-arrange the above equation: X 1 1 max − d2i − n2 P P 2 di d˜S i∈S di d˜i x2 − i∈S x+ i∈S = 0. (3.23) |S| |S| s.t. kd˜S − dS k22 ≤ n2 According to vieta theorem, a necessary condition that we d˜S dS . can solve for x ∈ R from the above equation is: ! 2 i∈S di 2 2 2 P P i∈S di − n Denote the feasible region of the above optimization problem −4 ≥ 0, as M. Note that M is a convex set since it is the intersection |S| |S| Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
25% 4% 20% which leads to: (1/3, 1/3, 1/3) (1/3, 0, 1/3) Ioriginal Ioriginal Ioriginal 0% 3% (1/3, 0, 1/3) 15% (1/3, 0, 0) 0 -25% 2% S S S r P 2 P 2 !1/2 10% |S| i∈S di i∈S di -50% 1% Imodified Imodified Imodified ≥ − . -75% (1/3, 0, 1/3) 0% 5% n |S| |S|2 (0, 0, 1/3) 0 S S S -100% -1% 0% 10% 20% 30% 40% 50% 10% 20% 30% 40% 50% 10% 20% 30% 40% 50% 5.00% 0.40% (1/3, 1/3, 1/3) 6.00% (1/3, 0, 1/3) Ioriginal Ioriginal Ioriginal 0.00% (1/3, 0, 1/3) 4.00% (1/3, 0, 0) 0.20% 0 2.00% S S S -5.00% D Additional Results on Real Networks 0.00% Imodified Imodified Imodified 0.00% -10.00% (1/3, 0, 1/3) -2.00% (0, 0, 1/3) 0 S S S We run three experiments to show the effectiveness of -15.00% -0.20% -4.00% 10% 20% 30% 40% 50% 10% 20% 30% 40% 50% 10% 20% 30% 40% 50% each term in the objective function of POTION . The 5% 1% 5% results are showed in Figure 5. The three rows (from Ioriginal Ioriginal Ioriginal (1/3, 1/3, 1/3) (1/3, 0, 1/3) 0% 0% (1/3, 0, 1/3) 4% (1/3, 0, 0) 0 S S S 3% top to bottom) correspond to experimental results on -5% 0% 2% Imodified Imodified Imodified -10% (1/3, 0, 1/3) -0% the email, airport and brain networks. (0, 0, 1/3) 1% -15% -1% 0% 0 S S S 10% 20% 30% 40% 50% 10% 20% 30% 40% 50% 10% 20% 30% 40% 50% The first column corresponds to the first experiment, which is to show that maximizing λ1 (ÃS ) leads to higher infected ratios. The hyper-parameters are set to Figure 5: Experiments showing the model’s effectiveness. α1 = 1/3, α2 = 0, and α3 = 1/3 (the hyper-parameters Top: the email network; Middle: the airport network; do not need to sum to one). The labels of the y-axis Bottom: the brain network. The three columns (from left S S to right) show the effectiveness of: 1) maximizing λ1 (ÃS ); 2) become Imodified − Ioriginal , which highlights that the infected ratios are for the targeted subgraph GS (the Maximizing the eigenvector centrality of S; 3) maximizing higher the better). Note that α2 is set to zero in order to the normalized cut of S. avoid the coupling between the eigenvector centrality of S and λ1 (ÃS ). From the plot it is clear that maximizing λ1 (ÃS ) is important to increase the infected ratios within qualitatively resemble real networks [45]. BTER are GS (the blue line). Note that solely maximizing the generative network models that can be calibrated to normalized cut of S may backfire (the red line), as match real-world networks, in particular, to reproduce a large portion of edges are deleted from GS when γ the community structures [33]. increases. The experimental setup is similar to the setup for the The second column is to show the effectiveness of limiting email network, except for a few changes. First, the the impact on GS 0 by maximizing the eigenvector S0 S0 experimental results for each class of the synthetic centrality of S. The y-axis represents Imodified − Ioriginal , networks are averaged over 30 randomly generated which highlights that the infected ratios are for the network topologies. Another difference lies in how the non-targeted subgraph GS 0 (the lower the better). The targeted set S is selected. For each randomly generated plot shows that the impact on GS 0 is well limited; the network, the targeted set S is selected as the node whose effectiveness is most significant when γ > 40%. degree is the 90 percentile of the degree sequence, and The last column is to show the effectiveness of maximiz- its neighbors. Some statistics of the synthetic networks ing the normalized cut of S. We set α2 = 0 to avoid are summarized in Table 2. Recall that δ = 0.24 and the effect of maximizing the eigenvector centrality of β = 0.06. The experimental results are showed in S. Observe that maximizing the normalized cut of S is Figure 6. The conclusion derived from Figure 6 is similar effective in increasing the infected ratio within GS only to that of the email network. It is worth pointing out that for the email network. This suggests that on weighted maximizing the normalized cut of S is effective on BA graphs normalized cut might not be a good heuristic to networks, while for other network it may backfire. increase the centrality of S. E Additional Results on Synthetic BA Watts-Strogatz BTER Networks |S| 17.5 12 20.03 In this section we show experimental results on synthetic dmin 9.86 10 11.69 density 0.02 0.03 0.03 unweighted graphs with 375 nodes. We focus on average degree 9.87 10 11.5 three classes of networks: Barabasi-Albert (BA), Watts- average clustering coeff. 0.08 0.35 0.05 Strogatz, and BTER [33]. BA is characterized by its power-law degree distribution [5]. Watts-Strogatz Table 2: Statistics of synthetic networks. is well-known for its local clustering in a way as to Copyright © 2021 by SIAM Unauthorized reproduction of this article is prohibited
You can also read