 Research
 Open Access
 Published:
A new multilevel algorithm for balanced partition problem on large scale directed graphs
Advances in Aerodynamics volume 3, Article number: 23 (2021)
Abstract
Graph partition is a classical combinatorial optimization and graph theory problem, and it has a lot of applications, such as scientific computing, VLSI design and clustering etc. In this paper, we study the partition problem on large scale directed graphs under a new objective function, a new instance of graph partition problem. We firstly propose the modeling of this problem, then design an algorithm based on multilevel strategy and recursive partition method, and finally do a lot of simulation experiments. The experimental results verify the stability of our algorithm and show that our algorithm has the same good performance as METIS. In addition, our algorithm is better than METIS on unbalanced ratio.
Introduction
Graph partition is a classical combinatorial optimization and graph theory problem. Given a graph G and a parameter k, the aim of this problem is to divide the vertex set of G into k parts, and to optimize the given objective functions. If we require the number (or total weights) of vertices of all parts to be the same or as close as possible, this problem is called a balanced graph partitioning problem (BGP). BGP is a standard special case of graph partition problem, and it has a lot of applications, in scientific computing, VLSI and chips design, image processing and clustering etc. Andreev and H. R\(\ddot {a}\)cke [1] showed that BGP is NPhard even for 2partition, and there is no constant approximation algorithm. In particular, BPG doesn’t admit constant approximation algorithm unless NP = P, even for trees and grids [2]. In addition, other cases of graph partition problem with application background have also received extensive attention from researchers, such as hypergraph partition problem[3, 4], balanced connected graph partition problem [5, 6], and pathpartition problem [7], etc. Recently, Buluç et al. [8] surveyed the algorithms design and applications of graph partition problem. Although there is no constant approximation algorithm for BGP, due to its wide applications, many heuristic algorithms had been developed to solve it. Firstly, by using local search strategy, Kernighan and Lin [9] presented an efficient heuristic algorithm for 2BGP with time complexity O(n^{2} logn). Then, Fiduccia and Mattheyses [10] developed a linear heuristic algorithm. Spectral method [11] is also an important method to solve BGP. This method divides the given graph into two parts, by using their eigenvalues and eigenvectors of its adjacency matrix or Laplacian matrix. At present, there are many graph partition algorithms based on spectral method [12, 13], which can solve 2BGP or general kBGP iteratively.
On the other hand, with the increasing of the problem scale and improvement of the computing power, the size of the graph to be partitioned is becoming larger and larger, and the number of vertices of the graph reaches 100,000,000 or more. Thus, it is impractical to use the previous algorithms to solve large scale graph partition problem. Therefore, researchers proposed multilevel method and streaming algorithms to solve this problem. The main idea of multilevel method is to convert the original graph into a small scale resulting graph by multiple contraction firstly, then divide the new graph into kparts, and finally back map and modify the partition of the contracted graph to become a partition of the original graph. The popular software and software package of graph partition, METIS [14] and KaHIP [15] were designed based on this method. The main idea of the streaming algorithm is to assign each vertex of the graph into the suitable part one by one, through a specific potential function. The advantage of streaming algorithm is fast and memorysaving, and it is very suitable for largescale graph partition problem. The graph partition software FENNEL is based on streaming algorithm [16].
Although a lot of theoretical results and algorithms on graph partition have been obtained, there are still some problems that have not been explored. The first problem is partition on directed graph. Most of the previous works are on undirected graphs, but for some practical applications, such as multisubject coupling problem, the corresponding models should be directed graph. Therefore, it is necessary to study the partition on directed graphs. The second one is about the objective function. In the past, researchers often considered the vertexweight and the edgeweight separately, that is, to optimize some edgeweight objective functions under some vertexweight constraints. There are few works on objective functions combining the two weights together. Based on the above two points, we study the directed graph problem with combined weight function.
The organization of this paper is as follows. Some basic conceptions of graph theory and the mathematical modeling of this problem will be presented in Section 2. In Section 3, we introduce the main idea and process of our algorithm. The experimental results are exhibited in Section 4. In detail, we will verify the stability of our algorithm, determine some parameters and compare our algorithm with METIS. Finally, the conclusion and future work are given in Section 5.
Basic conceptions and mathematical modeling
In this section, we will introduce some conceptions in graph theory and develop the mathematical programming for the new balanced graph partition problem.
A (undirected) graphG is an ordered pair (V(G),E(G)) consisting of a set V(G) of vertices, and a set E(G) of edges. Each edge of G is an unordered pair of vertices. If an edge e joins vertices u and v, then u and v are called the ends of e. A directed graphD is an ordered pair (V(D),A(D)) consisting of a set V(D) of vertices, and a set A(D) of arcs (directed edges). Each arc of D is an ordered pair of vertices. If an arc a joins vertices u to v, then u is the tail of a, v is the head of a, and u and v are the ends of a. For any graph, if we regard each edge e=uv as two arcs (u, v) and (v, u), then this graph becomes a directed graph. Thus, undirected graphs can be considered as a special class of directed graphs. For any vertex v in D, the notation \(A_{D}^{}(\{v\})\) is the sets of arcs whose heads are v, and the notation \(A_{D}^{+}(\{v\})\) is the sets of arcs whose tails are v. Furthermore, for any vertex subset X, \(A_{D}^{}(X)\) (\(A_{D}^{+}(X)\)) is the sets of arcs whose heads (tails) are in X, but tails (heads) are not in X. A set M of independent arcs (no common ends) in a digraph D is called a matching. Given a matching M of D, a vertex v is called matched (by M) if v is an end of some arc of M; otherwise, v is called unmatched. A matching M of G is maximal if for any arc a not in M, M∪a is not a matching of D.
Given a directed graph D=(V,A) with a weighted function w on V∪A, a kpartitionP is a decomposition (V_{1},V_{2},…,V_{k}) on vertex set V, such that V_{i}≠∅,V_{i}∩V_{j}=∅ for any 1≤i<j≤k and V_{1}∪V_{2}∪⋯∪V_{k}=V. Given a specific kpartition P, for any part j, we define its load
where \(w(V_{j})=\sum \limits _{v\in V_{j}} w(v)\) and \(w(A_{D}^{}(V_{j}))=\sum \limits _{a\in A_{D}^{}(V_{j})}w(a)\). Let \(L^{P}_{M}\) and \(L^{P}_{M}\) be the maximum load and minimum load among all parts in P, that is,
Thus, we model the balanced graph partition problem as the following unconstrained twoobjective programming,
where \(\mathcal {P}\) is the set of all kpartitions of G and ρ^{P} is the unbalanced ratio of the partition P.
As mentioned in Section 1, our problem differs from the one in METIS in two points. The first is that METIS only deals with undirected graphs, but our problem is defined on directed graphs. The second is the different objections. The optimization problem of METIS is as follows,
where E_{C} is the set of edges whose ends are in distinct parts, and ρ≥1 is the unbalanced ratio of the vertex weights. That is to say, the model of METIS considers vertices and edges separately, but we consider them together.
Algorithm
Since the scale of the graphs we’re going to deal with is very large (up to 100,000,000 vertices), and the number of parts is also large (up to 100,000), our algorithm is designed by combining the classical multilevel method and the recursive partition method.
Multilevel stage
Recently, the popular method to partition the large scale graph is the multilevel method. The multilevel method contains three phases: iterative contraction, initial partition and modification, and backward mapping. We will introduce the detail of each phase in the following.
PHASE 1: Iterative Contraction. In this phase, we will construct a sequence of directed graphs (D_{0},D_{1},…,D_{m}) with D_{i+1}<D_{i} for 0≤i≤m−1, where D_{0} is the original directed graph. To do this, we use the standard strategy for any current graph D_{i}. We compute a maximal matching M_{i} and contract every arc of M_{i} into a new vertex to obtain the next graph D_{i+1}. In detail, for any arc a=(u,v) of M_{i}, the process of contraction is removing a and a^{′}=(v,u) and identifying u and v as a new vertex x so that it is incident with whose arcs (other than a and a^{′}) that were originally incident u or v or both. The weight of new vertex x is the sum of weights of vertices u and v, and the weight of each new arc (x,y) is equal to w(u,y) or w(v,y) or w(x,y)=w(u,y)+w(v,y), respectively.
This phase ends when one of the following occurs: (i) the number of vertices of the current graph is less than ck, where k is the number of parts of the partition and c=90 is the contracted parameter chosen by our experiments in the next section; (ii) the ratio of contraction V(D_{i+1})/V(D_{i}) is larger than 80%, that is M_{i}≤20%V(D_{i}). To compute the maximal matching, we will use the following two random methods.
Random Maximum Weight Matching (RMWM). This classical method is used in METIS [14] and other multilevel algorithms [15]. The process of RMWM is as follows. The vertices of the graph are chosen by a random order. For a chosen vertex u, if u is already matched by other vertex or its inneighbors are all matched, we choose the next vertex. Otherwise, u is matched with its unmatched inneighbor v with the maximum weight of arc (v,u), that is,
When all vertices are chosen, we can obtain a maximal matching.
Random Maximum Ratio Matching (RMRM). The motivation to use this matching is the new objective functions. The only difference between the processes of RMRM and RMWM is the way to choose a vertex to match a vertex u, from its inneighbors. Since the objective function considers the weights of vertices and arcs together, u is matched with its unmatched inneighbor v with the maximum ratio of arcweight to vertexweight, that is,
PHASE 2: Initial Partition and Modification. After iterative contraction, the final graph D_{m} has at most ck vertices. Thus, we can fast obtain a good initial partition by greedy strategy. In detail, we will use the best fit decreasing (BFD) algorithm similar to that of solving the binpacking problem. Firstly, we set every part P_{j}=∅ for any j=1,2,…,k and reordering the vertices with decreasing vertexweight. For each stage, if we put the current vertex v into the jth part, then the load of the jth part will become
and the load of other part i (≠j) will become
Thus, we put v into the part so that the maximum load is minimum. When all the vertices are visited, the initial partition P is obtained.
The aim of modification is to make the initial partition a local optimum. The main strategy is local search, that is, move a vertex of the maximum load part into another part to reduce the maximum load, iteratively. In detail, for current iteration, we firstly choose a part P_{j} with the maximum load. Then, for any vertex v in P_{j}, we calculate its inarcweight \(w^{}_{i}(v)\) and outarcweight \(w^{+}_{i}(v)\) with respect to each part P_{i} (1≤i≤k) as follows,
Now, if we move vertex v from part P_{j} into part P_{i}, then the load of any part other than P_{i} and P_{j} has not changed, and the new loads \(L^{\prime }_{j}\) and \(L^{\prime }_{i}\) become
For every pair (v,P_{i}), we can calculate the maximum load and the sum of loads of the swapped partition.
If there exist some swapped partitions whose maximum load is less than that of the current partition, then we choose the swapped partition with minimum maximum load to replace the current one, and repeat this operation. Otherwise, if there are some swapped partitions whose maximum load is equal to that of the current partition, but the sum of loads is less than that of the current partition, we choose the partition with minimum sum of loads instead of the current one, and repeat this operation; else, the current partition achieves a local optimum, and the process of modification is finished.
PHASE 3: Back Mapping. The complete process of back mapping should be mapping the partition of D_{i+1} back to D_{i}, and modify the partition of D_{i} to be a local optimum recursively for i=m−1,m−2,…,0. But since the original graph is huge and the number of parts is large, in order to save the memory and reduce the running time, we directly map the partition of D_{m} back to the partition of D_{0}.
Recursive partition stage
As stated in the former subsection, the phase of iterative contraction ends when the number of vertices of contracted graph D_{m} is less than 90k, where k is the number of parts of desired partition. This implies that if k is large, the scale of D_{m} is also large, which can result in bad performance and long running time. Thus, we use the recursive partition strategy to avoid this.
The main idea of the recursive partition method is as follows. At the beginning, we factorize k into several small numbers, say, k=k_{1}k_{2}⋯k_{t}, with k_{i}≤20. This can often be accomplished, because in practice k is often chosen to be a number with many factors. In the first step, we use the multilevel method to obtain a k_{1}partition P of the original graph. Since k_{1} is small, we can guarantee good performance and short running time. Based on the partition P, the whole graph is decomposed into k_{1} subgraphs, and each is induced by a part in P. Note that the weight of arcs in the subgraphs is the same as that in the original graph, but the weight of every vertex v needs to be changed as follows,
where P[v] is the part which v belongs to P. The purpose of changing vertexweight is to ensure that the objective value for each subgraph sums up to the one for the whole graph. In the second step, we will divide every subgraph into k_{2} parts, and obtain k_{1}k_{2} new subgraphs by decomposing all old subgraphs. Hence, in the last step, we have k_{1}k_{2}⋯k_{t−1} subgraphs and obtain a k_{t}partition of every subgraph. That is, we obtain a partition of the original graph with k_{1}k_{2}⋯k_{t}=k parts.
How to choose a recursive partition strategy? Based on our experiments in the next section, we find that there is little difference between different strategies. Thus, if k is a power of some integer b≤20, that is k=b^{t}, then we divide k into b×b×⋯×b.
Experimental results
In this section, our experiment is mainly divided into two parts: design of algorithm and comparison with other algorithms. In the part of design of algorithm, we will test the performance of the two random matching methods, verify the stability of random method, and determine the contracted parameter c and strategy of recursive partition. In the comparison part, we will compare our algorithm with the kway partition algorithm in METIS on unbalanced ratio, maximum load and running time to evaluate the performance of our algorithm.
The directed graphs used in the experiment consist of two classes, theoretical and practical models. We use the grid graph as the representative of the theoretical model, which can also be regarded as the inner dual graph of the square grid of a plane. We consider grid graphs of three sizes, namely, Grid1 with 1,000,000 vertices and 3,996,000 arcs, Grid2 with 10,890,000 vertices and 43,546,800 arcs, and Grid3 with 100,000,000 vertices and 399,600,000 arcs, each of which has a random vertexweight of 120150, and the weight of every arc is about 1/20 of the weight of its end. For practical models, we use 8 graphs from 3D finite element meshes, two of them from the METIS and others from the real examples. The characters of all graphs are showed in Table 1. All the experiments were performed on a Dell T7610 graphics workstation with Intel Xeon 2.6GHz CPU (6 cores) and 1866mhz DDR3 32 GB memory.
Matching comparison
The aim of the subsection is to test the performance of the two matching contraction methods, RMWM and RMRM mentioned in Subsec. 3.1. We do the experiment on five graphs, Grid1, Grid2, MDual, FEM1 and FEM3. The smallscale graphs (Grid1, MDual and FEM1) and largescale graphs (Grid2 and FEM3) are partitioned into 100 and 1000 parts and 1000 and 10000 parts, respectively, where the contracted parameter c=90 and the recursive partition strategies are 10^{2},10^{3} and 10^{4}. Because of the randomness of the algorithm, we do each partition 10 times, and then compare the average and maximum values of the unbalanced rate ρ and the maxload L_{M}. The experimental and comparative results can be seen in Table 2, Figs. 1 and 2.
Figure 1 illustrates that the unbalanced ratios of RMWM are better than that of RMRM, except for the maximum unbalanced ratio of 100partition on MDual. Figure 2 implies that in term of maxload, while the performance of RMWM is better than that of RMRM, the gap is very small and the maximum ratio is less than 1.012. Hence, we use the method in the following.
Stability verification
In this subsection, we will test the stability of the algorithm, that is, determining whether randomness brings a large deviation to the output. The same graphs with same parts are used in the experiment. We compare the experiment results from three aspects: unbalance ratio, maxload and running time. The detail can be seen in Table 3.
From Fig. 3, we can see that the gap between the best and the worst result is very small and does not exceed 0.70%. Furthermore, the unbalance ratio in every test case is quite small, less than 2.00% except the worst result of 10000partition on FEM3. Figure 4 illustrates the maxload and the running time, where the baseline is average values. For each example, the worst maxload is almost equal to the best one; the difference of running time is also very small, and the maximum ratio is about 1.10. Hence, the randomness of our algorithm does not bring much deviation, and it is very stable.
Determining parameters
In our algorithm, there is a parameter and a strategy that need to be determined. Firstly, we determine the parameter, contracted parameter c mentioned in Subsec. 3.1, by comparing the results with c=50, 70, 90, 110, 130, 150. The experiment was conducted by three representative graphs, Grid2, MDual and FEM3, with the same partitions and same recursive partition strategies as Subsec. 4.1. The comparison results are illustrated in the following three figures.
Figure 5 shows the unbalanced ratios with different contracted parameter c. Figure 6 and Fig. 7 exhibit the ratios of results of other parameters at maximum load and running time to results of c=90, respectively. From these figures, we can see that the unbalanced ratio will basically decrease with the increase of the contracted parameters, on the contrary, the maxload and the running time will often rise with the increase of the parameters. Overall, good performance occurs when the parameter is selected as 70, 90, 110. Thus, we will choose the parameter c=90.
For the recursive partition strategy, by dividing the number k and doing corresponding experiments, we find that there is little difference between these results. The deviations of unbalanced ratio and ratio of maxload are at most 0.5% and 0.2%, respectively. Hence, we choose the simplest strategy, that is, divide k into a power of some integer b≤20. For example, if k=1000, our algorithm is divided into three stages, and each stage does 10partition.
Comparison with METIS
In this subsection, we will compare the performance of our algorithm (Graph\(\underline {~~}\)Partition) with the kway partition in METIS by carrying out the experiments on the 11 graphs of Table 1. Since METIS can only deal with undirected graphs, we transform each directed graph in Table 1 into an undirected graph, by modifying the weight of every edge uv as \(\frac {w(u,v)+w(v,u)}{2}\). Then, the resulting undirected graphs are partitioned by the kway partition. Finally, we calculate the unbalanced ratio and maxload of each graph with respect to the partition. The experimental results can be seen in Table 4, and the comparison can be seen in the following figures. Note that since the graph Grid3 is huge (100,000,000 vertices and 399,600,000 arcs), METIS does not calculate a feasible result.
Figure 8 illustrates the unbalanced ratios of partition results of the two algorithms. From the figure, we can see that the unbalanced ratio of small part is better than that of big part for each graph. This is a very natural phenomenon. Most of unbalanced ratios by our algorithm are less than 2%, and most of the results by METIS are between 6% and 9%. Clearly, our algorithm is better than METIS on unbalanced ratio. All unbalanced ratios of graph Copter are worse, and the reason is the average degree of Copter is much larger than others.
Figures 9 and 10 show the ratios of maxload and running time of our algorithm to that of METIS. Figure 9 illustrates that most of all ratios of maxload are between 0.94 and 1.06. This implies that there is little difference between the two algorithms in terms of maximum load. Moreover, we can see that the ratio increases with the number of parts, and the main reason is that we do not use mutlilevel modification in back mapping phase. And this is also a key direction in our future work. From Fig. 10, we can see that for the small k, our algorithm often runs longer than METIS; conversely, our algorithm often runs less time than METIS for large k. This difference is related to the number of iterations and the average number of vertices in each part.
Conclusions and future work
In this paper, we consider the balanced partition problem on large scale directed graphs. Firstly, we present a new mathematical modeling with new objective functions for this problem. Then, we combine multilevel strategy and recursive partition method to design an algorithm to solve it. Finally, by a large number of experiments, we determine the parameters, verify the stability of the algorithm, and compare with kway partition in METIS in unbalanced ratio, maximum load and running time three aspects. The experimental results show that comparing with METIS, our algorithm is better in unbalanced ratio and has the same quality in maximum load. Furthermore, our algorithm can deal with some graphs with huge scale, which METIS can not return a feasible result.
There are two possible directions for future work. The first one is adding modification in back mapping phase, that is, map the partition of D_{m} back to that of D_{0} level by level, and modify the partition of each level to be a local optimum. The second one is to ensure the connectivity of each part. Furthermore, finding a new good and efficient graph contraction method is also a meaningful work.
Availability of data and materials
The data used or analysed during the current study are available from the corresponding author on reasonable request.
Abbreviations
 VLSI:

Very large scale integration
 BGP:

Balanced graph partitioning
 RMWM:

Random maximum weight matching
 RMRM:

Random maximum ratio matching
 BFD:

Best fit decreasing.
References
 1
Andreev K, Racke H (2006) Balanced graph partitioning. Theory Comput Syst 39(6):929–939.
 2
Feldmann AE (2013) Fast balanced partitioning is hard even on grids and trees. Theor Comput Sci 485:61–68.
 3
Chekuri C, Xu C (2018) Minimum cuts and sparsification in hypergraphs. SIAM J Comput 47(6):2118–2156.
 4
Xu B, Yu X, Zhang X, Zhang ZB (2014) An SDP randomized approximation algorithm for max hypergraph cut with limited unbalance. Sci China Math 57(12):2437–2462.
 5
Chen G, Chen Y, Chen ZZ, Lin G, Liu T, Zhang A (2020) Approximation algorithms for the maximally balanced connected graph tripartition problem. J Comb Optim. https://doi.org/10.1007/s1087802000544w.
 6
Wu D, Zhang Z, Wu W (2016) Approximation algorithm for the balanced 2connected kpartition problem. Theor Comput Sci 609:627–638.
 7
Chen Y, Goebel R, Lin G, Liu L, Su B, Tong W, Xu Y, Zhang A (2019) A local search 4/3approximation algorithm for the minimum 3path partition problem In: Proceedings of International Workshop on Frontiers in Algorithmics, 14–25. https://doi.org/10.1007/9783030181260_2.
 8
Buluç A, Meyerhenke H, Safro I, Sanders P, Schulz C (2016) Recent advances in graph partitioning. In: Kliemann L Sanders P (eds)Algorithm Engineering. Lecture Notes in Computer Science, 9220, 117–158.. Springer, Cham. https://doi.org/10.1007/9783319494876_4.
 9
Kernighan BW, Lin S (1970) An efficient heuristic procedure for partitioning graphs. Bell Syst Tech J 49(2):291–307.
 10
Fiduccia CM, Mattheyses RM (1982) A lineartime heuristic for improving network partitions In: Proceedings of the 19th Design Automation Conference, 175–181. https://doi.org/10.1109/dac.1982.1585498.
 11
Chung FRK (1997) Spectral Graph Theory (CBMS Regional Conference Series in Mathematics, No. 92). American Mathematical Society, Providence.
 12
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416.
 13
Naumov M, Moon T (2016) Parallel spectral graph partitioning. NVIDIA Tech Rep NVR2016001:1–30.
 14
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392.
 15
Sanders P, Schulz C (2011) Engineering multilevel graph partitioning algorithms In: Proceedings of the 19th European Symposium on Algorithms, 469–480. https://doi.org/10.1007/9783642237195_40.
 16
Tsourakakis C, Gkantsidis C, Radunovic B, Vojnovic M (2014) Fennel: Streaming graph partitioning for massive scale graphs In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, 333–342. https://doi.org/10.1145/2556195.2556213.
Acknowledgements
The authors would like to thank China Aerodynamics Research and Development Center for providing the practical graph examples.
Funding
This work has been supported by National Numerical Windtunnel Project (No. NNW2019ZT5B16), National Natural Science Foundation of China (Nos. 11871256, 12071194), and the Basic Research Project of Qinghai (No. 2021ZJ703).
Author information
Affiliations
Contributions
The contribution of the authors to this work is equivalent. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Li, X., Pang, Y., Zhao, C. et al. A new multilevel algorithm for balanced partition problem on large scale directed graphs. Adv. Aerodyn. 3, 23 (2021). https://doi.org/10.1186/s4277402100074x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4277402100074x
Keywords
 Graph partition problem
 Large scale graphs
 Directed graphs
 Multilevel strategy