Memetic Multilevel Hypergraph Partitioning
Abstract
Hypergraph partitioning has a wide range of important applications such as VLSI design or scientific computing. With focus on solution quality, we develop the first multilevel memetic algorithm to tackle the problem. Key components of our contribution are new effective multilevel recombination and mutation operations that provide a large amount of diversity. We perform a wide range of experiments on a benchmark set containing instances from application areas such VLSI, SAT solving, social networks, and scientific computing. Our new algorithm computes the best result on almost all instances.
1 Introduction
Given an undirected hypergraph , the way hypergraph partitioning problem is to find disjoint subsets of its vertex set, , called blocks, such that the blocks have roughly equal size and an objective function involving the cut hyperedges is minimized, e.g. , the sum of the weights of those hyperedges that connect multiple blocks. The hypergraph partitioning problem has many important applications in practice such as scientific computing or VLSI design [44]. In particular VLSI design is a field where small improvements can lead to significant savings [60]. Hence, our focus in this work is on solution quality.
It is well known that the hypergraph partitioning problem (HGP) is NPhard [40] so that mostly heuristic algorithms are used in practice. A successful heuristic to partition large hypergraphs is the multilevel approach [16]. Here, the hypergraph is recursively contracted to obtain smaller hypergraphs which should reflect the same basic structure as the input. After applying an initial partitioning algorithm to the smallest hypergraph, contraction is undone and, at each level, a local search method is used to improve the partitioning induced by the coarser level. The intuition behind this approach is that a good partition at one level of the hierarchy will also be a good partition on the next finer level. Hence, depending on the definition of the neighborhood, local search algorithms are able explore local solution spaces very effectively in this setting. However, they are also prone to get stuck in local optima [36]. The multilevel paradigm helps to some extent, since local search has a more global view on the problem on the coarse levels and a very finegrained view on the fine levels of the multilevel hierarchy. In addition, as with many other metaheuristics, multilevel HGP gives better results if several repeated runs are made with some measures taken to diversify the search.
Still even a large number of repeated executions can only scratch the surface of the huge space of possible partitionings. In order to explore the global solution space extensively we need more sophisticated metaheuristics. This is where memetic algorithms (MAs), i.e. , genetic algorithms combined with local search [37], come into play. Memetic algorithms allow for effective exploration (global search) and exploitation (local search) of the solution space. The general idea behind genetic algorithms is to use mechanisms inspired by biological evolution such as selection, mutation, recombination and survival of the fittest. A genetic algorithm (GA) starts with a population of individuals (in our case partitions of the hypergraph) and evolves the population over several generational cycles (rounds). In each round, the GA uses a selection rule based on the fitness of the individuals of the population to select good individuals and combines them to obtain improved offspring [29]. When an offspring is generated an eviction rule is used to select a member of the population to be replaced by the new offspring. For an evolutionary algorithm it is of major importance to preserve diversity in the population [10], i.e., the individuals should not become too similar in order to avoid a premature convergence of the algorithm. This is usually achieved by using mutation operations and by using eviction rules that take similarity of individuals into account.
Several genetic and memetic hypergraph partitioning algorithms have already been proposed in the literature [6, 7, 8, 15, 38]. However none of them is considered to be truly competitive with stateoftheart tools [19]. We believe that this is due to the fact that all of them employ flat (i.e., nonmultilevel) partitioning algorithms to drive the exploitation of the local solution space.
Our main contribution in this paper therefore is a technique that integrates a memetic algorithm with a multilevel hypergraph partitioner. To this end, we present sophisticated recombination and mutation operators as well as a replacement rule that uses a problem specific similarity measure. In contrast to previous work [6, 7, 8, 15, 38], which only considered small and outdated [1, 3] ACM/SIGDA benchmark instances [46] (dating back to the late 1980s), we perform extensive experiments on a large benchmark set containing hypergraphs from several application areas. Our experiments indicate that our algorithm is able to compute partitions of very high quality, scales well to large networks, and performs better than KaHyPar, which seems to be the current method of choice among the available hypergraph partitioning tools unless speed is more important than quality [30]. In a setting where competing algorithms get the same fairly large amount of time to compute a solution, our new algorithm computes the best result on out of the benchmark instances. This is in contrast to previous nonmultilevel evolutionary algorithms for the problem which are not considered to be competitive with stateoftheart tools [19].
2 Preliminaries
Notation and Definitions.
An undirected hypergraph is defined as a set of vertices and a set of hyperedges/nets with vertex weights and net weights , where each net is a subset of the vertex set (i.e., ). The vertices of a net are called pins. We extend and to sets, i.e., and . A vertex is incident to a net if . denotes the set of all incident nets of . The set denotes the neighbors of . The size of a net is the number of its pins. We use to denote the number of pins in . A way partition of a hypergraph is a partition of its vertex set into blocks such that , for and for . We use to refer to the block of vertex . We call a way partition balanced if each block satisfies the balance constraint: for some parameter . Given a way partition , the number of pins of a net in block is defined as . For each net , denotes the connectivity set of . The connectivity of a net is the cardinality of its connectivity set: . A net is called cut net if . The way hypergraph partitioning problem is to find an balanced way partition of a hypergraph that minimizes an objective function over the cut nets for some . Several objective functions exist in the literature [5, 40]. The most commonly used cost functions are the cutnet metric and the connectivity metric , where is the set of all cut nets [24, 26]. Optimizing both objective functions is known to be NPhard [40]. In this paper, we use the connectivitymetric . Contracting a pair of vertices means merging into . The weight of becomes . We connect to the former neighbors of by replacing with in all nets and remove from all nets . Uncontracting a vertex reverses the contraction.
2.1 Related Work
Overview.
Driven by applications in VLSI design and scientific computing, HGP has evolved into a broad research area since the 1990s. We refer to [5, 11, 44, 54] for an extensive overview. In the following, we focus on issues closely related to the contributions of our paper. Memetic algorithms (MAs) were introduced in [42] and formalized in [45] as an extension to the concept of genetic algorithms (GAs) [31]. While GAs effectively explore the global solution space, MAs additionally allow for exploitation of the local solution space by incorporating local search methods into the genetic framework. We refer to [43] for an introduction to memetic algorithms. While several genetic and memetic flat (i.e., nonmultilevel) hypergraph partitioning algorithms have been proposed in the literature, none of them is considered to be truly competitive with stateoftheart tools [19]. Wellknown multilevel HGP software packages with certain distinguishing characteristics include PaToH [17] (originating from scientific computing), hMetis [35, 36] (originating from VLSI design), KaHyPar [2, 30, 52] (general purpose, level), Mondriaan [57] (sparse matrix partitioning), MLPart [4] (circuit partitioning), Zoltan [25], Parkway [55], and SHP [34] (distributed), UMPa [56] (directed hypergraph model, multiobjective), and kPaToH (multiple constraints, fixed vertices) [9].
Evolutionary Hypergraph Partitioning.
Saab and Rao [47] present an evolutionbased approach for solving the a way multiobjective, multiconstraint hypergraph partitioning problem. Since the algorithm only works with one individual, it does not use any recombination operators. Instead, the solution initially generated via bin packing is evolved using a randomized algorithm that moves vertices to different blocks if their gain is greater than some random value. Hulin [33] provides a GA that uses a coding scheme specifically tailored to circuit bipartitioning along with crossover and mutation operations that respect the coding. Bui and Moon [15] present a steadystate MA for ratio cut bipartitioning of hypergraphs, which uses a weak variation of the FM algorithm [28] as local search engine. To improve the performance of the crossover operation, a preprocessing step reindexes the vertices by the visiting order of a weighted depth first search on the cliquerepresentation [32] of the hypergraph. Areibi [6] present a memetic algorithm that combines a GA with a modified version of Sanchis’ way FM algorithm [48]. Areibi and Yang [7] enhance the MA presented in [6] with a preprocessing step that clusters and contracts vertices to reduce the complexity of the hypergraphs. Furthermore, the initial population contains both random as well as good solutions generated using the GRASP heuristic [27]. Armstrong et al. [8] propose a way MA that performs crossover, mutation and local search on multiple individuals in parallel. The traditional FM algorithm [28] and Sanchis’ way FM version [48] are used for local search. Kim et al. [38] present a steadystate MA for hypergraph bipartitioning, which uses a modified FM algorithm that works with lockgains [39]. Note that none of these algorithms makes use of the multilevel paradigm.
Evolutionary Graph Partitioning.
We refer to the survey of Kim et al. [37] for a general overview and more material on genetic approaches for graph partitioning. Soper et al. [53] provide the first algorithm that combined an evolutionary search algorithm within a multilevel graph partitioner. Here, crossover and mutation operators compute edge biases based on the input individuals. A similar approach based on perturbations of edge weights has been used by Delling et al. [23]. Benlic et al. [13] provide a multilevel memetic algorithm for balanced graph partitioning. PROBE [18] is a metaheuristic which can be viewed as a genetic algorithm without selection. It outperforms other metaheuristics, but it is restricted to the case and . KaHIP [51] contains KaFFPaE [50], which has a general recombine operator framework based on a multilevel algorithm.
2.2 way Hypergraph Partitioning using KaHyPar
Since our memetic algorithm builds on top of the KaHyPar framework, we briefly review its core components. While traditional multilevel HGP algorithms contract matchings or clusterings and therefore work with a coarsening hierarchy of levels, KaHyPar instantiates the multilevel paradigm in the extreme level version, removing only a single vertex between two levels. Vertex pairs to be contracted are determined using the heavyedge rating function , where . The coarsening process stops as soon as the number of vertices drops below a certain threshold or no more contractions are possible. The framework currently contains two coarsening algorithms. The first algorithm [52] contracts vertices in decreasing rating score order using a priority queue to store and update the ratings. The second algorithm [2] immediately contracts each vertex with its highestrated neighbor in random order. After coarsening, a portfolio of simple algorithms is used to create an initial partition of the coarsest hypergraph. During uncoarsening, strong localized local search heuristics based on the FM algorithm [28, 48] are used to refine the solution by moving vertices to other blocks in the order of improvements in the optimization objective. The framework provides a recursive bisection algorithm to optimize the cutnet metric (KaHyParR [52]) as well as a direct way algorithm to optimize the metric (KaHyParK [2]). Recently, Heuer and Schlag [30] integrated an improved coarsening scheme into KaHyParK that incorporates global information about the structure of the hypergraph into the coarsening process. It uses community detection in a preprocessing step and prevents intercommunity contractions during coarsening. This version is referred to as KaHyParCA. Unless mentioned otherwise, we use the default configurations provided by the authors^{1}^{1}1https://github.com/SebastianSchlag/kahypar/tree/master/config.
3 Memetic Multilevel Hypergraph Partitioning
We now explain the components of our memetic multilevel hypergraph partitioning algorithm. Given a hypergraph and a time limit , the algorithm starts by creating an initial population of individuals, which in our case correspond to balanced way partitions of . The population size is determined dynamically by first measuring the time spend to create one individual. Then is chosen such that the time to create individuals is a certain percentage of the total running time : , where is a tuning parameter. The lower bound on the population size is chosen to ensure a certain minimum of diversity, while the upper bound is used to ensure convergence. In contrast to previous approaches [6, 8, 15, 33, 38] the population is not filled with randomly generated individuals, but highquality solutions computed by KaHyParCA.
To judge the fitness of an individual we use the connectivity of its partition . The initial population is evolved over several generational cycles using the steadystate paradigm [22], i.e., we generate only one offspring per generation. Our twopoint and multipoint recombination operators described in Section 3.1 improve the average quality of the population by effectively combining different solutions to the HGP problem. In order to sufficiently explore the global search space and to prevent premature convergence, it is important to keep the population diverse [10]. This becomes even more relevant in our case, since with KaHyParCA we use powerful heuristics to exploit the local solution space. Previous work on evolutionary algorithms for HGP [7, 8, 15, 33, 38] used simple mutations that change the block of each vertex uniformly at random with a small probability. In contrast to these simple, problem agnostic operators, we propose mutation operators based on Vcycles [59] that exploit knowledge of the problem domain and create offspring solutions in the vicinity of the current population. Furthermore in Section 3.3 we propose a replacement strategy which considers fitness and similarity to determine the individual to be evicted from the population.
3.1 Recombination Operators
The evolutionary algorithms for HGP presented in Section 2.1 use simple multipoint crossover operators which split the parent partitions into several parts and then combine these parts to form new offspring (see Figure 1 (a)). Since these operators do not take the structure of the hypergraph into account, offspring solutions may have considerably worse fitness than their parents. By generalizing the recombine operator framework presented in [50] from graphs to hypergraphs, our twopoint recombine operators described in this section assure that the fitness of the offspring is at least as good as the best of both parents. The edge frequency based multipoint recombination operator described afterwards gives up this property, but still generates good offspring.
TwoPoint Recombine.
The operator starts with selecting parents for recombination using binary tournament selection (without replacement) [14]. Two individuals and are chosen uniformly at random from and the individual with better fitness (i.e., lower objective) becomes the first parent . This process is then repeated to determine the second parent . A tournament size of two is chosen to keep the selection pressure low and to avoid premature convergence, since all our individuals already constitute highquality solutions. Both individuals/partitions are then used as input of a modified multilevel partitioning scheme as follows: During coarsening, two vertices and are only allowed to be contracted if both parents agree on the block assignment of both vertices, i.e., if . This is a generalization from multilevel evolutionary GP, i.e. [50], where edges running between two blocks are not eligible for contraction and therefore remain in the graph. In other words, our generalization allows two vertices of the same cut net to be contracted as long as the input individuals agree that they belong to the same block. For HGP, this restriction ensures that cut nets remain in the coarsened hypergraph and maintain their connectivity regarding both partitions. This modification is important for our optimization objective, because it allows us to use the partition of the better parent as initial partition of the offspring. Since we can skip the initial partitioning phase and therefore do not need a sufficiently large number of vertices in the coarsest hypergraph to compute a good initial partition [36], we alter the stopping criterion of the coarsening phase such that it stops when no more contractions are possible. The high quality solution of the coarsest hypergraph contains two different classes of vertices: Those for which both parent partitions agree on a block assignment and those for which they don’t (see Figure 1 (b) for an example). During the uncoarsening phase, local search algorithms can then use this initial partitioning to (i) exchange good parts of the solution on the coarse levels by moving few vertices and (ii) to find the best block assignment for those vertices, for which the parent partitions disagreed. Since KaHyPar’s refinement algorithms guarantee nondecreasing solution quality, the fitness of offspring solutions generated using this kind of recombination is always at least as good as the better of both parents.
EdgeFrequency MultiRecombine.
The operator described previously is restricted to recombine partitions to improved offspring of nondecreasing quality. Sanders and Schulz [50] specifically restrict their operators to this case, arguing that in the course of the algorithm a series of twopoint recombine operations to some extend emulates a multipoint recombine. We here present a reasonable multipoint recombine operation to partially evaluate this hypothesis in our experimental evaluation. Our recombine operator uses the concept of (hyper)edge frequency [60] to pass information about the cut nets of the best individuals in the population on to new offspring. The frequency of a net hereby refers to the number of times it appears in the cut in the best solutions: . We use , which is a common value in evolutionary algorithms [23]. Our multirecombine operator then uses this information to create a new individual in the following way. The coarsening algorithm is modified to prefer to contract vertex pairs which share a large number of small, lowfrequency nets. This is achieved by replacing the standard heavyedge rating function of KaHyPar with the rating function [60] shown in Eq. 1:
(1) 
This rating function disfavors the contraction of vertex pairs incident to cut nets with high frequency, because these nets are likely to appear in the cut of high quality solutions. The tuning parameter is used as a damping factor. After coarsening stops, we compute an initial partition of the coarsest hypergraph using KaHyPar’s initial partitioning algorithms and refine it during the uncoarsening and local search phase.
3.2 Mutation Operations and Diversification
We define two mutation operators based on Vcycles. All operators are applied to a random individual of the current population. The main idea of Vcycle based mutation operators is to reuse an already computed partition as input for the multilevel approach and to iterate coarsening and local search phases several times using different seeds for randomization. This approach has been applied successfully in evolutionary GP [50], therefore we also adopt it for HGP. Similar to the recombine operator described in Section 3.1, the quality of the solution is maintained by only contracting vertex pairs belonging to the same block (). By distinguishing two possibilities for initial partitioning, we define two different mutation operators: The first one uses the current partition of the individual as initial partition of the coarsest hypergraph and guarantees nondecreasing solution quality. The second one employs KaHyPar’s portfolio of initial partitioning algorithms to compute a new solution for the coarsest hypergraph. During uncoarsening, local search algorithms improve the solution quality and thereby further mutate the individual. Since the second operator computes a new initial partition which might be different from the original partition of , the fitness of offspring generated by this operator can be worse than the fitness of .
3.3 Replacement Strategy
All recombination and mutation operators create one new offspring . In order to keep the population diverse, we evict the individual most similar to the offspring among all individuals whose fitness is equal to or worse than . Previous work on bipartitioning [15, 38] used the Hamming distance as a metric to measure the similarity between partitions. We propose a more sophisticated similarity measure that takes into account the connectivity of each cut net . For each individual, we compute the multiset , where is the multiplicity (i.e. number of occurrences) of . Thus each cut net is represented times in . The difference of two individuals and is the computed as , where is the symmetric difference.
4 Experimental Evaluation
System and Methodology.
We implemented the memetic algorithm described in the previous section using the latest version of the KaHyPar framework. The code is written in C++ and compiled using g++5.2 with flags O3 mtune=native march=native. We refer to the algorithm presented in this paper as EvoHGP. All experiments comparing EvoHGP with competing algorithms are performed on a cluster with nodes, where each node has two Intel Xeon E52670 OctaCore (Sandy Bridge) processors clocked at GHz, GB main memory, MB L3 and 8x256 KB L2Cache and runs RHEL 7.4. To analyze the influence of EvoHGP’s algorithmic components in Section 4.1, we use an Ubuntu 14.04.5 machine consisting with four Intel Xeon E54640 OctaCore processors clocked at GHz with GB main memory, MB L3 and 8x256 KB L2Cache.
We compare EvoHGP with two different configurations of KaHyParCA [30]. In any case, the objective of the partitioner is the connectivity metric. KaHyParCA was chosen for comparison, because it was shown to provide the best solution quality across a broad range of instances [30], performing better than stateoftheart tools such as the direct way and the recursive bisection version of hMetis 2.0 (p1) [35, 36] and PaToH 3.2 [17]. The first configuration corresponds to repeated runs of KaHyParCA using different random seeds for each run. Since it is known that global search strategies are more effective than plain restarts [49], we augment KaHyParCA with Vcycles (in a similar fashion as the first mutation operator) using a maximum number of Vcycle iterations per partitioner call. This new enhanced version of KaHyParCA constitutes the second configuration and is referred to as KaHyParCAV.
All three algorithms get eight hours time per test instance to compute a solution and we perform three repetitions with different seeds for each test instance and algorithm. Due to the large amount of computing time necessary to perform these experiments, we always partition instances in parallel on a single node. We use the arithmetic mean when averaging over solutions of the same instance and the geometric mean when averaging over different instances in order to give every instance a comparable influence on the final result. In order to compare EvoHGP with the different versions of KaHyParCA, we present two kinds of plots: Convergence plots [50] show the evolution of solution quality over time normalized by instance size, while performance plots [52] are used to compare the best solutions of all algorithms on a perinstance basis.
Convergence Plots.
We start by explaining how to compute the data for a single instance , i.e., a way partition of a hypergraph . Whenever an algorithm computes a partition that improves the solution quality, it reports a pair (, ), where the timestamp is the currently elapsed time. For repetitions with different seeds , these sequences of pairs are merged into one sequence of triples , which is sorted by the timestamp . Since we are interested in the evolution of the solution quality, we compute the sequence representing eventbased average values. We start by computing the average connectivity and the average time using the first pair (, ) of all sequences and insert into . We then sweep through the remaining entries of . Each entry corresponds to a partition computed at timestamp using seed that improved the solution quality to . For each entry we therefore replace the old connectivity value of seed that took part in the computation of with the new value , recompute and insert a new pair into . therefore represents the evolution of the average solution quality for instance over time. In a final step, we create the normalized sequence , where each entry in is replaced by where and is the average time that KaHyParCA needs to compute a way partition of . Average values over multiple instances are then obtained as follows: All sequences of pairs are merged into a sequence of triples , which is then sorted by . The final sequence presenting eventbased geometric averages values is then computed as follows: We start by computing the average normalized time and the geometric mean connectivity over all instances using the first value of all and insert into . We then sweep through the remaining entries of . For each entry , we replace the old connectivity value of that took part in the computation of with the new value , recompute and insert into . The sequence therefore represents the evolution of the solution quality averaged over all instances and repetitions.
Performance Plots.
These plots relate the smallest minimum connectivity of all algorithms to the corresponding connectivity produced by each algorithm on a perinstance basis. For each algorithm, these ratios are sorted in increasing order. The plots use a cube root scale for both axes to reduce right skewness [20] and show on the yaxis to highlight the instances were each partitioner performs badly. A point close to one indicates that the partition produced by the corresponding algorithm was considerably worse than the partition produced by the best algorithm. A value of zero therefore indicates that the corresponding algorithm produced the best solution. Thus an algorithm is considered to outperform another algorithm if its corresponding ratio values are below those of the other algorithm. In order to include instances with a cut of zero into the results, we set the corresponding cut values to one for ratio computations.
Benchmark Instances.
We evaluate our algorithm on a representative subset of hypergraphs from the benchmark set of Heuer and Schlag [30]^{2}^{2}2The benchmark set was downloaded from http://algo2.iti.kit.edu/schlag/sea2017/., which contains instances from four benchmark sets: the ISPD98 VLSI Circuit Benchmark Suite [3], the DAC 2012 RoutabilityDriven Placement Contest [58], the University of Florida Sparse Matrix Collection [21], and the international SAT Competition 2014 [12]. Sparse matrices are translated into hypergraphs using the rownet model [17], i.e., each row is treated as a net and each column as a vertex. SAT instances are converted to three different representations: For literal hypergraphs, each boolean literal is mapped to one vertex and each clause constitutes a net [44], while in the primal model each variable is represented by a vertex and each clause is represented by a net. In the dual model the opposite is the case [41]. The latter two models are more common in the SAT solving community than the literal model proposed in [44]. All hypergraphs have unit vertex and net weights. An overview of our benchmark sets is given in Tables 2 and 3 in Appendix 0.C. To compare EvoHGP with the best competing algorithms, all hypergraphs are partitioned into blocks with . For each hypergraph and each value of , a way partition is considered to be one test instance, resulting in a total of instances. In order to save running time, we choose a subset of 25 hypergraphs shown Table 2, , and to evaluate the impact of different algorithmic components of our algorithm (recombine and mutation operations) before we run our algorithms on the large benchmark set.
4.1 Influence of Algorithmic Components
All configurations determine their population size dynamically such that of the total time is spent to create the initial population. According to the results of Wichlund and Aas [60],
the damping factor used for edge frequency calculations is set to . We use a naming scheme to refer to different configurations of our algorithm. All configuration names start with EvoHGP followed by abbreviations for the added recombine and mutation operations (multiple abbreviations used to add multiple operations). Abbreviation refers to EvoHGP using twopoint recombine operations, refers to EvoHGP using multirecombine operations, and finally adds mutation operations with a mutation chance of . Whenever a mutation operation is performed, both operators have a 50 percent change of being chosen. Figure 2 compares different configurations of EvoHGP. Of all configurations, EvoHGP+, which relies only on multipoint recombine operations, performs worst. Comparing its performance with EvoHGP+ (which uses twopoint recombine operations), we can see that it is indeed beneficial to guarantee nondecreasing solution quality for combine operations. However, due to the fact that the strong multilevel local search engine KaHyParCA computes high quality solutions, we see that a significant amount of mutations is necessary to ensure diversity in the population. While EvoHGP++ ( mutation chance performed best for evolutionary graph partitioning in [50]) performs only marginally better than EvoHGP+, increasing the mutation rate to (EvoHGP++) improves the overall performance of the algorithm. Moreover, we see that combining twopoint recombine operators with edgefrequency based multicombines (EvoHGP++) also performs better than each of the recombine operators alone. This can be explained by the fact that multirecombines also act as mutation operator in that they don’t guarantee nondecreasing performance. Since EvoHGP++ and EvoHGP++ show the best convergence behavior, we restrict ourselves to these configurations for all remaining experiments performed in the paper.
4.2 Evaluation
We now switch to our large benchmark set to evaluate the performance of the different algorithms under consideration. Table 1 as well as Figure 3 and Figure 4 compare the performance of our memetic algorithms with repeated executions of KaHyParCA and KaHyParCAV. The remaining convergence and performance plots can be found in Appendix 0.A and Appendix 0.B. When looking at convergence plots, note that KaHyParCAV starts later than all other algorithms and has an initially better solution quality. This is due to the fact it uses up to Vcycles before reporting the first solution. The improvements of our memetic algorithms increase with increasing . This is expected as the search space of possible partitionings increases with the number of blocks. Looking at Table 1, we see that both algorithms on average outperform the best partitioner currently available (KaHyParCA), culminating in an improvement of for . EvoHGP++ is further able to improve upon the new Vcycling version KaHyParCAV for all values of and performs better on average than KaHyParCAV for . Our variant using the edgefrequency multirecombine operations does not achieve this: the convergence plots in Figure 3 indicate that EvoHGP++ can only improve upon KaHyParCA and KaHyParCAV for large values of . This can be explained by the fact that edge frequency becomes more meaningful in this case, because there is more variability in the cut nets of the best individuals.
Looking at the performance plot that compares the strongest nonevolutionary algorithm with the strongest memetic configuration in Figure 4, we see that EvoHGP++ performs significantly better than KaHyParCAV. It produces the best partitions for of the , while KaHyParCAV only computed the best partitions for 101 instances. Note that for some instances, both partitioners computed the same solution. Additional performance plots for each application area in Appendix 0.B show that this improvement in performance is consistent throughout the complete spectrum of instances.
This shows that even a large number of repeated executions helps only partially to explore the huge space of possible partitionings. By combining effective exploration (global search) with exploitation (in our case using powerful level HGP algorithms) our memetic algorithms can effectively help to break out of local minima and hence explore the global solution space more extensively.
5 Conclusion and Future Work
EvoHGP is the first multilevel memetic algorithm to tackle the balanced hypergraph partitioning problem. Key components of our contribution are new effective multilevel recombine and mutation operations that incorporate information about the best solutions in the coarsening process and provide a large amount of diversity. Experiments comparing EvoHGP with a Vcycling version of KaHyParCA indicate that our evolutionary algorithm computes by far the best solutions on almost all instances. This confirms our conjecture that previous attempts to solve the HGP problem using memetic algorithms failed to be competitive with stateoftheart tools because only flat partitioning algorithms where used to drive the exploitation phase. We therefore believe that EvoHGP is helpful in a wide area of application areas in which solution quality is of major importance. In the future, it would be interesting to apply EvoHGP in such application areas and to try other domain specific recombine operators that offer more specific knowledge of the application domain. In addition, it may be worth to investigate sharedmemory parallelization as in [8] or a distributed memory parallelization based on islands as in [50]. Lastly, we plan to integrate our algorithm in the KaHyPar framework.
References

[1]
S. N. Adya, M. C. Yildiz, I. L. Markov, P. G. Villarrubia, P. N. Parakh, and
P. H. Madden.
Benchmarking for largescale placement and beyond.
IEEE Transactions on ComputerAided Design of Integrated
Circuits and Systems, 23(4):472487, April 2004.
 [2] Y. Akhremtsev, T. Heuer, P. Sanders, and S. Schlag. Engineering a direct kway hypergraph partitioning algorithm. In 19th Workshop on Algorithm Engineering and Experiments, (ALENEX), pages 28–42, 2017.
 [3] C. J. Alpert. The ISPD98 Circuit Benchmark Suite. In Proceedings of the 1998 International Symposium on Physical Design, pages 80–85. ACM, 1998.
 [4] C. J. Alpert, J.H. Huang, and A. B. Kahng. Multilevel Circuit Partitioning. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, 17(8):655–667, 1998.
 [5] C. J. Alpert and A. B. Kahng. Recent Directions in Netlist Partitioning: a Survey. Integration, the VLSI Journal, 19(1–2):1 – 81, 1995.
 [6] S. Areibi. An Integrated Genetic Algorithm With Dynamic Hill Climbing for VLSI Circuit Partitioning. In Genetic and Evolutionary Computation Conference (GECCO), pages 97–102, 2000.
 [7] S. Areibi and Z. Yang. Effective Memetic Algorithms for VLSI Design = Genetic Algorithms + Local Search + MultiLevel Clustering. Evolutionary Computation, 12(3):327–353, 2004.
 [8] E. Armstrong, G. W. Grewal, S. Areibi, and G. Darlington. An investigation of parallel memetic algorithms for VLSI circuit partitioning on multicore computers. In Proceedings of the 23rd Canadian Conference on Electrical and Computer Engineering, CCECE, pages 1–6. IEEE, 2010.
 [9] C. Aykanat, B. B. Cambazoglu, and B. Uçar. Multilevel Direct Kway Hypergraph Partitioning with Multiple Constraints and Fixed Vertices. Journal of Parallel and Distributed Computing, 68(5):609–625, 2008.
 [10] T. Bäck. Evolutionary algorithms in theory and practice : evolution strategies, evolutionary programming, genetic algorithms. PhD thesis, 1996.
 [11] D. A. Bader, H. Meyerhenke, P. Sanders, and D. Wagner, editors. Proc. Graph Partitioning and Graph Clustering  10th DIMACS Implementation Challenge Workshop, volume 588 of Contemporary Mathematics. AMS, 2013.
 [12] A. Belov, D. Diepold, M. Heule, and M. Järvisalo. The SAT Competition 2014. http://www.satcompetition.org/2014/, 2014.
 [13] U. Benlic and J. K. Hao. A Multilevel Memetic Approach for Improving Graph Partitions. IEEE Transactions on Evolutionary Computation, 15(5):624–642, 2011.
 [14] T. Blickle and L. Thiele. A Comparison of Selection Schemes used in Evolutionary Algorithms. Evolutionary Computation, 4(4):361–394, 1996.
 [15] T. N. Bui and B. R. Moon. A Fast and Stable Hybrid Genetic Algorithm for the RatioCut Partitioning Problem on Hypergraphs. In Proceedings of the 31st Conference on Design Automation, pages 664–669, 1994.
 [16] A. Buluç, H. Meyerhenke, I. Safro, P. Sanders, and C. Schulz. Recent Advances in Graph Partitioning. In Algorithm Engineering  Selected Results and Surveys, pages 117–158. Springer, 2016.
 [17] Ü. V. Catalyürek and C. Aykanat. HypergraphPartitioningBased Decomposition for Parallel SparseMatrix Vector Multiplication. IEEE Transactions on Parallel and Distributed Systems, 10(7):673–693, Jul 1999.
 [18] P. Chardaire, M. Barake, and G. P. McKeown. A PROBEBased Heuristic for Graph Partitioning. IEEE Transactions on Computers, 56(12):1707–1720, 2007.
 [19] J. Cohoon, J. Kairo, and J. Lienig. Evolutionary Algorithms for the Physical Design of VLSI Circuits, pages 683–711. Springer, 2003.
 [20] N. J. Cox. Stata tip 96: Cube roots. Stata Journal, 11(1):149–154(6), 2011. URL: http://www.statajournal.com/article.html?article=st0223.
 [21] T. A. Davis and Y. Hu. The University of Florida Sparse Matrix Collection. ACM Transactions on Mathematical Software, 38(1):1:1–1:25, 2011.
 [22] K. A. De Jong. Evolutionary computation  a unified approach. MIT Press, 2006.
 [23] D. Delling, A. V. Goldberg, I. Razenshteyn, and R. F. Werneck. Graph Partitioning with Natural Cuts. In Proceedings of the 25th International Parallel and Distributed Processing Symposium, pages 1135–1146, 2011.
 [24] M. Deveci, K. Kaya, B. Uçar, and Ü V. Çatalyürek. Hypergraph partitioning for multiple communication cost metrics: Model and methods. Journal of Parallel and Distributed Computing, 77:69–83, 2015.
 [25] K. D. Devine, E. G. Boman, R. T. Heaphy, R. H. Bisseling, and Ü. V. Catalyürek. Parallel Hypergraph Partitioning for Scientific Computing. In 20th International Conference on Parallel and Distributed Processing, IPDPS, pages 124–124. IEEE, 2006.
 [26] W.E. Donath. Logic partitioning. Physical Design Automation of VLSI Systems, pages 65–86, 1988.
 [27] T. A. Feo, M. G. C. Resende, and S. H. Smith. A Greedy Randomized Adaptive Search Procedure for Maximum Independent Set. Operations Research, 42(5):860–878, 1994.
 [28] C.M. Fiduccia and R.M. Mattheyses. A lineartime heuristic for improving network partitions. In 19th Conference on Design Automation, pages 175–181, June 1982.
 [29] David E. Goldberg. Genetic algorithms in search, optimization, and machine learning. AddisonWesley, 1989.
 [30] T. Heuer and S. Schlag. Improving Coarsening Schemes for Hypergraph Partitioning by Exploiting Community Structure. In 16th International Symposium on Experimental Algorithms, (SEA), page 21:1–21:19, 2017.
 [31] J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975. second edition, 1992.
 [32] T. C. Hu and K. Moerder. Multiterminal Flows in a Hypergraph. In T.C. Hu and E.S. Kuh, editors, VLSI Circuit Layout: Theory and Design, chapter 3, pages 87–93. IEEE Press, 1985.
 [33] M. Hulin. Circuit partitioning with genetic algorithms using a coding scheme to preserve the structure of a circuit, pages 75–79. Springer, 1991.
 [34] I. Kabiljo, B. Karrer, M. Pundir, S. Pupyrev, A. Shalita, A. Presta, and Y. Akhremtsev. Social Hash Partitioner: A Scalable Distributed Hypergraph Partitioner. pages 1–23, 2017. arXiv:1707.06665.
 [35] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar. Multilevel Hypergraph Partitioning: Applications in VLSI Domain. IEEE Transactions on Very Large Scale Integration VLSI Systems, 7(1):69–79, 1999.
 [36] G. Karypis and V. Kumar. Multilevel way Hypergraph Partitioning. In Proceedings of the 36th ACM/IEEE Design Automation Conference, pages 343–348. ACM, 1999.
 [37] J. Kim, I. Hwang, Y. H. Kim, and B. R. Moon. Genetic Approaches for Graph Partitioning: A Survey. In Proceedings of the 13th Annual Genetic and Evolutionary Computation Conference (GECCO), pages 473–480. ACM, 2011.
 [38] J.P. Kim, Y.H. Kim, and B.R. Moon. A Hybrid Genetic Approach for Circuit Bipartitioning, pages 1054–1064. Springer, 2004.
 [39] Y.H. Kim and B. R. Moon. LockGain Based Graph Partitioning. Journal of Heuristics, 10(1):37–57, 2004.
 [40] T. Lengauer. Combinatorial Algorithms for Integrated Circuit Layout. John Wiley & Sons, Inc., 1990.
 [41] Z. Mann and P. Papp. Formula partitioning revisited. In Daniel Le Berre, editor, POS14. Fifth Pragmatics of SAT workshop, volume 27 of EPiC Series in Computing, pages 41–56. EasyChair, 2014.
 [42] P. Moscato. On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Towards Memetic Algorithms. Technical Report C3P Report 826, California Institute of Technology, 1989.
 [43] P. Moscato and C. Cotta. A Modern Introduction to Memetic Algorithms, pages 141–183. Springer US, 2010.
 [44] D. A. Papa and I. L. Markov. Hypergraph Partitioning and Clustering. In T. F. Gonzalez, editor, Handbook of Approximation Algorithms and Metaheuristics. Chapman and Hall/CRC, 2007.
 [45] N. J. Radcliffe and P. D. Surry. Formal Memetic Algorithms. In T. C. Fogarty, editor, Evolutionary Computing, AISB Workshop, volume 865 of Lecture Notes in Computer Science, pages 1–16. Springer, 1994.
 [46] K. Roberts and Preas B. MCNC. Technical report, Physical Design Workshop, 1987.
 [47] Y. Saab and V. B. Rao. An EvolutionBased Approach to Partitioning ASIC Systems. In D. E. Thomas, editor, Proceedings of the 26th ACM/IEEE Design Automation Conference, pages 767–770. ACM Press, 1989.
 [48] L. A. Sanchis. Multipleway Network Partitioning. IEEE Transactions on Computers, 38(1):62–81, 1989.
 [49] P. Sanders and C. Schulz. Engineering Multilevel Graph Partitioning Algorithms. In Algorithms – ESA, volume 6942 of LNCS, pages 469–480. Springer, 2011.
 [50] P. Sanders and C. Schulz. Distributed Evolutionary Graph Partitioning. In 12th Workshop on Algorithm Engineering and Experimentation (ALENEX), pages 16–29, 2012.
 [51] P. Sanders and C. Schulz. Think Locally, Act Globally: Highly Balanced Graph Partitioning. In 12th International Symposium on Experimental Algorithms (SEA), volume 7933 of LNCS, pages 164–175. Springer, 2013.
 [52] S. Schlag, V. Henne, T. Heuer, H. Meyerhenke, P. Sanders, and C. Schulz. way Hypergraph Partitioning via Level Recursive Bisection. In 18th Workshop on Algorithm Engineering and Experiments (ALENEX), pages 53–67, 2016.
 [53] A. J. Soper, C. Walshaw, and M. Cross. A Combined Evolutionary Search and Multilevel Optimisation Approach to GraphPartitioning. Journal of Global Optimization, 29(2):225–241, 2004.
 [54] A. Trifunovic. Parallel Algorithms for Hypergraph Partitioning. PhD thesis, University of London, 2006.
 [55] A. Trifunovic and W. J. Knottenbelt. Parallel Multilevel Algorithms for Hypergraph Partitioning. Journal of Parallel and Distributed Computing, 68(5):563 – 581, 2008.
 [56] Ü. V. Çatalyürek and M. Deveci and K. Kaya and B. Uçar. UMPa: A multiobjective, multilevel partitioner for communication minimization. In Bader et al. [11], pages 53–66.
 [57] B. Vastenhouw and R. H. Bisseling. A TwoDimensional Data Distribution Method for Parallel Sparse MatrixVector Multiplication. SIAM Review, 47(1):67–95, 2005.
 [58] N. Viswanathan, C. Alpert, C. Sze, Z. Li, and Y. Wei. The DAC 2012 Routabilitydriven Placement Contest and Benchmark Suite. In 49th Annual Design Automation Conference, DAC, pages 774–782. ACM, 2012.
 [59] C. Walshaw. Multilevel Refinement for Combinatorial Optimisation Problems. Annals of Operations Research, 131(14):325–372, 2004.
 [60] S. Wichlund and E. Aas. On Multilevel Circuit Partitioning. In Proceedings of theIEEE/ACM International Conference on ComputerAided Design (ICCAD), pages 505–511, 1998.
Appendix 0.A Additional Convergence Plots
Appendix 0.B Additional Performance Plots
Appendix 0.C Benchmark Hypergraphs
hypergraph  hypergraph  

ISPD98  SAT14Primal  
ibm06  32498  34826  128182  6s153  85646  245440  572692 
ibm07  45926  48117  175639  aaai10planningipc5  53919  308235  690466 
ibm08  51309  50513  204890  atco_enc2_opt1_05_21  56533  526872  2097393 
ibm09  53395  60902  222088  dated1011u  141860  629461  1429872 
ibm10  69429  75196  297567  hwmcc10timeframe  163622  488120  1138944 
SAT14Dual  SPM  
6s133  140968  48215  328924  laminar_duct3D  67173  67173  3833077 
6s153  245440  85646  572692  mixtank_new  29957  29957  1995041 
6s9  100384  34317  234228  mult_dcop_01  25187  25187  193276 
dated1011u  629461  141860  1429872  RFdevice  74104  74104  365580 
dated1017u  1070757  229544  2471122  vibrobox  12328  12328  342828 
SAT14Literal  
6s133  96430  140968  328924  
6s153  171292  245440  572692  
aaai10planningipc5  107838  308235  690466  
atco_enc2_opt1_05_21  112732  526872  2097393  
dated1011u  283720  629461  1429872 
hypergraph  hypergraph  

DAC2012  SAT14Primal  
superblue19  522 482  511 685  1 713 796  AProVE0727  7 729  29 194  77 124 
superblue13  630 802  619 815  2 048 903  countbitssrl032  18 607  55 724  130 020 
superblue14  698 339  697 458  2 280 417  6s184  33 365  97 516  227 536 
superblue3  917 944  898 001  3 109 446  6s9  34 317  100 384  234 228 
ISPD98  6s133  48 215  140 968  328 924  
ibm09  53 395  60 902  222 088  6s153  85 646  245 440  572 692 
ibm11  70 558  81 454  280 786  atco_enc1_opt2_10_16  9 643  152 744  641 139 
ibm10  69 429  75 196  297 567  aaai10planningipc5  53 919  308 235  690 466 
ibm12  71 076  77 240  317 760  hwmcc10timeframe  163 622  488 120  1 138 944 
ibm13  84 199  99 666  357 075  itox_vc1130  152 256  441 729  1 143 974 
ibm14  147 605  152 772  546 816  dated1011u  141 860  629 461  1 429 872 
ibm15  161 570  186 608  715 823  atco_enc1_opt2_05_4  14 636  386 163  1 652 800 
ibm16  183 484  190 048  778 823  manolpipec8nidw  269 048  799 867  1 866 355 
ibm18  210 613  201 920  819 697  atco_enc2_opt1_05_21  56 533  526 872  2 097 393 
ibm17  185 495  189 581  860 036  dated1017u  229 544  1 070 757  2 471 122 
SAT14Dual  ACG205p0  324 716  1 390 931  3 269 132  
AProVE0727  29 194  7 729  77 124  ACG205p1  331 196  1 416 850  3 333 531 
countbitssrl032  55 724  18 607  130 020  SPM  
6s184  97 516  33 365  227 536  powersim  15 838  15 838  67 562 
6s9  100 384  34 317  234 228  ascaida  31 379  26 475  106 762 
6s133  140 968  48 215  328 924  hvdc1  24 842  24 842  159 981 
6s153  245 440  85 646  572 692  Ill_Stokes  20 896  20 896  191 368 
atco_enc1_opt2_10_16  152 744  9 643  641 139  mult_dcop_01  25 187  25 187  193 276 
aaai10planningipc5  308 235  53 919  690 466  lp_pds_20  108 175  33 798  232 647 
hwmcc10timeframe  488 120  163 622  1 138 944  lhr14  14 270  14 270  307 858 
itox_vc1130  441 729  152 256  1 143 974  c61  43 618  43 618  310 016 
dated1011u  629 461  141 860  1 429 872  ckt11752_dc_1  49 702  49 702  333 029 
manolpipeg10bid_i  792 175  266 405  1 848 407  RFdevice  74 104  74 104  365 580 
manolpipec8nidw  799 867  269 048  1 866 355  light_in_tissue  29 282  29 282  406 084 
atco_enc2_opt1_05_21  526 872  56 533  2 097 393  Andrews  60 000  60 000  760 154 
dated1017u  1 070 757  229 544  2 471 122  2D_54019_highK  54 019  54 019  996 414 
ACG205p0  1 390 931  324 716  3 269 132  case39  40 216  40 216  1 042 160 
ACG205p1  1 416 850  331 196  3 333 531  denormal  89 400  89 400  1 156 224 
SAT14Literal  2cubes_sphere  101492  101492  1647264  
AProVE0727  15 458  29 194  77 124  av41092  41 092  41 092  1 683 902 
countbitssrl032  37 213  55 724  130 020  Lin  256 000  256 000  1 766 400 
6s184  66 730  97 516  227 536  cfd1  70 656  70 656  1 828 364 
6s9  68 634  100 384  234 228  mc2depi  525 825  525 825  2 100 225 
6s133  96 430  140 968  328 924  poisson3Db  85 623  85 623  2 374 949 
6s153  171 292  245 440  572 692  rgg_n_2_18_s0  262 144  262 141  3 094 566 
atco_enc1_opt2_10_16  18 930  152 744  641 139  cnr2000  325 557  247 501  3 216 152 
aaai10planningipc5  107 838  308 235  690 466  
hwmcc10timeframe  327 243  488 120  1 138 944  
itox_vc1130  294 326  441 729  1 143 974  
dated1011u  283 720  629 461  1 429 872  
atco_enc1_opt2_05_4  28 738  386 163  1 652 800  
manolpipeg10bid_i  532 810  792 175  1 848 407  
manolpipec8nidw  538 096  799 867  1 866 355  
atco_enc2_opt1_05_21  112 732  526 872  2 097 393  
dated1017u  459 088  1 070 757  2 471 122  
ACG205p0  649 432  1 390 931  3 269 132  
ACG205p1  662 392  1 416 850  3 333 531 