Enhanced Logic Rewriting with Intra- and Inter-graph Parallelism
Abstract
Logic rewriting, as a critical and time-consuming task in synthesis, is widely employed in the integrated circuit (IC) design flow because it has the unique advantages of high optimization and independence from technology. However, existing solutions either employ locks to guarantee safe inter‑node parallelism (at the cost of limiting parallelism), or parallelize the sub‑procedures of rewriting for individual nodes without adequately considering logical sharing (at the cost of inevitably decreasing quality). In this paper, we present DACPara 2.0, a fast, enhanced, and easily extensible parallel framework for high-quality logic rewriting. Our key insight is that, due to their characteristics, different large-scale circuits should adopt different parallel mechanisms to process this task enabling significant improvements in parallelism and scalability. In this spirit, for those with complex logic, we propose a divide-and-conquer parallel approach to exploit intra-graph parallelism (i.e., parallelism among nodes and their sub-procedures within the same And-Inverter Graph (AIG)), which separates three substages and redesigns them using dynamic global information. In this process, the nodes in an AIG are executed bottom-up in a level-wise parallel fashion. On the other hand, in the case of heavily pipelined industrial designs where each pipeline stage is represented as different copies of the same design, we propose a conflict-free sub-AIGs parallel approach featuring an ingenious fanout-based partitioning strategy to exploit inter-graph parallelism (i.e., parallelism between independent sub-AIGs). Experiments show that DACPara 2.0 using 40 CPU physical cores achieves 52.86 × /42.25 × speedup in rewriting/total runtime compared to logic rewriting in ABC, and 3.27 × /2.61 × speedup over the state-of-the-art CPU parallel method, with extremely comparable quality of result. Also, for all large-scale circuits with complex logic, DACPara 2.0 can achieve a 0.4% improvement in quality compared to the state-of-the-art GPU accelerated method.