This paper focuses on automated synthesis of divide-and-conquer parallelism, which is a common parallel programming skeleton supported by many cross-platform multithreaded libraries. The challenges of producing (manually or automatically) a correct divide-and-conquer parallel program from a given sequential code are two-fold: (1) assuming that individual worker threads execute a code identical to the sequential code, the programmer has to provide the extra code for dividing the tasks and combining the computation results, and (2) sometimes, the sequential code may not be usable as is, and may need to be modified by the programmer. We address both challenges in this paper. We present an automated synthesis technique for the case where no modifications to the sequential code are required, and we propose an algorithm for modifying the sequential code to make it suitable for parallelization when some modification is necessary. The paper presents theoretical results for when this modification is efficiently possible, and experimental evaluation of the technique and the quality of the produced parallel programs.
(An anonymous extended version including all proofs has been submitted as "Supplementary Material.)