Cristin-resultat-ID: 1875023
Sist endret: 20. januar 2021, 01:59
Resultat
Mastergradsoppgave
2020

Multi-GPU Maximum Flow Using the Groute Framework

Bidragsytere:
  • Vegard Karstang

Utgiver/serie

Utgiver

NTNU Open

Om resultatet

Mastergradsoppgave
Under utgivelse/in press
Publiseringsår: 2020
Antall sider: 29

Beskrivelse Beskrivelse

Tittel

Multi-GPU Maximum Flow Using the Groute Framework

Sammendrag

The maximum flow problem is a core graph problem and algorithms for solving this problem has evolved since Ford and Fulkerson published their solution in 1956. It has applications in several graph flow problems, graph cuts, scheduling, and many more. These applications led to the development of preflowpush algorithms. These have several variations with different complexities, and are amenable to parallelization. Early efforts in parallelizing preflow-push algorithms needed locks to prevent errors, while recent research has led to lock-free algorithms that are much faster. Multi-GPU systems are becoming more common and powerful. This has allowed the research into the usage and programming of such systems to increase. We introduce what is to our knowledge the first implementation of a maximum flow algorithm targeting multiple GPUs, and identify and discuss the suitability of implementing maximum flow on multiple GPUs. Our implementation is based on the asynchronous multithreaded algorithm by Hong and He, and we use the asynchronous multi-GPU framework Groute by Ben-Nun, Sutton, Pai, and Pingali as a base. We use the distributed worklist supplied by Groute to manage our list of active items, and CUDA managed memory for host to device, and device to device communication. The implementation is benchmarked on an NVIDIA DGX-2 machine using genrmf graphs which are commonly used for benchmarking max-flow algorithms. Our multi-GPU version may solve bigger problems compared to single GPU implementations due to the increased memory capacity. Even with this increase in memory capacity and the large amount of available threads it does not beat a simple single threaded CPU implementation. The implementation may also produce wrong results. We propose ways to combat both the run time issues, and the correctness issues. We nevertheless managed to develop and implement a multi-GPU max flow algorithm which ran correctly on most of the benchmarks we tested. Our results illustrate how challenging implementing irregular algorithms efficiently on multiple GPUs.

Bidragsytere

Anne Cathrine Elster

Bidragsyterens navn vises på dette resultatet som Anne C. Elster
  • Tilknyttet:
    Veileder
    ved Institutt for datateknologi og informatikk ved Norges teknisk-naturvitenskapelige universitet

Vegard Karstang

  • Tilknyttet:
    Forfatter
1 - 2 av 2