Historikk

Cristin-resultat-ID: 915144

Sist endret: 29. november 2013, 09:03

NVI-rapporteringsår: 2012

Resultat

Vitenskapelig artikkel

2013

Efficient local search on the GPU-Investigations on the vehicle routing problem

Christian Ferdinand Schulz

Tidsskrift Tidsskrift

Journal of Parallel and Distributed Computing

ISSN 0743-7315
e-ISSN 1096-0848

NVI-nivå 2

Finn i kanalregisteret

Om resultatet Om resultatet

Vitenskapelig artikkel

Publiseringsår: 2013

Publisert online: 2012

Trykket: 2013

Volum: 73

Hefte: 1

Sider: 14 - 31

Lenker Lenker

original online (doi)

https://doi.org/10.1016/j.jpdc.2012.02.020

Importkilder Importkilder

Scopus-ID: 2-s2.0-84869496633

Isi-ID: 000311921300003

Klassifisering Klassifisering

Vitenskapsdisipliner

Informasjons- og kommunikasjonsvitenskap

Emneord

Optimering • GPGPU

Beskrivelse Beskrivelse

Engelsk

Tittel

Efficient local search on the GPU-Investigations on the vehicle routing problem

Sammendrag

We study how to implement local search efficiently on data parallel accelerators such as Graphics Processing Units. The Distance-constrained Capacitated Vehicle Routing Problem, a computationally very hard discrete optimization problem with high industrial relevance, is the selected vehicle for our investigations. More precisely, we investigate local search with the Best Improving strategy for the 2-opt and 3-opt operators on a giant tour representation. Resource extension functions are used for constant time move evaluation. Using CUDA, a basic implementation called The Benchmark Version has been developed and deployed on a Fermi architecture Graphics Processing Unit. Both neighborhood setup and evaluation are performed entirely on the device. The Benchmark Version is the initial step of an incremental improvement process where a number of important implementation aspects have been systematically studied. Ten well-known test instances from the literature are used in computational experiments, and profiling tools are used to identify bottlenecks. In the final version, the device is fully saturated, given a large enough problem instance. A speedup of almost an order of magnitude relative to The Benchmark Version is observed. We conclude that, with some effort, local search may be implemented very efficiently on Graphics Processing Units. Our experiments show that a maximum efficiency, however, requires a neighborhood cardinality of at least one million. Full exploration of a billion neighbors takes a few seconds and may be deemed too expensive with the current technology. Reduced neighborhoods through filtering is an obvious remedy. Experiments on simple models of neighborhood filtering indicate, however, that the speedup effect is limited on data parallel accelerators. We believe these insights are valuable in the design of new metaheuristics that fully utilize modern, heterogeneous processors.

Vis fullstendig beskrivelse

Bidragsytere Bidragsytere

Christian Ferdinand Schulz

Forfatter
ved Mathematics and Cybernetics ved SINTEF AS

1 - 1 av 1