0
Research Papers

Speeding Up Particle Trajectory Simulations Under Moving Force Fields using Graphic Processing Units

[+] Author and Article Information
Robert Patro

Department of Computer Science, Institute for Advanced Computer Studies,  University of Maryland, College Park, MD 20742rob@cs.umd.edu

John P. Dickerson

Institute for Advanced Computer Studies,  University of Maryland, College Park, MD 20742johnd@umiacs.umd.edu

Sujal Bista

Department of Computer Science, Institute for Advanced Computer Studies,  University of Maryland, College Park, MD 20742sujal@cs.umd.edu

Satyandra K. Gupta

Department of Mechanical Engineering, Institute for Systems Research,  University of Maryland, College Park, MD 20742skgupta@umd.edu

Amitabh Varshney

Department of Computer Science Institute for Advanced Computer Studies  University of Maryland College Park, MA 20742varshney@cs.umd.edu

J. Comput. Inf. Sci. Eng 12(2), 021006 (May 22, 2012) (8 pages) doi:10.1115/1.4005718 History: Received June 14, 2011; Revised October 10, 2011; Published May 21, 2012; Online May 22, 2012

In this paper, we introduce a graphic processing unit (GPU)-based framework for simulating particle trajectories under both static and dynamic force fields. By exploiting the highly parallel nature of the problem and making efficient use of the available hardware, our simulator exhibits a significant speedup over its CPU-based analog. We apply our framework to a specific experimental simulation: the computation of trapping probabilities associated with micron-sized silica beads in optical trapping workbenches. When evaluating large numbers of trajectories (4096), we see approximately a 356 times speedup of the GPU-based simulator over its CPU-based counterpart.

FIGURES IN THIS ARTICLE
<>
Copyright © 2012 by American Society of Mechanical Engineers
Your Session has timed out. Please sign back in to continue.

References

Figures

Grahic Jump Location
Figure 1

The CUDA architecture provides for a logical hierarchy of parallelism that maps well to the hardware. The computational kernel is run in parallel among a large number of threads which are grouped into 1, 2, or 3 dimensional thread blocks. Threads within a block may communicate using the shared memory or coordinate execution using synchronization primitives. The blocks are likewise grouped into 1, 2, or 3 dimensional grids. Each thread is able to access its own grid, block and thread identifiers.

Grahic Jump Location
Figure 2

An overview diagram of our system showing the distribution of work and the flow of data between the host and the device (GPU). Currently, the host is used only to load the initial simulation parameters and set the initial particle positions. The device performs the simulation of the particle trajectories and returns the final positions to the host, which then simply computes the fraction of trapped particles corresponding to each initial position.

Grahic Jump Location
Figure 3

This plot shows the probability of trapping a particle under the force exerted by a stationary laser with its focus at (0,0), and was generated using the GPU implementation detailed in this paper.

Grahic Jump Location
Figure 4

This plot shows the absolute difference between the CPU and the GPU implementation of the probability of trapping a particle under the force exerted by a stationary laser with its focus at (0,0).

Grahic Jump Location
Figure 5

This plot shows the probability of trapping a particle under the force exerted by laser moving at a constant velocity of 0.65 μm m s−1 in the direction (1,0), and was generated using the GPU implementation detailed in this paper.

Grahic Jump Location
Figure 6

This plot shows the probability of trapping a particle under the force exerted by laser moving at a constant velocity of 0.325 μm m s−1 in the direction (0,−1), and was generated using the GPU implementation detailed in this paper.

Grahic Jump Location
Figure 7

The running time required as a function of the number of trajectories calculated using both the CPU and GPU simulators. The plots have been placed on a log–log scale. The CPU simulator exhibits exactly the type of linear performance curve we expect, while GPU performance slowed by less than a factor of 2 between 8 and 4096 trajectories.

Tables

Errata

Discussions

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In