Research Papers

Massively Parallel Discrete Element Method Simulations on Graphics Processing Units

[+] Author and Article Information
John Steuben

Computational Multiphysics Systems Laboratory,
U.S. Naval Research Laboratory,
Washington, DC 20375
e-mail: john.steuben.ctr@nrl.navy.mil

Graham Mustoe

College of Engineering and Computer Science,
Colorado School of Mines,
Golden, CO 80401
e-mail: gmustoe@mines.edu

Cameron Turner

Associate Professor
Department of Mechanical Engineering,
Clemson University,
Clemson, SC 29634
e-mail: cturne9@clemson.edu

1Corresponding author.

Contributed by the Computers and Information Division of ASME for publication in the JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING. Manuscript received June 18, 2014; final manuscript received May 23, 2016; published online August 19, 2016. Editor: Bahram Ravani.

J. Comput. Inf. Sci. Eng 16(3), 031001 (Aug 19, 2016) (8 pages) Paper No: JCISE-14-1212; doi: 10.1115/1.4033724 History: Received June 18, 2014; Revised May 23, 2016

This paper outlines the development and implementation of large-scale discrete element method (DEM) simulations on graphics processing hardware. These simulations, as well as the topic of general-purpose graphics processing unit (GPGPU) computing, are introduced and discussed. We proceed to cover the general software design choices and architecture used to realize a GPGPU-enabled DEM simulation, driven primarily by the massively parallel nature of this computing technology. Further enhancements to this simulation, namely, a more advanced sliding friction model and a thermal conduction model, are then addressed. This discussion also highlights some of the finer points and issues associated with GPGPU computing, particularly surrounding the issues of parallelization, synchronization, and approximation. Qualitative comparison studies between simple and advanced sliding friction models demonstrate the effectiveness of the friction model. A test problem and an application problem in the area of wind turbine blade icing demonstrate the capabilities of the thermal model. We conclude with remarks regarding the simulations developed, future work needed, and the general suitability of GPGPU architectures for DEM computations.

Copyright © 2016 by ASME
Your Session has timed out. Please sign back in to continue.


Williams, J. , Hocking, G. , and Mustoe, G. , 1985, “ The Theoretical Basis of the Discrete Element Method,” International Conference on Numerical Methods in Engineering: Theory and Applications, pp. 897–906.
Cundall, P. A. , and Strack, O. D. L. , 1979, “ A Discrete Numerical Model for Granular Assemblies,” Géotechnique, 29(1), pp. 47–65. [CrossRef]
Oñate, E. , and Owen, D. R. J. , 2011, Particle-Based Methods: Fundamentals and Applications (Computational Methods in Applied Sciences), Springer, Dordrecht, The Netherlands.
Jing, L. , and Stephansson, O. , 2007, Fundamentals of Discrete Element Methods for Rock Engineering: Theory and Applications: Theory and Applications, Developments in Geotechnical Engineering, Elsevier Science, Amsterdam, The Netherlands.
Kruggel-Emden, H. , Simsek, E. , Rickelt, S. , Wirtz, S. , and Scherer, V. , 2007, “ Review and Extension of Normal Force Models for the Discrete Element Method,” Powder Technol., 171(3), pp. 157–173. [CrossRef]
Luding, S. , 2008, “ Cohesive, Frictional Powders: Contact Models for Tension,” Granular Matter, 10(4), pp. 235–246. [CrossRef]
Kawaguchi, T. , Tanaka, T. , and Tsuji, Y. , 1998, “ Numerical Simulation of Two-Dimensional Fluidized Beds Using the Discrete Element Method (Comparison Between the Two- and Three-Dimensional Models),” Powder Technol., 96(2), pp. 129–138. [CrossRef]
Bobert, A. , Fakhimi, A. , Johnson, S. , Morris, J. , Tonon, F. , and Yeung, M. , 2009, “ Numerical Models in Discontinuous Media: Review of Advances for Rock Mechanics Applications,” J. Geotech. Geoenviron. Eng., 135(11), pp. 1547–1561. [CrossRef]
Munjiza, A. , and Owen, D. , 1995, “ A Combined Finite-Discrete Element Method in Transient Dynamics of Fracturing Solids,” Eng. Comput., 12(2), pp. 145–174. [CrossRef]
Fang, Z. Q. , Hu, G. M. , Du, J. , Fan, Z. , and Liu, J. , 2015, “ A Contact Detection Algorithm for Multi-Sphere Particles by Means of Two-Level-Grid-Searching in DEM Simulations,” Int. J. Numer. Methods Eng., 102(13), pp. 1869–1893. [CrossRef]
Noborio, H. , Fukuda, S. , and Arimoto, S. , 1988, “ Fast Interference Check Method Using Octree Representation,” Adv. Rob., 3(3), pp. 193–212. [CrossRef]
Williams, J. R. , and Connor, R. O. , 1999, “ Discrete Element Simulation and the Contact Problem,” Arch. Comput. Methods Eng., 6(4), pp. 279–304. [CrossRef]
Navarro, C. , Hitschfeld-Kahler, N. , and Mateu, L. , 2013, “ A Survey on Parallel Computing and Its Applications in Data-Parallel Problems Using GPU Architectures,” Commun. Comput. Phys., 15(2), pp. 285–329. [CrossRef]
Sanders, J. , and Kandrot, E. , 2011, CUDA by Example: An Introduction to General Purpose GPU Programming, Addison-Wesley, Boston, MA.
Courtier, R. , 2013, Designing Scientific Applications on GPUs, CRC Press, Boca Raton, FL.
Kowalik, J. , and Puzniakowski, T. , 2012, Using OPENCL: Programming Massively Parallel Computers, IOS Press B.V., Amsterdam, The Netherlands.
Karimi, K. , Dickson, N. , and Hamze, F. , 2011, “A Performance Comparison of CUDA and OPENCL.”
Fang, J. , Varbanescu, A. , and Sips, H. , 2011, “ A Comprehensive Performance Comparison of CUDA and OPENCL,” 2011 International Conference on Parallel Processing, Taipei City, Taiwan, Sept. 13–16, pp. 216–225.
FrantzDale, B. , Plimpton, S. , and Shephard, M. , 2010, “ Software Components for Parallel Multiscale Simulation: An Example With Lammps,” Eng. Comput., 26(2), pp. 205–211. [CrossRef]
Gropp, W. , Lusk, E. , and Skjellum, A. , 1999, Using MPI: Portable Parallel Programming With the Message-Passing Interface, 2nd ed., MIT Press, Cambridge, MA.
Chapman, B. , Jost, G. , and Pas, R. , 2008, Using OpenMP: Portable Shared Memory Parallel Programming, Vol. 10, MIT Press, Cambridge, MA.
Harada, T. , 2008, “ Real-Time Rigid Body Simulations on GPUs,” GPU Gems 3, Addison Wesley, Boston, MA, Chap. 29.
Harris, M. , 2006, “ Fast Fluid Dynamics Simulation on the GPU,” GPU Gems, Addison Wesley, Boston, MA, pp. 637–665.
Chen, F. , Ge, W. , Xianfeng, L. , Li, B. , Li, J. , Li, X. , Wang, X. , and Yuan, X. , 2009, “ Multi-Scale HPC System for Multi-Scale Discrete Simulation—Development and Application of a Supercomputer With 1 Petaflops Peak Performance in Single Precision,” Particuology, 7(4), pp. 332–335. [CrossRef]
Beberg, A. , Ensign, D. , Jayachandran, G. , Khaliq, S. , and Pande, V. , 2009, “ Folding@Home: Lessons From Eight Years of Volunteer Distributed Computing,” 2009 IEEE Symposium on Parallel and Distributed Processing, Rome, Italy, May 23–29.
Green, S. , 2010, “ Particle Simulation Using CUDA: Nvidia Software Development Toolkit,” Nvidia, Santa Clara, CA.
Ericson, C. , 2004, Real-Time Collision Detection (Morgan Kaufmann Series in Interactive 3D Technology), Vol. 1, Taylor & Francis, Abington, UK.
Kipfer, P. , Segal, M. , and Westermann, R. , 2004, “ UberFlow: A GPU-Based Particle Engine,” ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, HWWS'04, ACM, New York, pp. 115–122.
Terreros, I. , Iordanoff, I. , and Charles, J. , 2013, “ Simulation of Continuum Heat Conduction Using DEM Domains,” Comput. Mater. Sci., 69(1), pp. 46–54. [CrossRef]
Komatitsch, D. , Michéa, D. , and Erlebacher, G. , 2009, “ Porting a High-Order Finite-Element Earthquake Modeling Application to NVIDIA Graphics Cards Using CUDA,” J. Parallel Distrib. Comput., 69(5), pp. 451–460. [CrossRef]
Bédorf, J. , Gaburov, E. , and Zwart, S. P. , 2012, “ A Sparse Octree Gravitational N-Body Code That Runs Entirely on the GPU Processor,” J. Comput. Phys., 231(7), pp. 2825–2839. [CrossRef]
Shigeto, Y. , and Sakai, M. , 2011, “ Parallel Computing of Discrete Element Method on Multi-Core Processors,” Particuology, 9(4), pp. 398–405. [CrossRef]


Grahic Jump Location
Fig. 1

Elastic interaction of particles in a DEM simulation. Arrows indicate particle velocity, and the shaded areas where particles overlap indicate a repulsive force between particles due to elastic deformation.

Grahic Jump Location
Fig. 2

Flow diagram of the opencl DEM simulation. Note the three distinct phases.

Grahic Jump Location
Fig. 3

Local coordinate system (left) and friction force Fs versus slip distance δs (right)

Grahic Jump Location
Fig. 4

Results of the first trial run. The frames show the simulation at t = 10, 30, 90, 120, 160, and 180 s since simulation start.

Grahic Jump Location
Fig. 5

Results of the proof-of-concept test run. Note the temperature scale at right.

Grahic Jump Location
Fig. 6

Turbine blade geometry

Grahic Jump Location
Fig. 7

Results of GPGPU-enabled thermal DEM. Particle flow is from left to right. The formation of a wake (left) is seen on the leading edge of the turbine blade (wire frame). The thermal gradient in the boundary layer is seen in the cross section view (right).



Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In