This paper outlines the development and implementation of large-scale discrete element method (DEM) simulations on graphics processing hardware. These simulations, as well as the topic of general-purpose graphics processing unit (GPGPU) computing, are introduced and discussed. We proceed to cover the general software design choices and architecture used to realize a GPGPU-enabled DEM simulation, driven primarily by the massively parallel nature of this computing technology. Further enhancements to this simulation, namely, a more advanced sliding friction model and a thermal conduction model, are then addressed. This discussion also highlights some of the finer points and issues associated with GPGPU computing, particularly surrounding the issues of parallelization, synchronization, and approximation. Qualitative comparison studies between simple and advanced sliding friction models demonstrate the effectiveness of the friction model. A test problem and an application problem in the area of wind turbine blade icing demonstrate the capabilities of the thermal model. We conclude with remarks regarding the simulations developed, future work needed, and the general suitability of GPGPU architectures for DEM computations.