Research Papers

Large Scale Finite Element Analysis Via Assembly-Free Deflated Conjugate Gradient

[+] Author and Article Information
Praveen Yadav

Department of Mechanical Engineering,
Madison, WI 53706

Krishnan Suresh

Department of Mechanical Engineering,
Madison, WI 53706
e-mail: suresh@engr.wisc.edu

Contributed by the Computers and Information Division of ASME for publication in the JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING. Manuscript received June 27, 2014; final manuscript received September 10, 2014; published online October 7, 2014. Editor: Bahram Ravani.

J. Comput. Inf. Sci. Eng 14(4), 041008 (Oct 07, 2014) (9 pages) Paper No: JCISE-14-1225; doi: 10.1115/1.4028591 History: Received June 27, 2014; Revised September 10, 2014

Large-scale finite element analysis (FEA) with millions of degrees of freedom (DOF) is becoming commonplace in solid mechanics. The primary computational bottleneck in such problems is the solution of large linear systems of equations. In this paper, we propose an assembly-free version of the deflated conjugate gradient (DCG) for solving such equations, where neither the stiffness matrix nor the deflation matrix is assembled. While assembly-free FEA is a well-known concept, the novelty pursued in this paper is the use of assembly-free deflation. The resulting implementation is particularly well suited for large-scale problems and can be easily ported to multicore central processing unit (CPU) and graphics-programmable unit (GPU) architectures. For demonstration, we show that one can solve a 50 × 106 degree of freedom system on a single GPU card, equipped with 3 GB of memory. The second contribution is an extension of the “rigid-body agglomeration” concept used in DCG to a “curvature-sensitive agglomeration.” The latter exploits classic plate and beam theories for efficient deflation of highly ill-conditioned problems arising from thin structures.

Copyright © 2014 by ASME
Your Session has timed out. Please sign back in to continue.


Golub, G. H., 1996, Matrix Computations, Johns Hopkins, Balitmore, MD.
Aubry, R., Mut, F., Dey, S., and Lohner, R., 2011, “Deflated Preconditioned Conjugate Gradient Solvers for Linear Elasticity,” Int. J. Numer. Methods Eng., 88(11), pp. 1112–1127. [CrossRef]
2012, “ANSYS 13. ANSYS,” www.ansys.com
Efendiev, Y., and Hou, T. Y., 2009, Multiscale Finite Element Methods: Theory and Applications, Vol. 4., Springer, New York.
Arbenz, P., van Lenthe, G. H., Mennel, U., Müller, R., and Sala, M., 2008, “A Scalable Multi-Level Preconditioner for Matrix-Free μ-Finite Element Analysis of Human Bone Structures,” Int. J. Numer. Methods Eng., 73(7), pp. 927–947. [CrossRef]
Suresh, K., and Yadav, P., 2012, “Large-Scale Modal Analysis on Multi-Core Architectures,” ASME Paper No. DETC2012-70281. [CrossRef]
Saad, Y., 2003, Iterative Methods for Sparse Linear Systems, SIAM, New Delhi, India.
Williams, S., Oliker, L., Vuduc, R., Shalf, J., Yelick, K., and Demmel, J., 2007, “Optimization of Sparse Matrix–Vector Multiplication on Emerging Multicore Platforms,” Proceedings ACM/IEEE Conference on Supercomputing, Reno, NV, Nov. 10–16.
Bell, N., 2008, “Efficient Sparse Matrix–Vector Multiplication on CUDA,” NVIDIA, Technical Report No. NVR-2008-004.
Yang, X., Parthasarathy, S., and Sadayappan, P., 2011, “Fast Sparse Matrix–Vector Multiplication on GPUs: Implications for Graph Mining,” 37th International Conference on Very Large Data Bases, Seattle, WA, Aug. 29–Sept. 3. [CrossRef]
Akbariyeh, A., Carrigan, T. J., Dennis, B. H., Chan, W. S., Wang, B. P., and Lawrence, K. L., 2012, “Application of GPU-Based Computing to Large Scale Finite Element Analysis of Three-Dimensional Structures,” Proceedings of the Eighth International Conference on Engineering Computational Technology, Stirlingshire, UK, Paper No. 6 [CrossRef].
Adams, M., 2002, “Evaluation of Three Unstructured Multigrid Methods on 3D Finite Element Problems in Solid Mechanics,” Int. J. Numer. Methods Eng., 55(2), pp. 519–534. [CrossRef]
Benzi, M., and Tuma, M., 2003, “A Robust Incomplete Factorization Preconditioner for Positive Definite Matrices,” Numer. Linear Algebra Appl., 10(5,6), pp. 385–400. [CrossRef]
Benzi, M., 2002, “Preconditioning Techniques for Large Linear Systems: A Survey,” J. Comput. Phys., 182(2), pp. 418–477. [CrossRef]
Briggs, W. L., Henson, V. E., and McCormick, S. F., 2000, A Multigrid Tutorial. SIAM, New Delhi, India.
Wesseling, P., “Geometric Multigrid With Applications to Computational Fluid Dynamics,” J. Comput. Appl. Math., 128(1), pp. 311–334. [CrossRef]
Griebel, M., Oeltz, D., and Schweitzer, M. A., 2003, “An Algebraic Multigrid Method for Linear Elasticity,” SIAM J. Sci. Comput., 25(2), pp. 385–407. [CrossRef]
Karer, E., and Kraus, J. K., 2010, “Algebraic Multigrid for Finite Element Elasticity Equations: Determination of Nodal Dependence via Edge-Matrices and Two-Level Convergence,” Int. J. Numer. Methods Eng., 83(5), pp. 642–670 [CrossRef].
Ruge, J., and Brandt, A., 1988, “A Multigrid Approach for Elasticity Problems on ‘Thin’ Domains,” Multigrid Methods: Theory, Applications, and Supercomputing, Vol. 110, S. F.McCormick, ed., Marcel Dekker Inc., New York, pp. 541–555.
Jonsthovel, T. B., van Gijzen, M. B., MacLachlan, S., Vuik, C., and Scarpas, A., 2012, “Comparison of the Deflated Preconditioned Conjugate Gradient Method and Algebraic Multigrid for Composite Materials,” Comput. Mech., 50(3), pp. 321–333. [CrossRef]
Mishra, V., and Suresh, K., 2009, “Efficient Analysis of 3D Plates via Algebraic Reduction,” ASME 2009 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference (IDETC/CIE2009), San Diego, CA, July 19–23, Vol. 2, pp. 75–82.
Mishra, V., and Suresh, K., 2012, “A Dual-Representation Strategy for the Virtual Assembly of Thin Deformable Objects,” Virtual Reality, 16(1), pp. 3–14. [CrossRef]
Saad, Y., Yeung, M., Erhel, J., and Guyomarc'h, F., 2000, “A Deflated Version of the Conjugate Gradient Algorithm,” SIAM J. Sci. Comput., 21(5), pp. 1909–1926. [CrossRef]
Baker, A. H., Kolev, T. V., and Yank, U. M., 2010, “Improving Algebraic Multigrid Interpolation Operators for Linear Elasticity Problems,” Numer. Linear Algebra Appl., 17(2,3), pp. 495–517 [CrossRef].
Timoshenko, S., and Krieger, S. W., 1959, Theory of Plates and Shells, McGraw-Hill Book Company, New York.
Hughes, T. J. R., Levit, I., and Winget, J., 1983, “An Element-by-Element Solution Algorithm for Problems of Structural and Solid Mechanics,” Comput. Methods Appl. Mech. Eng., 36(2), pp. 241–254. [CrossRef]
Michopoulos, J., Hermanson, J. C., Iliopoulos, A. P., Lambrakos, S. G., and Furukawa, T., 2011, “Data-Driven Design Optimization for Composite Material Characterization,” ASME J. Comput. Inf. Sci. Eng., 11(2), p. 021009. [CrossRef]
NVIDIA Corporation, 2008, NVIDIA CUDA: Compute Unified Device Architecture, Programming Guide, Santa Clara, CA.
“OpenMP.org,” Accessed May 4, 2014, http://openmp.org/wp/
Suresh, K., 2013, “Efficient Generation of Large-Scale Pareto-Optimal Topologies,” Struct. Multidiscip. Optim., 47(1), pp. 49–61. [CrossRef]


Grahic Jump Location
Fig. 1

A two-level geometric multigrid

Grahic Jump Location
Fig. 2

(a) Finite element mesh and (b) agglomeration of mesh nodes into four groups

Grahic Jump Location
Fig. 3

Example of “thick” solids

Grahic Jump Location
Fig. 4

Examples of “thin” solids

Grahic Jump Location
Fig. 5

Curvature effects in thin structures

Grahic Jump Location
Fig. 6

Congruency in a finite element mesh

Grahic Jump Location
Fig. 7

Most of the distinct elements are localized

Grahic Jump Location
Fig. 8

Partitioning mesh-nodes into groups. (a) Finite element mesh. (b) Partitioning into 32 groups. (c) Partitioning into 64 groups.

Grahic Jump Location
Fig. 9

SpMV implementation in GPU

Grahic Jump Location
Fig. 10

GPU implementation of prolongation

Grahic Jump Location
Fig. 11

GPU implementation for restriction

Grahic Jump Location
Fig. 12

A beam geometry and its mesh

Grahic Jump Location
Fig. 13

Assembly-free SpMV on the CPU with and without exploiting element-congruency

Grahic Jump Location
Fig. 14

(a) Knuckle geometry and loading. (b) Voxel mesh with 3.16 × 106 DOF.

Grahic Jump Location
Fig. 15

Static displacement and stress for knuckle

Grahic Jump Location
Fig. 16

Visual representation of 100 and 1000 agglomeration groups

Grahic Jump Location
Fig. 17

Convergence of DCG versus Jacobi-PCG

Grahic Jump Location
Fig. 18

Loading on a thin plate

Grahic Jump Location
Fig. 19

Convergence of DCG versus Jacobi-PCG for thin plate

Grahic Jump Location
Fig. 20

CUDA profile for RBM deflation

Grahic Jump Location
Fig. 21

Structural problem over a Thomas engine

Grahic Jump Location
Fig. 22

Deflection from a 50 × 106 DOF system



Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In