QMCPACK Release v4.0.0 - 2025-02-05
Notes
This major release includes an important change in the default driver behavior, significantly expanded GPU support including fully GPU accelerated LCAO/Gaussian-basis set wavefunction support for both molecular and solid-state systems, improved GPU configuration options, a new fast spin-orbit implementation based on exact spin integration, a stochastic reconfiguration-based wavefunction optimizer for large parameter sets, self-healing wavefunction optimization, expanded implementation of the determinant localization (DLA) approach, and a new walker logging capability among many others. High-performance CPU execution is fully supported on laptops through to the largest CPU clusters. High-performance GPU execution on NVIDIA, Intel, and AMD GPUs is also fully supported on single GPUs up to the largest supercomputers. All users and developers are encouraged to check the extensive list of updates. In most cases small updates to old input files will be required to run with v4.0.0, which now also checks inputs more strictly for validity. NEXUS workflow scripts will need the least amount of changes. See the sections "QMCPACK's Performance Portable Implementation" and "Updating input files for batched drivers" in the manual, e.g., https://qmcpack.readthedocs.io/en/develop/performance_portable.html
- IMPORTANT: the default drivers are now the batched versions in both QMCPACK and NEXUS. As detailed in the user guide, small updates may be needed to inputs for the new batched versions. In addition to performance portable CPU and GPU support, the new drivers check for unknown or inconsistent input settings. The batched implementation is sufficient to cover at least 95% of prior QMCPACK publications. Less utilized features may not have been ported. Additional porting and optimizations will be guided by user feedback. If you run into difficulties, please request support so that documentation can be updated to provide more guidance. To recover legacy v3 behavior in QMCPACK, set the driver_version parameter to legacy https://qmcpack.readthedocs.io/en/develop/input_overview.html#driver-version-parameter . For NEXUS put driver = 'legacy' within generate_qmcpack sections.
- A single QMC_GPU option replaces the CMake options ENABLE_CUDA, ENABLE_ROCM, QMC_CUDA2HIP, ENABLE_SYCL and ENABLE_OFFLOAD. See details of this option explained in the user guide. e.g. Set QMC_GPU to "openmp;cuda" for NVIDIA, "openmp;hip" for AMD, and "openmp;sycl" for Intel GPUs. When not building from scratch, their cached entries in CMakeCache.txt needs to be removed. #5267
- Adopted Code of Conduct #4922
- GPU accelerated LCAO calculations with Gaussian basis sets for isolated molecules through periodic solids. E.g. via #5021, #4808
- GPU acceleration of real-to-real spline wavefunctions (SplineR2R) #5198
- Updated build recipes for ALCF Aurora and Polaris #5279, OLCF Frontier #5284, NERSC Perlmutter #5281.
- Fast implementation of spin orbit by exact spin integration #5119
- Simple implementation of Stochastic Reconfiguration optimization scheme #5017
- Improved regularization in Stochastic Reconfiguration optimization and added documentation #5157
- Implement determinant locality approximation (DLA) with T-move (TM) #5103
- Converter for determinants coming from a PySCF CAS-CI or CAS-SCF calculations #5005
- Fast, efficient implementation of SOC #4933
- New Self-Healing Overlap Estimator #4991
- New walker logging capability - write per walker data during QMC #5019
- Walkers have unique walker_ids to enable complete walker history analysis #5089, #5063
- 1-body reduced density matrix spinor support #4807
- Support for backflow optimization has been removed as part of refactoring and cleaning the codebase. QMC runs using backflow wavefunctions are still supported. This feature is expected to eventually be reimplemented. Users needing backflow optimization can use previously released versions of QMCPACK or help work towards its reimplementation in the modern code. #4688
- Samples input tag supported in batched drivers #4224
- Update Eref during warm up in the batched driver #4906
- Expanded SYCL implementation, e.g. batched determinant support #5043
- All examples updated to specify driver_version where needed #5271
- Specify recompute period in performance tests so mixed and full precision runs are directly comparable #5248
- Added AMD rocTX support #5199
- Allow exact spin integration for SOECP alongside wave function optimization #5173
- Print block and warmup timings #5058
- Print more timing information during optimization #4960
- Allow setting the number of grid points in the short range Ewald summation of Coulomb interaction #4928
- For hybrid representation spinor wavefunctions, skip incorrect norm check #5287
- Increased input consistency checking #5209
- Checks for NaNs during trial wavefunction ratio and gradient handling #4804
- Additional zero protection in multideterminant runs. #4775
- Prevent SOECP exact spin evaluation with multideterminant wave functions due to incomplete code paths #5111
- Improved optimizer robustness - accept more eigenvalues to fix occasional optimizer failures #4917
- Documentation on OneShiftOnly updated #5155
- Documentation and test for eigensolver option (used in wavefunction optimization). #5006
- Documentation explaining how to choose MPI ranks #4931
- Documentation on composing orbital rotations #4755
- Documentation on orbital rotation #4729
- Documentation for source code/developers via new doxygen cmake target (use 'make doxygen') #4700
- Build script for ORNL Baseline #5004
- Build script for NREL Kestrel #5095
- Build script for Improv at ANL LCRC #4994
- convert4qmc compatible with DIRAC versions > 22 #5196
- convertpw4qmcpack prints a completion message #5246
- qdens: Increased precision in XSF format output #5233
- qmc-fit: help keys are auto-populated correctly #5124
- Fixed AFQMC compilation with CUDA 12.x #4776
- Bug fix: Full precision batched drivers recompute Slater matrices every 10 blocks vs never #5249
- Bug fix: Proper workspace array determination for LAPACK::geev() #5194
- Bug fix: Fix TMDLA in batched DMC driver runs #5208
- Bug fix: Fix DLA+TMv1 #5113
- Bug fix: spinors with orbital optimization #4923
- Bug fix: complex hybrid representation #4939
- Bug fix: In TrialWaveFunction mw_evalGrad for spinor wave functions #4911
- Bug fix: T-move in batched DMC driver #4902
- Bug fix: Fix indexing inside SplineX2X when outputting few orbitals than it holds #4871
- Bug fix: Backend Changes to correct PBC ACForces #4855
- Bug fix: Fix rare bounds error in spline Jastrow #4828
- Bug fix: Fix incorrect Expressions for Hamiltonian and Overlap Matrices with Complex Wavefunctions #4821
- Bug fix: Fix wrong PhaseDiff and protect NaN for DMCBatched #4763
- Many smaller fixes including AFQMC maintenance, added checks, tests and cleanup.
NEXUS
- Nexus: NumPy 2 support #5215
- Nexus: Extensive examples for specifying estimators #5214
- Nexus: Implementation of Grand Canonical Twist Averaging (GCTA) with (spin)-adapted Fermi levels #5029
- Nexus: Support for GCTA with SOC calculations #5098
- Nexus: Support QE 7.2 DFT+U+V Hubbard format with nearest neighbors #5230
- Nexus: Capability to run self-consistent DFT+U+V in QE > 7.1 #4528
- Nexus: Support for supercell twists in PySCF workflows #5073
- Nexus: Spinor workflows capability #4787
- Nexus: Handle J3 terms in spin-orbit calculations #5184
- Nexus: Added dependency versions requirements files #5256
- Nexus: Support samples tag with batched drivers #5134
- Nexus: Print warning for the Nexus user before Quantum ESPRESSO wavefunction rsync (can be slow) #4984
- Nexus: Implement alternative to deprecated load_source #4964
- Nexus: Updates for QMCPACK batched input generation #4867
- Nexus: All inputs specify drivers where needed #5278
- Nexus: Set correct cusp in SOC Jastrow #4868
- Nexus: Support for Inti at ORNL #5102
- Nexus: Support LLNL machines Lassen and Ruby #5097
- Nexus: Support for NREL Kestrel #5096
- Nexus: Support for ANL LCRC machine Improv #4983
- Nexus: Support for SNL machines update #4916
- Nexus: Updated PBS job states for ALCF Polaris #4987
- Nexus: Bug fix for SOC J3 terms #5200
- Nexus: Bug fix for 1RDM input generation #5067