Releases

For questions please use the QMCPACK Google Group.

A list of releases made from GitHub can be browsed at https://github.com/QMCPACK/qmcpack/releases. Source code, including for the current development version, is available at https://github.com/QMCPACK/qmcpack

The latest nightly test status can be browsed at http://cdash.qmcpack.org

QMCPACK Release v4.0.0 - 2025-02-05

Notes

This major release includes an important change in the default driver behavior, significantly expanded GPU support including fully GPU accelerated LCAO/Gaussian-basis set wavefunction support for both molecular and solid-state systems, improved GPU configuration options, a new fast spin-orbit implementation based on exact spin integration, a stochastic reconfiguration-based wavefunction optimizer for large parameter sets, self-healing wavefunction optimization, expanded implementation of the determinant localization (DLA) approach, and a new walker logging capability among many others. High-performance CPU execution is fully supported on laptops through to the largest CPU clusters. High-performance GPU execution on NVIDIA, Intel, and AMD GPUs is also fully supported on single GPUs up to the largest supercomputers. All users and developers are encouraged to check the extensive list of updates. In most cases small updates to old input files will be required to run with v4.0.0, which now also checks inputs more strictly for validity. NEXUS workflow scripts will need the least amount of changes. See the sections "QMCPACK's Performance Portable Implementation" and "Updating input files for batched drivers" in the manual, e.g., https://qmcpack.readthedocs.io/en/develop/performance_portable.html

  • IMPORTANT: the default drivers are now the batched versions in both QMCPACK and NEXUS. As detailed in the user guide, small updates may be needed to inputs for the new batched versions. In addition to performance portable CPU and GPU support, the new drivers check for unknown or inconsistent input settings. The batched implementation is sufficient to cover at least 95% of prior QMCPACK publications. Less utilized features may not have been ported. Additional porting and optimizations will be guided by user feedback. If you run into difficulties, please request support so that documentation can be updated to provide more guidance. To recover legacy v3 behavior in QMCPACK, set the driver_version parameter to legacy https://qmcpack.readthedocs.io/en/develop/input_overview.html#driver-version-parameter . For NEXUS put driver = 'legacy' within generate_qmcpack sections.
  • A single QMC_GPU option replaces the CMake options ENABLE_CUDA, ENABLE_ROCM, QMC_CUDA2HIP, ENABLE_SYCL and ENABLE_OFFLOAD. See details of this option explained in the user guide. e.g. Set QMC_GPU to "openmp;cuda" for NVIDIA, "openmp;hip" for AMD, and "openmp;sycl" for Intel GPUs. When not building from scratch, their cached entries in CMakeCache.txt needs to be removed. #5267
  • Adopted Code of Conduct #4922
  • GPU accelerated LCAO calculations with Gaussian basis sets for isolated molecules through periodic solids. E.g. via #5021, #4808
  • GPU acceleration of real-to-real spline wavefunctions (SplineR2R) #5198
  • Updated build recipes for ALCF Aurora and Polaris #5279, OLCF Frontier #5284, NERSC Perlmutter #5281.
  • Fast implementation of spin orbit by exact spin integration #5119
  • Simple implementation of Stochastic Reconfiguration optimization scheme #5017
  • Improved regularization in Stochastic Reconfiguration optimization and added documentation #5157
  • Implement determinant locality approximation (DLA) with T-move (TM) #5103
  • Converter for determinants coming from a PySCF CAS-CI or CAS-SCF calculations #5005
  • Fast, efficient implementation of SOC #4933
  • New Self-Healing Overlap Estimator #4991
  • New walker logging capability - write per walker data during QMC #5019
  • Walkers have unique walker_ids to enable complete walker history analysis #5089, #5063
  • 1-body reduced density matrix spinor support #4807
  • Support for backflow optimization has been removed as part of refactoring and cleaning the codebase. QMC runs using backflow wavefunctions are still supported. This feature is expected to eventually be reimplemented. Users needing backflow optimization can use previously released versions of QMCPACK or help work towards its reimplementation in the modern code. #4688
  • Samples input tag supported in batched drivers #4224
  • Update Eref during warm up in the batched driver #4906
  • Expanded SYCL implementation, e.g. batched determinant support #5043
  • All examples updated to specify driver_version where needed #5271
  • Specify recompute period in performance tests so mixed and full precision runs are directly comparable #5248
  • Added AMD rocTX support #5199
  • Allow exact spin integration for SOECP alongside wave function optimization #5173
  • Print block and warmup timings #5058
  • Print more timing information during optimization #4960
  • Allow setting the number of grid points in the short range Ewald summation of Coulomb interaction #4928
  • For hybrid representation spinor wavefunctions, skip incorrect norm check #5287
  • Increased input consistency checking #5209
  • Checks for NaNs during trial wavefunction ratio and gradient handling #4804
  • Additional zero protection in multideterminant runs. #4775
  • Prevent SOECP exact spin evaluation with multideterminant wave functions due to incomplete code paths #5111
  • Improved optimizer robustness - accept more eigenvalues to fix occasional optimizer failures #4917
  • Documentation on OneShiftOnly updated #5155
  • Documentation and test for eigensolver option (used in wavefunction optimization). #5006
  • Documentation explaining how to choose MPI ranks #4931
  • Documentation on composing orbital rotations #4755
  • Documentation on orbital rotation #4729
  • Documentation for source code/developers via new doxygen cmake target (use 'make doxygen') #4700
  • Build script for ORNL Baseline #5004
  • Build script for NREL Kestrel #5095
  • Build script for Improv at ANL LCRC #4994
  • convert4qmc compatible with DIRAC versions > 22 #5196
  • convertpw4qmcpack prints a completion message #5246
  • qdens: Increased precision in XSF format output #5233
  • qmc-fit: help keys are auto-populated correctly #5124
  • Fixed AFQMC compilation with CUDA 12.x #4776
  • Bug fix: Full precision batched drivers recompute Slater matrices every 10 blocks vs never #5249
  • Bug fix: Proper workspace array determination for LAPACK::geev() #5194
  • Bug fix: Fix TMDLA in batched DMC driver runs #5208
  • Bug fix: Fix DLA+TMv1 #5113
  • Bug fix: spinors with orbital optimization #4923
  • Bug fix: complex hybrid representation #4939
  • Bug fix: In TrialWaveFunction mw_evalGrad for spinor wave functions #4911
  • Bug fix: T-move in batched DMC driver #4902
  • Bug fix: Fix indexing inside SplineX2X when outputting few orbitals than it holds #4871
  • Bug fix: Backend Changes to correct PBC ACForces #4855
  • Bug fix: Fix rare bounds error in spline Jastrow #4828
  • Bug fix: Fix incorrect Expressions for Hamiltonian and Overlap Matrices with Complex Wavefunctions #4821
  • Bug fix: Fix wrong PhaseDiff and protect NaN for DMCBatched #4763
  • Many smaller fixes including AFQMC maintenance, added checks, tests and cleanup.

NEXUS

  • Nexus: NumPy 2 support #5215
  • Nexus: Extensive examples for specifying estimators #5214
  • Nexus: Implementation of Grand Canonical Twist Averaging (GCTA) with (spin)-adapted Fermi levels #5029
  • Nexus: Support for GCTA with SOC calculations #5098
  • Nexus: Support QE 7.2 DFT+U+V Hubbard format with nearest neighbors #5230
  • Nexus: Capability to run self-consistent DFT+U+V in QE > 7.1 #4528
  • Nexus: Support for supercell twists in PySCF workflows #5073
  • Nexus: Spinor workflows capability #4787
  • Nexus: Handle J3 terms in spin-orbit calculations #5184
  • Nexus: Added dependency versions requirements files #5256
  • Nexus: Support samples tag with batched drivers #5134
  • Nexus: Print warning for the Nexus user before Quantum ESPRESSO wavefunction rsync (can be slow) #4984
  • Nexus: Implement alternative to deprecated load_source #4964
  • Nexus: Updates for QMCPACK batched input generation #4867
  • Nexus: All inputs specify drivers where needed #5278
  • Nexus: Set correct cusp in SOC Jastrow #4868
  • Nexus: Support for Inti at ORNL #5102
  • Nexus: Support LLNL machines Lassen and Ruby #5097
  • Nexus: Support for NREL Kestrel #5096
  • Nexus: Support for ANL LCRC machine Improv #4983
  • Nexus: Support for SNL machines update #4916
  • Nexus: Updated PBS job states for ALCF Polaris #4987
  • Nexus: Bug fix for SOC J3 terms #5200
  • Nexus: Bug fix for 1RDM input generation #5067

QMCPACK Release v3.17.1 - 2023-08-25

Notes

This minor release is recommended for all users and includes a couple of build fixes and a NEXUS improvement.

  • Improved HDF5 detection. Fixes cases where HDF5 was not identified by CMake, including on FreeBSD (thanks @yurivict for the report). #4708
  • Fix for building with BUILD_UNIT_TESTS=OFF. #4709
  • Add timer for orbital rotations. #4706

NEXUS

  • NEXUS: Support for spinor inputs. #4707

QMCPACK Release v3.17.0 - 2023-08-18

Notes

This is a recommended release for all users. Thanks to everyone who contributed directly, reported an issue, or suggested an improvement. There are many quality of life improvements, bug fixes throughout the application, and updates to the associated testing. As previously announced, the legacy CUDA support (QMC_CUDA=1) is removed in this version. For GPU support, users should transition to the offload code which is more capable and fully usable in production on NVIDIA GPUs.

This version is intended for long-term support of v3 of QMCPACK. Development effort is now focused towards v4. Contributions of tests, fixes, and features from users and developers are still welcome to v3 for a potential future release. However, these will not be ported towards v4 by the core QMCPACK developers without prior arrangement. Please discuss options with QMCPACK developers.

  • Simplified checkpointing and enabled it in the batched drivers. Users now only need specify checkpoint={-1,0,N} to checkpoint between blocks. #4646
  • NERSC Perlmutter build recipe. #4698
  • qmc-fit: Now supports parameter fitting with jackknife for e.g. DFT+U, EXX scans #4475 and for equation of states and morse fits #4518
  • Improved error checking including NaN checks to protect against potentially unreliable compilers and libraries, #4697, and checks on GPU matrix inversion #4693
  • Significant advances in orbital optimization capability, focusing on LCAO wavefunctions. Development is ongoing for multideterminant support and for spline wavefunctions. See e.g. the Be atom orbital optimization test #4626#4619, reading and writing of orbital rotation parameters #4580, support for disabled/frozen parameters #4581.
  • Magnetization Density Estimator for non-collinear wavefunctions #4531
  • Pathak-Wagner regularizer for forces #4477
  • The legacy CUDA implementation, the version built with QMC_CUDA=1, has been removed from the codebase, #4431#4632,#4499#4442.
  • For increased performance with current AMD GPU support, new QMC_DISABLE_HIP_HOST_REGISTER option is enabled by default for ROCm/HIP builds. #4674
  • Bugfix: J1Spin indexing was wrong #4612
  • Bugfix: 1RDM estimator data written to stat.h5 was incorrect #4568
  • Introduced ENABLE_PPCONVERT option and skip ppconvert compilation when cross compiling. #4601
  • Faster builds compared to v3.16.0 due to code refactoring #4682
  • Many refinements throughout the codebase, cleanup, improved testing.

NEXUS

  • Nexus: Equilibration detection algorithm is now deterministic #4557
  • Nexus: Support for Kagayaki cluster at JAIST #4598
  • Nexus: GPU support fix for NERSC/Perlmutter #4699
  • Nexus: Use simplices in convex_hull to support newer scipy versions #4671
  • Nexus: Add pdos flag for Projwfc #4655
  • Nexus: Adding crowds_serialize_walkers tag to dmc input list #4651
  • Nexus: Qdens handles batched driver input/output #4645
  • Nexus: Fix namelist read for Projwfc input #4644

Known problems

  • When offload builds are compiled with CUDA toolkit versions above 11.2 using LLVM, multideterminant tests and functionality will fail, seemingly due to an issue with the toolkit. This is discussed in llvm/llvm-project#54633 . All other functionality appears to work as expected. As a workaround, the CUDA toolkit 11.2 can be used. The actual NVIDIA drivers can be more recent.

QMCPACK Release v3.16.0 - 2023-01-31

Download QMCPACK v3.16.0

Notes

This release contains important bug fixes as well as feature improvements. It is a recommended release for all users. Thanks to everyone who reported an issue or suggested an improvement. See GitHub for the full list of merged pull requests and closed issues.

This release is expected to be the last including the legacy CUDA implementation, the version built with QMC_CUDA=1. Users should transition to the batched drivers which support greater functionality as well as both CPU and GPU execution. Users should adopt these drivers now and report any issues. The new drivers can be requested with the driver_version input parameter, see https://qmcpack.readthedocs.io/en/develop/performance_portable.html . In a subsequent release, the non-batched CPU drivers will also be removed leaving only the performance portable batched drivers. This will result in a single implementation of most functionality, improving overall usability and maintainability.

  • Important bugfix to NLPP integration grid rotations and update to all relevant deterministic test values. See issue #4362 for full discussion and visualization. Found and corrected by @markdewing, this bug has existed since the earliest days of QMCPACK. The stochastic rotations used to randomly reorient the integration grids for the non-local pseudopoptentials would not cover the full sphere unless they had many points and sufficient symmetry, as was the case for the QMCPACK default. However, calculations with custom integration grids with only a few points (small nrule) could show error or excess statistical noise in the non-local part of the pseudopotential energy. Standard calculations and tests on carbon diamond, lithium hydride, and hydrocarbon molecules were not affected due to QMCPACK's conservative defaults. Tests updated in #4383
  • NLPP grid randomization can be disabled for debugging and greater reproducibility #4394
  • Two-body Jastrow support for true 2D calculations #4289 (contributed by @Paul-St-Young)
  • Fix for very large calculations requesting too large grids in CUDA spline implementation #4421 (contributed by @pwang234)
  • Bugfix in the batched OpenMP offload implementation memory errors #4408 when the number of splines is not a perfectly aligned size (multiple of 8 single precision or 4 double precision).
  • Updates to test tolerances for many build types and platforms to improve reliability of deterministic tests. Goal: ctest -L deterministic should pass on all platforms. Please report any failures.
  • Improved CMake configuration including detecting use of parallel HDF5 in non-MPI builds #4420 and detection of missing OpenMP support #4422
  • Optimization of spinor wavefunctions with spin-orbit and pseudopotentials re-enabled #4418
  • QMCPACK output now indicates status of QMC_COMPLEX #4412
  • Initial work for eventual GPU offloading of Gaussian basis wavefunctions for molecules and solids #4407
  • Bugfix to support one-body Jastrow functions where only a subset of elements is given #4405
  • Electron coordinates are printed in case a NaN is detected #4401
  • To evade support problems for complex reductions in OpenMP offload compilers, real builds no longer reference any complex reductions #4379
  • Enabled HIP as language in CMake (requires >= 3.21) #3646. When using HIP targeting AMD GPUs, replace HIP_ARCH with CMAKE_HIP_ARCHITECTURES if HIP_ARCH was used to specify the GPU architectures.
  • Refinements to SYCL usage, e.g., #4384, #4382, #4380
  • Many expanded tests including for NLPP parameter derivatives #4394, more boundary conditions in distance tables #4374, for reptation Monte Carlo observables #4327, and orbital rotations #4304
  • Many updates to HDF5 usage including adoption of HDF5 1.10 API #4352 and related cleanup, e.g. #4300
  • Initial Perlmutter CPU build recipe #4398
  • Initial ALCF Sunspot build recipe including offloading to Intel Ponte Vecchio/Xe HPC GPU #4391
  • Better support for FreeBSD #4416
  • Minimum supported Intel classic compiler version is 2021.1. #4389
  • Ongoing improvement to orbital optimization and rotation, e.g. #4288, #4402
  • Ongoing code cleanup, e.g. #4276, #4275, #4273
  • Updated bmpi3 MPI "wrapper"
  • Various other small bug fixes and quality of life improvements. See the full list of merged PRs on GitHub for details.

Known problems

  • When offload builds are compiled with CUDA toolkit versions above 11.2 (tested 11.3-11.8) using LLVM15, multideterminant tests and functionality will fail, seemingly due to an issue with the toolkit. This is discussed in llvm/llvm-project#54633 . All other functionality appears to work as expected. As a workaround, the CUDA toolkit 11.2 can be used. The actual NVIDIA drivers can be more recent.
  • CUDA toolkit version 12.0 is not compatible with LLVM OpenMP offload llvm/llvm-project#60296

NEXUS

  • Nexus: Support for use of templates for job submission scripts #4344
  • Nexus: twist_info.dat files now added to results directory for easier analysis of twist average quantities #4302
  • Nexus: Initial support for Polaris at ALCF #4354
  • Nexus: Initial support for Perlmutter at NERSC #4356
  • Nexus: Support for gpusharing keyword for legacy CUDA #4403
  • Nexus: Support for handling multiple pickle protocols #4385
  • Nexus: CPU/GPU flags for batched code #4341
  • Nexus: Jastrow factors can be read from existing files #4339
  • Nexus: Fix VASP POSCAR write #4331
  • Nexus: Better handling of VASP pseudopotentials #4330

Known problems

  • The new QE7.1 DFT+U input style is not yet supported #4100

QMCPACK Release v3.15.0 - 2022-09-29

Download QMCPACK v3.15.0

Notes

This is a recommended release for all users. There are many quality of life improvements, bugfixes throughout the application, and updates to the associated testing. Thanks to everyone who reported an issue or suggested an improvement.

We are working to make the performance portable "batched drivers" the default in an upcoming version. These support execution on CPUs and multiple GPU architectures with high performance. Most standard QMC calculations and many observables are already supported. Because some changes to the input files will be required, we recommend trying these drivers now and reporting any issues.

  • Important bug fix to excited states in splines when spin-up/down sets are built from the same spin species and occupation is specified on the first sposet #4158
  • The Quantum ESPRESSO converter, pw2qmcpack is now supported via a plugin activated via -DQE_ENABLE_PLUGINS=pw2qmcpack on the QE CMake configure line, see https://qmcpack.readthedocs.io/en/develop/installation.html#quantum-espresso-7-0 . The latest QE 7.1 and earlier 7.0 are supported, and new versions should be automatically compatible.
  • Substantial improvements to the performance portable / batched implementation. Using LLVM 15.0, high performance production calculations can be performed on NVIDIA GPUs for several wavefunction types, in addition to running on all CPU systems.
  • As introduced in v3.14.0, the optional project data input parameter driver_version specifies whether legacy or batched drivers are used. In future versions of QMCPACK this tag will be required to avoid ambiguity and allow e.g. the batched VMC driver to be obtained via vmc in addition to vmc_batch. See https://qmcpack.readthedocs.io/en/develop/methods.html#transition-from-classic-drivers
  • Non-local pseudopotential energy contributions are consistently included in the objective function used for optimization, improving convergence and achievable wavefunction quality e.g. #4177
  • Support for multistep wavefunction optimization, specifying different parameter sets to be frozen at each step. #4169
  • Parameter filtration during optimization based on statistical uncertainties #4126
  • Support for 2D HEG calculations (e.g. #4084 )
  • Pseudopotential non-local channel can be specified in input #4032
  • Use of deprecated CUDA texture API removed for greater compatibility #4022
  • Optimization has been removed from the legacy CUDA code. New calculations needing GPU support should use the batched drivers and their GPU capabilities for optimization. #4138
  • Initial version of determinant update in SYCL for Intel architectures (e.g. #4118 )
  • Updated walker counts in several of the performance tests. Due to the changed but more representative workloads, new performance timings should not be compared with older runs (e.g. #4112 )
  • Maximum system sizes run in the performance tests can be specified in CMake via QMC_PERFORMANCE_NIO_MAX_ATOMS, QMC_PERFORMANCE_C_GRAPHITE_MAX_ATOMS, and QMC_PERFORMANCE_C_MOLECULE_MAX_ATOMS (e.g. #4134 )
  • Readability refinements in the output, e.g. #4149
  • UHF/UKS support in PySCF converter #4089
  • Example installation scripts for more machines placed in config directory, including Archer2, and Polaris.
  • Many improvements in testing including additional tests, better reliability, and bug fixes.
  • Minimum version of CMake is now v3.17.0 for CPU builds. For GPU builds, more recent versions may be required. Use of the latest CMake version is generally recommended.
  • Minimum CUDA version is 11.0 #3957
  • Minimum version of GCC is now v9.

Nexus

  • Nexus: support to current batched driver style. Example inputs for batched runs using trial wavefunctions from QE are included in examples/qmcpack/rsqmc_quantum_espresso #4246
  • Nexus: add override_vp_parameters element #4245
  • Nexus: fix convert4qmc hdf5 issue #4243
  • Nexus: extend angular channels for pseudopotentials up to l_max=21 #4148
  • Nexus: Pass PYTHONPATH recorded at cmake step to nxs-test to ensure tests run #3935
  • Nexus: Support for VASP keywords to version 6.3 #4056
  • Nexus: Adding docs for limiting the number of simultaneously submitted jobs to a queue #4133