Releases

For questions please use the QMCPACK Google Group.

A list of releases made from GitHub can be browsed at https://github.com/QMCPACK/qmcpack/releases. Source code, including for the current development version, is available at https://github.com/QMCPACK/qmcpack

The latest nightly test status can be browsed at http://cdash.qmcpack.org

QMCPACK Release v3.17.1 - 2023-08-25

Notes

This minor release is recommended for all users and includes a couple of build fixes and a NEXUS improvement.

  • Improved HDF5 detection. Fixes cases where HDF5 was not identified by CMake, including on FreeBSD (thanks @yurivict for the report). #4708
  • Fix for building with BUILD_UNIT_TESTS=OFF. #4709
  • Add timer for orbital rotations. #4706

NEXUS

  • NEXUS: Support for spinor inputs. #4707

QMCPACK Release v3.17.0 - 2023-08-18

Notes

This is a recommended release for all users. Thanks to everyone who contributed directly, reported an issue, or suggested an improvement. There are many quality of life improvements, bug fixes throughout the application, and updates to the associated testing. As previously announced, the legacy CUDA support (QMC_CUDA=1) is removed in this version. For GPU support, users should transition to the offload code which is more capable and fully usable in production on NVIDIA GPUs.

This version is intended for long-term support of v3 of QMCPACK. Development effort is now focused towards v4. Contributions of tests, fixes, and features from users and developers are still welcome to v3 for a potential future release. However, these will not be ported towards v4 by the core QMCPACK developers without prior arrangement. Please discuss options with QMCPACK developers.

  • Simplified checkpointing and enabled it in the batched drivers. Users now only need specify checkpoint={-1,0,N} to checkpoint between blocks. #4646
  • NERSC Perlmutter build recipe. #4698
  • qmc-fit: Now supports parameter fitting with jackknife for e.g. DFT+U, EXX scans #4475 and for equation of states and morse fits #4518
  • Improved error checking including NaN checks to protect against potentially unreliable compilers and libraries, #4697, and checks on GPU matrix inversion #4693
  • Significant advances in orbital optimization capability, focusing on LCAO wavefunctions. Development is ongoing for multideterminant support and for spline wavefunctions. See e.g. the Be atom orbital optimization test #4626#4619, reading and writing of orbital rotation parameters #4580, support for disabled/frozen parameters #4581.
  • Magnetization Density Estimator for non-collinear wavefunctions #4531
  • Pathak-Wagner regularizer for forces #4477
  • The legacy CUDA implementation, the version built with QMC_CUDA=1, has been removed from the codebase, #4431#4632,#4499#4442.
  • For increased performance with current AMD GPU support, new QMC_DISABLE_HIP_HOST_REGISTER option is enabled by default for ROCm/HIP builds. #4674
  • Bugfix: J1Spin indexing was wrong #4612
  • Bugfix: 1RDM estimator data written to stat.h5 was incorrect #4568
  • Introduced ENABLE_PPCONVERT option and skip ppconvert compilation when cross compiling. #4601
  • Faster builds compared to v3.16.0 due to code refactoring #4682
  • Many refinements throughout the codebase, cleanup, improved testing.

NEXUS

  • Nexus: Equilibration detection algorithm is now deterministic #4557
  • Nexus: Support for Kagayaki cluster at JAIST #4598
  • Nexus: GPU support fix for NERSC/Perlmutter #4699
  • Nexus: Use simplices in convex_hull to support newer scipy versions #4671
  • Nexus: Add pdos flag for Projwfc #4655
  • Nexus: Adding crowds_serialize_walkers tag to dmc input list #4651
  • Nexus: Qdens handles batched driver input/output #4645
  • Nexus: Fix namelist read for Projwfc input #4644

Known problems

  • When offload builds are compiled with CUDA toolkit versions above 11.2 using LLVM, multideterminant tests and functionality will fail, seemingly due to an issue with the toolkit. This is discussed in llvm/llvm-project#54633 . All other functionality appears to work as expected. As a workaround, the CUDA toolkit 11.2 can be used. The actual NVIDIA drivers can be more recent.

QMCPACK Release v3.16.0 - 2023-01-31

Download QMCPACK v3.16.0

Notes

This release contains important bug fixes as well as feature improvements. It is a recommended release for all users. Thanks to everyone who reported an issue or suggested an improvement. See GitHub for the full list of merged pull requests and closed issues.

This release is expected to be the last including the legacy CUDA implementation, the version built with QMC_CUDA=1. Users should transition to the batched drivers which support greater functionality as well as both CPU and GPU execution. Users should adopt these drivers now and report any issues. The new drivers can be requested with the driver_version input parameter, see https://qmcpack.readthedocs.io/en/develop/performance_portable.html . In a subsequent release, the non-batched CPU drivers will also be removed leaving only the performance portable batched drivers. This will result in a single implementation of most functionality, improving overall usability and maintainability.

  • Important bugfix to NLPP integration grid rotations and update to all relevant deterministic test values. See issue #4362 for full discussion and visualization. Found and corrected by @markdewing, this bug has existed since the earliest days of QMCPACK. The stochastic rotations used to randomly reorient the integration grids for the non-local pseudopoptentials would not cover the full sphere unless they had many points and sufficient symmetry, as was the case for the QMCPACK default. However, calculations with custom integration grids with only a few points (small nrule) could show error or excess statistical noise in the non-local part of the pseudopotential energy. Standard calculations and tests on carbon diamond, lithium hydride, and hydrocarbon molecules were not affected due to QMCPACK's conservative defaults. Tests updated in #4383
  • NLPP grid randomization can be disabled for debugging and greater reproducibility #4394
  • Two-body Jastrow support for true 2D calculations #4289 (contributed by @Paul-St-Young)
  • Fix for very large calculations requesting too large grids in CUDA spline implementation #4421 (contributed by @pwang234)
  • Bugfix in the batched OpenMP offload implementation memory errors #4408 when the number of splines is not a perfectly aligned size (multiple of 8 single precision or 4 double precision).
  • Updates to test tolerances for many build types and platforms to improve reliability of deterministic tests. Goal: ctest -L deterministic should pass on all platforms. Please report any failures.
  • Improved CMake configuration including detecting use of parallel HDF5 in non-MPI builds #4420 and detection of missing OpenMP support #4422
  • Optimization of spinor wavefunctions with spin-orbit and pseudopotentials re-enabled #4418
  • QMCPACK output now indicates status of QMC_COMPLEX #4412
  • Initial work for eventual GPU offloading of Gaussian basis wavefunctions for molecules and solids #4407
  • Bugfix to support one-body Jastrow functions where only a subset of elements is given #4405
  • Electron coordinates are printed in case a NaN is detected #4401
  • To evade support problems for complex reductions in OpenMP offload compilers, real builds no longer reference any complex reductions #4379
  • Enabled HIP as language in CMake (requires >= 3.21) #3646. When using HIP targeting AMD GPUs, replace HIP_ARCH with CMAKE_HIP_ARCHITECTURES if HIP_ARCH was used to specify the GPU architectures.
  • Refinements to SYCL usage, e.g., #4384, #4382, #4380
  • Many expanded tests including for NLPP parameter derivatives #4394, more boundary conditions in distance tables #4374, for reptation Monte Carlo observables #4327, and orbital rotations #4304
  • Many updates to HDF5 usage including adoption of HDF5 1.10 API #4352 and related cleanup, e.g. #4300
  • Initial Perlmutter CPU build recipe #4398
  • Initial ALCF Sunspot build recipe including offloading to Intel Ponte Vecchio/Xe HPC GPU #4391
  • Better support for FreeBSD #4416
  • Minimum supported Intel classic compiler version is 2021.1. #4389
  • Ongoing improvement to orbital optimization and rotation, e.g. #4288, #4402
  • Ongoing code cleanup, e.g. #4276, #4275, #4273
  • Updated bmpi3 MPI "wrapper"
  • Various other small bug fixes and quality of life improvements. See the full list of merged PRs on GitHub for details.

Known problems

  • When offload builds are compiled with CUDA toolkit versions above 11.2 (tested 11.3-11.8) using LLVM15, multideterminant tests and functionality will fail, seemingly due to an issue with the toolkit. This is discussed in llvm/llvm-project#54633 . All other functionality appears to work as expected. As a workaround, the CUDA toolkit 11.2 can be used. The actual NVIDIA drivers can be more recent.
  • CUDA toolkit version 12.0 is not compatible with LLVM OpenMP offload llvm/llvm-project#60296

NEXUS

  • Nexus: Support for use of templates for job submission scripts #4344
  • Nexus: twist_info.dat files now added to results directory for easier analysis of twist average quantities #4302
  • Nexus: Initial support for Polaris at ALCF #4354
  • Nexus: Initial support for Perlmutter at NERSC #4356
  • Nexus: Support for gpusharing keyword for legacy CUDA #4403
  • Nexus: Support for handling multiple pickle protocols #4385
  • Nexus: CPU/GPU flags for batched code #4341
  • Nexus: Jastrow factors can be read from existing files #4339
  • Nexus: Fix VASP POSCAR write #4331
  • Nexus: Better handling of VASP pseudopotentials #4330

Known problems

  • The new QE7.1 DFT+U input style is not yet supported #4100

QMCPACK Release v3.15.0 - 2022-09-29

Download QMCPACK v3.15.0

Notes

This is a recommended release for all users. There are many quality of life improvements, bugfixes throughout the application, and updates to the associated testing. Thanks to everyone who reported an issue or suggested an improvement.

We are working to make the performance portable "batched drivers" the default in an upcoming version. These support execution on CPUs and multiple GPU architectures with high performance. Most standard QMC calculations and many observables are already supported. Because some changes to the input files will be required, we recommend trying these drivers now and reporting any issues.

  • Important bug fix to excited states in splines when spin-up/down sets are built from the same spin species and occupation is specified on the first sposet #4158
  • The Quantum ESPRESSO converter, pw2qmcpack is now supported via a plugin activated via -DQE_ENABLE_PLUGINS=pw2qmcpack on the QE CMake configure line, see https://qmcpack.readthedocs.io/en/develop/installation.html#quantum-espresso-7-0 . The latest QE 7.1 and earlier 7.0 are supported, and new versions should be automatically compatible.
  • Substantial improvements to the performance portable / batched implementation. Using LLVM 15.0, high performance production calculations can be performed on NVIDIA GPUs for several wavefunction types, in addition to running on all CPU systems.
  • As introduced in v3.14.0, the optional project data input parameter driver_version specifies whether legacy or batched drivers are used. In future versions of QMCPACK this tag will be required to avoid ambiguity and allow e.g. the batched VMC driver to be obtained via vmc in addition to vmc_batch. See https://qmcpack.readthedocs.io/en/develop/methods.html#transition-from-classic-drivers
  • Non-local pseudopotential energy contributions are consistently included in the objective function used for optimization, improving convergence and achievable wavefunction quality e.g. #4177
  • Support for multistep wavefunction optimization, specifying different parameter sets to be frozen at each step. #4169
  • Parameter filtration during optimization based on statistical uncertainties #4126
  • Support for 2D HEG calculations (e.g. #4084 )
  • Pseudopotential non-local channel can be specified in input #4032
  • Use of deprecated CUDA texture API removed for greater compatibility #4022
  • Optimization has been removed from the legacy CUDA code. New calculations needing GPU support should use the batched drivers and their GPU capabilities for optimization. #4138
  • Initial version of determinant update in SYCL for Intel architectures (e.g. #4118 )
  • Updated walker counts in several of the performance tests. Due to the changed but more representative workloads, new performance timings should not be compared with older runs (e.g. #4112 )
  • Maximum system sizes run in the performance tests can be specified in CMake via QMC_PERFORMANCE_NIO_MAX_ATOMS, QMC_PERFORMANCE_C_GRAPHITE_MAX_ATOMS, and QMC_PERFORMANCE_C_MOLECULE_MAX_ATOMS (e.g. #4134 )
  • Readability refinements in the output, e.g. #4149
  • UHF/UKS support in PySCF converter #4089
  • Example installation scripts for more machines placed in config directory, including Archer2, and Polaris.
  • Many improvements in testing including additional tests, better reliability, and bug fixes.
  • Minimum version of CMake is now v3.17.0 for CPU builds. For GPU builds, more recent versions may be required. Use of the latest CMake version is generally recommended.
  • Minimum CUDA version is 11.0 #3957
  • Minimum version of GCC is now v9.

Nexus

  • Nexus: support to current batched driver style. Example inputs for batched runs using trial wavefunctions from QE are included in examples/qmcpack/rsqmc_quantum_espresso #4246
  • Nexus: add override_vp_parameters element #4245
  • Nexus: fix convert4qmc hdf5 issue #4243
  • Nexus: extend angular channels for pseudopotentials up to l_max=21 #4148
  • Nexus: Pass PYTHONPATH recorded at cmake step to nxs-test to ensure tests run #3935
  • Nexus: Support for VASP keywords to version 6.3 #4056
  • Nexus: Adding docs for limiting the number of simultaneously submitted jobs to a queue #4133

QMCPACK Release v3.14.0 - 2022-04-06

Download QMCPACK v3.14.0

Notes

This release focuses on performance improvements to the OpenMP target offload version for GPUs as well as ongoing minor improvements. The new GPU implementation rivals the legacy CUDA version for performance for broad range of problems while offering more functionality, such as three body Jastrow functions. Developers are very interested in feedback from users about the new version and will prioritize developments based on comments received. A new driver_version switch is introduced, currently optional, to disambiguate between the versions and their inputs.

  • New global driver_version switch to select between batched and legacy codes. This will become a required input tag in the next major release series of QMCPACK, but remains optional in 3.x versions #3897
  • Optimization of block sizes in GPU offload kernels #3910
  • GPU Offload of one-body Jastrow ratio calculation in pseudopotential evaluation #3905
  • GPU Offload of some Coulomb potential evaluations #3842
  • Partial GPU offload of multideterminant evaluation e.g. #3892
  • Increased performance via more selective distance table computation #3846
  • Improved performance on AMD GPUs via rocSOLVER integration #3756
  • HIP build options shown in output #3919
  • Documentation improvements, particularly relating to installation.
  • Various bug fixes and ongoing cleanup.

NEXUS

  • Nexus: proper use of max_seconds in legacy drivers #3877