diff --git a/CMakeLists.txt b/CMakeLists.txt index cd506c2230..1736be32a8 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -336,6 +336,7 @@ target_link_libraries(${ABACUS_BIN_NAME} driver xc_ hsolver + genelpa elecstate hamilt psi diff --git a/README.md b/README.md index d26135a5be..17de80ae4c 100644 --- a/README.md +++ b/README.md @@ -56,6 +56,7 @@ ABACUS provides the following features and functionalities: 20. (subsidiary tool)Generator for second generation numerical orbital basis. 21. Interface with DPGEN 22. Interface with phonopy +23. Implicit solvation model [back to top](#readme-top) @@ -161,6 +162,7 @@ The following provides basic sample jobs in ABACUS. More can be found in the dir - [BSSE for molecular formation energy](docs/examples/BSSE.md) - [ABACUS-DPGEN interface](docs/examples/dpgen.md) - [ABACUS-phonopy interface](docs/examples/phonopy.md) +- [Implicit solvation model](docs/examples/implicit-sol.md) [back to top](#readme-top) diff --git a/docs/examples/implicit-sol.md b/docs/examples/implicit-sol.md new file mode 100644 index 0000000000..1f677e2796 --- /dev/null +++ b/docs/examples/implicit-sol.md @@ -0,0 +1,63 @@ +# Implicit solvation model + +[back to main page](../../README.md) + +Solid-liquid interfaces are ubiquitous in nature and frequently encountered and employed in materials simulation. The solvation effect should be taken into account in accurate first-principles calculations of such systems. +Implicit solvation model is a well-developed method to deal with solvation effects, which has been widely used in finite and periodic systems. This approach treats the solvent as a continuous medium instead of individual “explicit” solvent molecules, which means that the solute embedded in an implicit solvent and the average over the solvent degrees of freedom becomes implicit in the properties of the solvent bath. + +## Input +``` +INPUT_PARAMETERS +imp_sol 1 +eb_k 80 +tau 0.000010798 +sigma_k 0.6 +nc_k 0.00037 +``` +- imp_sol + + If set to 1, an implicit solvation correction is considered. 0:vacuum calculation(default). +- eb_k + + The relative permittivity of the bulk solvent, 80 for water. Used only if `imp_sol` == true. +- tau + + The effective surface tension parameter, which describes the cavitation, the dispersion, and the repulsion interaction between the solute and the solvent that are not captured by the electrostatic terms. + We use the values of `tau`, `sigma_k`, `nc_k` that were obtained by a fit of the model to experimental solvation energies for molecules in water. tau = 0.525 $meV/Å^{2}$ = 1.0798e-05 $Ry/Bohr^{2}$. +- sigma_k + + We assume a diffuse cavity that is implicitly determined by the electronic structure of the solute. + `sigma_k` is the parameter that describes the width of the diffuse cavity. The specific value is sigma_k = 0.6. +- nc_k + + `nc_k` determines at what value of the electron density the dielectric cavity forms. + The specific value is nc_k = 0.0025 $Å^{-3}$ = 0.00037 $Bohr^{-3}$. + +## Output +In this example, we calculate the implicit solvation correction for H2O. +The results of the energy calculation are written in the “running_nscf.log” in the OUT folder. +``` + Energy Rydberg eV + E_KohnSham -34.3200995971 -466.948910448 + E_Harris -34.2973698556 -466.639656449 + E_band -7.66026117767 -104.223200184 + E_one_elec -56.9853883251 -775.325983964 + E_Hartree +30.0541108968 +408.907156521 + E_xc -8.32727420734 -113.298378028 + E_Ewald +0.961180728747 +13.0775347188 + E_demet +0 +0 + E_descf +0 +0 + E_efield +0 +0 + E_exx +0 +0 + E_sol_el -0.0250553663339 -0.340895747619 + E_sol_cav +0.00232667606131 +0.031656051834 + E_Fermi -0.499934383866 -6.8019562467 + +``` +- E_sol_el: Electrostatic contribution to the solvation energy. +- E_sol_cav: Cavitation and dispersion contributions to the solvation energy. +Both `E_sol_el` and `E_sol_cav` corrections are included in `E_KohnSham`. + + + +[back to top](#implicit-solvation-model) \ No newline at end of file diff --git a/docs/examples/phonopy.md b/docs/examples/phonopy.md index ac8efdba7d..bf00b37608 100644 --- a/docs/examples/phonopy.md +++ b/docs/examples/phonopy.md @@ -3,7 +3,7 @@ [back to main page](../../README.md) -[Phonopy](https://github.com/phonopy/phonopy) is a powerful package to calculate phonon and related properties. It has provided interface with ABACUS. In the following, we take the FCC aluminum as an example: +[Phonopy](https://github.com/phonopy/phonopy) (Note: please use the `develop` branch, rather than the `master` branch until the abacus interface has been merged into phonopy's `master` branch.) is a powerful package to calculate phonon and related properties. It has provided interface with ABACUS. In the following, we take the FCC aluminum as an example: 1. Prepare a 'setting.conf' with following tags: @@ -39,4 +39,4 @@ PRIMITIVE_AXES = 0 1/2 1/2 1/2 0 1/2 1/2 1/2 0 BAND= 1 1 1 1/2 1/2 1 3/8 3/8 3/4 0 0 0 1/2 1/2 1/2 BAND_POINTS = 21 BAND_CONNECTION = .TRUE. -``` \ No newline at end of file +``` diff --git a/docs/features.md b/docs/features.md index c108e78ed7..b74fdb5df7 100644 --- a/docs/features.md +++ b/docs/features.md @@ -50,9 +50,14 @@ ATOMIC_SPECIES Si 28.00 Si_ONCV_PBE-1.0.upf ``` -The user can download the pseudopotential files from our [website](http://abacus.ustc.edu.cn/pseudo.html). +You can download the pseudopotential files from our [website](http://abacus.ustc.edu.cn/pseudo/list.htm). -For more information of different types of pseudopotentials, please visit the Quantum espresso [website](http://www.quantum-espresso.org/pseudopotentials/). +There are pseudopotential files in these websites which are also supported by ABACUS: +1. [Quantum ESPRESSO](http://www.quantum-espresso.org/pseudopotentials/). +2. [SG15-ONCV](http://quantum-simulation.org/potentials/sg15_oncv/upf/). +3. [DOJO](http://www.pseudo-dojo.org/). + +If LCAO base is used, the numerical orbital files should match the pseudopotential files. The [official orbitals package](http://abacus.ustc.edu.cn/pseudo/list.htm) only matches SG15-ONCV pseudopotentials. [back to top](#features) diff --git a/docs/input-main.md b/docs/input-main.md index 7545f7f6ba..3f93eff3ad 100644 --- a/docs/input-main.md +++ b/docs/input-main.md @@ -82,6 +82,10 @@ [cal_cond](#cal_cond) | [cond_nche](#cond_nche) | [cond_dw](#cond_dw) | [cond_wcut](#cond_wcut) | [cond_wenlarge](#cond_wenlarge) | [cond_fwhm ](#cond_fwhm ) +- [Implicit solvation model](#implicit-solvation-model) + + [imp_sol](#imp_sol) | [eb_k](#eb_k) | [tau](#tau) | [sigma_k](#sigma_k) | [nc_k](#nc_k) + [back to main page](../README.md) ## Structure of the file @@ -1662,3 +1666,39 @@ Thermal conductivities: $\kappa = \lim_{\omega\to 0}\kappa(\omega)$ - **Type**: Integer - **Description**: We use gaussian functions to approxiamte $\delta(E)\approx \frac{1}{\sqrt{2\pi}\Delta E}e^{-\frac{E^2}{2{\Delta E}^2}}$. FWHM for conductivities, $FWHM=2*\sqrt{2\ln2}\cdot \Delta E$. The unit is eV. - **Default**: 0.3 + +### Implicit solvation model + +This part of variables are used to control the usage of implicit solvation model. This approach treats the solvent as a continuous medium instead of individual “explicit” solvent molecules, which means that the solute embedded in an implicit solvent and the average over the solvent degrees of freedom becomes implicit in the properties of the solvent bath. + +#### imp_sol + +- **Type**: Boolean +- **Description**: If set to 1, an implicit solvation correction is considered. +- **Default**: 0 + +#### eb_k + +- **Type**: Real +- **Description**: The relative permittivity of the bulk solvent, 80 for water. Used only if `imp_sol` == true. +- **Default**: 80 + +#### tau + +- **Type**: Real +- **Description**: The effective surface tension parameter, which describes the cavitation, the dispersion, and the repulsion interaction between the solute and the solvent that are not captured by the electrostatic terms. The unit is $Ry/Bohr^{2}$. +- **Default**: 1.0798e-05 + +#### sigma_k + +- **Type**: Real +- **Description**: We assume a diffuse cavity that is implicitly determined by the electronic structure of the solute. +`sigma_k` is the parameter that describes the width of the diffuse cavity. +- **Default**: 0.6 + +#### nc_k + +- **Type**: Real +- **Description**: It determines at what value of the electron density the dielectric cavity forms. +The unit is $Bohr^{-3}$. +- **Default**: 0.00037 \ No newline at end of file diff --git a/modules/FindELPA.cmake b/modules/FindELPA.cmake index 03c4cda549..82cfb56d86 100644 --- a/modules/FindELPA.cmake +++ b/modules/FindELPA.cmake @@ -38,17 +38,4 @@ if(ELPA_FOUND) endif() set(CMAKE_REQUIRED_INCLUDES ${CMAKE_REQUIRED_INCLUDES} ${ELPA_INCLUDE_DIR}) -include(CheckCXXSourceCompiles) -check_cxx_source_compiles(" -#include -#if ELPA_API_VERSION < 20210430 -#error ELPA version is too old. -#endif -int main(){} -" -ELPA_VERSION_SATISFIES -) -if(NOT ELPA_VERSION_SATISFIES) - message(FATAL_ERROR "ELPA version is too old. We support version 2017 or higher.") -endif() mark_as_advanced(ELPA_INCLUDE_DIR ELPA_LIBRARY) diff --git a/source/Makefile b/source/Makefile index 5043526890..16c493ff75 100644 --- a/source/Makefile +++ b/source/Makefile @@ -20,6 +20,7 @@ VPATH=./src_global\ :./module_xc\ :./module_esolver\ :./module_hsolver\ +:./module_hsolver/genelpa\ :./module_elecstate\ :./module_psi\ :./module_hamilt\ diff --git a/source/Makefile.Objects b/source/Makefile.Objects index 36799e6333..1c0eecb0ba 100644 --- a/source/Makefile.Objects +++ b/source/Makefile.Objects @@ -277,6 +277,11 @@ hsolver_lcao.o\ hsolver_pw.o\ hsolver_pw_sdft.o +OBJ_GENELPA=elpa_new_complex.o\ +elpa_new_real.o\ +elpa_new.o\ +utils.o + OBJ_ELECSTATES=elecstate.o\ dm2d_to_grid.o\ elecstate_lcao.o\ @@ -304,6 +309,7 @@ $(OBJ_HSOLVER)\ $(OBJ_ELECSTATES)\ $(OBJ_PSI)\ ${OBJ_OPERATOR}\ +${OBJ_GENELPA}\ charge.o \ charge_mixing.o \ charge_pulay.o \ diff --git a/source/input_conv.cpp b/source/input_conv.cpp index 1ff0f0a621..76a0ddcd53 100644 --- a/source/input_conv.cpp +++ b/source/input_conv.cpp @@ -373,6 +373,7 @@ void Input_Conv::Convert(void) if (GlobalC::exx_global.info.hybrid_type != Exx_Global::Hybrid_Type::No) { + //EXX case, convert all EXX related variables GlobalC::exx_global.info.hybrid_alpha = INPUT.exx_hybrid_alpha; XC_Functional::get_hybrid_alpha(INPUT.exx_hybrid_alpha); GlobalC::exx_global.info.hse_omega = INPUT.exx_hse_omega; @@ -406,6 +407,9 @@ void Input_Conv::Convert(void) Exx_Abfs::Jle::Lmax = INPUT.exx_opt_orb_lmax; Exx_Abfs::Jle::Ecut_exx = INPUT.exx_opt_orb_ecut; Exx_Abfs::Jle::tolerence = INPUT.exx_opt_orb_tolerence; + + //EXX does not support any symmetry analyse, force symmetry setting to -1 + ModuleSymmetry::Symmetry::symm_flag = -1; } #endif #endif diff --git a/source/module_deepks/test/CMakeLists.txt b/source/module_deepks/test/CMakeLists.txt index 0702aa30bc..963b5dae2c 100644 --- a/source/module_deepks/test/CMakeLists.txt +++ b/source/module_deepks/test/CMakeLists.txt @@ -8,7 +8,7 @@ target_link_libraries( test_deepks base cell symmetry md surchem xc_ neighbor orb io relax gint lcao parallel mrrr pdiag pw ri driver esolver hsolver psi elecstate hamilt planewave - pthread + pthread genelpa deepks ${ABACUS_LINK_LIBRARIES} ) diff --git a/source/module_elecstate/elecstate_lcao.cpp b/source/module_elecstate/elecstate_lcao.cpp index d97409ef58..12e1ceab86 100644 --- a/source/module_elecstate/elecstate_lcao.cpp +++ b/source/module_elecstate/elecstate_lcao.cpp @@ -137,7 +137,9 @@ void ElecStateLCAO::print_psi(const psi::Psi& psi_in) // output but not do "2d-to-grid" conversion double** wfc_grid = nullptr; +#ifdef __MPI this->lowf->wfc_2d_to_grid(ElecStateLCAO::out_wfc_lcao, psi_in.get_pointer(), wfc_grid, this->ekb, this->wg); +#endif return; } void ElecStateLCAO::print_psi(const psi::Psi>& psi_in) @@ -159,7 +161,7 @@ void ElecStateLCAO::print_psi(const psi::Psi>& psi_in) { for (int iw = 0; iw < GlobalV::NLOCAL; iw++) { - this->lowf->wfc_k_grid[ik][ib][iw] = psi(ib,iw); + this->lowf->wfc_k_grid[ik][ib][iw] = psi_in(ib, iw); } } #endif diff --git a/source/module_esolver/esolver_ks_pw_tool.cpp b/source/module_esolver/esolver_ks_pw_tool.cpp index 5d8790072c..08600bc573 100644 --- a/source/module_esolver/esolver_ks_pw_tool.cpp +++ b/source/module_esolver/esolver_ks_pw_tool.cpp @@ -1,6 +1,6 @@ #include "esolver_ks_pw.h" -#include "module_base/global_variable.h" #include "module_base/global_function.h" +#include "module_base/global_variable.h" #include "src_pw/global.h" #include "src_pw/occupy.h" @@ -20,24 +20,23 @@ namespace ModuleESolver // k = 1.380649e-23 // e/k = 11604.518026 , 1 eV = 11604.5 K //------------------------------------------------------------------ -#define TWOSQRT2LN2 2.354820045030949 //FWHM = 2sqrt(2ln2) * \sigma -#define FACTOR 1.839939223835727e7 -void ESolver_KS_PW::KG(const int nche_KG, const double fwhmin, const double wcut, - const double dw_in, const int times) +#define TWOSQRT2LN2 2.354820045030949 // FWHM = 2sqrt(2ln2) * \sigma +#define FACTOR 1.839939223835727e7 +void ESolver_KS_PW::KG(const int nche_KG, const double fwhmin, const double wcut, const double dw_in, const int times) { //----------------------------------------------------------- // KS conductivity //----------------------------------------------------------- - cout<<"Calculating conductivity..."<= 1); assert(nt >= 1); const int nk = GlobalC::kv.nks; @@ -46,150 +45,180 @@ void ESolver_KS_PW::KG(const int nche_KG, const double fwhmin, const double wcut const double tpiba = GlobalC::ucell.tpiba; const int nbands = GlobalV::NBANDS; const double ef = GlobalC::en.ef; - - double * ct11 = new double[nt]; - double * ct12 = new double[nt]; - double * ct22 = new double[nt]; - ModuleBase::GlobalFunc::ZEROS(ct11,nt); - ModuleBase::GlobalFunc::ZEROS(ct12,nt); - ModuleBase::GlobalFunc::ZEROS(ct22,nt); + double *ct11 = new double[nt]; + double *ct12 = new double[nt]; + double *ct22 = new double[nt]; + ModuleBase::GlobalFunc::ZEROS(ct11, nt); + ModuleBase::GlobalFunc::ZEROS(ct12, nt); + ModuleBase::GlobalFunc::ZEROS(ct22, nt); - for (int ik = 0;ik < nk;++ik) - { - for(int id = 0 ; id < ndim ; ++id) - { - this->phami->updateHk(ik); - const int npw = GlobalC::kv.ngk[ik]; - - complex * pij = new complex [nbands * nbands]; - complex * prevc= new complex [npw * nbands]; - complex * levc = &(this->psi[0](ik,0,0)); - double *ga = new double[npw]; - for (int ig = 0;ig < npw;ig++) - { - ModuleBase::Vector3 v3 = GlobalC::wfcpw->getgpluskcar(ik,ig); - ga[ig] = v3[id] * tpiba; - } - //px|right> - for (int ib = 0; ib < nbands ; ++ib) - { - for (int ig = 0; ig < npw; ++ig) - { - prevc[ib*npw+ig] = ga[ig] * levc[ib*npwx+ig]; - } - - } - zgemm_(&transc,&transn,&nbands,&nbands,&npw,&ModuleBase::ONE,levc,&npwx,prevc,&npw,&ModuleBase::ZERO,pij,&nbands); - MPI_Allreduce(MPI_IN_PLACE, pij ,2 * nbands * nbands, MPI_DOUBLE, MPI_SUM, POOL_WORLD); - int ntper = nt/GlobalV::NPROC_IN_POOL; - int itstart = ntper * GlobalV::RANK_IN_POOL; - if(nt%GlobalV::NPROC_IN_POOL > GlobalV::RANK_IN_POOL) - { - ntper++; - itstart += GlobalV::RANK_IN_POOL; - } - else + for (int ik = 0; ik < nk; ++ik) + { + for (int id = 0; id < ndim; ++id) { - itstart += nt%GlobalV::NPROC_IN_POOL; - } - - - for(int it = itstart ; it < itstart+ntper ; ++it) - // for(int it = 0 ; it < nt; ++it) - { - double tmct11 = 0; - double tmct12 = 0; - double tmct22 = 0; - double *enb=&(this->pelec->ekb(ik,0)); - for(int ib = 0 ; ib < nbands ; ++ib) + this->phami->updateHk(ik); + const int npw = GlobalC::kv.ngk[ik]; + + complex *pij = new complex[nbands * nbands]; + complex *prevc = new complex[npw * nbands]; + complex *levc = &(this->psi[0](ik, 0, 0)); + double *ga = new double[npw]; + for (int ig = 0; ig < npw; ig++) + { + ModuleBase::Vector3 v3 = GlobalC::wfcpw->getgpluskcar(ik, ig); + ga[ig] = v3[id] * tpiba; + } + // px|right> + for (int ib = 0; ib < nbands; ++ib) { - double ei = enb[ib]; - double fi = GlobalC::wf.wg(ik,ib); - for(int jb = ib + 1 ; jb < nbands ; ++jb) + for (int ig = 0; ig < npw; ++ig) { - double ej = enb[jb]; - double fj = GlobalC::wf.wg(ik,jb); - double tmct = sin((ej-ei)*(it)*dt)*(fi-fj)*norm(pij[ib*nbands+jb]); - tmct11 += tmct; - tmct12 += - tmct * ((ei+ej)/2 - ef); - tmct22 += tmct * pow((ei+ej)/2 - ef,2); + prevc[ib * npw + ig] = ga[ig] * levc[ib * npwx + ig]; } } - ct11[it] += tmct11/2.0; - ct12[it] += tmct12/2.0; - ct22[it] += tmct22/2.0; + zgemm_(&transc, + &transn, + &nbands, + &nbands, + &npw, + &ModuleBase::ONE, + levc, + &npwx, + prevc, + &npw, + &ModuleBase::ZERO, + pij, + &nbands); +#ifdef __MPI + MPI_Allreduce(MPI_IN_PLACE, pij, 2 * nbands * nbands, MPI_DOUBLE, MPI_SUM, POOL_WORLD); +#endif + int ntper = nt / GlobalV::NPROC_IN_POOL; + int itstart = ntper * GlobalV::RANK_IN_POOL; + if (nt % GlobalV::NPROC_IN_POOL > GlobalV::RANK_IN_POOL) + { + ntper++; + itstart += GlobalV::RANK_IN_POOL; + } + else + { + itstart += nt % GlobalV::NPROC_IN_POOL; + } + + for (int it = itstart; it < itstart + ntper; ++it) + // for(int it = 0 ; it < nt; ++it) + { + double tmct11 = 0; + double tmct12 = 0; + double tmct22 = 0; + double *enb = &(this->pelec->ekb(ik, 0)); + for (int ib = 0; ib < nbands; ++ib) + { + double ei = enb[ib]; + double fi = GlobalC::wf.wg(ik, ib); + for (int jb = ib + 1; jb < nbands; ++jb) + { + double ej = enb[jb]; + double fj = GlobalC::wf.wg(ik, jb); + double tmct = sin((ej - ei) * (it)*dt) * (fi - fj) * norm(pij[ib * nbands + jb]); + tmct11 += tmct; + tmct12 += -tmct * ((ei + ej) / 2 - ef); + tmct22 += tmct * pow((ei + ej) / 2 - ef, 2); + } + } + ct11[it] += tmct11 / 2.0; + ct12[it] += tmct12 / 2.0; + ct22[it] += tmct22 / 2.0; + } + delete[] pij; + delete[] prevc; + delete[] ga; } - delete [] pij; - delete [] prevc; - delete [] ga; - } } - MPI_Allreduce(MPI_IN_PLACE,ct11,nt,MPI_DOUBLE,MPI_SUM,MPI_COMM_WORLD); - MPI_Allreduce(MPI_IN_PLACE,ct12,nt,MPI_DOUBLE,MPI_SUM,MPI_COMM_WORLD); - MPI_Allreduce(MPI_IN_PLACE,ct22,nt,MPI_DOUBLE,MPI_SUM,MPI_COMM_WORLD); - +#ifdef __MPI + MPI_Allreduce(MPI_IN_PLACE, ct11, nt, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD); + MPI_Allreduce(MPI_IN_PLACE, ct12, nt, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD); + MPI_Allreduce(MPI_IN_PLACE, ct22, nt, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD); +#endif + //------------------------------------------------------------------ // Output //------------------------------------------------------------------ - if(GlobalV::MY_RANK == 0) + if (GlobalV::MY_RANK == 0) { - calcondw(nt,dt,fwhmin,wcut,dw_in,ct11,ct12,ct22); + calcondw(nt, dt, fwhmin, wcut, dw_in, ct11, ct12, ct22); } delete[] ct11; delete[] ct12; delete[] ct22; } -void ESolver_KS_PW::calcondw(const int nt,const double dt,const double fwhmin,const double wcut,const double dw_in,double*ct11,double*ct12,double *ct22) +void ESolver_KS_PW::calcondw(const int nt, + const double dt, + const double fwhmin, + const double wcut, + const double dw_in, + double *ct11, + double *ct12, + double *ct22) { double factor = FACTOR; const int ndim = 3; - int nw = ceil(wcut/dw_in); - double dw = dw_in / ModuleBase::Ry_to_eV; //converge unit in eV to Ry + int nw = ceil(wcut / dw_in); + double dw = dw_in / ModuleBase::Ry_to_eV; // converge unit in eV to Ry double sigma = fwhmin / TWOSQRT2LN2 / ModuleBase::Ry_to_eV; ofstream ofscond("je-je.txt"); - ofscond< void HamiltLCAO::updateHk(const int ik) } const int inc = 1; BlasConnector::copy(this->LM->Sloc.size(), this->LM->Sloc.data(), inc, this->smatrix_k, inc); - hsolver::DiagoElpa::is_already_decomposed = false; + hsolver::DiagoElpa::DecomposedState = 0; } ModuleBase::timer::tick("HamiltLCAO", "updateHk"); return; @@ -303,4 +303,4 @@ template <> void HamiltLCAO>::constructHamilt() #endif } -} // namespace hamilt \ No newline at end of file +} // namespace hamilt diff --git a/source/module_hamilt/operator.h b/source/module_hamilt/operator.h index 044fc8489b..8610247375 100644 --- a/source/module_hamilt/operator.h +++ b/source/module_hamilt/operator.h @@ -93,6 +93,11 @@ class Operator //create a new hpsi and delete old hpsi later T* hpsi_pointer = std::get<2>(info); const T* psi_pointer = std::get<0>(info)->get_pointer(); + if(this->hpsi != nullptr) + { + delete this->hpsi; + this->hpsi = nullptr; + } if(!hpsi_pointer) { ModuleBase::WARNING_QUIT("Operator::hPsi", "hpsi_pointer can not be nullptr"); diff --git a/source/module_hsolver/CMakeLists.txt b/source/module_hsolver/CMakeLists.txt index 702bec735d..898fc6cc08 100644 --- a/source/module_hsolver/CMakeLists.txt +++ b/source/module_hsolver/CMakeLists.txt @@ -12,6 +12,8 @@ add_library( diago_lapack.cpp ) +add_subdirectory(genelpa) + IF (BUILD_TESTING) add_subdirectory(test) endif() diff --git a/source/module_hsolver/diago_elpa.cpp b/source/module_hsolver/diago_elpa.cpp index 4bb7403121..69ac995e3d 100644 --- a/source/module_hsolver/diago_elpa.cpp +++ b/source/module_hsolver/diago_elpa.cpp @@ -7,52 +7,16 @@ extern "C" { #include "module_base/blacs_connector.h" -#include "my_elpa.h" #include "module_base/scalapack_connector.h" } +#include "genelpa/elpa_solver.h" typedef hamilt::MatrixBlock matd; typedef hamilt::MatrixBlock> matcd; namespace hsolver { -bool DiagoElpa::is_already_decomposed = false; -#ifdef __MPI -inline int set_elpahandle(elpa_t &handle, - const int *desc, - const int local_nrows, - const int local_ncols, - const int nbands) -{ - int error; - int nprows, npcols, myprow, mypcol; - Cblacs_gridinfo(desc[1], &nprows, &npcols, &myprow, &mypcol); - elpa_init(20210430); - handle = elpa_allocate(&error); - elpa_set_integer(handle, "na", desc[2], &error); - elpa_set_integer(handle, "nev", nbands, &error); - - elpa_set_integer(handle, "local_nrows", local_nrows, &error); - - elpa_set_integer(handle, "local_ncols", local_ncols, &error); - - elpa_set_integer(handle, "nblk", desc[4], &error); - - elpa_set_integer(handle, "mpi_comm_parent", MPI_Comm_c2f(MPI_COMM_WORLD), &error); - - elpa_set_integer(handle, "process_row", myprow, &error); - - elpa_set_integer(handle, "process_col", mypcol, &error); - - elpa_set_integer(handle, "blacs_context", desc[1], &error); - - elpa_set_integer(handle, "cannon_for_generalized", 0, &error); - /* Setup */ - elpa_setup(handle); /* Set tunables */ - return 0; -} -#endif - +int DiagoElpa::DecomposedState = 0; void DiagoElpa::diag(hamilt::Hamilt *phm_in, psi::Psi> &psi, double *eigenvalue_in) { ModuleBase::TITLE("DiagoElpa", "diag"); @@ -62,31 +26,15 @@ void DiagoElpa::diag(hamilt::Hamilt *phm_in, psi::Psi> &psi std::vector eigen(GlobalV::NLOCAL, 0.0); - static elpa_t handle; - static bool has_set_elpa_handle = false; - if (!has_set_elpa_handle) - { - set_elpahandle(handle, h_mat.desc, h_mat.row, h_mat.col, GlobalV::NBANDS); - has_set_elpa_handle = true; - } - - // compare to old code from pplab, there is no need to copy Sloc2 to another memory, - // just change Sloc2, which is a temporary matrix - // size_t nloc = h_mat.col * h_mat.row, - // BlasConnector::copy(nloc, s_mat, inc, Stmp, inc); - + bool isReal=false; + const MPI_Comm COMM_DIAG=MPI_COMM_WORLD; // use all processes + ELPA_Solver es((const bool)isReal, COMM_DIAG, (const int)GlobalV::NBANDS, (const int)h_mat.row, (const int)h_mat.col, (const int*)h_mat.desc); + this->DecomposedState=0; // for k pointer, the decomposed s_mat can not be reused ModuleBase::timer::tick("DiagoElpa", "elpa_solve"); - int elpa_derror; - elpa_generalized_eigenvectors_dc(handle, - reinterpret_cast(h_mat.p), - reinterpret_cast(s_mat.p), - eigen.data(), - reinterpret_cast(psi.get_pointer()), - 0, - &elpa_derror); + es.generalized_eigenvector(h_mat.p, s_mat.p, this->DecomposedState, eigen.data(), psi.get_pointer()); ModuleBase::timer::tick("DiagoElpa", "elpa_solve"); + es.exit(); - // the eigenvalues. const int inc = 1; BlasConnector::copy(GlobalV::NBANDS, eigen.data(), inc, eigenvalue_in, inc); #else @@ -103,43 +51,14 @@ void DiagoElpa::diag(hamilt::Hamilt *phm_in, psi::Psi &psi, double *eige std::vector eigen(GlobalV::NLOCAL, 0.0); - static elpa_t handle; - static bool has_set_elpa_handle = false; - if (!has_set_elpa_handle) - { - set_elpahandle(handle, h_mat.desc, h_mat.row, h_mat.col, GlobalV::NBANDS); - has_set_elpa_handle = true; - } - - // compare to old code from pplab, there is no need to copy Sloc2 to another memory, - // just change Sloc2, which is a temporary matrix - // change this judgement to HamiltLCAO - /*int is_already_decomposed; - if(ifElpaHandle(GlobalC::CHR.get_new_e_iteration(), (GlobalV::CALCULATION=="nscf"))) - { - ModuleBase::timer::tick("DiagoElpa","decompose_S"); - BlasConnector::copy(pv->nloc, s_mat, inc, Stmp, inc); - is_already_decomposed=0; - ModuleBase::timer::tick("DiagoElpa","decompose_S"); - } - else - { - is_already_decomposed=1; - }*/ - + bool isReal=true; + MPI_Comm COMM_DIAG=MPI_COMM_WORLD; // use all processes + //ELPA_Solver es(isReal, COMM_DIAG, GlobalV::NBANDS, h_mat.row, h_mat.col, h_mat.desc); + ELPA_Solver es((const bool)isReal, COMM_DIAG, (const int)GlobalV::NBANDS, (const int)h_mat.row, (const int)h_mat.col, (const int*)h_mat.desc); ModuleBase::timer::tick("DiagoElpa", "elpa_solve"); - int elpa_error; - elpa_generalized_eigenvectors_d(handle, - h_mat.p, - s_mat.p, - eigen.data(), - psi.get_pointer(), - DiagoElpa::is_already_decomposed, - &elpa_error); + es.generalized_eigenvector(h_mat.p, s_mat.p, this->DecomposedState, eigen.data(), psi.get_pointer()); ModuleBase::timer::tick("DiagoElpa", "elpa_solve"); - - //S matrix has been decomposed - DiagoElpa::is_already_decomposed = true; + es.exit(); const int inc = 1; ModuleBase::GlobalFunc::OUT(GlobalV::ofs_running, "K-S equation was solved by genelpa2"); @@ -162,4 +81,4 @@ bool DiagoElpa::ifElpaHandle(const bool &newIteration, const bool &ifNSCF) } #endif -} // namespace hsolver \ No newline at end of file +} // namespace hsolver diff --git a/source/module_hsolver/diago_elpa.h b/source/module_hsolver/diago_elpa.h index 379d33ee8e..0820bc6200 100644 --- a/source/module_hsolver/diago_elpa.h +++ b/source/module_hsolver/diago_elpa.h @@ -14,9 +14,9 @@ class DiagoElpa : public DiagH void diag(hamilt::Hamilt* phm_in, psi::Psi& psi, double* eigenvalue_in) override; void diag(hamilt::Hamilt* phm_in, psi::Psi>& psi, double* eigenvalue_in) override; - - static bool is_already_decomposed; + static int DecomposedState; + private: #ifdef __MPI bool ifElpaHandle(const bool& newIteration, const bool& ifNSCF); @@ -25,4 +25,4 @@ class DiagoElpa : public DiagH } // namespace hsolver -#endif \ No newline at end of file +#endif diff --git a/source/module_hsolver/genelpa/CMakeLists.txt b/source/module_hsolver/genelpa/CMakeLists.txt new file mode 100644 index 0000000000..e962f742f7 --- /dev/null +++ b/source/module_hsolver/genelpa/CMakeLists.txt @@ -0,0 +1 @@ +add_library(genelpa OBJECT elpa_new.cpp elpa_new_real.cpp elpa_new_complex.cpp utils.cpp) diff --git a/source/module_hsolver/genelpa/Cblacs.h b/source/module_hsolver/genelpa/Cblacs.h new file mode 100644 index 0000000000..35a7ccfdfb --- /dev/null +++ b/source/module_hsolver/genelpa/Cblacs.h @@ -0,0 +1,24 @@ +#pragma once +// blacs + // Initialization +#include "mpi.h" +int Csys2blacs_handle(MPI_Comm SysCtxt); +void Cblacs_pinfo(int *myid, int *nprocs); +void Cblacs_get(int icontxt, int what, int *val); +void Cblacs_gridinit(int* icontxt, char *layout, int nprow, int npcol); +void Cblacs_gridmap(int* icontxt, int *usermap, int ldumap, int nprow, int npcol); + // Destruction +void Cblacs_gridexit(int icontxt); + // Informational and Miscellaneous +void Cblacs_gridinfo(int icontxt, int* nprow, int *npcol, int *myprow, int *mypcol); +int Cblacs_pnum(int icontxt, int prow, int pcol); +void Cblacs_pcoord(int icontxt, int pnum, int *prow, int *pcol); +void Cblacs_barrier(int icontxt, char *scope); + // Point to Point +void Cdgesd2d(int icontxt, int m, int n, double *a, int lda, int rdest, int cdest); +void Cdgerv2d(int icontxt, int m, int n, double *a, int lda, int rsrc, int csrc); +void Czgesd2d(int icontxt, int m, int n, double _Complex *a, int lda, int rdest, int cdest); +void Czgerv2d(int icontxt, int m, int n, double _Complex *a, int lda, int rsrc, int csrc); + // Combine +//void Cdgamx2d(int icontxt, int scope, int top, int m, int n, +// double *a, int lda, int *ra, int *ca, int rcflag, int rdest, int cdest); diff --git a/source/module_hsolver/genelpa/README b/source/module_hsolver/genelpa/README new file mode 100644 index 0000000000..54005767af --- /dev/null +++ b/source/module_hsolver/genelpa/README @@ -0,0 +1,4 @@ +GenELPA, v1.1.1, customized for ABACUS + +Project: + diff --git a/source/module_hsolver/genelpa/blas.h b/source/module_hsolver/genelpa/blas.h new file mode 100644 index 0000000000..90266a702d --- /dev/null +++ b/source/module_hsolver/genelpa/blas.h @@ -0,0 +1,27 @@ +#pragma once +//blas +void dcopy_(const int *n, const double *x, const int *incx, double *y, const int *incy); +void zcopy_(const int *n, const double _Complex *x, const int *incx, double _Complex *y, const int *incy); +void dgemm_(const char *transa, const char *transb, const int *m, const int *n, const int *k, + const double *alpha, double *a, const int *lda, + double *b, const int *ldb, + const double *beta, double *c, const int *ldc); +void dsymm_(char *side, char *uplo, int *m, int *n, + const double *alpha, double *a, int *lda, + double *b, int *ldb, + const double *beta, double *c, int *ldc); +void dtrsm_(char *side, char *uplo, char *transa, char *diag, int *m, int *n, + const double *alpha, double *a, int *lda, + double *b, int *ldb); +//void zcopy_(int *n, double _Complex *x, int *incx, double _Complex *y, int *incy); +void zgemm_(const char *transa, const char *transb, const int *m, const int *n, const int *k, + const double _Complex *alpha, double _Complex *a, const int *lda, + double _Complex *b, const int *ldb, + const double _Complex *beta, double _Complex *c, const int *ldc); +void zsymm_(char *side, char *uplo, int *m, int *n, + const double _Complex *alpha, double _Complex *a, int *lda, + double _Complex *b, int *ldb, + const double _Complex *beta, double _Complex *c, int *ldc); +void ztrsm_(char *side, char *uplo, char *transa, char *diag, int *m, int *n, + double _Complex *alpha, double _Complex *a, int *lda, + double _Complex *b, int *ldb); \ No newline at end of file diff --git a/source/module_hsolver/genelpa/elpa_generic.hpp b/source/module_hsolver/genelpa/elpa_generic.hpp new file mode 100644 index 0000000000..a6b3f82ff5 --- /dev/null +++ b/source/module_hsolver/genelpa/elpa_generic.hpp @@ -0,0 +1,408 @@ +// `elpa_generic.h` replacement for version 2021.05.002 and earlier versions +// If the file `elpa_generic.h` has keywords `elpa_eigenvectors_all_host_arrays_dc`, +// it is the new version of 2021.11.002; otherwise it is the old version. +#pragma once +#include "elpa_new.h" +static inline void elpa_set(elpa_t e, const char *name, int value, int *error) +{ + elpa_set_integer(e, name, value, error); +} + +static inline void elpa_set(elpa_t e, const char *name, double value, int *error) +{ + elpa_set_double(e, name, value, error); +} + +static inline void elpa_get(elpa_t e, const char *name, int *value, int *error) +{ + elpa_get_integer(e, name, value, error); +} + +static inline void elpa_get(elpa_t e, const char *name, double *value, int *error) +{ + elpa_get_double(e, name, value, error); +} + +#if ELPA_API_VERSION <= 20210430 // ELPA 2021.05.002 and earlier versions + +static inline void elpa_eigenvectors(elpa_t handle, double *a, double *ev, double *q, int *error) +{ + elpa_eigenvectors_d(handle, a, ev, q, error); +} + +static inline void elpa_eigenvectors(elpa_t handle, float *a, float *ev, float *q, int *error) +{ + elpa_eigenvectors_f(handle, a, ev, q, error); +} + +static inline void elpa_eigenvectors(elpa_t handle, double complex *a, double *ev, double complex *q, int *error) +{ + elpa_eigenvectors_dc(handle, a, ev, q, error); +} + +static inline void elpa_eigenvectors(elpa_t handle, float complex *a, float *ev, float complex *q, int *error) +{ + elpa_eigenvectors_fc(handle, a, ev, q, error); +} + +static inline void elpa_skew_eigenvectors(elpa_t handle, double *a, double *ev, double *q, int *error) +{ + elpa_eigenvectors_d(handle, a, ev, q, error); +} + +static inline void elpa_skew_eigenvectors(elpa_t handle, float *a, float *ev, float *q, int *error) +{ + elpa_eigenvectors_f(handle, a, ev, q, error); +} + +static inline void elpa_generalized_eigenvectors(elpa_t handle, + double *a, + double *b, + double *ev, + double *q, + int is_already_decomposed, + int *error) +{ + elpa_generalized_eigenvectors_d(handle, a, b, ev, q, is_already_decomposed, error); +} + +static inline void elpa_generalized_eigenvectors(elpa_t handle, + float *a, + float *b, + float *ev, + float *q, + int is_already_decomposed, + int *error) +{ + elpa_generalized_eigenvectors_f(handle, a, b, ev, q, is_already_decomposed, error); +} + +static inline void elpa_generalized_eigenvectors(elpa_t handle, + double complex *a, + double complex *b, + double *ev, + double complex *q, + int is_already_decomposed, + int *error) +{ + elpa_generalized_eigenvectors_dc(handle, a, b, ev, q, is_already_decomposed, error); +} + +static inline void elpa_generalized_eigenvectors(elpa_t handle, + float complex *a, + float complex *b, + float *ev, + float complex *q, + int is_already_decomposed, + int *error) +{ + elpa_generalized_eigenvectors_fc(handle, a, b, ev, q, is_already_decomposed, error); +} + +static inline void elpa_eigenvalues(elpa_t handle, double *a, double *ev, int *error) +{ + elpa_eigenvalues_d(handle, a, ev, error); +} + +static inline void elpa_eigenvalues(elpa_t handle, float *a, float *ev, int *error) +{ + elpa_eigenvalues_f(handle, a, ev, error); +} + +static inline void elpa_eigenvalues(elpa_t handle, double complex *a, double *ev, int *error) +{ + elpa_eigenvalues_dc(handle, a, ev, error); +} + +static inline void elpa_eigenvalues(elpa_t handle, float complex *a, float *ev, int *error) +{ + elpa_eigenvalues_fc(handle, a, ev, error); +} + +static inline void elpa_skew_eigenvalues(elpa_t handle, double *a, double *ev, int *error) +{ + elpa_eigenvalues_d(handle, a, ev, error); +} + +static inline void elpa_skew_eigenvalues(elpa_t handle, float *a, float *ev, int *error) +{ + elpa_eigenvalues_f(handle, a, ev, error); +} + +static inline void elpa_cholesky(elpa_t handle, double *a, int *error) +{ + elpa_cholesky_d(handle, a, error); +} + +static inline void elpa_cholesky(elpa_t handle, float *a, int *error) +{ + elpa_cholesky_f(handle, a, error); +} +#else // ELPA version >= 2021.11.002 +static inline void elpa_eigenvectors(elpa_t handle, double *a, double *ev, double *q, int *error) +{ + elpa_eigenvectors_all_host_arrays_d(handle, a, ev, q, error); +} + +static inline void elpa_eigenvectors(elpa_t handle, float *a, float *ev, float *q, int *error) +{ + elpa_eigenvectors_all_host_arrays_f(handle, a, ev, q, error); +} + +static inline void elpa_eigenvectors(elpa_t handle, double complex *a, double *ev, double complex *q, int *error) +{ + elpa_eigenvectors_all_host_arrays_dc(handle, a, ev, q, error); +} + +static inline void elpa_eigenvectors(elpa_t handle, float complex *a, float *ev, float complex *q, int *error) +{ + elpa_eigenvectors_all_host_arrays_fc(handle, a, ev, q, error); +} + +static inline void elpa_eigenvectors_double(elpa_t handle, double *a, double *ev, double *q, int *error) +{ + elpa_eigenvectors_device_pointer_d(handle, a, ev, q, error); +} + +static inline void elpa_eigenvectors_float(elpa_t handle, float *a, float *ev, float *q, int *error) +{ + elpa_eigenvectors_device_pointer_f(handle, a, ev, q, error); +} + +static inline void elpa_eigenvectors_double_complex(elpa_t handle, + double complex *a, + double *ev, + double complex *q, + int *error) +{ + elpa_eigenvectors_device_pointer_dc(handle, a, ev, q, error); +} + +static inline void elpa_eigenvectors_float_complex(elpa_t handle, + float complex *a, + float *ev, + float complex *q, + int *error) +{ + elpa_eigenvectors_device_pointer_fc(handle, a, ev, q, error); +} + +static inline void elpa_skew_eigenvectors(elpa_t handle, double *a, double *ev, double *q, int *error) +{ + elpa_eigenvectors_all_host_arrays_d(handle, a, ev, q, error); +} + +static inline void elpa_skew_eigenvectors(elpa_t handle, float *a, float *ev, float *q, int *error) +{ + elpa_eigenvectors_all_host_arrays_f(handle, a, ev, q, error); +} + +static inline void elpa_skew_eigenvectors_double(elpa_t handle, double *a, double *ev, double *q, int *error) +{ + elpa_eigenvectors_device_pointer_d(handle, a, ev, q, error); +} + +static inline void elpa_skew_eigenvectors_float(elpa_t handle, float *a, float *ev, float *q, int *error) +{ + elpa_eigenvectors_device_pointer_f(handle, a, ev, q, error); +} + +static inline void elpa_generalized_eigenvectors(elpa_t handle, + double *a, + double *b, + double *ev, + double *q, + int is_already_decomposed, + int *error) +{ + elpa_generalized_eigenvectors_d(handle, a, b, ev, q, is_already_decomposed, error); +} + +static inline void elpa_generalized_eigenvectors(elpa_t handle, + float *a, + float *b, + float *ev, + float *q, + int is_already_decomposed, + int *error) +{ + elpa_generalized_eigenvectors_f(handle, a, b, ev, q, is_already_decomposed, error); +} + +static inline void elpa_generalized_eigenvectors(elpa_t handle, + double complex *a, + double complex *b, + double *ev, + double complex *q, + int is_already_decomposed, + int *error) +{ + elpa_generalized_eigenvectors_dc(handle, a, b, ev, q, is_already_decomposed, error); +} + +static inline void elpa_generalized_eigenvectors(elpa_t handle, + float complex *a, + float complex *b, + float *ev, + float complex *q, + int is_already_decomposed, + int *error) +{ + elpa_generalized_eigenvectors_fc(handle, a, b, ev, q, is_already_decomposed, error); +} + +static inline void elpa_eigenvalues(elpa_t handle, double *a, double *ev, int *error) +{ + elpa_eigenvalues_all_host_arrays_d(handle, a, ev, error); +} + +static inline void elpa_eigenvalues(elpa_t handle, float *a, float *ev, int *error) +{ + elpa_eigenvalues_all_host_arrays_f(handle, a, ev, error); +} + +static inline void elpa_eigenvalues(elpa_t handle, double complex *a, double *ev, int *error) +{ + elpa_eigenvalues_all_host_arrays_dc(handle, a, ev, error); +} + +static inline void elpa_eigenvalues(elpa_t handle, float complex *a, float *ev, int *error) +{ + elpa_eigenvalues_all_host_arrays_fc(handle, a, ev, error); +} + +static inline void elpa_eigenvalues_double(elpa_t handle, double *a, double *ev, int *error) +{ + elpa_eigenvalues_device_pointer_d(handle, a, ev, error); +} + +static inline void elpa_eigenvalues_float(elpa_t handle, float *a, float *ev, int *error) +{ + elpa_eigenvalues_device_pointer_f(handle, a, ev, error); +} + +static inline void elpa_eigenvalues_double_complex(elpa_t handle, double complex *a, double *ev, int *error) +{ + elpa_eigenvalues_device_pointer_dc(handle, a, ev, error); +} + +static inline void elpa_eigenvalues_float_complex(elpa_t handle, float complex *a, float *ev, int *error) +{ + elpa_eigenvalues_device_pointer_fc(handle, a, ev, error); +} + +static inline void elpa_skew_eigenvalues(elpa_t handle, double *a, double *ev, int *error) +{ + elpa_eigenvalues_all_host_arrays_d(handle, a, ev, error); +} + +static inline void elpa_skew_eigenvalues(elpa_t handle, float *a, float *ev, int *error) +{ + elpa_eigenvalues_all_host_arrays_f(handle, a, ev, error); +} + +static inline void elpa_skew_eigenvalues_double(elpa_t handle, double *a, double *ev, int *error) +{ + elpa_eigenvalues_device_pointer_d(handle, a, ev, error); +} + +static inline void elpa_skew_eigenvalues_float(elpa_t handle, float *a, float *ev, int *error) +{ + elpa_eigenvalues_device_pointer_f(handle, a, ev, error); +} + +#endif // ELPA_API_VERSION <= 20210430 + +static inline void elpa_cholesky(elpa_t handle, double complex *a, int *error) +{ + elpa_cholesky_dc(handle, a, error); +} + +static inline void elpa_cholesky(elpa_t handle, float complex *a, int *error) +{ + elpa_cholesky_fc(handle, a, error); +} + +static inline void elpa_hermitian_multiply(elpa_t handle, + char uplo_a, + char uplo_c, + int ncb, + double *a, + double *b, + int nrows_b, + int ncols_b, + double *c, + int nrows_c, + int ncols_c, + int *error) +{ + elpa_hermitian_multiply_d(handle, uplo_a, uplo_c, ncb, a, b, nrows_b, ncols_b, c, nrows_c, ncols_c, error); +} + +static inline void elpa_hermitian_multiply(elpa_t handle, + char uplo_a, + char uplo_c, + int ncb, + float *a, + float *b, + int nrows_b, + int ncols_b, + float *c, + int nrows_c, + int ncols_c, + int *error) +{ + elpa_hermitian_multiply_df(handle, uplo_a, uplo_c, ncb, a, b, nrows_b, ncols_b, c, nrows_c, ncols_c, error); +} + +static inline void elpa_hermitian_multiply(elpa_t handle, + char uplo_a, + char uplo_c, + int ncb, + double complex *a, + double complex *b, + int nrows_b, + int ncols_b, + double complex *c, + int nrows_c, + int ncols_c, + int *error) +{ + elpa_hermitian_multiply_dc(handle, uplo_a, uplo_c, ncb, a, b, nrows_b, ncols_b, c, nrows_c, ncols_c, error); +} + +static inline void elpa_hermitian_multiply(elpa_t handle, + char uplo_a, + char uplo_c, + int ncb, + float complex *a, + float complex *b, + int nrows_b, + int ncols_b, + float complex *c, + int nrows_c, + int ncols_c, + int *error) +{ + elpa_hermitian_multiply_fc(handle, uplo_a, uplo_c, ncb, a, b, nrows_b, ncols_b, c, nrows_c, ncols_c, error); +} + +static inline void elpa_invert_triangular(elpa_t handle, double *a, int *error) +{ + elpa_invert_trm_d(handle, a, error); +} + +static inline void elpa_invert_triangular(elpa_t handle, float *a, int *error) +{ + elpa_invert_trm_f(handle, a, error); +} + +static inline void elpa_invert_triangular(elpa_t handle, double complex *a, int *error) +{ + elpa_invert_trm_dc(handle, a, error); +} + +static inline void elpa_invert_triangular(elpa_t handle, float complex *a, int *error) +{ + elpa_invert_trm_fc(handle, a, error); +} diff --git a/source/module_hsolver/genelpa/elpa_new.cpp b/source/module_hsolver/genelpa/elpa_new.cpp new file mode 100644 index 0000000000..bc42d1de94 --- /dev/null +++ b/source/module_hsolver/genelpa/elpa_new.cpp @@ -0,0 +1,461 @@ +#include "elpa_new.h" + +#include "elpa_solver.h" +#include "my_math.hpp" +#include "utils.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +using namespace std; + +map NEW_ELPA_HANDLE_POOL; + +ELPA_Solver::ELPA_Solver(const bool isReal, + const MPI_Comm comm, + const int nev, + const int narows, + const int nacols, + const int* desc) +{ + this->isReal = isReal; + this->comm = comm; + this->nev = nev; + this->narows = narows; + this->nacols = nacols; + for (int i = 0; i < 9; ++i) + this->desc[i] = desc[i]; + cblacs_ctxt = desc[1]; + nFull = desc[2]; + nblk = desc[4]; + lda = desc[8]; + // cout<<"parameters are passed\n"; + MPI_Comm_rank(comm, &myid); + Cblacs_gridinfo(cblacs_ctxt, &nprows, &npcols, &myprow, &mypcol); + // cout<<"blacs grid is inited\n"; + allocate_work(); + // cout<<"work array is inited\n"; + if (isReal) + kernel_id = read_real_kernel(); + else + kernel_id = read_complex_kernel(); + // cout<<"kernel id is inited as "<setQR(0); + this->setKernel(isReal, kernel_id); + // cout<<"elpa kernel is setup\n"; + this->setLoglevel(0); + // cout<<"log level is setup\n"; +} + +ELPA_Solver::ELPA_Solver(const bool isReal, + const MPI_Comm comm, + const int nev, + const int narows, + const int nacols, + const int* desc, + const int* otherParameter) +{ + this->isReal = isReal; + this->comm = comm; + this->nev = nev; + this->narows = narows; + this->nacols = nacols; + for (int i = 0; i < 9; ++i) + this->desc[i] = desc[i]; + + kernel_id = otherParameter[0]; + useQR = otherParameter[1]; + loglevel = otherParameter[2]; + + cblacs_ctxt = desc[1]; + nFull = desc[2]; + nblk = desc[4]; + lda = desc[8]; + MPI_Comm_rank(comm, &myid); + Cblacs_gridinfo(cblacs_ctxt, &nprows, &npcols, &myprow, &mypcol); + allocate_work(); + + int error; + static map NEW_ELPA_HANDLE_POOL; + static int total_handle; + + elpa_init(20210430); + + handle_id = ++total_handle; + elpa_t handle; + handle = elpa_allocate(&error); + NEW_ELPA_HANDLE_POOL[handle_id] = handle; + + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "na", nFull, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "nev", nev, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "local_nrows", narows, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "local_ncols", nacols, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "nblk", nblk, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "mpi_comm_parent", MPI_Comm_c2f(comm), &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "process_row", myprow, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "process_col", mypcol, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "blacs_context", cblacs_ctxt, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "solver", ELPA_SOLVER_2STAGE, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "debug", wantDebug, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "qr", useQR, &error); + this->setQR(useQR); + this->setKernel(isReal, kernel_id); + this->setLoglevel(loglevel); +} + +void ELPA_Solver::setLoglevel(int loglevel) +{ + int error; + this->loglevel = loglevel; + static bool isLogfileInited = false; + + if (loglevel >= 2) + { + wantDebug = 1; + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "verbose", 1, &error); + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "debug", wantDebug, &error); + if (!isLogfileInited) + { + stringstream logfilename; + logfilename.str(""); + logfilename << "GenELPA_" << myid << ".log"; + logfile.open(logfilename.str()); + logfile << "logfile inited\n"; + isLogfileInited = true; + } + } + else + { + wantDebug = 0; + } +} + +void ELPA_Solver::setKernel(bool isReal, int kernel) +{ + this->kernel_id = kernel; + int error; + if (isReal) + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "real_kernel", kernel, &error); + else + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "complex_kernel", kernel, &error); +} + +void ELPA_Solver::setQR(int useQR) +{ + this->useQR = useQR; + int error; + elpa_set_integer(NEW_ELPA_HANDLE_POOL[handle_id], "qr", useQR, &error); +} + +void ELPA_Solver::exit() +{ + // delete[] dwork; + // delete[] zwork; + if (loglevel > 2) + logfile.close(); + int error; + elpa_deallocate(NEW_ELPA_HANDLE_POOL[handle_id], &error); +} + +int ELPA_Solver::read_cpuflag() +{ + int cpuflag = 0; + + ifstream f_cpuinfo("/proc/cpuinfo"); + string cpuinfo_line; + regex cpuflag_ex("flags.*"); + regex cpuflag_avx512(".*avx512.*"); + regex cpuflag_avx2(".*avx2.*"); + regex cpuflag_avx(".*avx.*"); + regex cpuflag_sse(".*sse.*"); + while (getline(f_cpuinfo, cpuinfo_line)) + { + if (regex_match(cpuinfo_line, cpuflag_ex)) + { + // cout< +#include + + struct elpa_struct; + typedef struct elpa_struct *elpa_t; + + struct elpa_autotune_struct; + typedef struct elpa_autotune_struct *elpa_autotune_t; + +#include +#include +// ELPA only provides a C interface header, causing inconsistence of complex +// between C99 (e.g. double complex) and C++11 (std::complex). +// Thus, we have to define a wrapper of complex over the c api +// for compatiability. +#define complex _Complex +#include + // #include +#undef complex + const char *elpa_strerr(int elpa_error); +} + +#define complex _Complex +#include "elpa_generic.hpp" // This is a wrapper for `elpa/elpa_generic.h`. +#undef complex \ No newline at end of file diff --git a/source/module_hsolver/genelpa/elpa_new_complex.cpp b/source/module_hsolver/genelpa/elpa_new_complex.cpp new file mode 100644 index 0000000000..b9cb9de375 --- /dev/null +++ b/source/module_hsolver/genelpa/elpa_new_complex.cpp @@ -0,0 +1,454 @@ +#include +#include +#include +#include +#include +#include +#include + +#include "elpa_new.h" +#include "elpa_solver.h" + +#include "my_math.hpp" +#include "utils.h" + +using namespace std; + +extern map NEW_ELPA_HANDLE_POOL; + +int ELPA_Solver::eigenvector(complex* A, double* EigenValue, complex* EigenVector) +{ + int info; + int allinfo; + double t; + + if((loglevel>0 && myid==0) || loglevel>1) + { + t=-1; + timer(myid, "elpa_eigenvectors_dc", "1", t); + } + elpa_eigenvectors(NEW_ELPA_HANDLE_POOL[handle_id], + reinterpret_cast(A), + EigenValue, reinterpret_cast(EigenVector), + &info); + if((loglevel>0 && myid==0) || loglevel>1) + { + timer(myid, "elpa_eigenvectors_dc", "1", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + return allinfo; +} + +int ELPA_Solver::generalized_eigenvector(complex* A, complex* B, int& DecomposedState, + double* EigenValue, complex* EigenVector) +{ + int info, allinfo; + double t; + + if((loglevel>0 && myid==0) || loglevel>1) + { + t=-1; + timer(myid, "decomposeRightMatrix", "1", t); + } + if(DecomposedState==0) // B is not decomposed + allinfo=decomposeRightMatrix(B, EigenValue, EigenVector, DecomposedState); + else + allinfo=0; + + if((loglevel>0 && myid==0) || loglevel>1) + { + timer(myid, "decomposeRightMatrix", "1", t); + } + if(allinfo != 0) + return allinfo; + + // transform A to A~ + if((loglevel>0 && myid==0) || loglevel>1) + { + t=-1; + timer(myid, "transform A to A~", "2", t); + } + if(DecomposedState == 1 || DecomposedState == 2) + { + // calculate A*U^-1, put to work + if(loglevel>1) + { + t=-1; + timer(myid, "A*U^-1", "2.1a", t); + } + Cpzgemm('C', 'N', nFull, 1.0, A, B, 0.0, zwork.data(), desc); + if(loglevel>1) + { + timer(myid, "A*U^-1", "2.1a", t); + } + + // calculate U^-C^(A*U^-1), put to a + if(loglevel>1) + { + t=-1; + timer(myid, "U^-T*(A*U^-1)", "2.2a", t); + } + Cpzgemm('C', 'N', nFull, 1.0, B, zwork.data(), 0.0, A, desc); + if(loglevel>1) + { + timer(myid, "U^-T*(A*U^-1)", "2.2a", t); + } + } + else + { + // calculate b*a^C and put to work + if(loglevel>1) + { + t=-1; + timer(myid, "B*A^T", "2.1b", t); + } + Cpzgemm('N', 'C', nFull, 1.0, B, A, 0.0, zwork.data(), desc); + if(loglevel>1) + { + timer(myid, "B*A^T", "2.1b", t); + } + // calculate b*work^C and put to a -- original A*x=v*B*x was transform to a*x'=v*x' + if(loglevel>1) + { + t=-1; + timer(myid, "B*(B*A^T)^T", "2.2b", t); + } + Cpzgemm('N', 'C', nFull, 1.0, B, zwork.data(), 0.0, A, desc); + if(loglevel>1) + { + timer(myid, "B*(B*A^T)^T", "2.2b", t); + } + } + if((loglevel>0 && myid==0) || loglevel>1) + { + timer(myid, "transform A to A~", "2", t); + } + + // calculate the eigenvalue and eigenvector of A~ + if((loglevel>0 && myid==0) || loglevel>1) + { + t=-1; + timer(myid, "elpa_eigenvectors", "3", t); + } + if(loglevel>2) saveMatrix("A_tilde.dat", nFull, A, desc, cblacs_ctxt); + //elpa_eigenvectors_all_host_arrays_dc(NEW_ELPA_HANDLE_POOL[handle_id], reinterpret_cast(A), + // EigenValue, reinterpret_cast(EigenVector), &info); + info=eigenvector(A, EigenValue, EigenVector); + if((loglevel>0 && myid==0) || loglevel>1) + { + timer(myid, "elpa_eigenvectors", "3", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + if(loglevel>2) saveMatrix("EigenVector_tilde.dat", nFull, EigenVector, desc, cblacs_ctxt); + + if((loglevel>0 && myid==0) || loglevel>1) + { + t=-1; + timer(myid, "composeEigenVector", "4", t); + } + // transform eigenvector c~ to original eigenvector c + allinfo=composeEigenVector(DecomposedState, B, EigenVector); + if((loglevel>0 && myid==0) || loglevel>1) + { + timer(myid, "composeEigenVector", "4", t); + } + return allinfo; +} + +int ELPA_Solver::decomposeRightMatrix(complex* B, double* EigenValue, complex* EigenVector, int& DecomposedState) +{ + double _Complex* b = reinterpret_cast(B); + double _Complex* q = reinterpret_cast(EigenVector); + + int info=0; + int allinfo=0; + double t; + + // first try cholesky decomposing + if(nFull1) + { + t=-1; + timer(myid, "pzpotrf_", "1", t); + } + Cpzpotrf('U', nFull, B, desc); + if(loglevel>1) + { + timer(myid, "pzpotrf_", "1", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + if(allinfo != 0) //if pdpotrf fail, try elpa_cholesky_real + { + DecomposedState=2; + if(loglevel>1) + { + t=-1; + timer(myid, "elpa_cholesky_dc", "2", t); + } + elpa_cholesky_dc(NEW_ELPA_HANDLE_POOL[handle_id], b, &info); + if(loglevel>1) + { + timer(myid, "elpa_cholesky_dc", "2", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + } + } else + { + DecomposedState=2; + if(loglevel>1) + { + t=-1; + timer(myid, "elpa_cholesky_dc", "1", t); + } + elpa_cholesky_dc(NEW_ELPA_HANDLE_POOL[handle_id], b, &info); + if(loglevel>1) + { + timer(myid, "elpa_cholesky_dc", "1", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + if(allinfo != 0) + { + DecomposedState=1; + if(loglevel>1) + { + t=-1; + timer(myid, "pzpotrf_", "2", t); + } + Cpzpotrf('U', nFull, B, desc); + if(loglevel>1) + { + timer(myid, "pzpotrf_", "2", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + } + } + + if(allinfo==0) // calculate U^{-1} + { + if(loglevel>1) + { + t=-1; + timer(myid, "clear low triangle", "1", t); + } + for(int j=0; jjGlobal) B[i+j*narows]=0; + } + } + if(loglevel>1) + { + timer(myid, "clear low triangle", "1", t); + } + if(loglevel>2) saveMatrix("U.dat", nFull, B, desc, cblacs_ctxt); + if(loglevel>1) + { + t=-1; + timer(myid, "invert U", "1", t); + } + elpa_invert_trm_dc(NEW_ELPA_HANDLE_POOL[handle_id], b, &info); + if(loglevel>1) + { + timer(myid, "invert U", "1", t); + } + if(loglevel>2) saveMatrix("U_inv.dat", nFull, B, desc, cblacs_ctxt); + } else { + // if cholesky decomposing failed, try diagonalize + // calculate B^{-1/2}_{i,j}=\sum_k q_{i,k}*ev_k^{-1/2}*q_{j,k} and put to b, which will be b^-1/2 + DecomposedState=3; + if(loglevel>1) + { + t=-1; + timer(myid, "calculate eigenvalue and eigenvector of B", "1", t); + } + //elpa_eigenvectors_all_host_arrays_dc(NEW_ELPA_HANDLE_POOL[handle_id], b, + // EigenValue, q, &info); + info=eigenvector(B, EigenValue, EigenVector); + if(loglevel>1) + { + timer(myid, "calculate eigenvalue and eigenvector of B", "1", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + // calculate q*ev and put to work + for(int i=0; iDBL_MIN?1.0/sqrt(EigenValue[eidx]):0; + for(int j=0; j1) + { + t=-1; + timer(myid, "qevq=qev*q^T", "2", t); + } + Cpzgemm('N', 'C', nFull, 1.0, zwork.data(), EigenVector, 0.0, B, desc); + if(loglevel>1) + { + timer(myid, "qevq=qev*q^T", "2", t); + } + } + return allinfo; +} + +int ELPA_Solver::composeEigenVector(int DecomposedState, complex* B, complex* EigenVector) +{ + double t; + if(DecomposedState==1 || DecomposedState==2) + { + // transform the eigenvectors to original general equation, let U^-1*q, and put to q + if(loglevel>1) + { + t=-1; + timer(myid, "Cpztrmm", "1", t); + } + Cpztrmm('L', 'U', 'N', 'N', nFull, nev, 1.0, B, EigenVector, desc); + if(loglevel>1) + { + timer(myid, "Cpztrmm", "1", t); + } + } else { + // transform the eigenvectors to original general equation, let b^C*q, and put to q + if(loglevel>1) + { + t=-1; + timer(myid, "Cpzgemm", "1", t); + } + Cpzgemm('C', 'N', nFull, nev, nFull, 1.0, B, zwork.data(), 0.0, EigenVector, desc); + if(loglevel>1) + { + timer(myid, "Cpzgemm", "1", t); + } + } + return 0; +} + +// calculate the error +// $ \ket{ \delta \psi_i } = H\ket{\psi_i} $ +// $ \delta_i = \braket{ \delta \psi_i | \delta \psi_i } $ +// +// V: eigenvector matrix +// D: Diagonal matrix of eigenvalue +// maxError: maximum absolute value of error +// meanError: mean absolute value of error +void ELPA_Solver::verify(complex* A, double* EigenValue, complex* EigenVector, + double &maxError, double &meanError) +{ + complex* V=EigenVector; + const int naloc=narows*nacols; + complex* D=new complex[naloc]; + complex* R=zwork.data(); + + for(int i=0; i2) saveMatrix("VD.dat", nFull, R, desc, cblacs_ctxt); + // R=A*V-V*D=A*V-R + Cpzhemm('L', 'U', nFull, 1.0, A, V, -1.0, R, desc); + if(loglevel>2) saveMatrix("AV-VD.dat", nFull, R, desc, cblacs_ctxt); + // calculate the maximum and mean value of sum_i{R(:,i)*R(:,i)} + double sumError=0; + maxError=0; + for(int i=1; i<=nev; ++i) + { + complex E; + Cpzdotc(nFull, E, R, 1, i, 1, + R, 1, i, 1, desc); + double abs_E=std::abs(E); + sumError+=abs_E; + maxError=std::max(maxError, abs_E); + } + meanError=sumError/nFull; + delete[] D; +} + +// calculate the error +// $ \ket{ \delta \psi_i } = (H - \epsilon_i S)\ket{\psi_i} $ +// $ \delta_i = \braket{ \delta \psi_i | \delta \psi_i } $ +// +// V: eigenvector matrix +// D: Diagonal matrix of eigenvalue +// maxError: maximum absolute value of error +// meanError: mean absolute value of error +void ELPA_Solver::verify(complex* A, complex* B, + double* EigenValue, complex* EigenVector, + double &maxError, double &meanError) +{ + complex* V=EigenVector; + const int naloc=narows*nacols; + complex* D=new complex[naloc]; + complex* R=new complex[naloc]; + + for(int i=0; i2) saveMatrix("BV.dat", nFull, zwork.data(), desc, cblacs_ctxt); + // R=B*V*D=zwork*D + Cpzhemm('R', 'U', nFull, 1.0, D, zwork.data(), 0.0, R, desc); + if(loglevel>2) saveMatrix("BVD.dat", nFull, R, desc, cblacs_ctxt); + // R=A*V-B*V*D=A*V-R + Cpzhemm('L', 'U', nFull, 1.0, A, V, -1.0, R, desc); + if(loglevel>2) saveMatrix("AV-BVD.dat", nFull, R, desc, cblacs_ctxt); + // calculate the maximum and mean value of sum_i{R(:,i)*R(:,i)} + double sumError=0; + maxError=0; + for(int i=1; i<=nev; ++i) + { + complex E; + Cpzdotc(nFull, E, R, 1, i, 1, + R, 1, i, 1, desc); + double abs_E=std::abs(E); + sumError+=abs_E; + maxError=std::max(maxError, abs_E); + } + meanError=sumError/nFull; + + delete[] D; + delete[] R; +} diff --git a/source/module_hsolver/genelpa/elpa_new_real.cpp b/source/module_hsolver/genelpa/elpa_new_real.cpp new file mode 100644 index 0000000000..d6b606b007 --- /dev/null +++ b/source/module_hsolver/genelpa/elpa_new_real.cpp @@ -0,0 +1,458 @@ +#include "elpa_new.h" +#include "elpa_solver.h" +#include "my_math.hpp" +#include "utils.h" + +#include +#include +#include +#include +#include +#include +#include + +using namespace std; +extern map NEW_ELPA_HANDLE_POOL; + +int ELPA_Solver::eigenvector(double* A, double* EigenValue, double* EigenVector) +{ + int info; + double t; + + if (loglevel > 0 && myid == 0) + { + t = -1; + timer(myid, "elpa_eigenvectors_all_host_arrays_d", "1", t); + } + elpa_eigenvectors(NEW_ELPA_HANDLE_POOL[handle_id], A, EigenValue, EigenVector, &info); + if (loglevel > 0 && myid == 0) + { + timer(myid, "elpa_eigenvectors_all_host_arrays_d", "1", t); + } + return info; +} + +int ELPA_Solver::generalized_eigenvector(double* A, + double* B, + int& DecomposedState, + double* EigenValue, + double* EigenVector) +{ + int info, allinfo; + double t; + + if (loglevel > 0 && myid == 0) + { + t = -1; + timer(myid, "decomposeRightMatrix", "1", t); + } + if (DecomposedState == 0) + allinfo = decomposeRightMatrix(B, EigenValue, EigenVector, DecomposedState); + else + allinfo = 0; + if (loglevel > 0 && myid == 0) + { + timer(myid, "decomposeRightMatrix", "1", t); + } + if (allinfo != 0) + return allinfo; + + // transform A to A~ + if ((loglevel > 0 && myid == 0) || loglevel > 1) + { + t = -1; + timer(myid, "transform A to A~", "2", t); + } + if (DecomposedState == 1 || DecomposedState == 2) + { + // calculate A*U^-1, put to work + if (loglevel > 1) + { + t = -1; + timer(myid, "A*U^-1", "2", t); + } + Cpdgemm('T', 'N', nFull, 1.0, A, B, 0.0, dwork.data(), desc); + if (loglevel > 1) + { + timer(myid, "A*U^-1", "2", t); + } + + // calculate U^-T^(A*U^-1), put to a + if (loglevel > 1) + { + t = -1; + timer(myid, "U^-T^(A*U^-1)", "3", t); + } + Cpdgemm('T', 'N', nFull, 1.0, B, dwork.data(), 0.0, A, desc); + if (loglevel > 1) + { + timer(myid, "U^-T^(A*U^-1)", "3", t); + } + } + else + { + // calculate B*A^T and put to work + if (loglevel > 1) + { + t = -1; + timer(myid, "B*A^T", "2", t); + } + Cpdgemm('N', 'T', nFull, 1.0, B, A, 0.0, dwork.data(), desc); + if (loglevel > 1) + { + timer(myid, "B*A^T", "2", t); + } + // calculate B*work^T = B*(B*A^T)^T and put to A -- original A*x=v*B*x was transform to a*x'=v*x' + if (loglevel > 1) + { + t = -1; + timer(myid, "B*work^T = B*(B*A^T)^T", "3", t); + } + Cpdgemm('N', 'T', nFull, 1.0, B, dwork.data(), 0.0, A, desc); + if (loglevel > 1) + { + timer(myid, "B*work^T = B*(B*A^T)^T", "3", t); + } + } + if ((loglevel > 0 && myid == 0) || loglevel > 1) + { + timer(myid, "transform A to A~", "2", t); + } + + if ((loglevel > 0 && myid == 0) || loglevel > 1) + { + t = -1; + timer(myid, "elpa_eigenvectors", "2", t); + } + if (loglevel > 2) + saveMatrix("A_tilde.dat", nFull, A, desc, cblacs_ctxt); + info = eigenvector(A, EigenValue, EigenVector); + + if ((loglevel > 0 && myid == 0) || loglevel > 1) + { + timer(myid, "elpa_eigenvectors", "2", t); + } + + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + if (loglevel > 2) + saveMatrix("EigenVector_tilde.dat", nFull, EigenVector, desc, cblacs_ctxt); + + if (loglevel > 0 && myid == 0) + { + t = -1; + timer(myid, "composeEigenVector", "3", t); + } + allinfo = composeEigenVector(DecomposedState, B, EigenVector); + if (loglevel > 0 && myid == 0) + { + timer(myid, "composeEigenVector", "3", t); + } + return allinfo; +} + +// calculate cholesky factorization of matrix B +// B = U^T * U +// and calculate the inverse: U^{-1} +// input: +// B: the right side matrix of generalized eigen equation +// output: +// DecomposedState: the method used to decompose right matrix +// 1 or 2: use cholesky decomposing, B=U^T*U +// 3: if cholesky decomposing failed, use diagonalizing +// B: decomposed right matrix +// when DecomposedState is 1 or 2, B is U^{-1} +// when DecomposedState is 3, B is B^{-1/2} +int ELPA_Solver::decomposeRightMatrix(double* B, double* EigenValue, double* EigenVector, int& DecomposedState) +{ + int info = 0; + int allinfo = 0; + double t; + + // first try cholesky decomposing + if (nFull < CHOLESKY_CRITICAL_SIZE) + { + DecomposedState = 1; + if (loglevel > 1) + { + t = -1; + timer(myid, "pdpotrf_", "1", t); + } + info = Cpdpotrf('U', nFull, B, desc); + if (loglevel > 1) + { + timer(myid, "pdpotrf_", "1", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + if (allinfo != 0) // pdpotrf fail, try elpa_cholesky_real + { + DecomposedState = 2; + if (loglevel > 1) + { + t = -1; + timer(myid, "elpa_cholesky_d", "2", t); + } + elpa_cholesky_d(NEW_ELPA_HANDLE_POOL[handle_id], B, &info); + if (loglevel > 1) + { + timer(myid, "elpa_cholesky_d", "2", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + } + } + else + { + DecomposedState = 2; + if (loglevel > 1) + { + t = -1; + timer(myid, "elpa_cholesky_d", "1", t); + } + elpa_cholesky_d(NEW_ELPA_HANDLE_POOL[handle_id], B, &info); + if (loglevel > 1) + { + timer(myid, "elpa_cholesky_d", "1", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + if (allinfo != 0) + { + DecomposedState = 1; + if (loglevel > 1) + { + t = -1; + timer(myid, "pdpotrf_", "2", t); + } + info = Cpdpotrf('U', nFull, B, desc); + if (loglevel > 1) + { + timer(myid, "pdpotrf_", "2", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + } + } + + if (allinfo == 0) // calculate U^{-1} + { + // clear low triangle + if (loglevel > 1) + { + t = -1; + timer(myid, "clear low triangle", "1", t); + } + for (int j = 0; j < nacols; ++j) + { + int jGlobal = globalIndex(j, nblk, npcols, mypcol); + for (int i = 0; i < narows; ++i) + { + int iGlobal = globalIndex(i, nblk, nprows, myprow); + if (iGlobal > jGlobal) + B[i + j * narows] = 0; + } + } + if (loglevel > 1) + { + timer(myid, "clear low triangle", "1", t); + } + if (loglevel > 2) + saveMatrix("U.dat", nFull, B, desc, cblacs_ctxt); + // calculate the inverse U^{-1} + if (loglevel > 1) + { + t = -1; + timer(myid, "invert U", "1", t); + } + elpa_invert_trm_d(NEW_ELPA_HANDLE_POOL[handle_id], B, &info); + if (loglevel > 1) + { + timer(myid, "invert U", "1", t); + } + if (loglevel > 2) + saveMatrix("U_inv.dat", nFull, B, desc, cblacs_ctxt); + } + else + { + // if cholesky decomposing failed, try diagonalize + // calculate B^{-1/2}_{i,j}=\sum_k q_{i,k}*ev_k^{-1/2}*q_{j,k} and put to b, which will be b^-1/2 + DecomposedState = 3; + if (loglevel > 1) + { + t = -1; + timer(myid, "calculate eigenvalue and eigenvector of B", "1", t); + } + // elpa_eigenvectors_all_host_arrays_d(NEW_ELPA_HANDLE_POOL[handle_id], B, EigenValue, EigenVector, &info); + info = eigenvector(B, EigenValue, EigenVector); + if (loglevel > 1) + { + timer(myid, "calculate eigenvalue and eigenvector of B", "1", t); + } + MPI_Allreduce(&info, &allinfo, 1, MPI_INT, MPI_MAX, comm); + + // calculate q*ev^{-1/2} and put to work + for (int i = 0; i < nacols; ++i) + { + int eidx = globalIndex(i, nblk, npcols, mypcol); + // double ev_sqrt=1.0/sqrt(ev[eidx]); + double ev_sqrt = EigenValue[eidx] > DBL_MIN ? 1.0 / sqrt(EigenValue[eidx]) : 0; + for (int j = 0; j < narows; ++j) + dwork[i * lda + j] = EigenVector[i * lda + j] * ev_sqrt; + } + + // calculate work*q=q*ev^{-1/2}*q^T, put to b, which is B^{-1/2} + if (loglevel > 1) + { + t = -1; + timer(myid, "qevq=qev*q^T", "2", t); + } + Cpdgemm('N', 'T', nFull, 1.0, dwork.data(), EigenVector, 0.0, B, desc); + if (loglevel > 1) + { + timer(myid, "qevq=qev*q^T", "2", t); + } + } + return allinfo; +} + +int ELPA_Solver::composeEigenVector(int DecomposedState, double* B, double* EigenVector) +{ + double t; + if (DecomposedState == 1 || DecomposedState == 2) + { + // transform the eigenvectors to original general equation, let U^-1*q, and put to q + if (loglevel > 1) + { + t = -1; + timer(myid, "Cpdtrmm", "1", t); + } + Cpdtrmm('L', 'U', 'N', 'N', nFull, nev, 1.0, B, EigenVector, desc); + if (loglevel > 1) + { + timer(myid, "Cpdtrmm", "1", t); + } + } + else + { + // transform the eigenvectors to original general equation, let b^T*q, and put to q + if (loglevel > 1) + { + t = -1; + timer(myid, "Cpdgemm", "1", t); + } + Cpdgemm('T', 'N', nFull, 1.0, B, dwork.data(), 0.0, EigenVector, desc); + if (loglevel > 1) + { + timer(myid, "Cpdgemm", "1", t); + } + } + return 0; +} + +// calculate error of sum_i{R(:,i)*R(:,i)}, where R = A*V - V*D +// V: eigenvector matrix +// D: Diaganal matrix of eigenvalue +// maxError: maximum error +// meanError: mean error +void ELPA_Solver::verify(double* A, double* EigenValue, double* EigenVector, double& maxError, double& meanError) +{ + double* V = EigenVector; + const int naloc = narows * nacols; + double* D = new double[naloc]; + double* R = dwork.data(); + + for (int i = 0; i < naloc; ++i) + D[i] = 0; + + for (int i = 0; i < nFull; ++i) + { + int localRow, localCol; + int localProcRow, localProcCol; + + localRow = localIndex(i, nblk, nprows, localProcRow); + if (myprow == localProcRow) + { + localCol = localIndex(i, nblk, npcols, localProcCol); + if (mypcol == localProcCol) + { + int idx = localRow + localCol * narows; + D[idx] = EigenValue[i]; + } + } + } + + // R=V*D + Cpdsymm('R', 'U', nFull, nev, 1.0, D, V, 0.0, R, desc); + // R=A*V-V*D=A*V-R + Cpdsymm('L', 'U', nFull, nev, 1.0, A, V, -1.0, R, desc); + // calculate the maximum and mean value of sum_i{R(:,i)*R(:,i)} + double sumError = 0; + maxError = 0; + for (int i = 1; i <= nev; ++i) + { + double E; + Cpddot(nFull, E, R, 1, i, 1, R, 1, i, 1, desc); + // printf("myid: %d, i: %d, E: %lf\n", myid, i, E); + sumError += E; + maxError = maxError > E ? maxError : E; + } + meanError = sumError / nFull; + // global mean and max Error + delete[] D; +} + +// calculate remains of A*V - B*V*D +// V: eigenvector matrix +// D: Diaganal matrix of eigenvalue +// maxError: maximum absolute value of error +// meanError: mean absolute value of error +void ELPA_Solver::verify(double* A, + double* B, + double* EigenValue, + double* EigenVector, + double& maxError, + double& meanError) +{ + double* V = EigenVector; + const int naloc = narows * nacols; + double* D = new double[naloc]; + double* R = new double[naloc]; + + for (int i = 0; i < naloc; ++i) + D[i] = 0; + + for (int i = 0; i < nFull; ++i) + { + int localRow, localCol; + int localProcRow, localProcCol; + + localRow = localIndex(i, nblk, nprows, localProcRow); + if (myprow == localProcRow) + { + localCol = localIndex(i, nblk, npcols, localProcCol); + if (mypcol == localProcCol) + { + int idx = localRow + localCol * narows; + D[idx] = EigenValue[i]; + } + } + } + + // dwork=B*V + Cpdsymm('L', 'U', nFull, 1.0, B, V, 0.0, dwork.data(), desc); + // R=B*V*D=dwork*D + Cpdsymm('R', 'U', nFull, 1.0, D, dwork.data(), 0.0, R, desc); + // R=A*V-B*V*D=A*V-R + Cpdsymm('L', 'U', nFull, 1.0, A, V, -1.0, R, desc); + // calculate the maximum and mean value of sum_i{R(:,i)*R(:,i)} + double sumError = 0; + maxError = 0; + for (int i = 1; i <= nev; ++i) + { + double E; + Cpddot(nFull, E, R, 1, i, 1, R, 1, i, 1, desc); + // printf("myid: %d, i: %d, E: %lf\n", myid, i, E); + sumError += E; + maxError = maxError > E ? maxError : E; + } + meanError = sumError / nFull; + + delete[] D; + delete[] R; +} diff --git a/source/module_hsolver/genelpa/elpa_solver.h b/source/module_hsolver/genelpa/elpa_solver.h new file mode 100644 index 0000000000..13b8bc5ecc --- /dev/null +++ b/source/module_hsolver/genelpa/elpa_solver.h @@ -0,0 +1,98 @@ +#pragma once +#include "mpi.h" + +#include +#include +#include + +class ELPA_Solver +{ + public: + ELPA_Solver(const bool isReal, + const MPI_Comm comm, + const int nev, + const int narows, + const int nacols, + const int* desc); + ELPA_Solver(const bool isReal, + const MPI_Comm comm, + const int nev, + const int narows, + const int nacols, + const int* desc, + const int* otherParameter); + + int eigenvector(double* A, double* EigenValue, double* EigenVector); + int generalized_eigenvector(double* A, double* B, int& DecomposedState, double* EigenValue, double* EigenVector); + int eigenvector(std::complex* A, double* EigenValue, std::complex* EigenVector); + int generalized_eigenvector(std::complex* A, + std::complex* B, + int& DecomposedState, + double* EigenValue, + std::complex* EigenVector); + void setLoglevel(int loglevel); + void setKernel(bool isReal, int Kernel); + void setQR(int useQR); + void outputParameters(); + void verify(double* A, double* EigenValue, double* EigenVector, double& maxRemain, double& meanRemain); + void verify(double* A, double* B, double* EigenValue, double* EigenVector, double& maxRemain, double& meanRemain); + void verify(std::complex* A, + double* EigenValue, + std::complex* EigenVector, + double& maxError, + double& meanError); + void verify(std::complex* A, + std::complex* B, + double* EigenValue, + std::complex* EigenVector, + double& maxError, + double& meanError); + void exit(); + + private: + const int CHOLESKY_CRITICAL_SIZE = 1000; + bool isReal; + MPI_Comm comm; + int nFull; + int nev; + int narows; + int nacols; + int desc[9]; + int method; + int kernel_id; + int cblacs_ctxt; + int nblk; + int lda; + std::vector dwork; + std::vector> zwork; + int myid; + int nprows; + int npcols; + int myprow; + int mypcol; + int useQR; + int wantDebug; + int loglevel; + std::ofstream logfile; + // for legacy interface + int comm_f; + int mpi_comm_rows; + int mpi_comm_cols; + // for new elpa handle + int handle_id; + + // toolbox + int read_cpuflag(); + int read_real_kernel(); + int read_complex_kernel(); + int allocate_work(); + int decomposeRightMatrix(double* B, double* EigenValue, double* EigenVector, int& DecomposedState); + int decomposeRightMatrix(std::complex* B, + double* EigenValue, + std::complex* EigenVector, + int& DecomposedState); + int composeEigenVector(int DecomposedState, double* B, double* EigenVector); + int composeEigenVector(int DecomposedState, std::complex* B, std::complex* EigenVector); + // debug tool + void timer(int myid, const char function[], const char step[], double& t0); +}; diff --git a/source/module_hsolver/genelpa/my_math.hpp b/source/module_hsolver/genelpa/my_math.hpp new file mode 100644 index 0000000000..999389e852 --- /dev/null +++ b/source/module_hsolver/genelpa/my_math.hpp @@ -0,0 +1,384 @@ +#pragma once +// simple wrappers for blas, pblas and scalapack +// NOTE: some parameters of these functions are not supported +extern "C" +{ +#include "Cblacs.h" +#include "blas.h" +#include "pblas.h" +#include "scalapack.h" + +} +#include + +static inline void Cdcopy(const int n, double* a, double* b) +{ + int inc = 1; + dcopy_(&n, a, &inc, b, &inc); +} + +static inline void Czcopy(const int n, std::complex* a, std::complex* b) +{ + double _Complex* aa = reinterpret_cast(a); + double _Complex* bb = reinterpret_cast(b); + int inc = 1; + zcopy_(&n, aa, &inc, bb, &inc); +} + +static inline void Cpddot(int n, + double& dot, + double* a, + int ia, + int ja, + int inca, + double* b, + int ib, + int jb, + int incb, + int* desc) +{ + pddot_(&n, &dot, a, &ia, &ja, desc, &inca, b, &ib, &jb, desc, &incb); +} + +static inline void Cpzdotc(int n, + std::complex& dotc, + std::complex* a, + int ia, + int ja, + int inca, + std::complex* b, + int ib, + int jb, + int incb, + int* desc) +{ + double _Complex* aa = reinterpret_cast(a); + double _Complex* bb = reinterpret_cast(b); + double _Complex* dotc_c = reinterpret_cast(&dotc); + pzdotc_(&n, dotc_c, aa, &ia, &ja, desc, &inca, bb, &ib, &jb, desc, &incb); +} + +static inline int Cpdpotrf(const char uplo, const int na, double* U, int* desc) +{ + int isrc = 1; + int info; + pdpotrf_(&uplo, &na, U, &isrc, &isrc, desc, &info); + return info; +} + +static inline int Cpzpotrf(const char uplo, const int na, std::complex* U, int* desc) +{ + int isrc = 1; + int info; + double _Complex* uu = reinterpret_cast(U); + pzpotrf_(&uplo, &na, uu, &isrc, &isrc, desc, &info); + return info; +} + +static inline void Cpdtrmm(char side, + char uplo, + char trans, + char diag, + int m, + int n, + double alpha, + double* a, + double* b, + int* desc) +{ + int isrc = 1; + pdtrmm_(&side, &uplo, &trans, &diag, &m, &n, &alpha, a, &isrc, &isrc, desc, b, &isrc, &isrc, desc); +} + +static inline void Cpztrmm(char side, + char uplo, + char trans, + char diag, + int m, + int n, + std::complex alpha, + std::complex* a, + std::complex* b, + int* desc) +{ + int isrc = 1; + double _Complex* alpha_c = reinterpret_cast(&alpha); + double _Complex* aa = reinterpret_cast(a); + double _Complex* bb = reinterpret_cast(b); + pztrmm_(&side, &uplo, &trans, &diag, &m, &n, alpha_c, aa, &isrc, &isrc, desc, bb, &isrc, &isrc, desc); +} + +static inline void Cpdgemm(char transa, + char transb, + int m, + int n, + int k, + double alpha, + double* a, + double* b, + double beta, + double* c, + int* desc) +{ + int isrc = 1; + pdgemm_(&transa, + &transb, + &m, + &n, + &k, + &alpha, + a, + &isrc, + &isrc, + desc, + b, + &isrc, + &isrc, + desc, + &beta, + c, + &isrc, + &isrc, + desc); +} + +static inline void Cpdgemm(char transa, + char transb, + int m, + double alpha, + double* a, + double* b, + double beta, + double* c, + int* desc) +{ + int isrc = 1; + pdgemm_(&transa, + &transb, + &m, + &m, + &m, + &alpha, + a, + &isrc, + &isrc, + desc, + b, + &isrc, + &isrc, + desc, + &beta, + c, + &isrc, + &isrc, + desc); +} + +static inline void Cpzgemm(char transa, + char transb, + int m, + int n, + int k, + std::complex alpha, + std::complex* a, + std::complex* b, + std::complex beta, + std::complex* c, + int* desc) +{ + double _Complex* alpha_c = reinterpret_cast(&alpha); + double _Complex* beta_c = reinterpret_cast(&beta); + double _Complex* aa = reinterpret_cast(a); + double _Complex* bb = reinterpret_cast(b); + double _Complex* cc = reinterpret_cast(c); + int isrc = 1; + pzgemm_(&transa, + &transb, + &m, + &n, + &k, + alpha_c, + aa, + &isrc, + &isrc, + desc, + bb, + &isrc, + &isrc, + desc, + beta_c, + cc, + &isrc, + &isrc, + desc); +} + +static inline void Cpzgemm(char transa, + char transb, + int m, + std::complex alpha, + std::complex* a, + std::complex* b, + std::complex beta, + std::complex* c, + int* desc) +{ + double _Complex* alpha_c = reinterpret_cast(&alpha); + double _Complex* beta_c = reinterpret_cast(&beta); + double _Complex* aa = reinterpret_cast(a); + double _Complex* bb = reinterpret_cast(b); + double _Complex* cc = reinterpret_cast(c); + int isrc = 1; + pzgemm_(&transa, + &transb, + &m, + &m, + &m, + alpha_c, + aa, + &isrc, + &isrc, + desc, + bb, + &isrc, + &isrc, + desc, + beta_c, + cc, + &isrc, + &isrc, + desc); +} + +static inline void Cpdsymm(char side, + char uplo, + int m, + int n, + double alpha, + double* a, + double* b, + double beta, + double* c, + int* desc) +{ + int isrc = 1; + pdsymm_(&side, &uplo, &m, &n, &alpha, a, &isrc, &isrc, desc, b, &isrc, &isrc, desc, &beta, c, &isrc, &isrc, desc); +} + +static inline void Cpdsymm(char side, + char uplo, + int na, + double alpha, + double* a, + double* b, + double beta, + double* c, + int* desc) +{ + int isrc = 1; + pdsymm_(&side, &uplo, &na, &na, &alpha, a, &isrc, &isrc, desc, b, &isrc, &isrc, desc, &beta, c, &isrc, &isrc, desc); +} + +static inline void Cpzsymm(char side, + char uplo, + int na, + std::complex alpha, + std::complex* a, + std::complex* b, + std::complex beta, + std::complex* c, + int* desc) +{ + double _Complex* alpha_c = reinterpret_cast(&alpha); + double _Complex* beta_c = reinterpret_cast(&beta); + double _Complex* aa = reinterpret_cast(a); + double _Complex* bb = reinterpret_cast(b); + double _Complex* cc = reinterpret_cast(c); + int isrc = 1; + pzsymm_(&side, + &uplo, + &na, + &na, + alpha_c, + aa, + &isrc, + &isrc, + desc, + bb, + &isrc, + &isrc, + desc, + beta_c, + cc, + &isrc, + &isrc, + desc); +} + +static inline void Cpzhemm(char side, + char uplo, + int na, + std::complex alpha, + std::complex* a, + std::complex* b, + std::complex beta, + std::complex* c, + int* desc) +{ + double _Complex* alpha_c = reinterpret_cast(&alpha); + double _Complex* beta_c = reinterpret_cast(&beta); + double _Complex* aa = reinterpret_cast(a); + double _Complex* bb = reinterpret_cast(b); + double _Complex* cc = reinterpret_cast(c); + int isrc = 1; + pzhemm_(&side, + &uplo, + &na, + &na, + alpha_c, + aa, + &isrc, + &isrc, + desc, + bb, + &isrc, + &isrc, + desc, + beta_c, + cc, + &isrc, + &isrc, + desc); +} + +static inline void Cpdgemr2d(int M, + int N, + double* a, + int ia, + int ja, + int* desca, + double* b, + int ib, + int jb, + int* descb, + int blacs_ctxt) +{ + pdgemr2d_(&M, &N, a, &ia, &ja, desca, b, &ib, &jb, descb, &blacs_ctxt); +} + +static inline void Cpzgemr2d(int M, + int N, + std::complex* a, + int ia, + int ja, + int* desca, + std::complex* b, + int ib, + int jb, + int* descb, + int blacs_ctxt) +{ + double _Complex* aa = reinterpret_cast(a); + double _Complex* bb = reinterpret_cast(b); + pzgemr2d_(&M, &N, aa, &ia, &ja, desca, bb, &ib, &jb, descb, &blacs_ctxt); +} diff --git a/source/module_hsolver/genelpa/pblas.h b/source/module_hsolver/genelpa/pblas.h new file mode 100644 index 0000000000..51ac6f3671 --- /dev/null +++ b/source/module_hsolver/genelpa/pblas.h @@ -0,0 +1,40 @@ +#pragma once +void pddot_(int* n, double* dot, double* x, int* ix, int* jx, int* descx, int* incx, + double* y, int* iy, int* jy, int* descy, int* incy); + +void pzdotc_(int* n, double _Complex* dot, double _Complex* x, int* ix, int* jx, int* descx, int* incx, + double _Complex* y, int* iy, int* jy, int* descy, int* incy); +void pdsymv_(char* uplo, int* n, + double* alpha, double* a, int* ia, int* ja, int* desca, + double* x, int* ix, int* jx, int* descx, int* incx, + double* beta, double* y, int* iy, int* jy, int* descy, int* incy); +void pdtran_(int* m , int* n , + double* alpha , double* a , int* ia , int* ja , int* desca , + double* beta , double* c , int* ic , int* jc , int* descc ); + +void pdgemm_(char* transa , char* transb , int* m , int* n , int* k , + double* alpha , double* a , int* ia , int* ja , int* desca , + double* b , int* ib , int* jb , int* descb , + double* beta , double* c , int* ic , int* jc , int* descc ); +void pzgemm_(char* transa , char* transb , int* m , int* n , int* k , + double _Complex* alpha , double _Complex* a , int* ia , int* ja , int* desca , + double _Complex* b , int* ib , int* jb , int* descb , + double _Complex* beta , double _Complex* c , int* ic , int* jc , int* descc ); +void pdsymm_(char* side , char* uplo , int* m , int* n , + double* alpha , double* a , int* ia , int* ja , int* desca , + double* b , int* ib , int* jb , int* descb , + double* beta , double* c , int* ic , int* jc , int* descc ); +void pzsymm_(char* side , char* uplo , int* m , int* n , + double _Complex* alpha , double _Complex* a , int* ia , int* ja , int* desca , + double _Complex* b , int* ib , int* jb , int* descb , + double _Complex* beta , double _Complex* c , int* ic , int* jc , int* descc ); +void pzhemm_(char* side , char* uplo , int* m , int* n , + double _Complex* alpha , double _Complex* a , int* ia , int* ja , int* desca , + double _Complex* b , int* ib , int* jb , int* descb , + double _Complex* beta , double _Complex* c , int* ic , int* jc , int* descc ); +void pdtrmm_(char* side , char* uplo , char* transa , char* diag , int* m , int* n , + double* alpha , double* a , int* ia , int* ja , int* desca , + double* b , int* ib , int* jb , int* descb ); +void pztrmm_(char* side , char* uplo , char* transa , char* diag , int* m , int* n , + double _Complex* alpha , double _Complex* a , int* ia , int* ja , int* desca , + double _Complex* b , int* ib , int* jb , int* descb ); diff --git a/source/module_hsolver/genelpa/scalapack.h b/source/module_hsolver/genelpa/scalapack.h new file mode 100644 index 0000000000..39e18358e3 --- /dev/null +++ b/source/module_hsolver/genelpa/scalapack.h @@ -0,0 +1,12 @@ +#pragma once +// scalapack +int numroc_(const int *N, const int *NB, const int *IPROC, const int *ISRCPROC, const int *NPROCS); +void descinit_(int *DESC, const int *M, const int *N, const int *MB, const int *NB, const int *IRSRC, const int *ICSRC, const int *ICTXT, const int *LLD, int *INFO); +void pdpotrf_(const char *UPLO, const int *N, double *A, const int *IA, const int *JA, const int *DESCA, int *INFO); +void pzpotrf_(const char *UPLO, const int *N, double _Complex *A, const int *IA, const int *JA, const int *DESCA, int *INFO); +void pdsyev_(const char *JOBZ, const char *UPLO, int *N, double *A, int *IA, int *JA, int *DESCA, + double *W, double *Z, int *IZ, int *JZ, int *DESCZ, double *WORK, int *LWORK, int *INFO); +void pdgemr2d_(int *M, int *N, double *A, int *IA, int *JA, int *DESCA, + double *B, int *IB, int *JB, int *DESCB, int *ICTXT); +void pzgemr2d_(int *M, int *N, double _Complex *A, int *IA, int *JA, int *DESCA, + double _Complex *B, int *IB, int *JB, int *DESCB, int *ICTXT); diff --git a/source/module_hsolver/genelpa/utils.cpp b/source/module_hsolver/genelpa/utils.cpp new file mode 100644 index 0000000000..77db0a4d41 --- /dev/null +++ b/source/module_hsolver/genelpa/utils.cpp @@ -0,0 +1,351 @@ +#include "utils.h" + +#include "my_math.hpp" + +#include +#include +#include +#include +#include +#include + +void initBlacsGrid(int loglevel, + MPI_Comm comm, + int nFull, + int nblk, + int& blacs_ctxt, + int& narows, + int& nacols, + int desc[]) +{ + std::stringstream outlog; + char BLACS_LAYOUT = 'C'; + int ISRCPROC = 0; // fortran array starts from 1 + int nprows, npcols; + int myprow, mypcol; + int nprocs, myid; + int info; + MPI_Comm_size(comm, &nprocs); + MPI_Comm_rank(comm, &myid); + // set blacs parameters + for (npcols = int(sqrt(double(nprocs))); npcols >= 2; --npcols) + { + if (nprocs % npcols == 0) + break; + } + nprows = nprocs / npcols; + if ((loglevel > 0 && myid == 0) || loglevel > 1) + { + outlog.str(""); + outlog << "myid " << myid << ": nprows: " << nprows << " ; npcols: " << npcols << std::endl; + std::cout << outlog.str(); + } + + // int comm_f = MPI_Comm_c2f(comm); + blacs_ctxt = Csys2blacs_handle(comm); + Cblacs_gridinit(&blacs_ctxt, &BLACS_LAYOUT, nprows, npcols); + if ((loglevel > 0 && myid == 0) || loglevel > 1) + { + outlog.str(""); + outlog << "myid " << myid << ": Cblacs_gridinit done, blacs_ctxt: " << blacs_ctxt << std::endl; + std::cout << outlog.str(); + } + Cblacs_gridinfo(blacs_ctxt, &nprows, &npcols, &myprow, &mypcol); + if ((loglevel > 0 && myid == 0) || loglevel > 1) + { + int mypnum = Cblacs_pnum(blacs_ctxt, myprow, mypcol); + int prow, pcol; + Cblacs_pcoord(blacs_ctxt, myid, &prow, &pcol); + outlog.str(""); + outlog << "myid " << myid << ": myprow: " << myprow << " ;mypcol: " << mypcol << std::endl; + outlog << "myid " << myid << ": mypnum: " << mypnum << std::endl; + outlog << "myid " << myid << ": prow: " << prow << " ;pcol: " << pcol << std::endl; + std::cout << outlog.str(); + } + + narows = numroc_(&nFull, &nblk, &myprow, &ISRCPROC, &nprows); + nacols = numroc_(&nFull, &nblk, &mypcol, &ISRCPROC, &npcols); + descinit_(desc, &nFull, &nFull, &nblk, &nblk, &ISRCPROC, &ISRCPROC, &blacs_ctxt, &narows, &info); + + if ((loglevel > 0 && myid == 0) || loglevel > 1) + { + outlog.str(""); + outlog << "myid " << myid << ": narows: " << narows << " nacols: " << nacols << std::endl; + outlog << "myid " << myid << ": blacs parameters setting" << std::endl; + outlog << "myid " << myid << ": desc is: "; + for (int i = 0; i < 9; ++i) + outlog << desc[i] << " "; + outlog << std::endl; + std::cout << outlog.str(); + } +} + +// load matrix from the file +void loadMatrix(const char FileName[], int nFull, double* a, int* desca, int blacs_ctxt) +{ + int nprows, npcols, myprow, mypcol; + Cblacs_gridinfo(blacs_ctxt, &nprows, &npcols, &myprow, &mypcol); + int myid = Cblacs_pnum(blacs_ctxt, myprow, mypcol); + + const int ROOT_PROC = 0; + std::ifstream matrixFile; + if (myid == ROOT_PROC) + matrixFile.open(FileName); + + double* b; // buffer + const int MAX_BUFFER_SIZE = 1e9; // max buffer size is 1GB + + int N = nFull; + int M + = std::max(1, std::min(nFull, (int)(MAX_BUFFER_SIZE / nFull / sizeof(double)))); // at lease 1 row, max size 1GB + if (myid == ROOT_PROC) + b = new double[M * N]; + else + b = new double[1]; + + // set descb, which has all elements in the only block in the root process + // block size is M x N, so all elements are in the first process + int descb[9] = {1, blacs_ctxt, M, N, M, N, 0, 0, M}; + + int ja = 1, ib = 1, jb = 1; + for (int ia = 1; ia < nFull; ia += M) + { + int thisM = std::min(M, nFull - ia + 1); // nFull-ia+1 is number of the last few rows to be read from file + // read from the file + if (myid == ROOT_PROC) + { + for (int i = 0; i < thisM; ++i) + { + for (int j = 0; j < N; ++j) + { + matrixFile >> b[i + j * M]; + } + } + } + // gather data rows by rows from all processes + Cpdgemr2d(thisM, N, b, ib, jb, descb, a, ia, ja, desca, blacs_ctxt); + } + + if (myid == ROOT_PROC) + matrixFile.close(); + + delete[] b; +} + +void saveLocalMatrix(const char filePrefix[], int narows, int nacols, double* a) +{ + using namespace std; + char FileName[80]; + int myid; + ofstream matrixFile; + MPI_Comm_rank(MPI_COMM_WORLD, &myid); + + sprintf(FileName, "%s_%3.3d.dat", filePrefix, myid); + matrixFile.open(FileName); + matrixFile.flags(std::ios_base::scientific); + matrixFile.precision(17); + matrixFile.width(24); + for (int i = 0; i < narows; ++i) + { + for (int j = 0; j < nacols; ++j) + { + matrixFile << a[i + j * narows] << " "; + } + matrixFile << std::endl; + } + matrixFile.close(); +} + +// use pdgemr2d to collect matrix from all processes to root process +// and save to one completed matrix file +void saveMatrix(const char FileName[], int nFull, double* a, int* desca, int blacs_ctxt) +{ + int nprows, npcols, myprow, mypcol; + Cblacs_gridinfo(blacs_ctxt, &nprows, &npcols, &myprow, &mypcol); + int myid = Cblacs_pnum(blacs_ctxt, myprow, mypcol); + + const int ROOT_PROC = 0; + std::ofstream matrixFile; + if (myid == ROOT_PROC) // setup saved matrix format + { + matrixFile.open(FileName); + matrixFile.flags(std::ios_base::scientific); + matrixFile.precision(17); + matrixFile.width(24); + } + + double* b; // buffer + const int MAX_BUFFER_SIZE = 1e9; // max buffer size is 1GB + + int N = nFull; + int M + = std::max(1, std::min(nFull, (int)(MAX_BUFFER_SIZE / nFull / sizeof(double)))); // at lease 1 row, max size 1GB + if (myid == ROOT_PROC) + b = new double[M * N]; + else + b = new double[1]; + + // set descb, which has all elements in the only block in the root process + int descb[9] = {1, blacs_ctxt, M, N, M, N, 0, 0, M}; + + int ja = 1, ib = 1, jb = 1; + for (int ia = 1; ia < nFull; ia += M) + { + int thisM = std::min(M, nFull - ia + 1); // nFull-ia+1 is the last few row to be saved + // gather data rows by rows from all processes + Cpdgemr2d(thisM, N, a, ia, ja, desca, b, ib, jb, descb, blacs_ctxt); + // write to the file + if (myid == ROOT_PROC) + { + for (int i = 0; i < thisM; ++i) + { + for (int j = 0; j < N; ++j) + { + matrixFile << b[i + j * M] << " "; + } + matrixFile << std::endl; + } + } + } + + if (myid == ROOT_PROC) + matrixFile.close(); + + delete[] b; +} + +// load matrix from the file +void loadMatrix(const char FileName[], int nFull, std::complex* a, int* desca, int blacs_ctxt) +{ + int nprows, npcols, myprow, mypcol; + Cblacs_gridinfo(blacs_ctxt, &nprows, &npcols, &myprow, &mypcol); + int myid = Cblacs_pnum(blacs_ctxt, myprow, mypcol); + + const int ROOT_PROC = 0; + std::ifstream matrixFile; + if (myid == ROOT_PROC) + matrixFile.open(FileName); + + std::complex* b; // buffer + const int MAX_BUFFER_SIZE = 1e9; // max buffer size is 1GB + + int N = nFull; + int M = std::max( + 1, + std::min(nFull, (int)(MAX_BUFFER_SIZE / nFull / (2 * sizeof(double))))); // at lease 1 row, max size 1GB + if (myid == ROOT_PROC) + b = new std::complex[M * N]; + else + b = new std::complex[1]; + + // set descb, which has all elements in the only block in the root process + // block size is M x N, so all elements are in the first process + int descb[9] = {1, blacs_ctxt, M, N, M, N, 0, 0, M}; + + int ja = 1, ib = 1, jb = 1; + for (int ia = 1; ia < nFull; ia += M) + { + int thisM = std::min(M, nFull - ia + 1); // nFull-ia+1 is number of the last few rows to be read from file + // read from the file + if (myid == ROOT_PROC) + { + for (int i = 0; i < thisM; ++i) + { + for (int j = 0; j < N; ++j) + { + matrixFile >> b[i + j * M]; + } + } + } + // gather data rows by rows from all processes + Cpzgemr2d(thisM, N, b, ib, jb, descb, a, ia, ja, desca, blacs_ctxt); + } + + if (myid == ROOT_PROC) + matrixFile.close(); + + delete[] b; +} + +void saveLocalMatrix(const char filePrefix[], int narows, int nacols, std::complex* a) +{ + using namespace std; + char FileName[80]; + int myid; + ofstream matrixFile; + + MPI_Comm_rank(MPI_COMM_WORLD, &myid); + + sprintf(FileName, "%s_%3.3d.dat", filePrefix, myid); + matrixFile.open(FileName); + matrixFile.flags(std::ios_base::scientific); + matrixFile.precision(17); + matrixFile.width(24); + for (int i = 0; i < narows; ++i) + { + for (int j = 0; j < nacols; ++j) + { + matrixFile << a[i + j * narows] << " "; + } + matrixFile << std::endl; + } + matrixFile.close(); +} + +// use pzgemr2d to collect matrix from all processes to root process +// and save to one completed matrix file +void saveMatrix(const char FileName[], int nFull, std::complex* a, int* desca, int blacs_ctxt) +{ + int nprows, npcols, myprow, mypcol; + Cblacs_gridinfo(blacs_ctxt, &nprows, &npcols, &myprow, &mypcol); + int myid = Cblacs_pnum(blacs_ctxt, myprow, mypcol); + + const int ROOT_PROC = 0; + std::ofstream matrixFile; + if (myid == ROOT_PROC) // setup saved matrix format + { + matrixFile.open(FileName); + matrixFile.flags(std::ios_base::scientific); + matrixFile.precision(17); + matrixFile.width(24); + } + + std::complex* b; // buffer + const int MAX_BUFFER_SIZE = 1e9; // max buffer size is 1GB + + int N = nFull; + int M + = std::max(1, std::min(nFull, (int)(MAX_BUFFER_SIZE / nFull / sizeof(double)))); // at lease 1 row, max size 1GB + if (myid == ROOT_PROC) + b = new std::complex[M * N]; + else + b = new std::complex[1]; + + // set descb, which has all elements in the only block in the root process + int descb[9] = {1, blacs_ctxt, M, N, M, N, 0, 0, M}; + + int ja = 1, ib = 1, jb = 1; + for (int ia = 1; ia < nFull; ia += M) + { + int transM = std::min(M, nFull - ia + 1); // nFull-ia+1 is the last few row to be saved + // gather data rows by rows from all processes + Cpzgemr2d(transM, N, a, ia, ja, desca, b, ib, jb, descb, blacs_ctxt); + // write to the file + if (myid == ROOT_PROC) + { + for (int i = 0; i < transM; ++i) + { + for (int j = 0; j < N; ++j) + { + matrixFile << b[i + j * M] << " "; + } + matrixFile << std::endl; + } + } + } + + if (myid == ROOT_PROC) + matrixFile.close(); + + delete[] b; +} diff --git a/source/module_hsolver/genelpa/utils.h b/source/module_hsolver/genelpa/utils.h new file mode 100644 index 0000000000..412bcdaca3 --- /dev/null +++ b/source/module_hsolver/genelpa/utils.h @@ -0,0 +1,43 @@ +#pragma once +#include +#include + +static inline int globalIndex(int localIndex, int nblk, int nprocs, int myproc) +{ + int iblock, gIndex; + iblock = localIndex / nblk; + gIndex = (iblock * nprocs + myproc) * nblk + localIndex % nblk; + return gIndex; +} + +static inline int localIndex(int globalIndex, int nblk, int nprocs, int& lcoalProc) +{ + lcoalProc = int((globalIndex % (nblk * nprocs)) / nblk); + return int(globalIndex / (nblk * nprocs)) * nblk + globalIndex % nblk; +} + +void initBlacsGrid(int loglevel, + MPI_Comm comm, + int nFull, + int nblk, + int& blacs_ctxt, + int& narows, + int& nacols, + int desc[]); + +// load matrix from the file +void loadMatrix(const char FileName[], int nFull, double* a, int* desca, int blacs_ctxt); + +void saveLocalMatrix(const char filePrefix[], int narows, int nacols, double* a); + +// use pdgemr2d to collect matrix from all processes to root process +// and save to one completed matrix file +void saveMatrix(const char FileName[], int nFull, double* a, int* desca, int blacs_ctxt); + +void loadMatrix(const char FileName[], int nFull, std::complex* a, int* desca, int blacs_ctxt); + +void saveLocalMatrix(const char filePrefix[], int narows, int nacols, std::complex* a); + +// use pzgemr2d to collect matrix from all processes to root process +// and save to one completed matrix file +void saveMatrix(const char FileName[], int nFull, std::complex* a, int* desca, int blacs_ctxt); diff --git a/source/module_hsolver/hsolver_pw.cpp b/source/module_hsolver/hsolver_pw.cpp index 911d15ac7a..84bc9552e5 100644 --- a/source/module_hsolver/hsolver_pw.cpp +++ b/source/module_hsolver/hsolver_pw.cpp @@ -158,7 +158,7 @@ void HSolverPW::hamiltSolvePsiK(hamilt::Hamilt* hm, psi::Psi &h_diag, const int ik, const int npw) { - h_diag.resize(h_diag.size(), 1.0); + h_diag.assign(h_diag.size(), 1.0); int precondition_type = 2; const double tpiba2 = this->wfc_basis->tpiba2; diff --git a/source/module_hsolver/test/CMakeLists.txt b/source/module_hsolver/test/CMakeLists.txt index 710efdbd46..39000d9330 100644 --- a/source/module_hsolver/test/CMakeLists.txt +++ b/source/module_hsolver/test/CMakeLists.txt @@ -16,7 +16,7 @@ AddTest( ) AddTest( TARGET HSolver_LCAO - LIBS ${math_libs} ELPA::ELPA base + LIBS ${math_libs} ELPA::ELPA base genelpa SOURCES diago_lcao_test.cpp ../diago_elpa.cpp ../diago_blas.cpp ../../src_parallel/parallel_global.cpp ../../src_parallel/parallel_common.cpp ../../src_parallel/parallel_reduce.cpp ) diff --git a/source/module_surchem/H_correction_pw.cpp b/source/module_surchem/H_correction_pw.cpp index 6ad2b036bc..1aaf2e6bbe 100644 --- a/source/module_surchem/H_correction_pw.cpp +++ b/source/module_surchem/H_correction_pw.cpp @@ -8,7 +8,7 @@ #include ModuleBase::matrix surchem::v_correction(const UnitCell &cell, - ModulePW::PW_Basis* rho_basis, + ModulePW::PW_Basis *rho_basis, const int &nspin, const double *const *const rho) { @@ -51,7 +51,14 @@ ModuleBase::matrix surchem::v_correction(const UnitCell &cell, return v; } -void surchem::add_comp_chg(const UnitCell &cell, ModulePW::PW_Basis* rho_basis, double q, double l, double center, complex *NG, int dim) +void surchem::add_comp_chg(const UnitCell &cell, + ModulePW::PW_Basis *rho_basis, + double q, + double l, + double center, + complex *NG, + int dim, + bool flag) { // x dim double tmp_q = 0.0; @@ -62,18 +69,24 @@ void surchem::add_comp_chg(const UnitCell &cell, ModulePW::PW_Basis* rho_basis, ModuleBase::GlobalFunc::ZEROS(NG, rho_basis->npw); for (int ig = 0; ig < rho_basis->npw; ig++) { - if(ig==rho_basis->ig_gge0) + if (ig == rho_basis->ig_gge0) + { + if(flag) + { + NG[ig] = complex(tmp_q * l / L, 0.0); + } continue; + } double GX = rho_basis->gcar[ig][0]; double GY = rho_basis->gcar[ig][1]; double GZ = rho_basis->gcar[ig][2]; GX = GX * 2 * ModuleBase::PI; if (GY == 0 && GZ == 0 && GX != 0) { - NG[ig] = exp(ModuleBase::NEG_IMAG_UNIT * GX * center) * complex(2.0 * tmp_q * sin(GX * l / 2.0) / (L * GX), 0.0); + NG[ig] = exp(ModuleBase::NEG_IMAG_UNIT * GX * center) + * complex(2.0 * tmp_q * sin(GX * l / 2.0) / (L * GX), 0.0); } } - // NG[0] = complex(tmp_q * l / L, 0.0); } // y dim else if (dim == 1) @@ -83,71 +96,123 @@ void surchem::add_comp_chg(const UnitCell &cell, ModulePW::PW_Basis* rho_basis, ModuleBase::GlobalFunc::ZEROS(NG, rho_basis->npw); for (int ig = 0; ig < rho_basis->npw; ig++) { - if(ig==rho_basis->ig_gge0) + if (ig == rho_basis->ig_gge0) + { + if(flag) + { + NG[ig] = complex(tmp_q * l / L, 0.0); + } continue; + } double GX = rho_basis->gcar[ig][0]; double GY = rho_basis->gcar[ig][1]; double GZ = rho_basis->gcar[ig][2]; GY = GY * 2 * ModuleBase::PI; if (GX == 0 && GZ == 0 && GY != 0) { - NG[ig] = exp(ModuleBase::NEG_IMAG_UNIT * GY * center) * complex(2.0 * tmp_q * sin(GY * l / 2.0) / (L * GY), 0.0); + NG[ig] = exp(ModuleBase::NEG_IMAG_UNIT * GY * center) + * complex(2.0 * tmp_q * sin(GY * l / 2.0) / (L * GY), 0.0); } } - // NG[0] = complex(tmp_q * l / L, 0.0); } // z dim else if (dim == 2) { double L = cell.a3[2]; - // cout << "area" << cross(cell.a1, cell.a2).norm() << endl; tmp_q = q / (cross(cell.a1, cell.a2).norm() * l); ModuleBase::GlobalFunc::ZEROS(NG, rho_basis->npw); for (int ig = 0; ig < rho_basis->npw; ig++) { - if(ig==rho_basis->ig_gge0) + if (ig == rho_basis->ig_gge0) + { + if(flag) + { + NG[ig] = complex(tmp_q * l / L, 0.0); + } continue; + } double GX = rho_basis->gcar[ig][0]; double GY = rho_basis->gcar[ig][1]; double GZ = rho_basis->gcar[ig][2]; GZ = GZ * 2 * ModuleBase::PI; if (GX == 0 && GY == 0 && GZ != 0) { - NG[ig] = exp(ModuleBase::NEG_IMAG_UNIT * GZ * center) * complex(2.0 * tmp_q * sin(GZ * l / 2.0) / (L * GZ), 0.0); + NG[ig] = exp(ModuleBase::NEG_IMAG_UNIT * GZ * center) + * complex(2.0 * tmp_q * sin(GZ * l / 2.0) / (L * GZ), 0.0); } } - // NG[0] = complex(tmp_q * l / L, 0.0); } } -ModuleBase::matrix surchem::v_compensating(const UnitCell &cell, ModulePW::PW_Basis *rho_basis) +ModuleBase::matrix surchem::v_compensating(const UnitCell &cell, + ModulePW::PW_Basis *rho_basis, + const int &nspin, + const double *const *const rho) { ModuleBase::TITLE("surchem", "v_compensating"); ModuleBase::timer::tick("surchem", "v_compensating"); + // calculating v_comp also need TOTN_real + double *Porter = new double[rho_basis->nrxx]; + for (int i = 0; i < rho_basis->nrxx; i++) + Porter[i] = 0.0; + const int nspin0 = (nspin == 2) ? 2 : 1; + for (int is = 0; is < nspin0; is++) + for (int ir = 0; ir < rho_basis->nrxx; ir++) + Porter[ir] += rho[is][ir]; + + complex *Porter_g = new complex[rho_basis->npw]; + ModuleBase::GlobalFunc::ZEROS(Porter_g, rho_basis->npw); + + rho_basis->real2recip(Porter, Porter_g); + + complex *N = new complex[rho_basis->npw]; + complex *TOTN = new complex[rho_basis->npw]; + + cal_totn(cell, rho_basis, Porter_g, N, TOTN); + + // save TOTN in real space + rho_basis->recip2real(TOTN, this->TOTN_real); + complex *comp_reci = new complex[rho_basis->npw]; complex *phi_comp_G = new complex[rho_basis->npw]; - double *phi_comp_R = new double[rho_basis->nrxx]; ModuleBase::GlobalFunc::ZEROS(comp_reci, rho_basis->npw); ModuleBase::GlobalFunc::ZEROS(phi_comp_G, rho_basis->npw); ModuleBase::GlobalFunc::ZEROS(phi_comp_R, rho_basis->nrxx); - // get comp chg in reci space - add_comp_chg(cell, rho_basis, comp_q, comp_l, comp_center, comp_reci, comp_dim); - double ecomp = 0.0; + // get compensating charge in reci space + add_comp_chg(cell, rho_basis, comp_q, comp_l, comp_center, comp_reci, comp_dim, true); + // save compensating charge in real space + rho_basis->recip2real(comp_reci, this->comp_real); + + // test sum of comp_real -> 0 + // for (int i = 0; i < rho_basis->nz;i++) + // { + // cout << comp_real[i] << endl; + // } + // double sum = 0; + // for (int i = 0; i < rho_basis->nxyz; i++) + // { + // sum += TOTN_real[i]; + // } + // sum = sum * cell.omega / rho_basis->nxyz; + // cout << "sum:" << sum << endl; + // int pp; + // cin >> pp; + for (int ig = 0; ig < rho_basis->npw; ig++) { - if (rho_basis->gg[ig] >= 1.0e-12) // LiuXh 20180410 + if (ig == rho_basis->ig_gge0) + { + // cout << ig << endl; + continue; + } + else { const double fac = ModuleBase::e2 * ModuleBase::FOUR_PI / (cell.tpiba2 * rho_basis->gg[ig]); - ecomp += (conj(comp_reci[ig]) * comp_reci[ig]).real() * fac; phi_comp_G[ig] = fac * comp_reci[ig]; } } - Parallel_Reduce::reduce_double_pool(ecomp); - ecomp *= 0.5 * cell.omega; - // std::cout << " ecomp=" << ecomp << std::endl; - comp_chg_energy = ecomp; rho_basis->recip2real(phi_comp_G, phi_comp_R); @@ -166,95 +231,64 @@ ModuleBase::matrix surchem::v_compensating(const UnitCell &cell, ModulePW::PW_Ba delete[] comp_reci; delete[] phi_comp_G; - delete[] phi_comp_R; + delete[] Porter; + delete[] Porter_g; + delete[] N; + delete[] TOTN; ModuleBase::timer::tick("surchem", "v_compensating"); return v_comp; } -void test_print(double* tmp, ModulePW::PW_Basis *rho_basis) -{ - for (int i = 0; i < rho_basis->nz; i++) - { - cout << tmp[i] << endl; - } -} - -void surchem::test_V_to_N(ModuleBase::matrix &v, - const UnitCell &cell, - ModulePW::PW_Basis *rho_basis, - const double *const *const rho) +void surchem::cal_comp_force(ModuleBase::matrix &force_comp, ModulePW::PW_Basis *rho_basis) { - double *phi_comp_R = new double[rho_basis->nrxx]; - complex *phi_comp_G = new complex[rho_basis->npw]; - complex *comp_reci = new complex[rho_basis->npw]; - double *N_real = new double[rho_basis->nrxx]; - - ModuleBase::GlobalFunc::ZEROS(phi_comp_R, rho_basis->nrxx); - ModuleBase::GlobalFunc::ZEROS(phi_comp_G, rho_basis->npw); - ModuleBase::GlobalFunc::ZEROS(comp_reci, rho_basis->npw); - ModuleBase::GlobalFunc::ZEROS(N_real, rho_basis->nrxx); - - for (int ir = 0; ir < rho_basis->nz; ir++) - { - cout << v(0, ir) << endl; - } - - for (int ir = 0; ir < rho_basis->nrxx; ir++) - { - phi_comp_R[ir] = v(0, ir); - } - + int iat = 0; + std::complex *N = new std::complex[rho_basis->npw]; + std::complex *phi_comp_G = new complex[rho_basis->npw]; + std::complex *vloc_at = new std::complex[rho_basis->npw]; rho_basis->real2recip(phi_comp_R, phi_comp_G); - for (int ig = 0; ig < rho_basis->npw; ig++) + + for (int it = 0; it < GlobalC::ucell.ntype; it++) { - if (rho_basis->gg[ig] >= 1.0e-12) // LiuXh 20180410 + for (int ia = 0; ia < GlobalC::ucell.atoms[it].na; ia++) { - const double fac = ModuleBase::e2 * ModuleBase::FOUR_PI / (cell.tpiba2 * rho_basis->gg[ig]); - comp_reci[ig] = phi_comp_G[ig] / fac; - } - } - rho_basis->recip2real(comp_reci, N_real); - - complex *vloc_g = new complex[rho_basis->npw]; - complex *ng = new complex[rho_basis->npw]; - ModuleBase::GlobalFunc::ZEROS(vloc_g, rho_basis->npw); - ModuleBase::GlobalFunc::ZEROS(ng, rho_basis->npw); - double* Porter = new double[rho_basis->nrxx]; - for (int ir = 0; ir < rho_basis->nrxx; ir++) - Porter[ir] = rho[0][ir]; + // cout << GlobalC::ucell.atoms[it].zv << endl; + for (int ig = 0; ig < rho_basis->npw; ig++) + { + complex phase = exp( ModuleBase::NEG_IMAG_UNIT *ModuleBase::TWO_PI * ( rho_basis->gcar[ig] * GlobalC::ucell.atoms[it].tau[ia])); + //vloc for each atom + vloc_at[ig] = GlobalC::ppcell.vloc(it, rho_basis->ig2igg[ig]) * phase; + if(rho_basis->ig_gge0 == ig) + { + N[ig] = GlobalC::ucell.atoms[it].zv / GlobalC::ucell.omega; + } + else + { + const double fac + = ModuleBase::e2 * ModuleBase::FOUR_PI / (GlobalC::ucell.tpiba2 * rho_basis->gg[ig]); + + N[ig] = -vloc_at[ig] / fac; + } + + //force for each atom + force_comp(iat, 0) += rho_basis->gcar[ig][0] * imag(conj(phi_comp_G[ig]) * N[ig]); + force_comp(iat, 1) += rho_basis->gcar[ig][1] * imag(conj(phi_comp_G[ig]) * N[ig]); + force_comp(iat, 2) += rho_basis->gcar[ig][2] * imag(conj(phi_comp_G[ig]) * N[ig]); + } + + force_comp(iat, 0) *= (GlobalC::ucell.tpiba * GlobalC::ucell.omega); + force_comp(iat, 1) *= (GlobalC::ucell.tpiba * GlobalC::ucell.omega); + force_comp(iat, 2) *= (GlobalC::ucell.tpiba * GlobalC::ucell.omega); - rho_basis->real2recip(GlobalC::pot.vltot,vloc_g);// now n is vloc in Recispace - for (int ig = 0; ig < rho_basis->npw; ig++) { - if (rho_basis->gg[ig] >= 1.0e-12) // LiuXh 20180410 - { - const double fac = ModuleBase::e2 * ModuleBase::FOUR_PI / - (cell.tpiba2 * rho_basis->gg[ig]); + // cout << "Force1(Ry / Bohr)" << iat << ":" + // << " " << force_comp(iat, 0) << " " << force_comp(iat, 1) << " " << force_comp(iat, 2) << endl; - ng[ig] = -vloc_g[ig] / fac; + ++iat; } } - double *nr = new double[rho_basis->nrxx]; - rho_basis->recip2real(ng, nr); - - double *diff = new double[rho_basis->nrxx]; - double *diff2 = new double[rho_basis->nrxx]; - for (int i = 0; i < rho_basis->nrxx; i++) - { - diff[i] = N_real[i] - nr[i]; - diff2[i] = N_real[i] - Porter[i]; - } - - for (int i = 0; i < rho_basis->nrxx;i++) - { - diff[i] -= Porter[i]; - } - - delete[] phi_comp_R; + Parallel_Reduce::reduce_double_pool(force_comp.c, force_comp.nr * force_comp.nc); + delete[] vloc_at; + delete[] N; delete[] phi_comp_G; - delete[] comp_reci; - delete[] diff; - delete[] vloc_g; - delete[] Porter; -} +} \ No newline at end of file diff --git a/source/module_surchem/cal_totn.cpp b/source/module_surchem/cal_totn.cpp index e2a183fab3..3cea7e6a0d 100644 --- a/source/module_surchem/cal_totn.cpp +++ b/source/module_surchem/cal_totn.cpp @@ -3,7 +3,7 @@ void surchem::cal_totn(const UnitCell &cell, ModulePW::PW_Basis* rho_basis, const complex *Porter_g, complex *N, complex *TOTN) { - // vloc to N8 + // vloc to N complex *vloc_g = new complex[rho_basis->npw]; ModuleBase::GlobalFunc::ZEROS(vloc_g, rho_basis->npw); @@ -25,7 +25,6 @@ void surchem::cal_totn(const UnitCell &cell, ModulePW::PW_Basis* rho_basis, TOTN[ig] = N[ig] - Porter_g[ig]; } - // delete[] comp_real; delete[] vloc_g; return; } \ No newline at end of file diff --git a/source/module_surchem/cal_vel.cpp b/source/module_surchem/cal_vel.cpp index fe6b246cc1..88f246053d 100644 --- a/source/module_surchem/cal_vel.cpp +++ b/source/module_surchem/cal_vel.cpp @@ -57,7 +57,6 @@ ModuleBase::matrix surchem::cal_vel(const UnitCell &cell, ModuleBase::TITLE("surchem", "cal_vel"); ModuleBase::timer::tick("surchem", "cal_vel"); - // double *TOTN_real = new double[pwb.nrxx]; rho_basis->recip2real(TOTN, TOTN_real); // -4pi * TOTN(G) @@ -93,7 +92,6 @@ ModuleBase::matrix surchem::cal_vel(const UnitCell &cell, double *phi_tilda_R = new double[rho_basis->nrxx]; double *phi_tilda_R0 = new double[rho_basis->nrxx]; - // double *delta_phi_R = new double[pwb.nrxx]; rho_basis->recip2real(Sol_phi, phi_tilda_R); rho_basis->recip2real(Sol_phi0, phi_tilda_R0); @@ -144,11 +142,8 @@ ModuleBase::matrix surchem::cal_vel(const UnitCell &cell, delete[] epsilon; delete[] epsilon0; delete[] tmp_Vel; - // delete[] Vel2; - // delete[] TOTN_real; delete[] phi_tilda_R; delete[] phi_tilda_R0; - // delete[] delta_phi_R; ModuleBase::timer::tick("surchem", "cal_vel"); return Vel; diff --git a/source/module_surchem/corrected_energy.cpp b/source/module_surchem/corrected_energy.cpp index 7f0aaabe6f..55894e8fa3 100644 --- a/source/module_surchem/corrected_energy.cpp +++ b/source/module_surchem/corrected_energy.cpp @@ -1,6 +1,6 @@ #include "surchem.h" -double surchem::cal_Ael(const UnitCell &cell, ModulePW::PW_Basis* rho_basis) +double surchem::cal_Ael(const UnitCell &cell, ModulePW::PW_Basis *rho_basis) { double Ael = 0.0; for (int ir = 0; ir < rho_basis->nrxx; ir++) @@ -8,17 +8,100 @@ double surchem::cal_Ael(const UnitCell &cell, ModulePW::PW_Basis* rho_basis) Ael -= TOTN_real[ir] * delta_phi[ir]; } Parallel_Reduce::reduce_double_pool(Ael); - Ael = Ael * cell.omega / rho_basis->nxyz; // unit Ry - //cout << "Ael: " << Ael << endl; + Ael = Ael * cell.omega / rho_basis->nxyz; + // cout << "Ael: " << Ael << endl; return Ael; } -double surchem::cal_Acav(const UnitCell &cell, ModulePW::PW_Basis* rho_basis) +double surchem::cal_Acav(const UnitCell &cell, ModulePW::PW_Basis *rho_basis) { double Acav = 0.0; Acav = GlobalV::tau * qs; - Acav = Acav * cell.omega / rho_basis->nxyz; // unit Ry + Acav = Acav * cell.omega / rho_basis->nxyz; // unit Ry Parallel_Reduce::reduce_double_pool(Acav); - //cout << "Acav: " << Acav << endl; + // cout << "Acav: " << Acav << endl; return Acav; +} + +void surchem::cal_Acomp(const UnitCell &cell, + ModulePW::PW_Basis *rho_basis, + const double *const *const rho, + vector &res) +{ + double Acomp1 = 0.0; // self + double Acomp2 = 0.0; // electrons + double Acomp3 = 0.0; // nuclear + + complex *phi_comp_G = new complex[rho_basis->npw]; + complex *comp_reci = new complex[rho_basis->npw]; + double *phi_comp_R = new double[rho_basis->nrxx]; + + ModuleBase::GlobalFunc::ZEROS(phi_comp_G, rho_basis->npw); + ModuleBase::GlobalFunc::ZEROS(comp_reci, rho_basis->npw); + ModuleBase::GlobalFunc::ZEROS(phi_comp_R, rho_basis->nrxx); + + // part1: comp & comp + rho_basis->real2recip(comp_real, comp_reci); + for (int ig = 0; ig < rho_basis->npw; ig++) + { + if (rho_basis->gg[ig] >= 1.0e-12) // LiuXh 20180410 + { + const double fac = ModuleBase::e2 * ModuleBase::FOUR_PI / (cell.tpiba2 * rho_basis->gg[ig]); + Acomp1 += (conj(comp_reci[ig]) * comp_reci[ig]).real() * fac; + phi_comp_G[ig] = fac * comp_reci[ig]; + } + } + // 0.5 for double counting + Parallel_Reduce::reduce_double_pool(Acomp1); + Acomp1 *= 0.5 * cell.omega; + + // electrons + double *n_elec_R = new double[rho_basis->nrxx]; + for (int i = 0; i < rho_basis->nrxx; i++) + n_elec_R[i] = 0.0; + const int nspin0 = (GlobalV::NSPIN == 2) ? 2 : 1; + for (int is = 0; is < nspin0; is++) + for (int ir = 0; ir < rho_basis->nrxx; ir++) + n_elec_R[ir] += rho[is][ir]; + + // nuclear = TOTN_R + n_elec_R + double *n_nucl_R = new double[rho_basis->nrxx]; + for (int ir = 0; ir < rho_basis->nrxx; ir++) + { + n_nucl_R[ir] = TOTN_real[ir] + n_elec_R[ir]; + } + + // part2: electrons + rho_basis->recip2real(phi_comp_G, phi_comp_R); + for (int ir = 0; ir < rho_basis->nrxx; ir++) + { + Acomp2 += n_elec_R[ir] * phi_comp_R[ir]; + } + Parallel_Reduce::reduce_double_pool(Acomp2); + Acomp2 = Acomp2 * cell.omega / rho_basis->nxyz; + + // part3: nuclear + for (int ir = 0; ir < rho_basis->nrxx; ir++) + { + Acomp3 += n_nucl_R[ir] * phi_comp_R[ir]; + } + Parallel_Reduce::reduce_double_pool(Acomp3); + Acomp3 = Acomp3 * cell.omega / rho_basis->nxyz; + + delete[] phi_comp_G; + delete[] phi_comp_R; + delete[] comp_reci; + + delete[] n_elec_R; + delete[] n_nucl_R; + + // cout << "Acomp1(self, Ry): " << Acomp1 << endl; + // cout << "Acomp1(electrons, Ry): " << Acomp2 << endl; + // cout << "Acomp1(nuclear, Ry): " << Acomp3 << endl; + + res[0] = Acomp1; + res[1] = Acomp2; + res[2] = -Acomp3; + + // return Acomp1 + Acomp2 - Acomp3; } \ No newline at end of file diff --git a/source/module_surchem/surchem.cpp b/source/module_surchem/surchem.cpp index 83f199e5ac..77de8158a3 100644 --- a/source/module_surchem/surchem.cpp +++ b/source/module_surchem/surchem.cpp @@ -2,7 +2,7 @@ namespace GlobalC { - surchem solvent_model; +surchem solvent_model; } surchem::surchem() @@ -10,10 +10,11 @@ surchem::surchem() TOTN_real = nullptr; delta_phi = nullptr; epspot = nullptr; + comp_real = nullptr; + phi_comp_R = nullptr; Vcav = ModuleBase::matrix(); Vel = ModuleBase::matrix(); qs = 0; - comp_chg_energy = 0; } void surchem::allocate(const int &nrxx, const int &nspin) @@ -24,20 +25,32 @@ void surchem::allocate(const int &nrxx, const int &nspin) delete[] TOTN_real; delete[] delta_phi; delete[] epspot; - if(nrxx > 0) + delete[] comp_real; + delete[] phi_comp_R; + if (nrxx > 0) { TOTN_real = new double[nrxx]; delta_phi = new double[nrxx]; epspot = new double[nrxx]; + comp_real = new double[nrxx]; + phi_comp_R = new double[nrxx]; } else - TOTN_real = delta_phi = epspot = nullptr; + { + TOTN_real = nullptr; + delta_phi = nullptr; + epspot = nullptr; + comp_real = nullptr; + phi_comp_R = nullptr; + } Vcav.create(nspin, nrxx); Vel.create(nspin, nrxx); ModuleBase::GlobalFunc::ZEROS(delta_phi, nrxx); ModuleBase::GlobalFunc::ZEROS(TOTN_real, nrxx); ModuleBase::GlobalFunc::ZEROS(epspot, nrxx); + ModuleBase::GlobalFunc::ZEROS(comp_real, nrxx); + ModuleBase::GlobalFunc::ZEROS(phi_comp_R, nrxx); return; } @@ -45,4 +58,29 @@ surchem::~surchem() { delete[] TOTN_real; delete[] delta_phi; + delete[] epspot; + delete[] comp_real; + delete[] phi_comp_R; +} + +void surchem::get_totn_reci(const UnitCell &cell, ModulePW::PW_Basis *rho_basis, complex *totn_reci) +{ + double *tmp_totn_real = new double[rho_basis->nrxx]; + double *tmp_comp_real = new double[rho_basis->nrxx]; + complex *comp_reci = new complex[rho_basis->npw]; + ModuleBase::GlobalFunc::ZEROS(tmp_totn_real, rho_basis->nrxx); + ModuleBase::GlobalFunc::ZEROS(tmp_comp_real, rho_basis->nrxx); + ModuleBase::GlobalFunc::ZEROS(comp_reci, rho_basis->npw); + add_comp_chg(cell, rho_basis, comp_q, comp_l, comp_center, comp_reci, comp_dim, false); + rho_basis->recip2real(comp_reci, tmp_comp_real); + + for (int ir = 0; ir < rho_basis->nrxx;ir++) + { + tmp_totn_real[ir] = TOTN_real[ir] + tmp_comp_real[ir]; + } + + rho_basis->real2recip(tmp_totn_real, totn_reci); + delete[] tmp_totn_real; + delete[] tmp_comp_real; + delete[] comp_reci; } \ No newline at end of file diff --git a/source/module_surchem/surchem.h b/source/module_surchem/surchem.h index 79a8ae85d4..93496fac7f 100644 --- a/source/module_surchem/surchem.h +++ b/source/module_surchem/surchem.h @@ -5,12 +5,12 @@ #include "../module_base/global_variable.h" #include "../module_base/matrix.h" #include "../module_cell/unitcell.h" +#include "../module_pw/pw_basis.h" #include "../src_parallel/parallel_reduce.h" #include "../src_pw/global.h" #include "../src_pw/structure_factor.h" #include "../src_pw/use_fft.h" #include "atom_in.h" -#include "../module_pw/pw_basis.h" class surchem { @@ -25,8 +25,9 @@ class surchem ModuleBase::matrix Vel; double qs; - // energy of compensating charge - double comp_chg_energy; + // compensating charge (in real space, used to cal_Acomp) + double *comp_real; + double *phi_comp_R; // compensating charge params double comp_q; @@ -38,12 +39,12 @@ class surchem void allocate(const int &nrxx, const int &nspin); - void cal_epsilon(ModulePW::PW_Basis* rho_basis, const double *PS_TOTN_real, double *epsilon, double *epsilon0); + void cal_epsilon(ModulePW::PW_Basis *rho_basis, const double *PS_TOTN_real, double *epsilon, double *epsilon0); void cal_pseudo(const UnitCell &cell, - ModulePW::PW_Basis* rho_basis, - const complex *Porter_g, - complex *PS_TOTN); + ModulePW::PW_Basis *rho_basis, + const complex *Porter_g, + complex *PS_TOTN); void add_comp_chg(const UnitCell &cell, ModulePW::PW_Basis *rho_basis, @@ -51,64 +52,85 @@ class surchem double l, double center, complex *NG, - int dim); + int dim, + bool flag); // Set value of comp_reci[ig_gge0] when flag is true. + + void cal_comp_force(ModuleBase::matrix &force_comp, ModulePW::PW_Basis *rho_basis); - void gauss_charge(const UnitCell &cell, ModulePW::PW_Basis* rho_basis, complex *N); + void gauss_charge(const UnitCell &cell, ModulePW::PW_Basis *rho_basis, complex *N); void cal_totn(const UnitCell &cell, - ModulePW::PW_Basis* rho_basis, - const complex *Porter_g, - complex *N, - complex *TOTN); - void createcavity(const UnitCell &ucell, ModulePW::PW_Basis* rho_basis, const complex *PS_TOTN, double *vwork); + ModulePW::PW_Basis *rho_basis, + const complex *Porter_g, + complex *N, + complex *TOTN); + void createcavity(const UnitCell &ucell, + ModulePW::PW_Basis *rho_basis, + const complex *PS_TOTN, + double *vwork); - ModuleBase::matrix cal_vcav(const UnitCell &ucell, ModulePW::PW_Basis* rho_basis, complex *PS_TOTN, int nspin); + ModuleBase::matrix cal_vcav(const UnitCell &ucell, + ModulePW::PW_Basis *rho_basis, + complex *PS_TOTN, + int nspin); ModuleBase::matrix cal_vel(const UnitCell &cell, - ModulePW::PW_Basis* rho_basis, - complex *TOTN, - complex *PS_TOTN, - int nspin); - + ModulePW::PW_Basis *rho_basis, + complex *TOTN, + complex *PS_TOTN, + int nspin); + + double cal_Ael(const UnitCell &cell, ModulePW::PW_Basis *rho_basis); - double cal_Ael(const UnitCell &cell, ModulePW::PW_Basis* rho_basis); + double cal_Acav(const UnitCell &cell, ModulePW::PW_Basis *rho_basis); - double cal_Acav(const UnitCell &cell, ModulePW::PW_Basis* rho_basis); + void cal_Acomp(const UnitCell &cell, + ModulePW::PW_Basis *rho_basis, + const double *const *const rho, + vector &res); void minimize_cg(const UnitCell &ucell, - ModulePW::PW_Basis* rho_basis, - double *d_eps, - const complex *tot_N, - complex *phi, - int &ncgsol); + ModulePW::PW_Basis *rho_basis, + double *d_eps, + const complex *tot_N, + complex *phi, + int &ncgsol); void Leps2(const UnitCell &ucell, - ModulePW::PW_Basis* rho_basis, - complex *phi, - double *epsilon, // epsilon from shapefunc, dim=nrxx - complex *gradphi_x, // dim=ngmc - complex *gradphi_y, - complex *gradphi_z, - complex *phi_work, - complex *lp); + ModulePW::PW_Basis *rho_basis, + complex *phi, + double *epsilon, // epsilon from shapefunc, dim=nrxx + complex *gradphi_x, // dim=ngmc + complex *gradphi_y, + complex *gradphi_z, + complex *phi_work, + complex *lp); ModuleBase::matrix v_correction(const UnitCell &cell, - ModulePW::PW_Basis* rho_basis, - const int &nspin, - const double *const *const rho); - - ModuleBase::matrix v_compensating(const UnitCell &cell, ModulePW::PW_Basis *pwb); - - void test_V_to_N(ModuleBase::matrix &v, const UnitCell &cell, ModulePW::PW_Basis *rho_basis, const double *const *const rho); - - void cal_force_sol(const UnitCell &cell, ModulePW::PW_Basis* rho_basis , ModuleBase::matrix& forcesol); - + ModulePW::PW_Basis *rho_basis, + const int &nspin, + const double *const *const rho); + + ModuleBase::matrix v_compensating(const UnitCell &cell, + ModulePW::PW_Basis *rho_basis, + const int &nspin, + const double *const *const rho); + + void test_V_to_N(ModuleBase::matrix &v, + const UnitCell &cell, + ModulePW::PW_Basis *rho_basis, + const double *const *const rho); + + void cal_force_sol(const UnitCell &cell, ModulePW::PW_Basis *rho_basis, ModuleBase::matrix &forcesol); + + void get_totn_reci(const UnitCell &cell, ModulePW::PW_Basis *rho_basis, complex *totn_reci); + private: }; namespace GlobalC { - extern surchem solvent_model; +extern surchem solvent_model; } #endif diff --git a/source/src_io/to_wannier90.cpp b/source/src_io/to_wannier90.cpp index 103a8a42b3..02c4bc9cc4 100644 --- a/source/src_io/to_wannier90.cpp +++ b/source/src_io/to_wannier90.cpp @@ -1,1876 +1,1923 @@ #include "to_wannier90.h" + #include "../src_pw/global.h" #ifdef __LCAO #include "../src_lcao/global_fp.h" // mohan add 2021-01-30, this module should be modified #endif -#include "../module_base/math_integral.h" +#include "../module_base/math_integral.h" +#include "../module_base/math_polyint.h" #include "../module_base/math_sphbes.h" -#include "../module_base/math_polyint.h" -#include "../module_base/math_ylmreal.h" +#include "../module_base/math_ylmreal.h" toWannier90::toWannier90(int num_kpts, ModuleBase::Matrix3 recip_lattice) { - this->num_kpts = num_kpts; - this->recip_lattice = recip_lattice; - if(GlobalV::NSPIN==1 || GlobalV::NSPIN==4) this->cal_num_kpts = this->num_kpts; - else if(GlobalV::NSPIN==2) this->cal_num_kpts = this->num_kpts/2; - + this->num_kpts = num_kpts; + this->recip_lattice = recip_lattice; + if (GlobalV::NSPIN == 1 || GlobalV::NSPIN == 4) + this->cal_num_kpts = this->num_kpts; + else if (GlobalV::NSPIN == 2) + this->cal_num_kpts = this->num_kpts / 2; } -toWannier90::toWannier90(int num_kpts, ModuleBase::Matrix3 recip_lattice, std::complex*** wfc_k_grid_in) +toWannier90::toWannier90(int num_kpts, ModuleBase::Matrix3 recip_lattice, std::complex ***wfc_k_grid_in) { this->wfc_k_grid = wfc_k_grid_in; this->num_kpts = num_kpts; - this->recip_lattice = recip_lattice; - if(GlobalV::NSPIN==1 || GlobalV::NSPIN==4) this->cal_num_kpts = this->num_kpts; - else if(GlobalV::NSPIN==2) this->cal_num_kpts = this->num_kpts/2; - + this->recip_lattice = recip_lattice; + if (GlobalV::NSPIN == 1 || GlobalV::NSPIN == 4) + this->cal_num_kpts = this->num_kpts; + else if (GlobalV::NSPIN == 2) + this->cal_num_kpts = this->num_kpts / 2; } toWannier90::~toWannier90() { - if(num_exclude_bands > 0) delete[] exclude_bands; - if(GlobalV::BASIS_TYPE == "lcao") delete unk_inLcao; + if (num_exclude_bands > 0) + delete[] exclude_bands; + if (GlobalV::BASIS_TYPE == "lcao") + delete unk_inLcao; } - -void toWannier90::init_wannier(const psi::Psi>* psi) -{ - this->read_nnkp(); - - if(GlobalV::NSPIN == 2) - { - wannier_spin = INPUT.wannier_spin; - if(wannier_spin == "up") start_k_index = 0; - else if(wannier_spin == "down") start_k_index = num_kpts/2; - else - { - ModuleBase::WARNING_QUIT("toWannier90::init_wannier","Error wannier_spin set,is not \"up\" or \"down\" "); - } - } - - if(GlobalV::BASIS_TYPE == "pw") - { - writeUNK(*psi); - outEIG(); - cal_Mmn(*psi); - cal_Amn(*psi); - } +void toWannier90::init_wannier(const psi::Psi> *psi) +{ + this->read_nnkp(); + + if (GlobalV::NSPIN == 2) + { + wannier_spin = INPUT.wannier_spin; + if (wannier_spin == "up") + start_k_index = 0; + else if (wannier_spin == "down") + start_k_index = num_kpts / 2; + else + { + ModuleBase::WARNING_QUIT("toWannier90::init_wannier", "Error wannier_spin set,is not \"up\" or \"down\" "); + } + } + + if (GlobalV::BASIS_TYPE == "pw") + { + writeUNK(*psi); + outEIG(); + cal_Mmn(*psi); + cal_Amn(*psi); + } #ifdef __LCAO - else if(GlobalV::BASIS_TYPE == "lcao") - { - getUnkFromLcao(); - cal_Amn(this->unk_inLcao[0]); - cal_Mmn(this->unk_inLcao[0]); - writeUNK(this->unk_inLcao[0]); - outEIG(); - } + else if (GlobalV::BASIS_TYPE == "lcao") + { + getUnkFromLcao(); + cal_Amn(this->unk_inLcao[0]); + cal_Mmn(this->unk_inLcao[0]); + writeUNK(this->unk_inLcao[0]); + outEIG(); + } #endif - /* - if(GlobalV::MY_RANK==0) - { - if(GlobalV::BASIS_TYPE == "pw") - { - cal_Amn(GlobalC::wf.evc); - cal_Mmn(GlobalC::wf.evc); - writeUNK(GlobalC::wf.evc); - outEIG(); - } - else if(GlobalV::BASIS_TYPE == "lcao") - { - getUnkFromLcao(); - cal_Amn(this->unk_inLcao); - cal_Mmn(this->unk_inLcao); - writeUNK(this->unk_inLcao); - outEIG(); - } - } - */ - + /* + if(GlobalV::MY_RANK==0) + { + if(GlobalV::BASIS_TYPE == "pw") + { + cal_Amn(GlobalC::wf.evc); + cal_Mmn(GlobalC::wf.evc); + writeUNK(GlobalC::wf.evc); + outEIG(); + } + else if(GlobalV::BASIS_TYPE == "lcao") + { + getUnkFromLcao(); + cal_Amn(this->unk_inLcao); + cal_Mmn(this->unk_inLcao); + writeUNK(this->unk_inLcao); + outEIG(); + } + } + */ } void toWannier90::read_nnkp() { - // read *.nnkp file - // ��� ����ʸ������ʸ��k�����꣬��̽���ͶӰ��ÿ��k��Ľ���k�㣬��Ҫ�ų����ܴ�ָ�� - - wannier_file_name = INPUT.NNKP; - wannier_file_name = wannier_file_name.substr(0,wannier_file_name.length() - 5); - - GlobalV::ofs_running << "reading the " << wannier_file_name << ".nnkp file." << std::endl; - - std::ifstream nnkp_read(INPUT.NNKP.c_str(), ios::in); - - if(!nnkp_read) ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error during readin parameters."); - - if( ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read,"real_lattice") ) - { - ModuleBase::Matrix3 real_lattice_nnkp; - nnkp_read >> real_lattice_nnkp.e11 >> real_lattice_nnkp.e12 >> real_lattice_nnkp.e13 - >> real_lattice_nnkp.e21 >> real_lattice_nnkp.e22 >> real_lattice_nnkp.e23 - >> real_lattice_nnkp.e31 >> real_lattice_nnkp.e32 >> real_lattice_nnkp.e33; - - real_lattice_nnkp = real_lattice_nnkp / GlobalC::ucell.lat0_angstrom; - - if(abs(real_lattice_nnkp.e11 - GlobalC::ucell.latvec.e11) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error real_lattice in *.nnkp file"); - if(abs(real_lattice_nnkp.e12 - GlobalC::ucell.latvec.e12) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error real_lattice in *.nnkp file"); - if(abs(real_lattice_nnkp.e13 - GlobalC::ucell.latvec.e13) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error real_lattice in *.nnkp file"); - if(abs(real_lattice_nnkp.e21 - GlobalC::ucell.latvec.e21) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error real_lattice in *.nnkp file"); - if(abs(real_lattice_nnkp.e22 - GlobalC::ucell.latvec.e22) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error real_lattice in *.nnkp file"); - if(abs(real_lattice_nnkp.e23 - GlobalC::ucell.latvec.e23) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error real_lattice in *.nnkp file"); - if(abs(real_lattice_nnkp.e31 - GlobalC::ucell.latvec.e31) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error real_lattice in *.nnkp file"); - if(abs(real_lattice_nnkp.e32 - GlobalC::ucell.latvec.e32) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error real_lattice in *.nnkp file"); - if(abs(real_lattice_nnkp.e33 - GlobalC::ucell.latvec.e33) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error real_lattice in *.nnkp file"); - - } - - if( ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read,"recip_lattice") ) - { - ModuleBase::Matrix3 recip_lattice_nnkp; - nnkp_read >> recip_lattice_nnkp.e11 >> recip_lattice_nnkp.e12 >> recip_lattice_nnkp.e13 - >> recip_lattice_nnkp.e21 >> recip_lattice_nnkp.e22 >> recip_lattice_nnkp.e23 - >> recip_lattice_nnkp.e31 >> recip_lattice_nnkp.e32 >> recip_lattice_nnkp.e33; - - const double tpiba_angstrom = ModuleBase::TWO_PI / GlobalC::ucell.lat0_angstrom; - recip_lattice_nnkp = recip_lattice_nnkp / tpiba_angstrom; - - if(abs(recip_lattice_nnkp.e11 - GlobalC::ucell.G.e11) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error recip_lattice in *.nnkp file"); - if(abs(recip_lattice_nnkp.e12 - GlobalC::ucell.G.e12) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error recip_lattice in *.nnkp file"); - if(abs(recip_lattice_nnkp.e13 - GlobalC::ucell.G.e13) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error recip_lattice in *.nnkp file"); - if(abs(recip_lattice_nnkp.e21 - GlobalC::ucell.G.e21) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error recip_lattice in *.nnkp file"); - if(abs(recip_lattice_nnkp.e22 - GlobalC::ucell.G.e22) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error recip_lattice in *.nnkp file"); - if(abs(recip_lattice_nnkp.e23 - GlobalC::ucell.G.e23) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error recip_lattice in *.nnkp file"); - if(abs(recip_lattice_nnkp.e31 - GlobalC::ucell.G.e31) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error recip_lattice in *.nnkp file"); - if(abs(recip_lattice_nnkp.e32 - GlobalC::ucell.G.e32) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error recip_lattice in *.nnkp file"); - if(abs(recip_lattice_nnkp.e33 - GlobalC::ucell.G.e33) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error recip_lattice in *.nnkp file"); - } - - if( ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read,"kpoints") ) - { - int numkpt_nnkp; - ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, numkpt_nnkp); - if( (GlobalV::NSPIN == 1 || GlobalV::NSPIN == 4) && numkpt_nnkp != GlobalC::kv.nkstot ) ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error kpoints in *.nnkp file"); - else if(GlobalV::NSPIN == 2 && numkpt_nnkp != (GlobalC::kv.nkstot/2)) ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error kpoints in *.nnkp file"); - - ModuleBase::Vector3 *kpoints_direct_nnkp = new ModuleBase::Vector3[numkpt_nnkp]; - for(int ik = 0; ik < numkpt_nnkp; ik++) - { - nnkp_read >> kpoints_direct_nnkp[ik].x >> kpoints_direct_nnkp[ik].y >> kpoints_direct_nnkp[ik].z; - if(abs(kpoints_direct_nnkp[ik].x - GlobalC::kv.kvec_d[ik].x) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error kpoints in *.nnkp file"); - if(abs(kpoints_direct_nnkp[ik].y - GlobalC::kv.kvec_d[ik].y) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error kpoints in *.nnkp file"); - if(abs(kpoints_direct_nnkp[ik].z - GlobalC::kv.kvec_d[ik].z) > 1.0e-4) - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","Error kpoints in *.nnkp file"); - } - - delete[] kpoints_direct_nnkp; - - //�ж�gamma only - ModuleBase::Vector3 my_gamma_point(0.0,0.0,0.0); - //if( (GlobalC::kv.nkstot == 1) && (GlobalC::kv.kvec_d[0] == my_gamma_point) ) gamma_only_wannier = true; - } - - if(GlobalV::NSPIN!=4) - { - if( ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read,"projections") ) - { - ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, num_wannier); - // test - //GlobalV::ofs_running << "num_wannier = " << num_wannier << std::endl; - // test - if(num_wannier < 0) - { - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","wannier number is lower than 0"); - } - - R_centre = new ModuleBase::Vector3[num_wannier]; - L = new int[num_wannier]; - m = new int[num_wannier]; - rvalue = new int[num_wannier]; - ModuleBase::Vector3* z_axis = new ModuleBase::Vector3[num_wannier]; - ModuleBase::Vector3* x_axis = new ModuleBase::Vector3[num_wannier]; - alfa = new double[num_wannier]; - - - for(int count = 0; count < num_wannier; count++) - { - nnkp_read >> R_centre[count].x >> R_centre[count].y >> R_centre[count].z; - nnkp_read >> L[count] >> m[count]; - ModuleBase::GlobalFunc::READ_VALUE(nnkp_read,rvalue[count]); - nnkp_read >> z_axis[count].x >> z_axis[count].y >> z_axis[count].z; - nnkp_read >> x_axis[count].x >> x_axis[count].y >> x_axis[count].z; - ModuleBase::GlobalFunc::READ_VALUE(nnkp_read,alfa[count]); - } - - } - } - else - { - ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","noncolin spin is not done yet"); - } - - if( ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read,"nnkpts") ) - { - ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, nntot); - nnlist.resize(GlobalC::kv.nkstot); - nncell.resize(GlobalC::kv.nkstot); - for(int ik = 0; ik < GlobalC::kv.nkstot; ik++) - { - nnlist[ik].resize(nntot); - nncell[ik].resize(nntot); - } - - int numkpt_nnkp; - if(GlobalV::NSPIN == 1 || GlobalV::NSPIN == 4) numkpt_nnkp = GlobalC::kv.nkstot; - else if(GlobalV::NSPIN == 2) numkpt_nnkp = GlobalC::kv.nkstot/2; - else throw std::runtime_error("numkpt_nnkp uninitialized in "+ModuleBase::GlobalFunc::TO_STRING(__FILE__)+" line "+ModuleBase::GlobalFunc::TO_STRING(__LINE__)); - - for(int ik = 0; ik < numkpt_nnkp; ik++) - { - for(int ib = 0; ib < nntot; ib++) - { - int ik_nnkp; - nnkp_read >> ik_nnkp; - if(ik_nnkp != (ik+1)) ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","error nnkpts in *.nnkp file"); - nnkp_read >> nnlist[ik][ib]; - nnkp_read >> nncell[ik][ib].x >> nncell[ik][ib].y >> nncell[ik][ib].z; - nnlist[ik][ib]--; // this is c++ , begin from 0 - } - - } - } - - if( ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read,"exclude_bands") ) - { - ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, num_exclude_bands); - if(num_exclude_bands > 0) exclude_bands = new int[num_exclude_bands]; - else if(num_exclude_bands < 0) ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","the exclude bands is wrong , please check *.nnkp file."); - - if(num_exclude_bands > 0) - { - for(int i = 0; i < num_exclude_bands; i++) - { - ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, exclude_bands[i]); - exclude_bands[i]--; // this is c++ , begin from 0 - } - } - } - - // test by jingan - //GlobalV::ofs_running << "num_exclude_bands = " << num_exclude_bands << std::endl; - //for(int i = 0; i < num_exclude_bands; i++) - //{ - // GlobalV::ofs_running << "exclude_bands : " << exclude_bands[i] << std::endl; - //} - // test by jingan - - nnkp_read.close(); - - // ������̽������� - for(int i = 0; i < num_wannier; i++) - { - R_centre[i] = R_centre[i] * GlobalC::ucell.latvec; - m[i] = m[i] - 1; // ABACUS and wannier90 �ԴŽǶ���m�Ķ��岻һ����ABACUS�Ǵ�0��ʼ�ģ�wannier90�Ǵ�1��ʼ�� - } - - // test by jingan - //GlobalV::ofs_running << "num_wannier is " << num_wannier << std::endl; - //for(int i = 0; i < num_wannier; i++) - //{ - // GlobalV::ofs_running << "num_wannier" << std::endl; - // GlobalV::ofs_running << L[i] << " " << m[i] << " " << rvalue[i] << " " << alfa[i] << std::endl; - //} - // test by jingan - - // ����exclude_bands - tag_cal_band = new bool[GlobalV::NBANDS]; - if(GlobalV::NBANDS <= num_exclude_bands) ModuleBase::WARNING_QUIT("toWannier90::read_nnkp","you set the band numer is not enough, please add bands number."); - if(num_exclude_bands == 0) - { - for(int ib = 0; ib < GlobalV::NBANDS; ib++) tag_cal_band[ib] = true; - } - else - { - for(int ib = 0; ib < GlobalV::NBANDS; ib++) - { - tag_cal_band[ib] = true; - for(int ibb = 0; ibb < num_exclude_bands; ibb++) - { - if(exclude_bands[ibb] == ib) - { - tag_cal_band[ib] = false; - break; - } - } - } - } - - if(num_exclude_bands < 0) num_bands = GlobalV::NBANDS; - else num_bands = GlobalV::NBANDS - num_exclude_bands; - - + // read *.nnkp file + + wannier_file_name = INPUT.NNKP; + wannier_file_name = wannier_file_name.substr(0, wannier_file_name.length() - 5); + + GlobalV::ofs_running << "reading the " << wannier_file_name << ".nnkp file." << std::endl; + + std::ifstream nnkp_read(INPUT.NNKP.c_str(), ios::in); + + if (!nnkp_read) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error during readin parameters."); + + if (ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read, "real_lattice")) + { + ModuleBase::Matrix3 real_lattice_nnkp; + nnkp_read >> real_lattice_nnkp.e11 >> real_lattice_nnkp.e12 >> real_lattice_nnkp.e13 >> real_lattice_nnkp.e21 + >> real_lattice_nnkp.e22 >> real_lattice_nnkp.e23 >> real_lattice_nnkp.e31 >> real_lattice_nnkp.e32 + >> real_lattice_nnkp.e33; + + real_lattice_nnkp = real_lattice_nnkp / GlobalC::ucell.lat0_angstrom; + + if (abs(real_lattice_nnkp.e11 - GlobalC::ucell.latvec.e11) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error real_lattice in *.nnkp file"); + if (abs(real_lattice_nnkp.e12 - GlobalC::ucell.latvec.e12) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error real_lattice in *.nnkp file"); + if (abs(real_lattice_nnkp.e13 - GlobalC::ucell.latvec.e13) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error real_lattice in *.nnkp file"); + if (abs(real_lattice_nnkp.e21 - GlobalC::ucell.latvec.e21) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error real_lattice in *.nnkp file"); + if (abs(real_lattice_nnkp.e22 - GlobalC::ucell.latvec.e22) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error real_lattice in *.nnkp file"); + if (abs(real_lattice_nnkp.e23 - GlobalC::ucell.latvec.e23) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error real_lattice in *.nnkp file"); + if (abs(real_lattice_nnkp.e31 - GlobalC::ucell.latvec.e31) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error real_lattice in *.nnkp file"); + if (abs(real_lattice_nnkp.e32 - GlobalC::ucell.latvec.e32) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error real_lattice in *.nnkp file"); + if (abs(real_lattice_nnkp.e33 - GlobalC::ucell.latvec.e33) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error real_lattice in *.nnkp file"); + } + + if (ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read, "recip_lattice")) + { + ModuleBase::Matrix3 recip_lattice_nnkp; + nnkp_read >> recip_lattice_nnkp.e11 >> recip_lattice_nnkp.e12 >> recip_lattice_nnkp.e13 + >> recip_lattice_nnkp.e21 >> recip_lattice_nnkp.e22 >> recip_lattice_nnkp.e23 >> recip_lattice_nnkp.e31 + >> recip_lattice_nnkp.e32 >> recip_lattice_nnkp.e33; + + const double tpiba_angstrom = ModuleBase::TWO_PI / GlobalC::ucell.lat0_angstrom; + recip_lattice_nnkp = recip_lattice_nnkp / tpiba_angstrom; + + if (abs(recip_lattice_nnkp.e11 - GlobalC::ucell.G.e11) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error recip_lattice in *.nnkp file"); + if (abs(recip_lattice_nnkp.e12 - GlobalC::ucell.G.e12) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error recip_lattice in *.nnkp file"); + if (abs(recip_lattice_nnkp.e13 - GlobalC::ucell.G.e13) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error recip_lattice in *.nnkp file"); + if (abs(recip_lattice_nnkp.e21 - GlobalC::ucell.G.e21) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error recip_lattice in *.nnkp file"); + if (abs(recip_lattice_nnkp.e22 - GlobalC::ucell.G.e22) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error recip_lattice in *.nnkp file"); + if (abs(recip_lattice_nnkp.e23 - GlobalC::ucell.G.e23) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error recip_lattice in *.nnkp file"); + if (abs(recip_lattice_nnkp.e31 - GlobalC::ucell.G.e31) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error recip_lattice in *.nnkp file"); + if (abs(recip_lattice_nnkp.e32 - GlobalC::ucell.G.e32) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error recip_lattice in *.nnkp file"); + if (abs(recip_lattice_nnkp.e33 - GlobalC::ucell.G.e33) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error recip_lattice in *.nnkp file"); + } + + if (ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read, "kpoints")) + { + int numkpt_nnkp; + ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, numkpt_nnkp); + if ((GlobalV::NSPIN == 1 || GlobalV::NSPIN == 4) && numkpt_nnkp != GlobalC::kv.nkstot) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error kpoints in *.nnkp file"); + else if (GlobalV::NSPIN == 2 && numkpt_nnkp != (GlobalC::kv.nkstot / 2)) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error kpoints in *.nnkp file"); + + ModuleBase::Vector3 *kpoints_direct_nnkp = new ModuleBase::Vector3[numkpt_nnkp]; + for (int ik = 0; ik < numkpt_nnkp; ik++) + { + nnkp_read >> kpoints_direct_nnkp[ik].x >> kpoints_direct_nnkp[ik].y >> kpoints_direct_nnkp[ik].z; + if (abs(kpoints_direct_nnkp[ik].x - GlobalC::kv.kvec_d[ik].x) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error kpoints in *.nnkp file"); + if (abs(kpoints_direct_nnkp[ik].y - GlobalC::kv.kvec_d[ik].y) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error kpoints in *.nnkp file"); + if (abs(kpoints_direct_nnkp[ik].z - GlobalC::kv.kvec_d[ik].z) > 1.0e-4) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "Error kpoints in *.nnkp file"); + } + + delete[] kpoints_direct_nnkp; + + ModuleBase::Vector3 my_gamma_point(0.0, 0.0, 0.0); + // if( (GlobalC::kv.nkstot == 1) && (GlobalC::kv.kvec_d[0] == my_gamma_point) ) gamma_only_wannier = true; + } + + if (GlobalV::NSPIN != 4) + { + if (ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read, "projections")) + { + ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, num_wannier); + // test + // GlobalV::ofs_running << "num_wannier = " << num_wannier << std::endl; + // test + if (num_wannier < 0) + { + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "wannier number is lower than 0"); + } + + R_centre = new ModuleBase::Vector3[num_wannier]; + L = new int[num_wannier]; + m = new int[num_wannier]; + rvalue = new int[num_wannier]; + ModuleBase::Vector3 *z_axis = new ModuleBase::Vector3[num_wannier]; + ModuleBase::Vector3 *x_axis = new ModuleBase::Vector3[num_wannier]; + alfa = new double[num_wannier]; + + for (int count = 0; count < num_wannier; count++) + { + nnkp_read >> R_centre[count].x >> R_centre[count].y >> R_centre[count].z; + nnkp_read >> L[count] >> m[count]; + ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, rvalue[count]); + nnkp_read >> z_axis[count].x >> z_axis[count].y >> z_axis[count].z; + nnkp_read >> x_axis[count].x >> x_axis[count].y >> x_axis[count].z; + ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, alfa[count]); + } + } + } + else + { + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "noncolin spin is not done yet"); + } + + if (ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read, "nnkpts")) + { + ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, nntot); + nnlist.resize(GlobalC::kv.nkstot); + nncell.resize(GlobalC::kv.nkstot); + for (int ik = 0; ik < GlobalC::kv.nkstot; ik++) + { + nnlist[ik].resize(nntot); + nncell[ik].resize(nntot); + } + + int numkpt_nnkp; + if (GlobalV::NSPIN == 1 || GlobalV::NSPIN == 4) + numkpt_nnkp = GlobalC::kv.nkstot; + else if (GlobalV::NSPIN == 2) + numkpt_nnkp = GlobalC::kv.nkstot / 2; + else + throw std::runtime_error("numkpt_nnkp uninitialized in " + ModuleBase::GlobalFunc::TO_STRING(__FILE__) + + " line " + ModuleBase::GlobalFunc::TO_STRING(__LINE__)); + + for (int ik = 0; ik < numkpt_nnkp; ik++) + { + for (int ib = 0; ib < nntot; ib++) + { + int ik_nnkp; + nnkp_read >> ik_nnkp; + if (ik_nnkp != (ik + 1)) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", "error nnkpts in *.nnkp file"); + nnkp_read >> nnlist[ik][ib]; + nnkp_read >> nncell[ik][ib].x >> nncell[ik][ib].y >> nncell[ik][ib].z; + nnlist[ik][ib]--; // this is c++ , begin from 0 + } + } + } + + if (ModuleBase::GlobalFunc::SCAN_BEGIN(nnkp_read, "exclude_bands")) + { + ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, num_exclude_bands); + if (num_exclude_bands > 0) + exclude_bands = new int[num_exclude_bands]; + else if (num_exclude_bands < 0) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", + "the exclude bands is wrong , please check *.nnkp file."); + + if (num_exclude_bands > 0) + { + for (int i = 0; i < num_exclude_bands; i++) + { + ModuleBase::GlobalFunc::READ_VALUE(nnkp_read, exclude_bands[i]); + exclude_bands[i]--; // this is c++ , begin from 0 + } + } + } + + // test by jingan + // GlobalV::ofs_running << "num_exclude_bands = " << num_exclude_bands << std::endl; + // for(int i = 0; i < num_exclude_bands; i++) + //{ + // GlobalV::ofs_running << "exclude_bands : " << exclude_bands[i] << std::endl; + //} + // test by jingan + + nnkp_read.close(); + + for (int i = 0; i < num_wannier; i++) + { + R_centre[i] = R_centre[i] * GlobalC::ucell.latvec; + m[i] = m[i] - 1; + } + + // test by jingan + // GlobalV::ofs_running << "num_wannier is " << num_wannier << std::endl; + // for(int i = 0; i < num_wannier; i++) + //{ + // GlobalV::ofs_running << "num_wannier" << std::endl; + // GlobalV::ofs_running << L[i] << " " << m[i] << " " << rvalue[i] << " " << alfa[i] << std::endl; + //} + // test by jingan + + tag_cal_band = new bool[GlobalV::NBANDS]; + if (GlobalV::NBANDS <= num_exclude_bands) + ModuleBase::WARNING_QUIT("toWannier90::read_nnkp", + "you set the band numer is not enough, please add bands number."); + if (num_exclude_bands == 0) + { + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + tag_cal_band[ib] = true; + } + else + { + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + tag_cal_band[ib] = true; + for (int ibb = 0; ibb < num_exclude_bands; ibb++) + { + if (exclude_bands[ibb] == ib) + { + tag_cal_band[ib] = false; + break; + } + } + } + } + + if (num_exclude_bands < 0) + num_bands = GlobalV::NBANDS; + else + num_bands = GlobalV::NBANDS - num_exclude_bands; } void toWannier90::outEIG() { - if(GlobalV::MY_RANK == 0) - { - std::string fileaddress = GlobalV::global_out_dir + wannier_file_name + ".eig"; - std::ofstream eig_file( fileaddress.c_str() ); - for(int ik = start_k_index; ik < (cal_num_kpts+start_k_index); ik++) - { - int index_band = 0; - for(int ib = 0; ib < GlobalV::NBANDS; ib++) - { - if(!tag_cal_band[ib]) continue; - index_band++; - eig_file << std::setw(5) << index_band << std::setw(5) << ik+1-start_k_index - << std::setw(18) << showpoint << fixed << std::setprecision(12) - << GlobalC::wf.ekb[ik][ib] * ModuleBase::Ry_to_eV << std::endl; - } - } - - eig_file.close(); - } + if (GlobalV::MY_RANK == 0) + { + std::string fileaddress = GlobalV::global_out_dir + wannier_file_name + ".eig"; + std::ofstream eig_file(fileaddress.c_str()); + for (int ik = start_k_index; ik < (cal_num_kpts + start_k_index); ik++) + { + int index_band = 0; + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + if (!tag_cal_band[ib]) + continue; + index_band++; + eig_file << std::setw(5) << index_band << std::setw(5) << ik + 1 - start_k_index << std::setw(18) + << showpoint << fixed << std::setprecision(12) + << GlobalC::wf.ekb[ik][ib] * ModuleBase::Ry_to_eV << std::endl; + } + } + + eig_file.close(); + } } - -void toWannier90::writeUNK(const psi::Psi>& wfc_pw) +void toWannier90::writeUNK(const psi::Psi> &wfc_pw) { - - // std::complex *porter = new std::complex[GlobalC::wfcpw->nrxx]; - - // for(int ik = start_k_index; ik < (cal_num_kpts+start_k_index); ik++) - // { - // std::stringstream name; - // if(GlobalV::NSPIN==1 || GlobalV::NSPIN==4) - // { - // name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') << ik+1 << ".1" ; - // } - // else if(GlobalV::NSPIN==2) - // { - // if(wannier_spin=="up") name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') << ik+1-start_k_index << ".1" ; - // else if(wannier_spin=="down") name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') << ik+1-start_k_index << ".2" ; - // } - - // std::ofstream unkfile(name.str()); - - // unkfile << std::setw(12) << GlobalC::rhopw->nx << std::setw(12) << GlobalC::rhopw->ny << std::setw(12) << GlobalC::rhopw->nz << std::setw(12) << ik+1 << std::setw(12) << num_bands << std::endl; - - // for(int ib = 0; ib < GlobalV::NBANDS; ib++) - // { - // if(!tag_cal_band[ib]) continue; - // //std::complex *porter = GlobalC::UFFT.porter; - // // u_k in real space - // ModuleBase::GlobalFunc::ZEROS(porter, GlobalC::rhopw->nrxx); - // for (int ig = 0; ig < GlobalC::kv.ngk[ik]; ig++) - // { - // porter[GlobalC::sf.ig2fftw[GlobalC::wf.igk(ik, ig)]] = wfc_pw[ik](ib, ig); - // } - // GlobalC::sf.FFT_wfc.FFT3D(porter, 1); - - // for(int k=0; knz; k++) - // { - // for(int j=0; jny; j++) - // { - // for(int i=0; inx; i++) - // { - // if(!gamma_only_wannier) - // { - // unkfile << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) << porter[i*GlobalC::rhopw->ny*GlobalC::rhopw->nz + j*GlobalC::rhopw->nz + k].real() - // << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) << porter[i*GlobalC::rhopw->ny*GlobalC::rhopw->nz + j*GlobalC::rhopw->nz + k].imag() - // //jingan test - // //<< " " << std::setw(12) << std::setprecision(9) << std::setiosflags(ios::scientific) << abs(porter[i*GlobalC::rhopw->ny*GlobalC::rhopw->nz + j*GlobalC::rhopw->nz + k]) - // << std::endl; - // } - // else - // { - // double zero = 0.0; - // unkfile << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) << abs( porter[i*GlobalC::rhopw->ny*GlobalC::rhopw->nz + j*GlobalC::rhopw->nz + k] ) - // << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) << zero - // //jingan test - // //<< " " << std::setw(12) << std::setprecision(9) << std::setiosflags(ios::scientific) << abs(porter[i*GlobalC::rhopw->ny*GlobalC::rhopw->nz + j*GlobalC::rhopw->nz + k]) - // << std::endl; - // } - // } - // } - // } - - - // } - - - // unkfile.close(); - - // } - - // delete[] porter; - +/* + std::complex *porter = new std::complex[GlobalC::wfcpw->nrxx]; + + for(int ik = start_k_index; ik < (cal_num_kpts+start_k_index); ik++) + { + std::stringstream name; + if(GlobalV::NSPIN==1 || GlobalV::NSPIN==4) + { + name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') << ik+1 << ".1" ; + } + else if(GlobalV::NSPIN==2) + { + if(wannier_spin=="up") name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') << + ik+1-start_k_index << ".1" ; else if(wannier_spin=="down") name << GlobalV::global_out_dir << "UNK" << std::setw(5) + << setfill('0') << ik+1-start_k_index << ".2" ; + } + + std::ofstream unkfile(name.str()); + + unkfile << std::setw(12) << GlobalC::rhopw->nx << std::setw(12) << GlobalC::rhopw->ny << std::setw(12) << + GlobalC::rhopw->nz << std::setw(12) << ik+1 << std::setw(12) << num_bands << std::endl; + + for(int ib = 0; ib < GlobalV::NBANDS; ib++) + { + if(!tag_cal_band[ib]) continue; + //std::complex *porter = GlobalC::UFFT.porter; + // u_k in real space + ModuleBase::GlobalFunc::ZEROS(porter, GlobalC::rhopw->nrxx); + for (int ig = 0; ig < GlobalC::kv.ngk[ik]; ig++) + { + porter[GlobalC::sf.ig2fftw[GlobalC::wf.igk(ik, ig)]] = wfc_pw[ik](ib, ig); + } + GlobalC::sf.FFT_wfc.FFT3D(porter, 1); + + for(int k=0; knz; k++) + { + for(int j=0; jny; j++) + { + for(int i=0; inx; i++) + { + if(!gamma_only_wannier) + { + unkfile << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) << + porter[i*GlobalC::rhopw->ny*GlobalC::rhopw->nz + j*GlobalC::rhopw->nz + k].real() + << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) << + porter[i*GlobalC::rhopw->ny*GlobalC::rhopw->nz + j*GlobalC::rhopw->nz + k].imag() + //jingan test + //<< " " << std::setw(12) << std::setprecision(9) << + std::setiosflags(ios::scientific) << abs(porter[i*GlobalC::rhopw->ny*GlobalC::rhopw->nz + j*GlobalC::rhopw->nz + k]) + << std::endl; + } + else + { + double zero = 0.0; + unkfile << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) << + abs( porter[i*GlobalC::rhopw->ny*GlobalC::rhopw->nz + j*GlobalC::rhopw->nz + k] ) + << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) << + zero + //jingan test + //<< " " << std::setw(12) << std::setprecision(9) << + std::setiosflags(ios::scientific) << abs(porter[i*GlobalC::rhopw->ny*GlobalC::rhopw->nz + j*GlobalC::rhopw->nz + k]) + << std::endl; + } + } + } + } + + + } + + + unkfile.close(); + + } + + delete[] porter; +*/ #ifdef __MPI - // num_z: how many planes on processor 'ip' - int *num_z = new int[GlobalV::NPROC_IN_POOL]; - ModuleBase::GlobalFunc::ZEROS(num_z, GlobalV::NPROC_IN_POOL); - for (int iz=0;iznbz;iz++) - { - int ip = iz % GlobalV::NPROC_IN_POOL; - num_z[ip] += GlobalC::bigpw->bz; - } - - // start_z: start position of z in - // processor ip. - int *start_z = new int[GlobalV::NPROC_IN_POOL]; - ModuleBase::GlobalFunc::ZEROS(start_z, GlobalV::NPROC_IN_POOL); - for (int ip=1;ipnz]; - ModuleBase::GlobalFunc::ZEROS(which_ip, GlobalC::wfcpw->nz); - for(int iz=0; iznz; iz++) - { - for(int ip=0; ip=start_z[GlobalV::NPROC_IN_POOL-1]) - { - which_ip[iz] = GlobalV::NPROC_IN_POOL-1; - break; - } - else if(iz>=start_z[ip] && iz *porter = new std::complex[GlobalC::wfcpw->nrxx]; - int nxy = GlobalC::wfcpw->nx * GlobalC::wfcpw->ny; - std::complex *zpiece = new std::complex[nxy]; - - if(GlobalV::MY_POOL==0) - { - for(int ik = start_k_index; ik < (cal_num_kpts+start_k_index); ik++) - { - std::ofstream unkfile; - - if(GlobalV::MY_RANK == 0) - { - std::stringstream name; - if(GlobalV::NSPIN==1 || GlobalV::NSPIN==4) - { - name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') << ik+1 << ".1" ; - } - else if(GlobalV::NSPIN==2) - { - if(wannier_spin=="up") name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') << ik+1-start_k_index << ".1" ; - else if(wannier_spin=="down") name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') << ik+1-start_k_index << ".2" ; - } - - unkfile.open(name.str(),ios::out); - - unkfile << std::setw(12) << GlobalC::wfcpw->nx << std::setw(12) << GlobalC::wfcpw->ny << std::setw(12) << GlobalC::wfcpw->nz << std::setw(12) << ik+1 << std::setw(12) << num_bands << std::endl; - } - - for(int ib = 0; ib < GlobalV::NBANDS; ib++) - { - if(!tag_cal_band[ib]) continue; - - GlobalC::wfcpw->recip2real(&wfc_pw(ik, ib, 0), porter, ik); - - // save the rho one z by one z. - for(int iz=0; iznz; iz++) - { - // tag must be different for different iz. - ModuleBase::GlobalFunc::ZEROS(zpiece, nxy); - int tag = iz; - MPI_Status ierror; - - // case 1: the first part of rho in processor 0. - if(which_ip[iz] == 0 && GlobalV::RANK_IN_POOL ==0) - { - for(int ir=0; irnplane+iz-GlobalC::wfcpw->startz_current]; - } - } - // case 2: > first part rho: send the rho to - // processor 0. - else if(which_ip[iz] == GlobalV::RANK_IN_POOL ) - { - for(int ir=0; irnplane+iz-GlobalC::wfcpw->startz_current]; - } - MPI_Send(zpiece, nxy, MPI_DOUBLE_COMPLEX, 0, tag, POOL_WORLD); - } - - // case 2: > first part rho: processor 0 receive the rho - // from other processors - else if(GlobalV::RANK_IN_POOL==0) - { - MPI_Recv(zpiece, nxy, MPI_DOUBLE_COMPLEX, which_ip[iz], tag, POOL_WORLD, &ierror); - } - - // write data - if(GlobalV::MY_RANK==0) - { - for(int iy=0; iyny; iy++) - { - for(int ix=0; ixnx; ix++) - { - unkfile << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) << zpiece[ix*GlobalC::wfcpw->ny+iy].real() - << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) << zpiece[ix*GlobalC::wfcpw->ny+iy].imag() - << std::endl; - } - } - } - }// end iz - MPI_Barrier(POOL_WORLD); - } - - if(GlobalV::MY_RANK == 0) - { - unkfile.close(); - } - - } - } - MPI_Barrier(MPI_COMM_WORLD); - - delete[] num_z; - delete[] start_z; - delete[] which_ip; - delete[] porter; - delete[] zpiece; - -#endif - -} - - - - - + // num_z: how many planes on processor 'ip' + int *num_z = new int[GlobalV::NPROC_IN_POOL]; + ModuleBase::GlobalFunc::ZEROS(num_z, GlobalV::NPROC_IN_POOL); + for (int iz = 0; iz < GlobalC::bigpw->nbz; iz++) + { + int ip = iz % GlobalV::NPROC_IN_POOL; + num_z[ip] += GlobalC::bigpw->bz; + } + + // start_z: start position of z in + // processor ip. + int *start_z = new int[GlobalV::NPROC_IN_POOL]; + ModuleBase::GlobalFunc::ZEROS(start_z, GlobalV::NPROC_IN_POOL); + for (int ip = 1; ip < GlobalV::NPROC_IN_POOL; ip++) + { + start_z[ip] = start_z[ip - 1] + num_z[ip - 1]; + } + + // which_ip: found iz belongs to which ip. + int *which_ip = new int[GlobalC::wfcpw->nz]; + ModuleBase::GlobalFunc::ZEROS(which_ip, GlobalC::wfcpw->nz); + for (int iz = 0; iz < GlobalC::wfcpw->nz; iz++) + { + for (int ip = 0; ip < GlobalV::NPROC_IN_POOL; ip++) + { + if (iz >= start_z[GlobalV::NPROC_IN_POOL - 1]) + { + which_ip[iz] = GlobalV::NPROC_IN_POOL - 1; + break; + } + else if (iz >= start_z[ip] && iz < start_z[ip + 1]) + { + which_ip[iz] = ip; + break; + } + } + } + + // only do in the first pool. + std::complex *porter = new std::complex[GlobalC::wfcpw->nrxx]; + int nxy = GlobalC::wfcpw->nx * GlobalC::wfcpw->ny; + std::complex *zpiece = new std::complex[nxy]; + + if (GlobalV::MY_POOL == 0) + { + for (int ik = start_k_index; ik < (cal_num_kpts + start_k_index); ik++) + { + std::ofstream unkfile; + + if (GlobalV::MY_RANK == 0) + { + std::stringstream name; + if (GlobalV::NSPIN == 1 || GlobalV::NSPIN == 4) + { + name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') << ik + 1 << ".1"; + } + else if (GlobalV::NSPIN == 2) + { + if (wannier_spin == "up") + name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') + << ik + 1 - start_k_index << ".1"; + else if (wannier_spin == "down") + name << GlobalV::global_out_dir << "UNK" << std::setw(5) << setfill('0') + << ik + 1 - start_k_index << ".2"; + } + + unkfile.open(name.str(), ios::out); + + unkfile << std::setw(12) << GlobalC::wfcpw->nx << std::setw(12) << GlobalC::wfcpw->ny << std::setw(12) + << GlobalC::wfcpw->nz << std::setw(12) << ik + 1 << std::setw(12) << num_bands << std::endl; + } + + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + if (!tag_cal_band[ib]) + continue; + + GlobalC::wfcpw->recip2real(&wfc_pw(ik, ib, 0), porter, ik); + + // save the rho one z by one z. + for (int iz = 0; iz < GlobalC::wfcpw->nz; iz++) + { + // tag must be different for different iz. + ModuleBase::GlobalFunc::ZEROS(zpiece, nxy); + int tag = iz; + MPI_Status ierror; + + // case 1: the first part of rho in processor 0. + if (which_ip[iz] == 0 && GlobalV::RANK_IN_POOL == 0) + { + for (int ir = 0; ir < nxy; ir++) + { + zpiece[ir] = porter[ir * GlobalC::wfcpw->nplane + iz - GlobalC::wfcpw->startz_current]; + } + } + // case 2: > first part rho: send the rho to + // processor 0. + else if (which_ip[iz] == GlobalV::RANK_IN_POOL) + { + for (int ir = 0; ir < nxy; ir++) + { + zpiece[ir] = porter[ir * GlobalC::wfcpw->nplane + iz - GlobalC::wfcpw->startz_current]; + } + MPI_Send(zpiece, nxy, MPI_DOUBLE_COMPLEX, 0, tag, POOL_WORLD); + } + + // case 2: > first part rho: processor 0 receive the rho + // from other processors + else if (GlobalV::RANK_IN_POOL == 0) + { + MPI_Recv(zpiece, nxy, MPI_DOUBLE_COMPLEX, which_ip[iz], tag, POOL_WORLD, &ierror); + } + + // write data + if (GlobalV::MY_RANK == 0) + { + for (int iy = 0; iy < GlobalC::wfcpw->ny; iy++) + { + for (int ix = 0; ix < GlobalC::wfcpw->nx; ix++) + { + unkfile << std::setw(20) << std::setprecision(9) << std::setiosflags(ios::scientific) + << zpiece[ix * GlobalC::wfcpw->ny + iy].real() << std::setw(20) + << std::setprecision(9) << std::setiosflags(ios::scientific) + << zpiece[ix * GlobalC::wfcpw->ny + iy].imag() << std::endl; + } + } + } + } // end iz + MPI_Barrier(POOL_WORLD); + } + + if (GlobalV::MY_RANK == 0) + { + unkfile.close(); + } + } + } + MPI_Barrier(MPI_COMM_WORLD); + + delete[] num_z; + delete[] start_z; + delete[] which_ip; + delete[] porter; + delete[] zpiece; +#endif +} -void toWannier90::cal_Amn(const psi::Psi>& wfc_pw) +void toWannier90::cal_Amn(const psi::Psi> &wfc_pw) { - // ��һ��������ʵ��г����lm��ij��k���µ�ƽ�沨�����µı��񣨾��� - // �ڶ���������̽����ľ��򲿷���ij��k����ƽ�沨ͶӰ - // ����������ȡ��̽�����ij��k����ƽ�沨�����µ�ͶӰ - const int pwNumberMax = GlobalC::wf.npwx; - - std::ofstream Amn_file; - - if(GlobalV::MY_RANK == 0) - { - time_t time_now = time(NULL); - std::string fileaddress = GlobalV::global_out_dir + wannier_file_name + ".amn"; - Amn_file.open( fileaddress.c_str() , ios::out); - Amn_file << " Created on " << ctime(&time_now); - Amn_file << std::setw(12) << num_bands << std::setw(12) << cal_num_kpts << std::setw(12) << num_wannier << std::endl; - } - - ModuleBase::ComplexMatrix *trial_orbitals = new ModuleBase::ComplexMatrix[cal_num_kpts]; - for(int ik = 0; ik < cal_num_kpts; ik++) - { - trial_orbitals[ik].create(num_wannier,pwNumberMax); - produce_trial_in_pw(ik,trial_orbitals[ik]); - } - - // test by jingan - //GlobalV::ofs_running << __FILE__ << __LINE__ << "start_k_index = " << start_k_index << " cal_num_kpts = " << cal_num_kpts << std::endl; - // test by jingan - - for(int ik = start_k_index; ik < (cal_num_kpts+start_k_index); ik++) - { - for(int iw = 0; iw < num_wannier; iw++) - { - int index_band = 0; - for(int ib = 0; ib < GlobalV::NBANDS; ib++) - { - if(!tag_cal_band[ib]) continue; - index_band++; - std::complex amn(0.0,0.0); - std::complex amn_tem(0.0,0.0); - for(int ig = 0; ig < pwNumberMax; ig++) - { - int cal_ik = ik - start_k_index; - amn_tem = amn_tem + conj( wfc_pw(ik,ib,ig) ) * trial_orbitals[cal_ik](iw,ig); - } + const int pwNumberMax = GlobalC::wf.npwx; + + std::ofstream Amn_file; + + if (GlobalV::MY_RANK == 0) + { + time_t time_now = time(NULL); + std::string fileaddress = GlobalV::global_out_dir + wannier_file_name + ".amn"; + Amn_file.open(fileaddress.c_str(), ios::out); + Amn_file << " Created on " << ctime(&time_now); + Amn_file << std::setw(12) << num_bands << std::setw(12) << cal_num_kpts << std::setw(12) << num_wannier + << std::endl; + } + + ModuleBase::ComplexMatrix *trial_orbitals = new ModuleBase::ComplexMatrix[cal_num_kpts]; + for (int ik = 0; ik < cal_num_kpts; ik++) + { + trial_orbitals[ik].create(num_wannier, pwNumberMax); + produce_trial_in_pw(ik, trial_orbitals[ik]); + } + + // test by jingan + // GlobalV::ofs_running << __FILE__ << __LINE__ << "start_k_index = " << start_k_index << " cal_num_kpts = " << + // cal_num_kpts << std::endl; + // test by jingan + + for (int ik = start_k_index; ik < (cal_num_kpts + start_k_index); ik++) + { + for (int iw = 0; iw < num_wannier; iw++) + { + int index_band = 0; + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + if (!tag_cal_band[ib]) + continue; + index_band++; + std::complex amn(0.0, 0.0); + std::complex amn_tem(0.0, 0.0); + for (int ig = 0; ig < pwNumberMax; ig++) + { + int cal_ik = ik - start_k_index; + amn_tem = amn_tem + conj(wfc_pw(ik, ib, ig)) * trial_orbitals[cal_ik](iw, ig); + } #ifdef __MPI - MPI_Allreduce(&amn_tem , &amn , 1, MPI_DOUBLE_COMPLEX , MPI_SUM , POOL_WORLD); + MPI_Allreduce(&amn_tem, &amn, 1, MPI_DOUBLE_COMPLEX, MPI_SUM, POOL_WORLD); #else - amn=amn_tem; + amn = amn_tem; #endif - if(GlobalV::MY_RANK == 0) - { - Amn_file << std::setw(5) << index_band << std::setw(5) << iw+1 << std::setw(5) << ik+1-start_k_index - << std::setw(18) << showpoint << fixed << std::setprecision(12) << amn.real() - << std::setw(18) << showpoint << fixed << std::setprecision(12) << amn.imag() - //jingan test - //<< " " << std::setw(18) << std::setprecision(13) << abs(amn) - << std::endl; - } - } - } - } - - - - if(GlobalV::MY_RANK == 0) Amn_file.close(); - - delete[] trial_orbitals; - + if (GlobalV::MY_RANK == 0) + { + Amn_file << std::setw(5) << index_band << std::setw(5) << iw + 1 << std::setw(5) + << ik + 1 - start_k_index << std::setw(18) << showpoint << fixed << std::setprecision(12) + << amn.real() << std::setw(18) << showpoint << fixed << std::setprecision(12) + << amn.imag() + // jingan test + //<< " " << std::setw(18) << std::setprecision(13) << abs(amn) + << std::endl; + } + } + } + } + + if (GlobalV::MY_RANK == 0) + Amn_file.close(); + + delete[] trial_orbitals; } - - -void toWannier90::cal_Mmn(const psi::Psi>& wfc_pw) -{ - // test by jingan - //GlobalV::ofs_running << __FILE__ << __LINE__ << " cal_num_kpts = " << cal_num_kpts << std::endl; - // test by jingan - - std::ofstream mmn_file; - - if(GlobalV::MY_RANK == 0) - { - std::string fileaddress = GlobalV::global_out_dir + wannier_file_name + ".mmn"; - mmn_file.open( fileaddress.c_str() , ios::out); - - time_t time_now = time(NULL); - mmn_file << " Created on " << ctime(&time_now); - mmn_file << std::setw(12) << num_bands << std::setw(12) << cal_num_kpts << std::setw(12) << nntot << std::endl; - } - - /* - ModuleBase::ComplexMatrix Mmn(GlobalV::NBANDS,GlobalV::NBANDS); - if(gamma_only_wannier) - { - for(int ib = 0; ib < nntot; ib++) - { - ModuleBase::Vector3 phase_G = nncell[0][ib]; - for(int m = 0; m < GlobalV::NBANDS; m++) - { - if(!tag_cal_band[m]) continue; - for(int n = 0; n <= m; n++) - { - if(!tag_cal_band[n]) continue; - std::complex mmn_tem = gamma_only_cal(m,n,wfc_pw,phase_G); - Mmn(m,n) = mmn_tem; - if(m!=n) Mmn(n,m) = Mmn(m,n); - } - } - } - } - */ - - for(int ik = 0; ik < cal_num_kpts; ik++) - { - for(int ib = 0; ib < nntot; ib++) - { - int ikb = nnlist[ik][ib]; // ik+b : ik�Ľ���k�� - - ModuleBase::Vector3 phase_G = nncell[ik][ib]; - - if(GlobalV::MY_RANK == 0) - { - mmn_file << std::setw(5) << ik+1 << std::setw(5) << ikb+1 << std::setw(5) - << int(phase_G.x) << std::setw(5) << int(phase_G.y) << std::setw(5) << int(phase_G.z) - << std::endl; - } - - for(int m = 0; m < GlobalV::NBANDS; m++) - { - if(!tag_cal_band[m]) continue; - for(int n = 0; n < GlobalV::NBANDS; n++) - { - if(!tag_cal_band[n]) continue; - std::complex mmn(0.0,0.0); - - if(!gamma_only_wannier) - { - int cal_ik = ik + start_k_index; - int cal_ikb = ikb + start_k_index; - // test by jingan - //GlobalV::ofs_running << __FILE__ << __LINE__ << "cal_ik = " << cal_ik << "cal_ikb = " << cal_ikb << std::endl; - // test by jingan - //std::complex *unk_L_r = new std::complex[GlobalC::wfcpw->nrxx]; - //ToRealSpace(cal_ik,n,wfc_pw,unk_L_r,phase_G); - //mmn = unkdotb(unk_L_r,cal_ikb,m,wfc_pw); - mmn = unkdotkb(cal_ik,cal_ikb,n,m,phase_G,wfc_pw); - //delete[] unk_L_r; - } - else - { - //GlobalV::ofs_running << "gamma only test" << std::endl; - //mmn = Mmn(n,m); - } - - if(GlobalV::MY_RANK == 0) - { - mmn_file << std::setw(18) << std::setprecision(12) << showpoint << fixed << mmn.real() - << std::setw(18) << std::setprecision(12) << showpoint << fixed << mmn.imag() - // jingan test - //<< " " << std::setw(12) << std::setprecision(9) << abs(mmn) - << std::endl; - } - } - } - } - - } - - if(GlobalV::MY_RANK == 0) mmn_file.close(); - +void toWannier90::cal_Mmn(const psi::Psi> &wfc_pw) +{ + // test by jingan + // GlobalV::ofs_running << __FILE__ << __LINE__ << " cal_num_kpts = " << cal_num_kpts << std::endl; + // test by jingan + + std::ofstream mmn_file; + + if (GlobalV::MY_RANK == 0) + { + std::string fileaddress = GlobalV::global_out_dir + wannier_file_name + ".mmn"; + mmn_file.open(fileaddress.c_str(), ios::out); + + time_t time_now = time(NULL); + mmn_file << " Created on " << ctime(&time_now); + mmn_file << std::setw(12) << num_bands << std::setw(12) << cal_num_kpts << std::setw(12) << nntot << std::endl; + } + + /* + ModuleBase::ComplexMatrix Mmn(GlobalV::NBANDS,GlobalV::NBANDS); + if(gamma_only_wannier) + { + for(int ib = 0; ib < nntot; ib++) + { + ModuleBase::Vector3 phase_G = nncell[0][ib]; + for(int m = 0; m < GlobalV::NBANDS; m++) + { + if(!tag_cal_band[m]) continue; + for(int n = 0; n <= m; n++) + { + if(!tag_cal_band[n]) continue; + std::complex mmn_tem = gamma_only_cal(m,n,wfc_pw,phase_G); + Mmn(m,n) = mmn_tem; + if(m!=n) Mmn(n,m) = Mmn(m,n); + } + } + } + } + */ + + for (int ik = 0; ik < cal_num_kpts; ik++) + { + for (int ib = 0; ib < nntot; ib++) + { + int ikb = nnlist[ik][ib]; + + ModuleBase::Vector3 phase_G = nncell[ik][ib]; + + if (GlobalV::MY_RANK == 0) + { + mmn_file << std::setw(5) << ik + 1 << std::setw(5) << ikb + 1 << std::setw(5) << int(phase_G.x) + << std::setw(5) << int(phase_G.y) << std::setw(5) << int(phase_G.z) << std::endl; + } + + for (int m = 0; m < GlobalV::NBANDS; m++) + { + if (!tag_cal_band[m]) + continue; + for (int n = 0; n < GlobalV::NBANDS; n++) + { + if (!tag_cal_band[n]) + continue; + std::complex mmn(0.0, 0.0); + + if (!gamma_only_wannier) + { + int cal_ik = ik + start_k_index; + int cal_ikb = ikb + start_k_index; + // test by jingan + // GlobalV::ofs_running << __FILE__ << __LINE__ << "cal_ik = " << cal_ik << "cal_ikb = " << + // cal_ikb << std::endl; + // test by jingan + // std::complex *unk_L_r = new std::complex[GlobalC::wfcpw->nrxx]; + // ToRealSpace(cal_ik,n,wfc_pw,unk_L_r,phase_G); + // mmn = unkdotb(unk_L_r,cal_ikb,m,wfc_pw); + mmn = unkdotkb(cal_ik, cal_ikb, n, m, phase_G, wfc_pw); + // delete[] unk_L_r; + } + else + { + // GlobalV::ofs_running << "gamma only test" << std::endl; + // mmn = Mmn(n,m); + } + + if (GlobalV::MY_RANK == 0) + { + mmn_file << std::setw(18) << std::setprecision(12) << showpoint << fixed << mmn.real() + << std::setw(18) << std::setprecision(12) << showpoint << fixed + << mmn.imag() + // jingan test + //<< " " << std::setw(12) << std::setprecision(9) << abs(mmn) + << std::endl; + } + } + } + } + } + + if (GlobalV::MY_RANK == 0) + mmn_file.close(); } - void toWannier90::produce_trial_in_pw(const int &ik, ModuleBase::ComplexMatrix &trial_orbitals_k) { - // �������Ƿ���ȷ - for(int i =0; i < num_wannier; i++) - { - if(L[i] < -5 || L[i] > 3) std::cout << "toWannier90::produce_trial_in_pw() your L angular momentum is wrong , please check !!! " << std::endl; - - if(L[i] >= 0) - { - if(m[i] < 0 || m[i] > 2*L[i]) std::cout << "toWannier90::produce_trial_in_pw() your m momentum is wrong , please check !!! " << std::endl; - } - else - { - if(m[i] < 0 || m[i] > -L[i]) std::cout << "toWannier90::produce_trial_in_pw() your m momentum is wrong , please check !!! " << std::endl; - - } - } - - const int npw = GlobalC::kv.ngk[ik]; - const int npwx = GlobalC::wf.npwx; - const int total_lm = 16; - ModuleBase::matrix ylm(total_lm,npw); //�������͵���г���� - //matrix wannier_ylm(num_wannier,npw); //Ҫ��̽�����ʹ�õ���г���� - double bs2, bs3, bs6, bs12; - bs2 = 1.0/sqrt(2.0); - bs3 = 1.0/sqrt(3.0); - bs6 = 1.0/sqrt(6.0); - bs12 = 1.0/sqrt(12.0); - - ModuleBase::Vector3 *gk = new ModuleBase::Vector3[npw]; - for(int ig = 0; ig < npw; ig++) - { - gk[ig] = GlobalC::wf.get_1qvec_cartesian(ik, ig); // k+Gʸ�� - } - - ModuleBase::YlmReal::Ylm_Real(total_lm, npw, gk, ylm); - - // test by jingan - //GlobalV::ofs_running << "the mathzone::ylm_real is successful!" << std::endl; - //GlobalV::ofs_running << "produce_trial_in_pw: num_wannier is " << num_wannier << std::endl; - // test by jingan - - - // 1.���ɾ�������ij��k��ƽ�沨�����ͶӰ - const int mesh_r = 333; //��������������Ҫ�ĸ���� - const double dx = 0.025; //�̶�������������ɷǹ̶������dr����߾���,���ֵ������ - const double x_min = -6.0; // ��������dr��r����ʼ�� - ModuleBase::matrix r(num_wannier,mesh_r); //��ͬalfa�ľ�������r - ModuleBase::matrix dr(num_wannier,mesh_r); //��ͬalfa�ľ�������ÿ��r��ļ�� - ModuleBase::matrix psi(num_wannier,mesh_r); //������psi in ʵ�ռ� - ModuleBase::matrix psir(num_wannier,mesh_r);// psi * r in ʵ�ռ� - ModuleBase::matrix psik(num_wannier,npw); //��������ij��k���µ��ռ��ͶӰ - - // ����r,dr - for(int i = 0; i < num_wannier; i++) - { - double x = 0; - for(int ir = 0; ir < mesh_r; ir++) - { - x = x_min + ir * dx; - r(i,ir) = exp(x) / alfa[i]; - dr(i,ir) = dx * r(i,ir); - } - - } - - // ����psi - for(int i = 0; i < num_wannier; i++) - { - double alfa32 = pow(alfa[i],3.0/2.0); - double alfa_new = alfa[i]; - int wannier_index = i; - - if(rvalue[i] == 1) - { - for(int ir = 0; ir < mesh_r; ir++) - { - psi(wannier_index,ir) = 2.0 * alfa32 * exp( -alfa_new * r(wannier_index,ir) ); - } - } - - if(rvalue[i] == 2) - { - for(int ir = 0; ir < mesh_r; ir++) - { - psi(wannier_index,ir) = 1.0/sqrt(8.0) * alfa32 - * (2.0 - alfa_new * r(wannier_index,ir)) - * exp( -alfa_new * r(wannier_index,ir) * 0.5 ); - } - } - - if(rvalue[i] == 3) - { - for(int ir = 0; ir < mesh_r; ir++) - { - psi(wannier_index,ir) = sqrt(4.0/27.0) * alfa32 - * ( 1.0 - 2.0/3.0 * alfa_new * r(wannier_index,ir) + 2.0/27.0 * pow(alfa_new,2.0) * r(wannier_index,ir) * r(wannier_index,ir) ) - * exp( -alfa_new * r(wannier_index,ir) * 1.0/3.0 ); - } - } - - } - - // ����psir - for(int i = 0; i < num_wannier; i++) - { - for(int ir = 0; ir < mesh_r; ir++) - { - psir(i,ir) = psi(i,ir) * r(i,ir); - } - } - - - // �����̽��� - for(int wannier_index = 0; wannier_index < num_wannier; wannier_index++) - { - if(L[wannier_index] >= 0) - { - get_trial_orbitals_lm_k(wannier_index, L[wannier_index], m[wannier_index], ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - } - else - { - if(L[wannier_index] == -1 && m[wannier_index] == 0) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex *tem_array = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs2 * tem_array[ig] + bs2 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array; - - } - else if(L[wannier_index] == -1 && m[wannier_index] == 1) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex *tem_array = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs2 * tem_array[ig] - bs2 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array; - } - else if(L[wannier_index] == -2 && m[wannier_index] == 0) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs3 * tem_array_1[ig] - bs6 * tem_array_2[ig] + bs2 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - delete[] tem_array_2; - } - else if(L[wannier_index] == -2 && m[wannier_index] == 1) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs3 * tem_array_1[ig] - bs6 * tem_array_2[ig] - bs2 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - delete[] tem_array_2; - } - else if(L[wannier_index] == -2 && m[wannier_index] == 2) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs3 * tem_array[ig] + 2 * bs6 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array; - } - else if(L[wannier_index] == -3 && m[wannier_index] == 0) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_3 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_3[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = 0.5*(tem_array_1[ig] + tem_array_2[ig] + tem_array_3[ig] + trial_orbitals_k(wannier_index,ig)); - } - delete[] tem_array_1; - delete[] tem_array_2; - delete[] tem_array_3; - - } - else if(L[wannier_index] == -3 && m[wannier_index] == 1) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_3 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_3[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = 0.5*(tem_array_1[ig] + tem_array_2[ig] - tem_array_3[ig] - trial_orbitals_k(wannier_index,ig)); - } - delete[] tem_array_1; - delete[] tem_array_2; - delete[] tem_array_3; - } - else if(L[wannier_index] == -3 && m[wannier_index] == 2) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_3 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_3[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = 0.5*(tem_array_1[ig] - tem_array_2[ig] + tem_array_3[ig] - trial_orbitals_k(wannier_index,ig)); - } - delete[] tem_array_1; - delete[] tem_array_2; - delete[] tem_array_3; - } - else if(L[wannier_index] == -3 && m[wannier_index] == 3) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_3 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_3[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = 0.5*(tem_array_1[ig] - tem_array_2[ig] - tem_array_3[ig] + trial_orbitals_k(wannier_index,ig)); - } - delete[] tem_array_1; - delete[] tem_array_2; - delete[] tem_array_3; - } - else if(L[wannier_index] == -4 && m[wannier_index] == 0) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs3 * tem_array_1[ig] - bs6 * tem_array_2[ig] + bs2 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - delete[] tem_array_2; - } - else if(L[wannier_index] == -4 && m[wannier_index] == 1) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs3 * tem_array_1[ig] - bs6 * tem_array_2[ig] - bs2 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - delete[] tem_array_2; - } - else if(L[wannier_index] == -4 && m[wannier_index] == 2) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs3 * tem_array_1[ig] - 2 * bs6 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - } - else if(L[wannier_index] == -4 && m[wannier_index] == 3) - { - get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs2 * tem_array_1[ig] + bs2 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - } - else if(L[wannier_index] == -4 && m[wannier_index] == 4) - { - get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = -1.0 * bs2 * tem_array_1[ig] + bs2 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - } - else if(L[wannier_index] == -5 && m[wannier_index] == 0) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_3 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_3[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 3, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs6 * tem_array_1[ig] - bs2 * tem_array_2[ig] - bs12 * tem_array_3[ig] + 0.5 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - delete[] tem_array_2; - delete[] tem_array_3; - } - else if(L[wannier_index] == -5 && m[wannier_index] == 1) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_3 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_3[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 3, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs6 * tem_array_1[ig] + bs2 * tem_array_2[ig] - bs12 * tem_array_3[ig] + 0.5 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - delete[] tem_array_2; - delete[] tem_array_3; - } - else if(L[wannier_index] == -5 && m[wannier_index] == 2) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_3 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_3[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 3, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs6 * tem_array_1[ig] - bs2 * tem_array_2[ig] - bs12 * tem_array_3[ig] - 0.5 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - delete[] tem_array_2; - delete[] tem_array_3; - } - else if(L[wannier_index] == -5 && m[wannier_index] == 3) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_3 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_3[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 3, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs6 * tem_array_1[ig] + bs2 * tem_array_2[ig] - bs12 * tem_array_3[ig] - 0.5 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - delete[] tem_array_2; - delete[] tem_array_3; - } - else if(L[wannier_index] == -5 && m[wannier_index] == 4) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs6 * tem_array_1[ig] - bs2 * tem_array_2[ig] + bs3 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - delete[] tem_array_2; - } - else if(L[wannier_index] == -5 && m[wannier_index] == 5) - { - get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_1 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_1[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - std::complex * tem_array_2 = new std::complex[npwx]; - for(int ig = 0; ig < npwx; ig++) - { - tem_array_2[ig] = trial_orbitals_k(wannier_index,ig); - } - get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr,r,psir,mesh_r,gk,npw,trial_orbitals_k); - for(int ig = 0; ig < npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = bs6 * tem_array_1[ig] + bs2 * tem_array_2[ig] + bs3 * trial_orbitals_k(wannier_index,ig); - } - delete[] tem_array_1; - delete[] tem_array_2; - } - } - } - - - + + for (int i = 0; i < num_wannier; i++) + { + if (L[i] < -5 || L[i] > 3) + std::cout << "toWannier90::produce_trial_in_pw() your L angular momentum is wrong , please check !!! " + << std::endl; + + if (L[i] >= 0) + { + if (m[i] < 0 || m[i] > 2 * L[i]) + std::cout << "toWannier90::produce_trial_in_pw() your m momentum is wrong , please check !!! " + << std::endl; + } + else + { + if (m[i] < 0 || m[i] > -L[i]) + std::cout << "toWannier90::produce_trial_in_pw() your m momentum is wrong , please check !!! " + << std::endl; + } + } + + const int npw = GlobalC::kv.ngk[ik]; + const int npwx = GlobalC::wf.npwx; + const int total_lm = 16; + ModuleBase::matrix ylm(total_lm, npw); + + double bs2, bs3, bs6, bs12; + bs2 = 1.0 / sqrt(2.0); + bs3 = 1.0 / sqrt(3.0); + bs6 = 1.0 / sqrt(6.0); + bs12 = 1.0 / sqrt(12.0); + + ModuleBase::Vector3 *gk = new ModuleBase::Vector3[npw]; + for (int ig = 0; ig < npw; ig++) + { + gk[ig] = GlobalC::wf.get_1qvec_cartesian(ik, ig); + } + + ModuleBase::YlmReal::Ylm_Real(total_lm, npw, gk, ylm); + + // test by jingan + // GlobalV::ofs_running << "the mathzone::ylm_real is successful!" << std::endl; + // GlobalV::ofs_running << "produce_trial_in_pw: num_wannier is " << num_wannier << std::endl; + // test by jingan + + const int mesh_r = 333; + const double dx = 0.025; + const double x_min = -6.0; + ModuleBase::matrix r(num_wannier, mesh_r); + ModuleBase::matrix dr(num_wannier, mesh_r); + ModuleBase::matrix psi(num_wannier, mesh_r); + ModuleBase::matrix psir(num_wannier, mesh_r); + ModuleBase::matrix psik(num_wannier, npw); + + for (int i = 0; i < num_wannier; i++) + { + double x = 0; + for (int ir = 0; ir < mesh_r; ir++) + { + x = x_min + ir * dx; + r(i, ir) = exp(x) / alfa[i]; + dr(i, ir) = dx * r(i, ir); + } + } + + for (int i = 0; i < num_wannier; i++) + { + double alfa32 = pow(alfa[i], 3.0 / 2.0); + double alfa_new = alfa[i]; + int wannier_index = i; + + if (rvalue[i] == 1) + { + for (int ir = 0; ir < mesh_r; ir++) + { + psi(wannier_index, ir) = 2.0 * alfa32 * exp(-alfa_new * r(wannier_index, ir)); + } + } + + if (rvalue[i] == 2) + { + for (int ir = 0; ir < mesh_r; ir++) + { + psi(wannier_index, ir) = 1.0 / sqrt(8.0) * alfa32 * (2.0 - alfa_new * r(wannier_index, ir)) + * exp(-alfa_new * r(wannier_index, ir) * 0.5); + } + } + + if (rvalue[i] == 3) + { + for (int ir = 0; ir < mesh_r; ir++) + { + psi(wannier_index, ir) + = sqrt(4.0 / 27.0) * alfa32 + * (1.0 - 2.0 / 3.0 * alfa_new * r(wannier_index, ir) + + 2.0 / 27.0 * pow(alfa_new, 2.0) * r(wannier_index, ir) * r(wannier_index, ir)) + * exp(-alfa_new * r(wannier_index, ir) * 1.0 / 3.0); + } + } + } + + for (int i = 0; i < num_wannier; i++) + { + for (int ir = 0; ir < mesh_r; ir++) + { + psir(i, ir) = psi(i, ir) * r(i, ir); + } + } + + for (int wannier_index = 0; wannier_index < num_wannier; wannier_index++) + { + if (L[wannier_index] >= 0) + { + get_trial_orbitals_lm_k(wannier_index, + L[wannier_index], + m[wannier_index], + ylm, + dr, + r, + psir, + mesh_r, + gk, + npw, + trial_orbitals_k); + } + else + { + if (L[wannier_index] == -1 && m[wannier_index] == 0) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs2 * tem_array[ig] + bs2 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array; + } + else if (L[wannier_index] == -1 && m[wannier_index] == 1) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs2 * tem_array[ig] - bs2 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array; + } + else if (L[wannier_index] == -2 && m[wannier_index] == 0) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs3 * tem_array_1[ig] - bs6 * tem_array_2[ig] + bs2 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + delete[] tem_array_2; + } + else if (L[wannier_index] == -2 && m[wannier_index] == 1) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs3 * tem_array_1[ig] - bs6 * tem_array_2[ig] - bs2 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + delete[] tem_array_2; + } + else if (L[wannier_index] == -2 && m[wannier_index] == 2) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs3 * tem_array[ig] + 2 * bs6 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array; + } + else if (L[wannier_index] == -3 && m[wannier_index] == 0) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_3 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_3[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = 0.5 + * (tem_array_1[ig] + tem_array_2[ig] + tem_array_3[ig] + trial_orbitals_k(wannier_index, ig)); + } + delete[] tem_array_1; + delete[] tem_array_2; + delete[] tem_array_3; + } + else if (L[wannier_index] == -3 && m[wannier_index] == 1) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_3 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_3[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = 0.5 + * (tem_array_1[ig] + tem_array_2[ig] - tem_array_3[ig] - trial_orbitals_k(wannier_index, ig)); + } + delete[] tem_array_1; + delete[] tem_array_2; + delete[] tem_array_3; + } + else if (L[wannier_index] == -3 && m[wannier_index] == 2) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_3 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_3[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = 0.5 + * (tem_array_1[ig] - tem_array_2[ig] + tem_array_3[ig] - trial_orbitals_k(wannier_index, ig)); + } + delete[] tem_array_1; + delete[] tem_array_2; + delete[] tem_array_3; + } + else if (L[wannier_index] == -3 && m[wannier_index] == 3) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_3 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_3[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = 0.5 + * (tem_array_1[ig] - tem_array_2[ig] - tem_array_3[ig] + trial_orbitals_k(wannier_index, ig)); + } + delete[] tem_array_1; + delete[] tem_array_2; + delete[] tem_array_3; + } + else if (L[wannier_index] == -4 && m[wannier_index] == 0) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs3 * tem_array_1[ig] - bs6 * tem_array_2[ig] + bs2 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + delete[] tem_array_2; + } + else if (L[wannier_index] == -4 && m[wannier_index] == 1) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs3 * tem_array_1[ig] - bs6 * tem_array_2[ig] - bs2 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + delete[] tem_array_2; + } + else if (L[wannier_index] == -4 && m[wannier_index] == 2) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs3 * tem_array_1[ig] - 2 * bs6 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + } + else if (L[wannier_index] == -4 && m[wannier_index] == 3) + { + get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs2 * tem_array_1[ig] + bs2 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + } + else if (L[wannier_index] == -4 && m[wannier_index] == 4) + { + get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = -1.0 * bs2 * tem_array_1[ig] + bs2 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + } + else if (L[wannier_index] == -5 && m[wannier_index] == 0) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_3 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_3[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 3, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) = bs6 * tem_array_1[ig] - bs2 * tem_array_2[ig] + - bs12 * tem_array_3[ig] + + 0.5 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + delete[] tem_array_2; + delete[] tem_array_3; + } + else if (L[wannier_index] == -5 && m[wannier_index] == 1) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 1, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_3 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_3[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 3, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) = bs6 * tem_array_1[ig] + bs2 * tem_array_2[ig] + - bs12 * tem_array_3[ig] + + 0.5 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + delete[] tem_array_2; + delete[] tem_array_3; + } + else if (L[wannier_index] == -5 && m[wannier_index] == 2) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_3 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_3[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 3, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) = bs6 * tem_array_1[ig] - bs2 * tem_array_2[ig] + - bs12 * tem_array_3[ig] + - 0.5 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + delete[] tem_array_2; + delete[] tem_array_3; + } + else if (L[wannier_index] == -5 && m[wannier_index] == 3) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 2, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_3 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_3[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 3, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) = bs6 * tem_array_1[ig] + bs2 * tem_array_2[ig] + - bs12 * tem_array_3[ig] + - 0.5 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + delete[] tem_array_2; + delete[] tem_array_3; + } + else if (L[wannier_index] == -5 && m[wannier_index] == 4) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs6 * tem_array_1[ig] - bs2 * tem_array_2[ig] + bs3 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + delete[] tem_array_2; + } + else if (L[wannier_index] == -5 && m[wannier_index] == 5) + { + get_trial_orbitals_lm_k(wannier_index, 0, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_1 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_1[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 1, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + std::complex *tem_array_2 = new std::complex[npwx]; + for (int ig = 0; ig < npwx; ig++) + { + tem_array_2[ig] = trial_orbitals_k(wannier_index, ig); + } + get_trial_orbitals_lm_k(wannier_index, 2, 0, ylm, dr, r, psir, mesh_r, gk, npw, trial_orbitals_k); + for (int ig = 0; ig < npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) + = bs6 * tem_array_1[ig] + bs2 * tem_array_2[ig] + bs3 * trial_orbitals_k(wannier_index, ig); + } + delete[] tem_array_1; + delete[] tem_array_2; + } + } + } } -// ע����������Lֵ�����Ǵ��ڵ���0�� -void toWannier90::get_trial_orbitals_lm_k(const int wannier_index, const int orbital_L, const int orbital_m, ModuleBase::matrix &ylm, - ModuleBase::matrix &dr, ModuleBase::matrix &r, ModuleBase::matrix &psir, const int mesh_r, - ModuleBase::Vector3 *gk, const int npw, ModuleBase::ComplexMatrix &trial_orbitals_k) +void toWannier90::get_trial_orbitals_lm_k(const int wannier_index, + const int orbital_L, + const int orbital_m, + ModuleBase::matrix &ylm, + ModuleBase::matrix &dr, + ModuleBase::matrix &r, + ModuleBase::matrix &psir, + const int mesh_r, + ModuleBase::Vector3 *gk, + const int npw, + ModuleBase::ComplexMatrix &trial_orbitals_k) { - //���㾶������ij��k���µ��ռ��ͶӰ - double *psik = new double[npw]; - double *psir_tem = new double[mesh_r]; - double *r_tem = new double[mesh_r]; - double *dr_tem = new double[mesh_r]; - double *psik_tem = new double[GlobalV::NQX]; //�������ڹ̶�k�ռ��ͶӰ����ʱʹ�õ����飩 - ModuleBase::GlobalFunc::ZEROS(psir_tem,mesh_r); - ModuleBase::GlobalFunc::ZEROS(r_tem,mesh_r); - ModuleBase::GlobalFunc::ZEROS(dr_tem,mesh_r); - - for(int ir = 0; ir < mesh_r; ir++) - { - psir_tem[ir] = psir(wannier_index,ir); - r_tem[ir] = r(wannier_index,ir); - dr_tem[ir] = dr(wannier_index,ir); - } - - toWannier90::integral(mesh_r,psir_tem,r_tem,dr_tem,orbital_L,psik_tem); - - // ��GlobalV::NQX��G���в�ֵ�����npw��G���ֵ - for(int ig = 0; ig < npw; ig++) - { - psik[ig] = ModuleBase::PolyInt::Polynomial_Interpolation(psik_tem, GlobalV::NQX, GlobalV::DQ, gk[ig].norm() * GlobalC::ucell.tpiba); - } - - - // 2.������ԭ��ѡ�񣨼�������ģ�����������λ��ƽ�沨������ - std::complex *sk = new std::complex[npw]; - for(int ig = 0; ig < npw; ig++) - { - const double arg = ( gk[ig] * R_centre[wannier_index] ) * ModuleBase::TWO_PI; - sk[ig] = std::complex ( cos(arg), -sin(arg) ); - } - - // 3.���� wannier_ylm - double *wannier_ylm = new double[npw]; - for(int ig = 0; ig < npw; ig++) - { - int index = orbital_L * orbital_L + orbital_m; - if(index == 2 || index == 3 || index == 5 || index == 6 || index == 14 || index == 15) - { - wannier_ylm[ig] = -1 * ylm(index,ig); - } - else - { - wannier_ylm[ig] = ylm(index,ig); - } - } - - // 4.����������̽�����ij��k����ƽ�沨�����ͶӰ - std::complex lphase = pow(ModuleBase::NEG_IMAG_UNIT, orbital_L); - for(int ig = 0; ig < GlobalC::wf.npwx; ig++) - { - if(ig < npw) - { - trial_orbitals_k(wannier_index,ig) = lphase * sk[ig] * wannier_ylm[ig] * psik[ig]; - } - else trial_orbitals_k(wannier_index,ig) = std::complex(0.0,0.0); - } - - - // 5.��һ�� - std::complex anorm(0.0,0.0); - for(int ig = 0; ig < GlobalC::wf.npwx; ig++) - { - anorm = anorm + conj(trial_orbitals_k(wannier_index,ig)) * trial_orbitals_k(wannier_index,ig); - } - - std::complex anorm_tem(0.0,0.0); + + double *psik = new double[npw]; + double *psir_tem = new double[mesh_r]; + double *r_tem = new double[mesh_r]; + double *dr_tem = new double[mesh_r]; + double *psik_tem = new double[GlobalV::NQX]; + ModuleBase::GlobalFunc::ZEROS(psir_tem, mesh_r); + ModuleBase::GlobalFunc::ZEROS(r_tem, mesh_r); + ModuleBase::GlobalFunc::ZEROS(dr_tem, mesh_r); + + for (int ir = 0; ir < mesh_r; ir++) + { + psir_tem[ir] = psir(wannier_index, ir); + r_tem[ir] = r(wannier_index, ir); + dr_tem[ir] = dr(wannier_index, ir); + } + + toWannier90::integral(mesh_r, psir_tem, r_tem, dr_tem, orbital_L, psik_tem); + + for (int ig = 0; ig < npw; ig++) + { + psik[ig] = ModuleBase::PolyInt::Polynomial_Interpolation(psik_tem, + GlobalV::NQX, + GlobalV::DQ, + gk[ig].norm() * GlobalC::ucell.tpiba); + } + + std::complex *sk = new std::complex[npw]; + for (int ig = 0; ig < npw; ig++) + { + const double arg = (gk[ig] * R_centre[wannier_index]) * ModuleBase::TWO_PI; + sk[ig] = std::complex(cos(arg), -sin(arg)); + } + + double *wannier_ylm = new double[npw]; + for (int ig = 0; ig < npw; ig++) + { + int index = orbital_L * orbital_L + orbital_m; + if (index == 2 || index == 3 || index == 5 || index == 6 || index == 14 || index == 15) + { + wannier_ylm[ig] = -1 * ylm(index, ig); + } + else + { + wannier_ylm[ig] = ylm(index, ig); + } + } + + std::complex lphase = pow(ModuleBase::NEG_IMAG_UNIT, orbital_L); + for (int ig = 0; ig < GlobalC::wf.npwx; ig++) + { + if (ig < npw) + { + trial_orbitals_k(wannier_index, ig) = lphase * sk[ig] * wannier_ylm[ig] * psik[ig]; + } + else + trial_orbitals_k(wannier_index, ig) = std::complex(0.0, 0.0); + } + + std::complex anorm(0.0, 0.0); + for (int ig = 0; ig < GlobalC::wf.npwx; ig++) + { + anorm = anorm + conj(trial_orbitals_k(wannier_index, ig)) * trial_orbitals_k(wannier_index, ig); + } + + std::complex anorm_tem(0.0, 0.0); #ifdef __MPI - MPI_Allreduce(&anorm , &anorm_tem , 1, MPI_DOUBLE_COMPLEX , MPI_SUM , POOL_WORLD); + MPI_Allreduce(&anorm, &anorm_tem, 1, MPI_DOUBLE_COMPLEX, MPI_SUM, POOL_WORLD); #else - anorm_tem=anorm; + anorm_tem = anorm; #endif - - for(int ig = 0; ig < GlobalC::wf.npwx; ig++) - { - trial_orbitals_k(wannier_index,ig) = trial_orbitals_k(wannier_index,ig) / sqrt(anorm_tem); - } - - delete[] psik; - delete[] psir_tem; - delete[] r_tem; - delete[] dr_tem; - delete[] psik_tem; - delete[] sk; - delete[] wannier_ylm; - - return; - -} + for (int ig = 0; ig < GlobalC::wf.npwx; ig++) + { + trial_orbitals_k(wannier_index, ig) = trial_orbitals_k(wannier_index, ig) / sqrt(anorm_tem); + } + + delete[] psik; + delete[] psir_tem; + delete[] r_tem; + delete[] dr_tem; + delete[] psik_tem; + delete[] sk; + delete[] wannier_ylm; -void toWannier90::integral(const int meshr, const double *psir, const double *r, const double *rab, const int &l, double* table) + return; +} + +void toWannier90::integral(const int meshr, + const double *psir, + const double *r, + const double *rab, + const int &l, + double *table) { - const double pref = ModuleBase::FOUR_PI / sqrt(GlobalC::ucell.omega); - - double *inner_part = new double[meshr]; - for(int ir=0; ir *psir, + const ModuleBase::Vector3 G) +{ + // (1) set value + std::complex *phase = GlobalC::UFFT.porter; + ModuleBase::GlobalFunc::ZEROS(psir, GlobalC::wfcpw->nrxx); + ModuleBase::GlobalFunc::ZEROS(phase, GlobalC::wfcpw->nrxx); + + for (int ig = 0; ig < GlobalC::kv.ngk[ik]; ig++) + { + psir[GlobalC::wfcpw->ng2fftw[GlobalC::wf.igk(ik, ig)]] = evc[ik](ib, ig); + } + + // get the phase value in realspace + for (int ig = 0; ig < GlobalC::wfcpw->ngmw; ig++) + { + if (GlobalC::wfcpw->ndirect[ig] == G) + { + phase[GlobalC::wfcpw->ng2fftw[ig]] = std::complex(1.0, 0.0); + break; + } + } + // (2) fft and get value + GlobalC::wfcpw->nFT_wfc.FFT3D(psir, 1); + GlobalC::wfcpw->nFT_wfc.FFT3D(phase, 1); + + for (int ir = 0; ir < GlobalC::wfcpw->nrxx; ir++) + { + psir[ir] = psir[ir] * phase[ir]; + } + return; +} -// void toWannier90::ToRealSpace(const int &ik, const int &ib, const ModuleBase::ComplexMatrix *evc, std::complex *psir, const ModuleBase::Vector3 G) -// { -// // (1) set value -// std::complex *phase = GlobalC::UFFT.porter; -// ModuleBase::GlobalFunc::ZEROS( psir, GlobalC::wfcpw->nrxx ); -// ModuleBase::GlobalFunc::ZEROS( phase, GlobalC::wfcpw->nrxx); - - -// for (int ig = 0; ig < GlobalC::kv.ngk[ik]; ig++) -// { -// psir[ GlobalC::wfcpw->ng2fftw[ GlobalC::wf.igk(ik,ig) ] ] = evc[ik](ib, ig); -// } - -// // get the phase value in realspace -// for (int ig = 0; ig < GlobalC::wfcpw->ngmw; ig++) -// { -// if (GlobalC::wfcpw->ndirect[ig] == G) -// { -// phase[ GlobalC::wfcpw->ng2fftw[ig] ] = std::complex(1.0,0.0); -// break; -// } -// } -// // (2) fft and get value -// GlobalC::wfcpw->nFT_wfc.FFT3D(psir, 1); -// GlobalC::wfcpw->nFT_wfc.FFT3D(phase, 1); - - - -// for (int ir = 0; ir < GlobalC::wfcpw->nrxx; ir++) -// { -// psir[ir] = psir[ir] * phase[ir]; -// } -// return; -// } - -// std::complex toWannier90::unkdotb(const std::complex *psir, const int ikb, const int bandindex, const ModuleBase::ComplexMatrix *wfc_pw) -// { -// std::complex result(0.0,0.0); -// int knumber = GlobalC::kv.ngk[ikb]; -// std::complex *porter = GlobalC::UFFT.porter; -// ModuleBase::GlobalFunc::ZEROS( porter, GlobalC::wfcpw->nrxx); -// for (int ir = 0; ir < GlobalC::wfcpw->nrxx; ir++) -// { -// porter[ir] = psir[ir]; -// } -// GlobalC::wfcpw->nFT_wfc.FFT3D( porter, -1); - - -// for (int ig = 0; ig < knumber; ig++) -// { -// result = result + conj( porter[ GlobalC::wfcpw->ng2fftw[GlobalC::wf.igk(ikb, ig)] ] ) * wfc_pw[ikb](bandindex,ig); - -// } -// return result; -// } - -std::complex toWannier90::unkdotkb(const int &ik, const int &ikb, const int &iband_L, const int &iband_R, const ModuleBase::Vector3 G, const psi::Psi>& wfc_pw) +std::complex toWannier90::unkdotb(const std::complex *psir, + const int ikb, + const int bandindex, + const ModuleBase::ComplexMatrix *wfc_pw) { - // (1) set value - std::complex result(0.0,0.0); - std::complex *psir = new std::complex[GlobalC::wfcpw->nmaxgr]; - std::complex *phase = new std::complex[GlobalC::rhopw->nmaxgr]; - - // get the phase value in realspace - for (int ig = 0; ig < GlobalC::rhopw->npw; ig++) - { - if (GlobalC::rhopw->gdirect[ig] == G) //It should be used carefully. We cannot judge if two double are equal. - { - phase[ig] = std::complex(1.0,0.0); - break; - } - } - - // (2) fft and get value - GlobalC::rhopw->recip2real(phase, phase); - GlobalC::wfcpw->recip2real(&wfc_pw(ik,iband_L,0), psir, ik); - - for (int ir = 0; ir < GlobalC::wfcpw->nrxx; ir++) - { - psir[ir] *= phase[ir]; - } - - GlobalC::wfcpw->real2recip(psir, psir, ik); - - std::complex result_tem(0.0,0.0); - - for (int ig = 0; ig < GlobalC::kv.ngk[ikb]; ig++) - { - result_tem = result_tem + conj( psir[ig]) * wfc_pw(ikb, iband_R,ig); - - } + std::complex result(0.0, 0.0); + int knumber = GlobalC::kv.ngk[ikb]; + std::complex *porter = GlobalC::UFFT.porter; + ModuleBase::GlobalFunc::ZEROS(porter, GlobalC::wfcpw->nrxx); + for (int ir = 0; ir < GlobalC::wfcpw->nrxx; ir++) + { + porter[ir] = psir[ir]; + } + GlobalC::wfcpw->nFT_wfc.FFT3D(porter, -1); + + for (int ig = 0; ig < knumber; ig++) + { + result = result + conj(porter[GlobalC::wfcpw->ng2fftw[GlobalC::wf.igk(ikb, ig)]]) * wfc_pw[ikb](bandindex, ig); + } + return result; +} +*/ +std::complex toWannier90::unkdotkb(const int &ik, + const int &ikb, + const int &iband_L, + const int &iband_R, + const ModuleBase::Vector3 G, + const psi::Psi> &wfc_pw) +{ + // (1) set value + std::complex result(0.0, 0.0); + std::complex *psir = new std::complex[GlobalC::wfcpw->nmaxgr]; + std::complex *phase = new std::complex[GlobalC::rhopw->nmaxgr]; + + // get the phase value in realspace + for (int ig = 0; ig < GlobalC::rhopw->npw; ig++) + { + if (GlobalC::rhopw->gdirect[ig] == G) // It should be used carefully. We cannot judge if two double are equal. + { + phase[ig] = std::complex(1.0, 0.0); + break; + } + } + + // (2) fft and get value + GlobalC::rhopw->recip2real(phase, phase); + GlobalC::wfcpw->recip2real(&wfc_pw(ik, iband_L, 0), psir, ik); + + for (int ir = 0; ir < GlobalC::wfcpw->nrxx; ir++) + { + psir[ir] *= phase[ir]; + } + + GlobalC::wfcpw->real2recip(psir, psir, ik); + + std::complex result_tem(0.0, 0.0); + + for (int ig = 0; ig < GlobalC::kv.ngk[ikb]; ig++) + { + result_tem = result_tem + conj(psir[ig]) * wfc_pw(ikb, iband_R, ig); + } #ifdef __MPI - MPI_Allreduce(&result_tem , &result , 1, MPI_DOUBLE_COMPLEX , MPI_SUM , POOL_WORLD); + MPI_Allreduce(&result_tem, &result, 1, MPI_DOUBLE_COMPLEX, MPI_SUM, POOL_WORLD); #else - result=result_tem; + result = result_tem; #endif - delete[] psir; - delete[] phase; - return result; - + delete[] psir; + delete[] phase; + return result; +} + +/* +std::complex toWannier90::gamma_only_cal(const int &ib_L, + const int &ib_R, + const ModuleBase::ComplexMatrix *wfc_pw, + const ModuleBase::Vector3 G) +{ + std::complex *phase = new std::complex[GlobalC::wfcpw->nrxx]; + std::complex *psir = new std::complex[GlobalC::wfcpw->nrxx]; + std::complex *psir_2 = new std::complex[GlobalC::wfcpw->nrxx]; + ModuleBase::GlobalFunc::ZEROS(phase, GlobalC::wfcpw->nrxx); + ModuleBase::GlobalFunc::ZEROS(psir, GlobalC::wfcpw->nrxx); + ModuleBase::GlobalFunc::ZEROS(psir_2, GlobalC::wfcpw->nrxx); + + for (int ig = 0; ig < GlobalC::kv.ngk[0]; ig++) + { + // psir[ GlobalC::wfcpw->ng2fftw[ GlobalC::wf.igk(0,ig) ] ] = wfc_pw[0](ib_L, ig); + psir[GlobalC::wfcpw->ng2fftw[GlobalC::wf.igk(0, ig)]] = std::complex(abs(wfc_pw[0](ib_L, ig)), 0.0); + } + + // get the phase value in realspace + for (int ig = 0; ig < GlobalC::wfcpw->ngmw; ig++) + { + if (GlobalC::wfcpw->ndirect[ig] == G) + { + phase[GlobalC::wfcpw->ng2fftw[ig]] = std::complex(1.0, 0.0); + break; + } + } + // (2) fft and get value + GlobalC::wfcpw->nFT_wfc.FFT3D(psir, 1); + GlobalC::wfcpw->nFT_wfc.FFT3D(phase, 1); + + for (int ir = 0; ir < GlobalC::wfcpw->nrxx; ir++) + { + psir_2[ir] = conj(psir[ir]) * phase[ir]; + } + + for (int ir = 0; ir < GlobalC::wfcpw->nrxx; ir++) + { + psir[ir] = psir[ir] * phase[ir]; + } + + GlobalC::wfcpw->nFT_wfc.FFT3D(psir, -1); + GlobalC::wfcpw->nFT_wfc.FFT3D(psir_2, -1); + + std::complex result(0.0, 0.0); + + for (int ig = 0; ig < GlobalC::kv.ngk[0]; ig++) + { + // result = result + conj(psir_2[ GlobalC::wfcpw->ng2fftw[GlobalC::wf.igk(0,ig)] ]) * wfc_pw[0](ib_R,ig) + psir[ +GlobalC::wfcpw->ng2fftw[ GlobalC::wf.igk(0,ig)] ] * conj(wfc_pw[0](ib_R,ig)); +// std::complex tem = std::complex( abs(wfc_pw[0](ib_R,ig)), 0.0 ); +result = result + conj(psir[GlobalC::wfcpw->ng2fftw[GlobalC::wf.igk(0, ig)]]); // * tem; + } + + delete[] phase; + delete[] psir; + delete[] psir_2; + + return result; } +*/ -// std::complex toWannier90::gamma_only_cal(const int &ib_L, const int &ib_R, const ModuleBase::ComplexMatrix *wfc_pw, const ModuleBase::Vector3 G) -// { -// std::complex *phase = new std::complex[GlobalC::wfcpw->nrxx]; -// std::complex *psir = new std::complex[GlobalC::wfcpw->nrxx]; -// std::complex *psir_2 = new std::complex[GlobalC::wfcpw->nrxx]; -// ModuleBase::GlobalFunc::ZEROS( phase, GlobalC::wfcpw->nrxx); -// ModuleBase::GlobalFunc::ZEROS( psir, GlobalC::wfcpw->nrxx); -// ModuleBase::GlobalFunc::ZEROS( psir_2, GlobalC::wfcpw->nrxx); - -// for (int ig = 0; ig < GlobalC::kv.ngk[0]; ig++) -// { -// //psir[ GlobalC::wfcpw->ng2fftw[ GlobalC::wf.igk(0,ig) ] ] = wfc_pw[0](ib_L, ig); -// psir[ GlobalC::wfcpw->ng2fftw[ GlobalC::wf.igk(0,ig) ] ] = std::complex ( abs(wfc_pw[0](ib_L, ig)), 0.0 ); -// } - -// // get the phase value in realspace -// for (int ig = 0; ig < GlobalC::wfcpw->ngmw; ig++) -// { -// if (GlobalC::wfcpw->ndirect[ig] == G) -// { -// phase[ GlobalC::wfcpw->ng2fftw[ig] ] = std::complex(1.0,0.0); -// break; -// } -// } -// // (2) fft and get value -// GlobalC::wfcpw->nFT_wfc.FFT3D(psir, 1); -// GlobalC::wfcpw->nFT_wfc.FFT3D(phase, 1); - -// for (int ir = 0; ir < GlobalC::wfcpw->nrxx; ir++) -// { -// psir_2[ir] = conj(psir[ir]) * phase[ir]; -// } - -// for (int ir = 0; ir < GlobalC::wfcpw->nrxx; ir++) -// { -// psir[ir] = psir[ir] * phase[ir]; -// } - -// GlobalC::wfcpw->nFT_wfc.FFT3D( psir, -1); -// GlobalC::wfcpw->nFT_wfc.FFT3D( psir_2, -1); - -// std::complex result(0.0,0.0); - -// for (int ig = 0; ig < GlobalC::kv.ngk[0]; ig++) -// { -// //result = result + conj(psir_2[ GlobalC::wfcpw->ng2fftw[GlobalC::wf.igk(0,ig)] ]) * wfc_pw[0](ib_R,ig) + psir[ GlobalC::wfcpw->ng2fftw[ GlobalC::wf.igk(0,ig)] ] * conj(wfc_pw[0](ib_R,ig)); -// //std::complex tem = std::complex( abs(wfc_pw[0](ib_R,ig)), 0.0 ); -// result = result + conj(psir[ GlobalC::wfcpw->ng2fftw[ GlobalC::wf.igk(0,ig)] ]);// * tem; -// } - -// delete[] phase; -// delete[] psir; -// delete[] psir_2; - -// return result; - -// } - -//ʹ��lcao_in_pw������lcao����ת��pw���� #ifdef __LCAO void toWannier90::lcao2pw_basis(const int ik, ModuleBase::ComplexMatrix &orbital_in_G) { - this->table_local.create(GlobalC::ucell.ntype, GlobalC::ucell.nmax_total, GlobalV::NQX); - Wavefunc_in_pw::make_table_q(GlobalC::ORB.orbital_file, this->table_local); - Wavefunc_in_pw::produce_local_basis_in_pw(ik, orbital_in_G, this->table_local); + this->table_local.create(GlobalC::ucell.ntype, GlobalC::ucell.nmax_total, GlobalV::NQX); + Wavefunc_in_pw::make_table_q(GlobalC::ORB.orbital_file, this->table_local); + Wavefunc_in_pw::produce_local_basis_in_pw(ik, orbital_in_G, this->table_local); } - -// ��lcao�����²���pw����IJ��������ڲ���unk��ֵ��unk_inLcao[ik](ib,ig),ig�ķ�Χ��GlobalC::kv.ngk[ik] void toWannier90::getUnkFromLcao() { - std::complex*** lcao_wfc_global = new std::complex**[num_kpts]; - for(int ik = 0; ik < num_kpts; ik++) - { - lcao_wfc_global[ik] = new std::complex*[GlobalV::NBANDS]; - for(int ib = 0; ib < GlobalV::NBANDS; ib++) - { - lcao_wfc_global[ik][ib] = new std::complex[GlobalV::NLOCAL]; - ModuleBase::GlobalFunc::ZEROS(lcao_wfc_global[ik][ib], GlobalV::NLOCAL); - } - } - - - if(this->unk_inLcao != nullptr) - { - delete this->unk_inLcao; - } - this->unk_inLcao = new psi::Psi>(num_kpts, GlobalV::NBANDS, GlobalC::wf.npwx, nullptr); - ModuleBase::ComplexMatrix *orbital_in_G = new ModuleBase::ComplexMatrix[num_kpts]; - - for(int ik = 0; ik < num_kpts; ik++) - { - // ��ȡȫ�ֵ�lcao�IJ�����ϵ�� - get_lcao_wfc_global_ik(lcao_wfc_global[ik], this->wfc_k_grid[ik]); - - int npw = GlobalC::kv.ngk[ik]; - orbital_in_G[ik].create(GlobalV::NLOCAL,npw); - this->lcao2pw_basis(ik,orbital_in_G[ik]); - - } - - // ��lcao�����unkת��pw�����µ�unk - for(int ik = 0; ik < num_kpts; ik++) - { - for(int ib = 0; ib < GlobalV::NBANDS; ib++) - { - for(int ig = 0; ig < GlobalC::kv.ngk[ik]; ig++) - { - for(int iw = 0; iw < GlobalV::NLOCAL; iw++) - { - unk_inLcao[0](ik,ib,ig) += orbital_in_G[ik](iw,ig)*lcao_wfc_global[ik][ib][iw]; - } - } - } - } - - // ��һ�� - for(int ik = 0; ik < num_kpts; ik++) - { - for(int ib = 0; ib < GlobalV::NBANDS; ib++) - { - std::complex anorm(0.0,0.0); - for(int ig = 0; ig < GlobalC::kv.ngk[ik]; ig++) - { - anorm = anorm + conj( unk_inLcao[0](ik,ib,ig) ) * unk_inLcao[0](ik,ib,ig); - } - - std::complex anorm_tem(0.0,0.0); - #ifdef __MPI - MPI_Allreduce(&anorm , &anorm_tem , 1, MPI_DOUBLE_COMPLEX , MPI_SUM , POOL_WORLD); - #endif - - for(int ig = 0; ig < GlobalC::kv.ngk[ik]; ig++) - { - unk_inLcao[0](ik,ib,ig) = unk_inLcao[0](ik,ib,ig) / sqrt(anorm_tem); - } - - } - } - - - for(int ik = 0; ik < GlobalC::kv.nkstot; ik++) - { - for(int ib = 0; ib < GlobalV::NBANDS; ib++) - { - delete[] lcao_wfc_global[ik][ib]; - } - delete[] lcao_wfc_global[ik]; - } - delete[] lcao_wfc_global; - - delete[] orbital_in_G; + std::complex ***lcao_wfc_global = new std::complex **[num_kpts]; + for (int ik = 0; ik < num_kpts; ik++) + { + lcao_wfc_global[ik] = new std::complex *[GlobalV::NBANDS]; + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + lcao_wfc_global[ik][ib] = new std::complex[GlobalV::NLOCAL]; + ModuleBase::GlobalFunc::ZEROS(lcao_wfc_global[ik][ib], GlobalV::NLOCAL); + } + } + + if (this->unk_inLcao != nullptr) + { + delete this->unk_inLcao; + } + this->unk_inLcao = new psi::Psi>(num_kpts, GlobalV::NBANDS, GlobalC::wf.npwx, nullptr); + ModuleBase::ComplexMatrix *orbital_in_G = new ModuleBase::ComplexMatrix[num_kpts]; + + for (int ik = 0; ik < num_kpts; ik++) + { + + get_lcao_wfc_global_ik(lcao_wfc_global[ik], this->wfc_k_grid[ik]); + + int npw = GlobalC::kv.ngk[ik]; + orbital_in_G[ik].create(GlobalV::NLOCAL, npw); + this->lcao2pw_basis(ik, orbital_in_G[ik]); + } + + for (int ik = 0; ik < num_kpts; ik++) + { + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + for (int ig = 0; ig < GlobalC::kv.ngk[ik]; ig++) + { + for (int iw = 0; iw < GlobalV::NLOCAL; iw++) + { + unk_inLcao[0](ik, ib, ig) += orbital_in_G[ik](iw, ig) * lcao_wfc_global[ik][ib][iw]; + } + } + } + } + + for (int ik = 0; ik < num_kpts; ik++) + { + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + std::complex anorm(0.0, 0.0); + for (int ig = 0; ig < GlobalC::kv.ngk[ik]; ig++) + { + anorm = anorm + conj(unk_inLcao[0](ik, ib, ig)) * unk_inLcao[0](ik, ib, ig); + } + + std::complex anorm_tem(0.0, 0.0); +#ifdef __MPI + MPI_Allreduce(&anorm, &anorm_tem, 1, MPI_DOUBLE_COMPLEX, MPI_SUM, POOL_WORLD); +#endif + + for (int ig = 0; ig < GlobalC::kv.ngk[ik]; ig++) + { + unk_inLcao[0](ik, ib, ig) = unk_inLcao[0](ik, ib, ig) / sqrt(anorm_tem); + } + } + } + + for (int ik = 0; ik < GlobalC::kv.nkstot; ik++) + { + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + delete[] lcao_wfc_global[ik][ib]; + } + delete[] lcao_wfc_global[ik]; + } + delete[] lcao_wfc_global; + + delete[] orbital_in_G; #ifdef __MPI - MPI_Barrier(MPI_COMM_WORLD); + MPI_Barrier(MPI_COMM_WORLD); #endif - return; + return; } -// ��ȡȫ�ֵ�lcao�IJ�����ϵ�� void toWannier90::get_lcao_wfc_global_ik(std::complex **ctot, std::complex **cc) { - std::complex* ctot_send = new std::complex[GlobalV::NBANDS*GlobalV::NLOCAL]; + std::complex *ctot_send = new std::complex[GlobalV::NBANDS * GlobalV::NLOCAL]; #ifdef __MPI - MPI_Status status; + MPI_Status status; #endif - for (int i=0; i= 0) - { - for (int ib=0; ib* crecv = new std::complex[GlobalV::NBANDS*lgd2]; - ModuleBase::GlobalFunc::ZEROS(crecv, GlobalV::NBANDS*lgd2); - tag = i * 3 + 2; - #ifdef __MPI - MPI_Recv(crecv,GlobalV::NBANDS*lgd2,mpicomplex,i,tag,DIAG_WORLD, &status); - #endif - for (int ib=0; ib=0) - { - //ctot[ib][iw] = crecv[mu_local*GlobalV::NBANDS+ib]; - ctot_send[ib*GlobalV::NLOCAL+iw] = crecv[mu_local*GlobalV::NBANDS+ib]; - } - } - } - - delete[] crecv; - delete[] trace_lo2; - } - } - }// end GlobalV::DRANK=0 - else if ( i == GlobalV::DRANK) - { - int tag; - - // send GlobalC::GridT.lgd - tag = GlobalV::DRANK * 3; - #ifdef __MPI - MPI_Send(&GlobalC::GridT.lgd, 1, MPI_INT, 0, tag, DIAG_WORLD); - #endif - - if(GlobalC::GridT.lgd != 0) - { - // send trace_lo - tag = GlobalV::DRANK * 3 + 1; - #ifdef __MPI - MPI_Send(GlobalC::GridT.trace_lo, GlobalV::NLOCAL, MPI_INT, 0, tag, DIAG_WORLD); - #endif - - // send cc - std::complex* csend = new std::complex[GlobalV::NBANDS*GlobalC::GridT.lgd]; - ModuleBase::GlobalFunc::ZEROS(csend, GlobalV::NBANDS*GlobalC::GridT.lgd); - - for (int ib=0; ib= 0) + { + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + // ctot[ib][iw] = cc[ib][mu_local]; + ctot_send[ib * GlobalV::NLOCAL + iw] = cc[ib][mu_local]; + } + } + } + } + else + { + int tag; + // receive lgd2 + int lgd2 = 0; + tag = i * 3; +#ifdef __MPI + MPI_Recv(&lgd2, 1, MPI_INT, i, tag, DIAG_WORLD, &status); +#endif + if (lgd2 == 0) + { + } + else + { + // receive trace_lo2 + tag = i * 3 + 1; + int *trace_lo2 = new int[GlobalV::NLOCAL]; +#ifdef __MPI + MPI_Recv(trace_lo2, GlobalV::NLOCAL, MPI_INT, i, tag, DIAG_WORLD, &status); +#endif + // receive crecv + std::complex *crecv = new std::complex[GlobalV::NBANDS * lgd2]; + ModuleBase::GlobalFunc::ZEROS(crecv, GlobalV::NBANDS * lgd2); + tag = i * 3 + 2; +#ifdef __MPI + MPI_Recv(crecv, GlobalV::NBANDS * lgd2, mpicomplex, i, tag, DIAG_WORLD, &status); +#endif + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + for (int iw = 0; iw < GlobalV::NLOCAL; iw++) + { + const int mu_local = trace_lo2[iw]; + if (mu_local >= 0) + { + // ctot[ib][iw] = crecv[mu_local*GlobalV::NBANDS+ib]; + ctot_send[ib * GlobalV::NLOCAL + iw] = crecv[mu_local * GlobalV::NBANDS + ib]; + } + } + } + + delete[] crecv; + delete[] trace_lo2; + } + } + } // end GlobalV::DRANK=0 + else if (i == GlobalV::DRANK) + { + int tag; + + // send GlobalC::GridT.lgd + tag = GlobalV::DRANK * 3; +#ifdef __MPI + MPI_Send(&GlobalC::GridT.lgd, 1, MPI_INT, 0, tag, DIAG_WORLD); #endif + if (GlobalC::GridT.lgd != 0) + { + // send trace_lo + tag = GlobalV::DRANK * 3 + 1; +#ifdef __MPI + MPI_Send(GlobalC::GridT.trace_lo, GlobalV::NLOCAL, MPI_INT, 0, tag, DIAG_WORLD); +#endif + // send cc + std::complex *csend = new std::complex[GlobalV::NBANDS * GlobalC::GridT.lgd]; + ModuleBase::GlobalFunc::ZEROS(csend, GlobalV::NBANDS * GlobalC::GridT.lgd); + + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + for (int mu = 0; mu < GlobalC::GridT.lgd; mu++) + { + csend[mu * GlobalV::NBANDS + ib] = cc[ib][mu]; + } + } + + tag = GlobalV::DRANK * 3 + 2; +#ifdef __MPI + MPI_Send(csend, GlobalV::NBANDS * GlobalC::GridT.lgd, mpicomplex, 0, tag, DIAG_WORLD); +#endif + + delete[] csend; + } + } // end i==GlobalV::DRANK +#ifdef __MPI + MPI_Barrier(DIAG_WORLD); +#endif + } +#ifdef __MPI + MPI_Bcast(ctot_send, GlobalV::NBANDS * GlobalV::NLOCAL, mpicomplex, 0, DIAG_WORLD); +#endif + + for (int ib = 0; ib < GlobalV::NBANDS; ib++) + { + for (int iw = 0; iw < GlobalV::NLOCAL; iw++) + { + ctot[ib][iw] = ctot_send[ib * GlobalV::NLOCAL + iw]; + } + } + + delete[] ctot_send; + + return; +} + +#endif diff --git a/source/src_io/to_wannier90.h b/source/src_io/to_wannier90.h index 2377135ad9..2e14cca1a1 100644 --- a/source/src_io/to_wannier90.h +++ b/source/src_io/to_wannier90.h @@ -3,101 +3,109 @@ #include using namespace std; -#include -#include -#include -#include +#include "../module_base/complexmatrix.h" #include "../module_base/global_function.h" #include "../module_base/global_variable.h" +#include "../module_base/lapack_connector.h" #include "../module_base/matrix.h" #include "../module_base/matrix3.h" -#include "../module_base/complexmatrix.h" -#include "../module_base/lapack_connector.h" #include "../src_lcao/wavefunc_in_pw.h" #include "module_psi/psi.h" +#include +#include +#include +#include + #ifdef __LCAO #include "../src_lcao/local_orbital_wfc.h" #endif - class toWannier90 { -public: - //const int k_supercell = 5; // default the k-space supercell - //const int k_cells = (2 * k_supercell + 1)*(2 * k_supercell + 1)*(2 * k_supercell + 1); // the primitive cell number in k-space supercell - //const int k_shells = 12; // default the shell numbers - //const double large_number = 99999999.0; - //const double small_number = 0.000001; - //std::vector> lmn; //ÿ��k��ԭ����� - //std::vector dist_shell; //ÿһ��shell�Ľ���k����� - //std::vector multi; //ÿһ��shell�Ľ���k����Ŀ - //int num_shell_real; //����������B1������shell��Ŀ�����ս����(ע��1��ʼ����) - //int *shell_list_real; //1��12��shell�в�ƽ�в��ȼ۵�shell��ǩ������Ϊnum_shell_real - //double *bweight; //ÿ��shell��bweight������Ϊnum_shell_real - - int num_kpts; // k�����Ŀ - int cal_num_kpts; // ��Ҫ�����k����Ŀ������nspin=2ʱ���ô� - ModuleBase::Matrix3 recip_lattice; - std::vector> nnlist; //ÿ��k��Ľ���k����� - std::vector>> nncell; //ÿ��k��Ľ���k�����ڵ�ԭ����� - int nntot = 0; //ÿ��k��Ľ���k����Ŀ - int num_wannier; //��Ҫ����wannier�����ĸ��� - int *L; //��̽����Ľ�������ָ��,����Ϊnum_wannier - int *m; //��̽����Ĵ�������ָ��,����Ϊnum_wannier - int *rvalue; //��̽����ľ��򲿷ֺ�����ʽ,ֻ��������ʽ,����Ϊnum_wannier - double *alfa; //��̽����ľ��򲿷ֺ����еĵ��ڲ���,����Ϊnum_wannier - ModuleBase::Vector3 *R_centre; //��̽�����������,����Ϊnum_wannier,cartesian���� - std::string wannier_file_name = "seedname"; // .mmn,.amn�ļ��� - int num_exclude_bands = 0; // �ų�������ܴ���Ŀ��-1��ʾû����Ҫ�ų����ܴ� - int *exclude_bands; // �ų��ܴ���index - bool *tag_cal_band; // �ж�GlobalV::NBANDS�ܴ���һ����Ҫ���� - int num_bands; // wannier90 �е�num_bands - bool gamma_only_wannier = false; // ֻ��gamma������wannier���� - std::string wannier_spin = "up"; // spin��������up,down�������� - int start_k_index = 0; // ����forѭ��Ѱ��k��ָ�꣬spin=2ʱ��ʼ��index�Dz�һ���� - - - // ������lcao�����µ�wannier90������� - ModuleBase::realArray table_local; - psi::Psi> *unk_inLcao = nullptr; // lcao�����²����������ڲ���unk + public: + // const int k_supercell = 5; + // const int k_cells = (2 * k_supercell + 1)*(2 * k_supercell + 1)*(2 * k_supercell + 1); + // const int k_shells = 12; + // const double large_number = 99999999.0; + // const double small_number = 0.000001; + // std::vector> lmn; + // std::vector dist_shell; + // std::vector multi; + // int num_shell_real; + // int *shell_list_real; + // double *bweight; + int num_kpts; + int cal_num_kpts; + ModuleBase::Matrix3 recip_lattice; + std::vector> nnlist; + std::vector>> nncell; + int nntot = 0; + int num_wannier; + int *L; + int *m; + int *rvalue; + double *alfa; + ModuleBase::Vector3 *R_centre; + std::string wannier_file_name = "seedname"; + int num_exclude_bands = 0; + int *exclude_bands; + bool *tag_cal_band; + int num_bands; + bool gamma_only_wannier = false; + std::string wannier_spin = "up"; + int start_k_index = 0; + ModuleBase::realArray table_local; + psi::Psi> *unk_inLcao = nullptr; toWannier90(int num_kpts, ModuleBase::Matrix3 recip_lattice); - toWannier90(int num_kpts,ModuleBase::Matrix3 recip_lattice, std::complex*** wfc_k_grid_in); + toWannier90(int num_kpts, ModuleBase::Matrix3 recip_lattice, std::complex ***wfc_k_grid_in); ~toWannier90(); - //void kmesh_supercell_sort(); //������ԭ��ľ����С��������lmn - //void get_nnkpt_first(); //������12��shell�Ľ���k��ľ���͸��� - //void kmesh_get_bvectors(int multi, int reference_kpt, double dist_shell, std::vector>& bvector); //��ȡָ��shell�㣬ָ���ο�k��Ľ���k���bvector - //void get_nnkpt_last(); //��ȡ���յ�shell��Ŀ��bweight - //void get_nnlistAndnncell(); + // void kmesh_supercell_sort(); + // void get_nnkpt_first(); + // void kmesh_get_bvectors(int multi, int reference_kpt, double dist_shell, + // std::vector>& bvector); void get_nnkpt_last(); - void init_wannier(const psi::Psi>* psi=nullptr); - void read_nnkp(); - void outEIG(); - void cal_Amn(const psi::Psi>& wfc_pw); - void cal_Mmn(const psi::Psi>& wfc_pw); - void produce_trial_in_pw(const int &ik, ModuleBase::ComplexMatrix &trial_orbitals_k); - void get_trial_orbitals_lm_k(const int wannier_index, const int orbital_L, const int orbital_m, ModuleBase::matrix &ylm, - ModuleBase::matrix &dr, ModuleBase::matrix &r, ModuleBase::matrix &psir, const int mesh_r, - ModuleBase::Vector3 *gk, const int npw, ModuleBase::ComplexMatrix &trial_orbitals_k); - void integral(const int meshr, const double *psir, const double *r, const double *rab, const int &l, double* table); - void writeUNK(const psi::Psi>& wfc_pw); - // void ToRealSpace(const int &ik, const int &ib, const ModuleBase::ComplexMatrix *evc, std::complex *psir, const ModuleBase::Vector3 G); - // std::complex unkdotb(const std::complex *psir, const int ikb, const int bandindex, const ModuleBase::ComplexMatrix *wfc_pw); - std::complex unkdotkb(const int &ik, const int &ikb, const int &iband_L, const int &iband_R, const ModuleBase::Vector3 G, const psi::Psi>& wfc_pw); - // std::complex gamma_only_cal(const int &ib_L, const int &ib_R, const ModuleBase::ComplexMatrix *wfc_pw, const ModuleBase::Vector3 G); - - // lcao���� - void lcao2pw_basis(const int ik, ModuleBase::ComplexMatrix &orbital_in_G); - void getUnkFromLcao(); - void get_lcao_wfc_global_ik(std::complex** ctot, std::complex** cc); + void init_wannier(const psi::Psi> *psi = nullptr); + void read_nnkp(); + void outEIG(); + void cal_Amn(const psi::Psi> &wfc_pw); + void cal_Mmn(const psi::Psi> &wfc_pw); + void produce_trial_in_pw(const int &ik, ModuleBase::ComplexMatrix &trial_orbitals_k); + void get_trial_orbitals_lm_k(const int wannier_index, + const int orbital_L, + const int orbital_m, + ModuleBase::matrix &ylm, + ModuleBase::matrix &dr, + ModuleBase::matrix &r, + ModuleBase::matrix &psir, + const int mesh_r, + ModuleBase::Vector3 *gk, + const int npw, + ModuleBase::ComplexMatrix &trial_orbitals_k); + void integral(const int meshr, const double *psir, const double *r, const double *rab, const int &l, double *table); + void writeUNK(const psi::Psi> &wfc_pw); + // void ToRealSpace(const int &ik, const int &ib, const ModuleBase::ComplexMatrix *evc, std::complex *psir, + // const ModuleBase::Vector3 G); std::complex unkdotb(const std::complex *psir, const int + // ikb, const int bandindex, const ModuleBase::ComplexMatrix *wfc_pw); + std::complex unkdotkb(const int &ik, + const int &ikb, + const int &iband_L, + const int &iband_R, + const ModuleBase::Vector3 G, + const psi::Psi> &wfc_pw); + // std::complex gamma_only_cal(const int &ib_L, const int &ib_R, const ModuleBase::ComplexMatrix *wfc_pw, + // const ModuleBase::Vector3 G); -private: - std::complex*** wfc_k_grid; + void lcao2pw_basis(const int ik, ModuleBase::ComplexMatrix &orbital_in_G); + void getUnkFromLcao(); + void get_lcao_wfc_global_ik(std::complex **ctot, std::complex **cc); + private: + std::complex ***wfc_k_grid; }; #endif diff --git a/source/src_lcao/LCAO_gen_fixedH.cpp b/source/src_lcao/LCAO_gen_fixedH.cpp index eb0522a0b1..eba45bd222 100644 --- a/source/src_lcao/LCAO_gen_fixedH.cpp +++ b/source/src_lcao/LCAO_gen_fixedH.cpp @@ -24,13 +24,6 @@ LCAO_gen_fixedH::~LCAO_gen_fixedH() void LCAO_gen_fixedH::calculate_NL_no(double* HlocR) { ModuleBase::TITLE("LCAO_gen_fixedH","calculate_NL_no"); - if(GlobalV::NSPIN==4) - { - this->build_Nonlocal_mu(HlocR, false); - return; - //ModuleBase::WARNING_QUIT("LCAO_gen_fixedH::calculate_NL_no","noncollinear case shoule be complex* type"); - } - if(GlobalV::GAMMA_ONLY_LOCAL) { //for gamma only. @@ -64,16 +57,6 @@ void LCAO_gen_fixedH::calculate_NL_no(double* HlocR) return; } -/*void LCAO_gen_fixedH::calculate_NL_no(std::complex* HlocR) -{ - ModuleBase::TITLE("LCAO_gen_fixedH","calculate_NL_no"); - if(GlobalV::NSPIN!=4) ModuleBase::WARNING_QUIT("LCAO_gen_fixedH::calculate_NL_no","complex* type shoule be noncollinear case"); - - this->build_Nonlocal_mu(HlocR, false); - - return; -}*/ - void LCAO_gen_fixedH::calculate_T_no(double* HlocR) { ModuleBase::TITLE("LCAO_gen_fixedH","calculate_T_no"); @@ -733,39 +716,51 @@ void LCAO_gen_fixedH::build_Nonlocal_mu_new(double* NLloc, const bool &calc_deri { std::vector nlm_1=(*nlm_cur1_e)[iw1_all]; std::vector nlm_2=(*nlm_cur2_e)[iw2_all]; - double nlm_tmp = 0.0; - - const int nproj = GlobalC::ucell.infoNL.nproj[T0]; - int ib = 0; - for (int nb = 0; nb < nproj; nb++) + if(GlobalV::NSPIN==4) { - const int L0 = GlobalC::ucell.infoNL.Beta[T0].Proj[nb].getL(); - for(int m=0;m<2*L0+1;m++) + std::complex nlm_tmp = ModuleBase::ZERO; + int is0 = (j-j0*GlobalV::NPOL) + (k-k0*GlobalV::NPOL)*2; + for (int no = 0; no < GlobalC::ucell.atoms[T0].non_zero_count_soc[is0]; no++) { - if(nlm_1[ib]!=0.0 && nlm_2[ib]!=0.0) + const int p1 = GlobalC::ucell.atoms[T0].index1_soc[is0][no]; + const int p2 = GlobalC::ucell.atoms[T0].index2_soc[is0][no]; + nlm_tmp += nlm_1[p1] * nlm_2[p2] * GlobalC::ucell.atoms[T0].d_so(is0, p2, p1); + } + this->LM->Hloc_fixedR_soc[nnr+nnr_inner] += nlm_tmp; + } + else + { + double nlm_tmp = 0.0; + const int nproj = GlobalC::ucell.infoNL.nproj[T0]; + int ib = 0; + for (int nb = 0; nb < nproj; nb++) + { + const int L0 = GlobalC::ucell.infoNL.Beta[T0].Proj[nb].getL(); + for(int m=0;m<2*L0+1;m++) { - nlm_tmp += nlm_1[ib]*nlm_2[ib]*GlobalC::ucell.atoms[T0].dion(nb,nb); + if(nlm_1[ib]!=0.0 && nlm_2[ib]!=0.0) + { + nlm_tmp += nlm_1[ib]*nlm_2[ib]*GlobalC::ucell.atoms[T0].dion(nb,nb); + } + ib+=1; } - ib+=1; } - } - assert(ib==nlm_1.size()); + assert(ib==nlm_1.size()); - if(GlobalV::GAMMA_ONLY_LOCAL) - { - // mohan add 2010-12-20 - if( nlm_tmp!=0.0 ) + if(GlobalV::GAMMA_ONLY_LOCAL) { - // GlobalV::ofs_running << std::setw(10) << iw1_all << std::setw(10) - // << iw2_all << std::setw(20) << nlm[0] << std::endl; - this->LM->set_HSgamma(iw1_all,iw2_all,nlm_tmp,'N', NLloc);//N stands for nonlocal. + // mohan add 2010-12-20 + if( nlm_tmp!=0.0 ) + { + this->LM->set_HSgamma(iw1_all,iw2_all,nlm_tmp,'N', NLloc);//N stands for nonlocal. + } } - } - else - { - if( nlm_tmp!=0.0 ) + else { - NLloc[nnr+nnr_inner] += nlm_tmp; + if( nlm_tmp!=0.0 ) + { + NLloc[nnr+nnr_inner] += nlm_tmp; + } } } }// calc_deri @@ -874,12 +869,10 @@ void LCAO_gen_fixedH::build_Nonlocal_mu_new(double* NLloc, const bool &calc_deri if(!GlobalV::GAMMA_ONLY_LOCAL) { - // std::cout << " nr=" << nnr << std::endl; - // std::cout << " pv->nnr=" << pv->nnr << std::endl; - // GlobalV::ofs_running << " nr=" << nnr << std::endl; - // GlobalV::ofs_running << " pv->nnr=" << pv->nnr << std::endl; if( nnr!=pv->nnr) { + GlobalV::ofs_running << " nr=" << nnr << std::endl; + GlobalV::ofs_running << " pv->nnr=" << pv->nnr << std::endl; ModuleBase::WARNING_QUIT("LCAO_gen_fixedH::build_Nonlocal_mu_new","nnr!=LNNR.nnr"); } } @@ -1025,6 +1018,13 @@ void LCAO_gen_fixedH::build_Nonlocal_mu(double* NLloc, const bool &calc_deri) if(!calc_deri) { int is0 = (j-j0*GlobalV::NPOL) + (k-k0*GlobalV::NPOL)*2; + //Note : there was a bug in the old implementation + //of soc nonlocal PP, which does not seem to affect the + //converged results though. + //However, there is a discrepancy in the integrate test case + //240*soc, when checked against the new method. + //The origin of the bug is the mismatch between the indexes + //of and d_so GlobalC::UOT.snap_psibeta( GlobalC::ORB, GlobalC::ucell.infoNL, diff --git a/source/src_pdiag/pdiag_double.cpp b/source/src_pdiag/pdiag_double.cpp index 5fb375362e..4b1b43a727 100644 --- a/source/src_pdiag/pdiag_double.cpp +++ b/source/src_pdiag/pdiag_double.cpp @@ -13,7 +13,6 @@ extern "C" { #include "../module_base/blacs_connector.h" - #include "my_elpa.h" #include "../module_base/scalapack_connector.h" } #include "pdgseps.h" @@ -27,47 +26,6 @@ extern "C" #include "diag_cusolver.cuh" #endif -#ifdef __MPI -inline int set_elpahandle(elpa_t &handle, const int *desc,const int local_nrows,const int local_ncols, const int nbands) -{ - int error; - int nprows, npcols, myprow, mypcol; - Cblacs_gridinfo(desc[1], &nprows, &npcols, &myprow, &mypcol); - elpa_init(20210430); - handle = elpa_allocate(&error); - elpa_set_integer(handle, "na", desc[2], &error); - elpa_set_integer(handle, "nev", nbands, &error); - - elpa_set_integer(handle, "local_nrows", local_nrows, &error); - - elpa_set_integer(handle, "local_ncols", local_ncols, &error); - - elpa_set_integer(handle, "nblk", desc[4], &error); - - elpa_set_integer(handle, "mpi_comm_parent", MPI_Comm_c2f(MPI_COMM_WORLD), &error); - - elpa_set_integer(handle, "process_row", myprow, &error); - - elpa_set_integer(handle, "process_col", mypcol, &error); - - elpa_set_integer(handle, "blacs_context", desc[1], &error); - - elpa_set_integer(handle, "cannon_for_generalized", 0, &error); - /* Setup */ - elpa_setup(handle); /* Set tunables */ - return 0; -} -#endif - - -inline bool ifElpaHandle(const bool& newIteration, const bool& ifNSCF) -{ - int doHandle = false; - if(newIteration) doHandle = true; - if(ifNSCF) doHandle = true; - return doHandle; -} - #ifdef __CUSOLVER_LCAO template void cusolver_helper_gather(const T* mat_loc, T* mat_glb, const Parallel_Orbitals* pv){ diff --git a/source/src_pw/energy.cpp b/source/src_pw/energy.cpp index f22c8b2cfe..3bac1627d6 100644 --- a/source/src_pw/energy.cpp +++ b/source/src_pw/energy.cpp @@ -181,6 +181,15 @@ void energy::print_etot( this->print_format("E_sol_el", esol_el); this->print_format("E_sol_cav", esol_cav); } + if (GlobalV::comp_chg) + { + vector ecomp(3, 0); + GlobalC::solvent_model.cal_Acomp(GlobalC::ucell, GlobalC::rhopw, GlobalC::CHR.rho, ecomp); + this->print_format("E_comp_self", ecomp[0]); + this->print_format("E_comp_electron", ecomp[1]); + this->print_format("E_comp_nuclear", ecomp[2]); + this->print_format("E_comp_tot", ecomp[0] + ecomp[1] + ecomp[2]); + } #ifdef __DEEPKS if (GlobalV::deepks_scf) //caoyu add 2021-08-10 { diff --git a/source/src_pw/forces.cpp b/source/src_pw/forces.cpp index 3a38806b18..3822e798ae 100644 --- a/source/src_pw/forces.cpp +++ b/source/src_pw/forces.cpp @@ -1,82 +1,83 @@ #include "forces.h" + +#include "../module_symmetry/symmetry.h" #include "global.h" #include "vdwd2.h" -#include "vdwd3.h" -#include "../module_symmetry/symmetry.h" +#include "vdwd3.h" // new -#include "../module_xc/xc_functional.h" #include "../module_base/math_integral.h" -#include "../src_parallel/parallel_reduce.h" #include "../module_base/timer.h" #include "../module_surchem/efield.h" #include "../module_surchem/surchem.h" -double Forces::output_acc = 1.0e-8; // (Ryd/angstrom). +double Forces::output_acc = 1.0e-8; // (Ryd/angstrom). Forces::Forces() { } -Forces::~Forces() {} +Forces::~Forces() +{ +} #include "../module_base/mathzone.h" void Forces::init(ModuleBase::matrix& force, const psi::Psi>* psi_in) { - ModuleBase::TITLE("Forces", "init"); - this->nat = GlobalC::ucell.nat; - force.create(nat, 3); - - ModuleBase::matrix forcelc(nat, 3); - ModuleBase::matrix forceion(nat, 3); - ModuleBase::matrix forcecc(nat, 3); - ModuleBase::matrix forcenl(nat, 3); - ModuleBase::matrix forcescc(nat, 3); + ModuleBase::TITLE("Forces", "init"); + this->nat = GlobalC::ucell.nat; + force.create(nat, 3); + + ModuleBase::matrix forcelc(nat, 3); + ModuleBase::matrix forceion(nat, 3); + ModuleBase::matrix forcecc(nat, 3); + ModuleBase::matrix forcenl(nat, 3); + ModuleBase::matrix forcescc(nat, 3); this->cal_force_loc(forcelc, GlobalC::rhopw); this->cal_force_ew(forceion, GlobalC::rhopw); this->cal_force_nl(forcenl, psi_in); - this->cal_force_cc(forcecc, GlobalC::rhopw); - this->cal_force_scc(forcescc, GlobalC::rhopw); + this->cal_force_cc(forcecc, GlobalC::rhopw); + this->cal_force_scc(forcescc, GlobalC::rhopw); - ModuleBase::matrix stress_vdw_pw;//.create(3,3); + ModuleBase::matrix stress_vdw_pw; //.create(3,3); ModuleBase::matrix force_vdw; force_vdw.create(nat, 3); - if(GlobalC::vdwd2_para.flag_vdwd2) //Peize Lin add 2014.04.03, update 2021.03.09 - { - Vdwd2 vdwd2(GlobalC::ucell,GlobalC::vdwd2_para); - vdwd2.cal_force(); - for(int iat=0; iat } } - //impose total force = 0 + ModuleBase::matrix forcecomp; + if (GlobalV::comp_chg) + { + forcecomp.create(GlobalC::ucell.nat, 3); + GlobalC::solvent_model.cal_comp_force(forcecomp, GlobalC::rhopw); + } + + // impose total force = 0 int iat = 0; - for (int ipol = 0; ipol < 3; ipol++) - { - double sum = 0.0; - iat = 0; + for (int ipol = 0; ipol < 3; ipol++) + { + double sum = 0.0; + iat = 0; - for (int it = 0;it < GlobalC::ucell.ntype;it++) - { - for (int ia = 0;ia < GlobalC::ucell.atoms[it].na;ia++) - { - force(iat, ipol) = - forcelc(iat, ipol) - + forceion(iat, ipol) - + forcenl(iat, ipol) - + forcecc(iat, ipol) - + forcescc(iat, ipol); - - if(GlobalC::vdwd2_para.flag_vdwd2 || GlobalC::vdwd3_para.flag_vdwd3) //linpz and jiyy added vdw force, modified by zhengdy - { + for (int it = 0; it < GlobalC::ucell.ntype; it++) + { + for (int ia = 0; ia < GlobalC::ucell.atoms[it].na; ia++) + { + force(iat, ipol) = forcelc(iat, ipol) + forceion(iat, ipol) + forcenl(iat, ipol) + forcecc(iat, ipol) + + forcescc(iat, ipol); + + if (GlobalC::vdwd2_para.flag_vdwd2 + || GlobalC::vdwd3_para.flag_vdwd3) // linpz and jiyy added vdw force, modified by zhengdy + { force(iat, ipol) += force_vdw(iat, ipol); - } - - if(GlobalV::EFIELD_FLAG) - { - force(iat,ipol) = force(iat, ipol) + force_e(iat, ipol); - } + } + + if (GlobalV::EFIELD_FLAG) + { + force(iat, ipol) = force(iat, ipol) + force_e(iat, ipol); + } + + if (GlobalV::comp_chg) + { + force(iat, ipol) = force(iat, ipol) + forcecomp(iat, ipol); + } if(GlobalV::imp_sol) { @@ -128,99 +138,125 @@ void Forces::init(ModuleBase::matrix& force, const psi::Psi sum += force(iat, ipol); - iat++; - } - } + iat++; + } + } - double compen = sum / GlobalC::ucell.nat; - for(int iat=0; iat return; } -void Forces::print_to_files(std::ofstream &ofs, const std::string &name, const ModuleBase::matrix &f) +void Forces::print_to_files(std::ofstream& ofs, const std::string& name, const ModuleBase::matrix& f) { int iat = 0; ofs << " " << name; ofs << std::setprecision(8); - //ofs << std::setiosflags(ios::showpos); - - double fac = ModuleBase::Ry_to_eV / 0.529177;// (eV/A) + // ofs << std::setiosflags(ios::showpos); - if(GlobalV::TEST_FORCE) - { - std::cout << std::setiosflags(ios::showpos); - std::cout << " " << name; - std::cout << std::setprecision(8); - } + double fac = ModuleBase::Ry_to_eV / 0.529177; // (eV/A) - for (int it = 0;it < GlobalC::ucell.ntype;it++) + if (GlobalV::TEST_FORCE) { - for (int ia = 0;ia < GlobalC::ucell.atoms[it].na;ia++) - { - ofs << " " << std::setw(5) << it - << std::setw(8) << ia+1 - << std::setw(20) << f(iat, 0)*fac - << std::setw(20) << f(iat, 1)*fac - << std::setw(20) << f(iat, 2)*fac << std::endl; - - if(GlobalV::TEST_FORCE) - { - std::cout << " " << std::setw(5) << it - << std::setw(8) << ia+1 - << std::setw(20) << f(iat, 0)*fac - << std::setw(20) << f(iat, 1)*fac - << std::setw(20) << f(iat, 2)*fac << std::endl; - } + std::cout << std::setiosflags(ios::showpos); + std::cout << " " << name; + std::cout << std::setprecision(8); + } + + for (int it = 0; it < GlobalC::ucell.ntype; it++) + { + for (int ia = 0; ia < GlobalC::ucell.atoms[it].na; ia++) + { + ofs << " " << std::setw(5) << it << std::setw(8) << ia + 1 << std::setw(20) << f(iat, 0) * fac + << std::setw(20) << f(iat, 1) * fac << std::setw(20) << f(iat, 2) * fac << std::endl; + + if (GlobalV::TEST_FORCE) + { + std::cout << " " << std::setw(5) << it << std::setw(8) << ia + 1 << std::setw(20) << f(iat, 0) * fac + << std::setw(20) << f(iat, 1) * fac << std::setw(20) << f(iat, 2) * fac << std::endl; + } iat++; } } - GlobalV::ofs_running << std::resetiosflags(ios::showpos); - std::cout << std::resetiosflags(ios::showpos); + GlobalV::ofs_running << std::resetiosflags(ios::showpos); + std::cout << std::resetiosflags(ios::showpos); return; } - - -void Forces::print(const std::string &name, const ModuleBase::matrix &f, bool ry) +void Forces::print(const std::string& name, const ModuleBase::matrix& f, bool ry) { - ModuleBase::GlobalFunc::NEW_PART(name); + ModuleBase::GlobalFunc::NEW_PART(name); - GlobalV::ofs_running << " " << std::setw(8) << "atom" << std::setw(15) << "x" << std::setw(15) << "y" << std::setw(15) << "z" << std::endl; - GlobalV::ofs_running << std::setiosflags(ios::showpos); + GlobalV::ofs_running << " " << std::setw(8) << "atom" << std::setw(15) << "x" << std::setw(15) << "y" + << std::setw(15) << "z" << std::endl; + GlobalV::ofs_running << std::setiosflags(ios::showpos); GlobalV::ofs_running << std::setprecision(8); - const double fac = ModuleBase::Ry_to_eV / 0.529177; - - if(GlobalV::TEST_FORCE) - { - std::cout << " --------------- " << name << " ---------------" << std::endl; - std::cout << " " << std::setw(8) << "atom" << std::setw(15) << "x" << std::setw(15) << "y" << std::setw(15) << "z" << std::endl; - std::cout << std::setiosflags(ios::showpos); - std::cout << std::setprecision(6); - } + const double fac = ModuleBase::Ry_to_eV / 0.529177; + + if (GlobalV::TEST_FORCE) + { + std::cout << " --------------- " << name << " ---------------" << std::endl; + std::cout << " " << std::setw(8) << "atom" << std::setw(15) << "x" << std::setw(15) << "y" << std::setw(15) + << "z" << std::endl; + std::cout << std::setiosflags(ios::showpos); + std::cout << std::setprecision(6); + } int iat = 0; - for (int it = 0;it < GlobalC::ucell.ntype;it++) + for (int it = 0; it < GlobalC::ucell.ntype; it++) { - for (int ia = 0;ia < GlobalC::ucell.atoms[it].na;ia++) + for (int ia = 0; ia < GlobalC::ucell.atoms[it].na; ia++) { - std::stringstream ss; - ss << GlobalC::ucell.atoms[it].label << ia+1; + std::stringstream ss; + ss << GlobalC::ucell.atoms[it].label << ia + 1; + + if (ry) // output Rydberg Unit + { + GlobalV::ofs_running << " " << std::setw(8) << ss.str(); + if (abs(f(iat, 0)) > Forces::output_acc) + GlobalV::ofs_running << std::setw(15) << f(iat, 0); + else + GlobalV::ofs_running << std::setw(15) << "0"; + if (abs(f(iat, 1)) > Forces::output_acc) + GlobalV::ofs_running << std::setw(15) << f(iat, 1); + else + GlobalV::ofs_running << std::setw(15) << "0"; + if (abs(f(iat, 2)) > Forces::output_acc) + GlobalV::ofs_running << std::setw(15) << f(iat, 2); + else + GlobalV::ofs_running << std::setw(15) << "0"; + GlobalV::ofs_running << std::endl; + } + else + { + GlobalV::ofs_running << " " << std::setw(8) << ss.str(); + if (abs(f(iat, 0)) > Forces::output_acc) + GlobalV::ofs_running << std::setw(15) << f(iat, 0) * fac; + else + GlobalV::ofs_running << std::setw(15) << "0"; + if (abs(f(iat, 1)) > Forces::output_acc) + GlobalV::ofs_running << std::setw(15) << f(iat, 1) * fac; + else + GlobalV::ofs_running << std::setw(15) << "0"; + if (abs(f(iat, 2)) > Forces::output_acc) + GlobalV::ofs_running << std::setw(15) << f(iat, 2) * fac; + else + GlobalV::ofs_running << std::setw(15) << "0"; + GlobalV::ofs_running << std::endl; + } + + if (GlobalV::TEST_FORCE && ry) + { + std::cout << " " << std::setw(8) << ss.str(); + if (abs(f(iat, 0)) > Forces::output_acc) + std::cout << std::setw(15) << f(iat, 0); + else + std::cout << std::setw(15) << "0"; + if (abs(f(iat, 1)) > Forces::output_acc) + std::cout << std::setw(15) << f(iat, 1); + else + std::cout << std::setw(15) << "0"; + if (abs(f(iat, 2)) > Forces::output_acc) + std::cout << std::setw(15) << f(iat, 2); + else + std::cout << std::setw(15) << "0"; + std::cout << std::endl; + } + else if (GlobalV::TEST_FORCE) + { + std::cout << " " << std::setw(8) << ss.str(); + if (abs(f(iat, 0)) > Forces::output_acc) + std::cout << std::setw(15) << f(iat, 0) * fac; + else + std::cout << std::setw(15) << "0"; + if (abs(f(iat, 1)) > Forces::output_acc) + std::cout << std::setw(15) << f(iat, 1) * fac; + else + std::cout << std::setw(15) << "0"; + if (abs(f(iat, 2)) > Forces::output_acc) + std::cout << std::setw(15) << f(iat, 2) * fac; + else + std::cout << std::setw(15) << "0"; + std::cout << std::endl; + } - if(ry) // output Rydberg Unit - { - GlobalV::ofs_running << " " << std::setw(8) << ss.str(); - if( abs(f(iat,0)) > Forces::output_acc) GlobalV::ofs_running << std::setw(15) << f(iat,0); - else GlobalV::ofs_running << std::setw(15) << "0"; - if( abs(f(iat,1)) > Forces::output_acc) GlobalV::ofs_running << std::setw(15) << f(iat,1); - else GlobalV::ofs_running << std::setw(15) << "0"; - if( abs(f(iat,2)) > Forces::output_acc) GlobalV::ofs_running << std::setw(15) << f(iat,2); - else GlobalV::ofs_running << std::setw(15) << "0"; - GlobalV::ofs_running << std::endl; - } - else - { - GlobalV::ofs_running << " " << std::setw(8) << ss.str(); - if( abs(f(iat,0)) > Forces::output_acc) GlobalV::ofs_running << std::setw(15) << f(iat,0)*fac; - else GlobalV::ofs_running << std::setw(15) << "0"; - if( abs(f(iat,1)) > Forces::output_acc) GlobalV::ofs_running << std::setw(15) << f(iat,1)*fac; - else GlobalV::ofs_running << std::setw(15) << "0"; - if( abs(f(iat,2)) > Forces::output_acc) GlobalV::ofs_running << std::setw(15) << f(iat,2)*fac; - else GlobalV::ofs_running << std::setw(15) << "0"; - GlobalV::ofs_running << std::endl; - } - - if(GlobalV::TEST_FORCE && ry) - { - std::cout << " " << std::setw(8) << ss.str(); - if( abs(f(iat,0)) > Forces::output_acc) std::cout << std::setw(15) << f(iat,0); - else std::cout << std::setw(15) << "0"; - if( abs(f(iat,1)) > Forces::output_acc) std::cout << std::setw(15) << f(iat,1); - else std::cout << std::setw(15) << "0"; - if( abs(f(iat,2)) > Forces::output_acc) std::cout << std::setw(15) << f(iat,2); - else std::cout << std::setw(15) << "0"; - std::cout << std::endl; - } - else if (GlobalV::TEST_FORCE) - { - std::cout << " " << std::setw(8) << ss.str(); - if( abs(f(iat,0)) > Forces::output_acc) std::cout << std::setw(15) << f(iat,0)*fac; - else std::cout << std::setw(15) << "0"; - if( abs(f(iat,1)) > Forces::output_acc) std::cout << std::setw(15) << f(iat,1)*fac; - else std::cout << std::setw(15) << "0"; - if( abs(f(iat,2)) > Forces::output_acc) std::cout << std::setw(15) << f(iat,2)*fac; - else std::cout << std::setw(15) << "0"; - std::cout << std::endl; - } - iat++; } } - GlobalV::ofs_running << std::resetiosflags(ios::showpos); - std::cout << std::resetiosflags(ios::showpos); + GlobalV::ofs_running << std::resetiosflags(ios::showpos); + std::cout << std::resetiosflags(ios::showpos); return; } - void Forces::cal_force_loc(ModuleBase::matrix& forcelc, ModulePW::PW_Basis* rho_basis) { - ModuleBase::timer::tick("Forces","cal_force_loc"); + ModuleBase::timer::tick("Forces", "cal_force_loc"); - std::complex *aux = new std::complex[rho_basis->nmaxgr]; + std::complex* aux = new std::complex[rho_basis->nmaxgr]; ModuleBase::GlobalFunc::ZEROS(aux, rho_basis->nrxx); // now, in all pools , the charge are the same, // so, the force calculated by each pool is equal. - - for(int is=0; isnrxx; ir++) - { - aux[ir] += std::complex( GlobalC::CHR.rho[is][ir], 0.0 ); - } - } - // to G space. - rho_basis->real2recip(aux,aux); + for (int is = 0; is < GlobalV::NSPIN; is++) + { + for (int ir = 0; ir < rho_basis->nrxx; ir++) + { + aux[ir] += std::complex(GlobalC::CHR.rho[is][ir], 0.0); + } + } + // to G space. + rho_basis->real2recip(aux, aux); int iat = 0; - for (int it = 0;it < GlobalC::ucell.ntype;it++) + for (int it = 0; it < GlobalC::ucell.ntype; it++) { - for (int ia = 0;ia < GlobalC::ucell.atoms[it].na;ia++) + for (int ia = 0; ia < GlobalC::ucell.atoms[it].na; ia++) { - for (int ig = 0; ig < rho_basis->npw ; ig++) + for (int ig = 0; ig < rho_basis->npw; ig++) { const double phase = ModuleBase::TWO_PI * (rho_basis->gcar[ig] * GlobalC::ucell.atoms[it].tau[ia]); - const double factor = GlobalC::ppcell.vloc(it, rho_basis->ig2igg[ig]) * - ( cos(phase) * aux[ig].imag() - + sin(phase) * aux[ig].real()); + const double factor = GlobalC::ppcell.vloc(it, rho_basis->ig2igg[ig]) + * (cos(phase) * aux[ig].imag() + sin(phase) * aux[ig].real()); forcelc(iat, 0) += rho_basis->gcar[ig][0] * factor; forcelc(iat, 1) += rho_basis->gcar[ig][1] * factor; forcelc(iat, 2) += rho_basis->gcar[ig][2] * factor; } - for (int ipol = 0;ipol < 3;ipol++) + for (int ipol = 0; ipol < 3; ipol++) { forcelc(iat, ipol) *= (GlobalC::ucell.tpiba * GlobalC::ucell.omega); } ++iat; } } - //this->print(GlobalV::ofs_running, "local forces", forcelc); + // this->print(GlobalV::ofs_running, "local forces", forcelc); Parallel_Reduce::reduce_double_pool(forcelc.c, forcelc.nr * forcelc.nc); delete[] aux; - ModuleBase::timer::tick("Forces","cal_force_loc"); + ModuleBase::timer::tick("Forces", "cal_force_loc"); return; } #include "H_Ewald_pw.h" void Forces::cal_force_ew(ModuleBase::matrix& forceion, ModulePW::PW_Basis* rho_basis) { - ModuleBase::timer::tick("Forces","cal_force_ew"); + ModuleBase::timer::tick("Forces", "cal_force_ew"); double fact = 2.0; - std::complex *aux = new std::complex [rho_basis->npw]; + std::complex* aux = new std::complex[rho_basis->npw]; ModuleBase::GlobalFunc::ZEROS(aux, rho_basis->npw); - for (int it = 0;it < GlobalC::ucell.ntype;it++) + for (int it = 0; it < GlobalC::ucell.ntype; it++) { for (int ig = 0; ig < rho_basis->npw; ig++) { - if(ig == rho_basis->ig_gge0) continue; + if (ig == rho_basis->ig_gge0) + continue; aux[ig] += static_cast(GlobalC::ucell.atoms[it].zv) * conj(GlobalC::sf.strucFac(it, ig)); } } - // calculate total ionic charge + // calculate total ionic charge double charge = 0.0; - for (int it = 0;it < GlobalC::ucell.ntype;it++) + for (int it = 0; it < GlobalC::ucell.ntype; it++) { - charge += GlobalC::ucell.atoms[it].na * GlobalC::ucell.atoms[it].zv;//mohan modify 2007-11-7 + charge += GlobalC::ucell.atoms[it].na * GlobalC::ucell.atoms[it].zv; // mohan modify 2007-11-7 } - - double alpha = 1.1; - double upperbound ; + + double alpha = 1.1; + double upperbound; do { alpha -= 0.10; @@ -450,200 +502,210 @@ void Forces::cal_force_ew(ModuleBase::matrix& forceion, ModulePW::PW_Basis* rho_ if (alpha <= 0.0) { - ModuleBase::WARNING_QUIT("ewald","Can't find optimal alpha."); + ModuleBase::WARNING_QUIT("ewald", "Can't find optimal alpha."); } - upperbound = 2.0 * charge * charge * sqrt(2.0 * alpha / ModuleBase::TWO_PI) * - erfc(sqrt(GlobalC::ucell.tpiba2 * rho_basis->ggecut / 4.0 / alpha)); - } - while (upperbound > 1.0e-6); -// std::cout << " GlobalC::en.alpha = " << alpha << std::endl; -// std::cout << " upperbound = " << upperbound << std::endl; - - + upperbound = 2.0 * charge * charge * sqrt(2.0 * alpha / ModuleBase::TWO_PI) + * erfc(sqrt(GlobalC::ucell.tpiba2 * rho_basis->ggecut / 4.0 / alpha)); + } while (upperbound > 1.0e-6); + // std::cout << " GlobalC::en.alpha = " << alpha << std::endl; + // std::cout << " upperbound = " << upperbound << std::endl; for (int ig = 0; ig < rho_basis->npw; ig++) { - if(ig == rho_basis->ig_gge0) continue; - aux[ig] *= exp(-1.0 * rho_basis->gg[ig] * GlobalC::ucell.tpiba2 / alpha / 4.0) / (rho_basis->gg[ig] * GlobalC::ucell.tpiba2); + if (ig == rho_basis->ig_gge0) + continue; + aux[ig] *= exp(-1.0 * rho_basis->gg[ig] * GlobalC::ucell.tpiba2 / alpha / 4.0) + / (rho_basis->gg[ig] * GlobalC::ucell.tpiba2); } int iat = 0; - for (int it = 0;it < GlobalC::ucell.ntype;it++) + for (int it = 0; it < GlobalC::ucell.ntype; it++) { - for (int ia = 0;ia < GlobalC::ucell.atoms[it].na;ia++) + for (int ia = 0; ia < GlobalC::ucell.atoms[it].na; ia++) { for (int ig = 0; ig < rho_basis->npw; ig++) { - if(ig == rho_basis->ig_gge0) continue; + if (ig == rho_basis->ig_gge0) + continue; const ModuleBase::Vector3 gcar = rho_basis->gcar[ig]; const double arg = ModuleBase::TWO_PI * (gcar * GlobalC::ucell.atoms[it].tau[ia]); - double sumnb = -cos(arg) * aux[ig].imag() + sin(arg) * aux[ig].real(); + double sumnb = -cos(arg) * aux[ig].imag() + sin(arg) * aux[ig].real(); forceion(iat, 0) += gcar[0] * sumnb; forceion(iat, 1) += gcar[1] * sumnb; forceion(iat, 2) += gcar[2] * sumnb; } - for (int ipol = 0;ipol < 3;ipol++) + for (int ipol = 0; ipol < 3; ipol++) { - forceion(iat, ipol) *= GlobalC::ucell.atoms[it].zv * ModuleBase::e2 * GlobalC::ucell.tpiba * ModuleBase::TWO_PI / GlobalC::ucell.omega * fact; + forceion(iat, ipol) *= GlobalC::ucell.atoms[it].zv * ModuleBase::e2 * GlobalC::ucell.tpiba + * ModuleBase::TWO_PI / GlobalC::ucell.omega * fact; } - // std::cout << " atom" << iat << std::endl; - // std::cout << std::setw(15) << forceion(iat, 0) << std::setw(15) << forceion(iat,1) << std::setw(15) << forceion(iat,2) << std::endl; + // std::cout << " atom" << iat << std::endl; + // std::cout << std::setw(15) << forceion(iat, 0) << std::setw(15) << forceion(iat,1) << std::setw(15) + //<< forceion(iat,2) << std::endl; iat++; } } - delete [] aux; - + delete[] aux; - // means that the processor contains G=0 term. + // means that the processor contains G=0 term. if (rho_basis->ig_gge0 >= 0) { double rmax = 5.0 / (sqrt(alpha) * GlobalC::ucell.lat0); int nrm = 0; - - //output of rgen: the number of vectors in the sphere + + // output of rgen: the number of vectors in the sphere const int mxr = 50; // the maximum number of R vectors included in r - ModuleBase::Vector3 *r = new ModuleBase::Vector3[mxr]; - double *r2 = new double[mxr]; - ModuleBase::GlobalFunc::ZEROS(r2, mxr); - int *irr = new int[mxr]; - ModuleBase::GlobalFunc::ZEROS(irr, mxr); + ModuleBase::Vector3* r = new ModuleBase::Vector3[mxr]; + double* r2 = new double[mxr]; + ModuleBase::GlobalFunc::ZEROS(r2, mxr); + int* irr = new int[mxr]; + ModuleBase::GlobalFunc::ZEROS(irr, mxr); // the square modulus of R_j-tau_s-tau_s' - int iat1 = 0; + int iat1 = 0; for (int T1 = 0; T1 < GlobalC::ucell.ntype; T1++) { - Atom* atom1 = &GlobalC::ucell.atoms[T1]; + Atom* atom1 = &GlobalC::ucell.atoms[T1]; for (int I1 = 0; I1 < atom1->na; I1++) { - int iat2 = 0; // mohan fix bug 2011-06-07 + int iat2 = 0; // mohan fix bug 2011-06-07 for (int T2 = 0; T2 < GlobalC::ucell.ntype; T2++) { for (int I2 = 0; I2 < GlobalC::ucell.atoms[T2].na; I2++) { if (iat1 != iat2) { - ModuleBase::Vector3 d_tau = GlobalC::ucell.atoms[T1].tau[I1] - GlobalC::ucell.atoms[T2].tau[I2]; + ModuleBase::Vector3 d_tau + = GlobalC::ucell.atoms[T1].tau[I1] - GlobalC::ucell.atoms[T2].tau[I2]; H_Ewald_pw::rgen(d_tau, rmax, irr, GlobalC::ucell.latvec, GlobalC::ucell.G, r, r2, nrm); - for (int n = 0;n < nrm;n++) + for (int n = 0; n < nrm; n++) { - const double rr = sqrt(r2[n]) * GlobalC::ucell.lat0; + const double rr = sqrt(r2[n]) * GlobalC::ucell.lat0; - double factor = GlobalC::ucell.atoms[T1].zv * GlobalC::ucell.atoms[T2].zv * ModuleBase::e2 / (rr * rr) - * (erfc(sqrt(alpha) * rr) / rr - + sqrt(8.0 * alpha / ModuleBase::TWO_PI) * exp(-1.0 * alpha * rr * rr)) * GlobalC::ucell.lat0; + double factor + = GlobalC::ucell.atoms[T1].zv * GlobalC::ucell.atoms[T2].zv * ModuleBase::e2 + / (rr * rr) + * (erfc(sqrt(alpha) * rr) / rr + + sqrt(8.0 * alpha / ModuleBase::TWO_PI) * exp(-1.0 * alpha * rr * rr)) + * GlobalC::ucell.lat0; - forceion(iat1, 0) -= factor * r[n].x; + forceion(iat1, 0) -= factor * r[n].x; forceion(iat1, 1) -= factor * r[n].y; forceion(iat1, 2) -= factor * r[n].z; -// std::cout << " r.z=" << r[n].z << " r2=" << r2[n] << std::endl; - // std::cout << " " << iat1 << " " << iat2 << " n=" << n - // << " rn.z=" << r[n].z - // << " r2=" << r2[n] << " rr=" << rr << " fac=" << factor << " force=" << forceion(iat1,2) - // << " new_part=" << factor*r[n].z << std::endl; + // std::cout << " r.z=" << r[n].z << " r2=" << r2[n] << + // std::endl; std::cout << " " << iat1 << " " << iat2 << " n=" << n + // << " rn.z=" << r[n].z + // << " r2=" << r2[n] << " rr=" << rr << " fac=" << factor << " force=" << + // forceion(iat1,2) + // << " new_part=" << factor*r[n].z << std::endl; } } ++iat2; } - }//atom b + } // atom b -// std::cout << " atom" << iat1 << std::endl; -// std::cout << std::setw(15) << forceion(iat1, 0) << std::setw(15) << forceion(iat1,1) << std::setw(15) << forceion(iat1,2) << std::endl; + // std::cout << " atom" << iat1 << std::endl; + // std::cout << std::setw(15) << forceion(iat1, 0) << std::setw(15) << forceion(iat1,1) << + // std::setw(15) << forceion(iat1,2) << std::endl; ++iat1; } - }//atom a - delete []r; - delete []r2; - delete []irr; + } // atom a + delete[] r; + delete[] r2; + delete[] irr; } Parallel_Reduce::reduce_double_pool(forceion.c, forceion.nr * forceion.nc); - //this->print(GlobalV::ofs_running, "ewald forces", forceion); + // this->print(GlobalV::ofs_running, "ewald forces", forceion); - ModuleBase::timer::tick("Forces","cal_force_ew"); + ModuleBase::timer::tick("Forces", "cal_force_ew"); return; } void Forces::cal_force_cc(ModuleBase::matrix& forcecc, ModulePW::PW_Basis* rho_basis) { - // recalculate the exchange-correlation potential. - + // recalculate the exchange-correlation potential. + ModuleBase::matrix v(GlobalV::NSPIN, rho_basis->nrxx); - if(XC_Functional::get_func_type() == 3) - { + if (XC_Functional::get_func_type() == 3) + { #ifdef USE_LIBXC - const auto etxc_vtxc_v = XC_Functional::v_xc_meta( - rho_basis->nrxx, rho_basis->nxyz, GlobalC::ucell.omega, - GlobalC::CHR.rho, GlobalC::CHR.rho_core, GlobalC::CHR.kin_r); - + const auto etxc_vtxc_v = XC_Functional::v_xc_meta(rho_basis->nrxx, + rho_basis->nxyz, + GlobalC::ucell.omega, + GlobalC::CHR.rho, + GlobalC::CHR.rho_core, + GlobalC::CHR.kin_r); + GlobalC::en.etxc = std::get<0>(etxc_vtxc_v); GlobalC::en.vtxc = std::get<1>(etxc_vtxc_v); v = std::get<2>(etxc_vtxc_v); #else - ModuleBase::WARNING_QUIT("cal_force_cc","to use mGGA, compile with LIBXC"); + ModuleBase::WARNING_QUIT("cal_force_cc", "to use mGGA, compile with LIBXC"); #endif - } - else - { - const auto etxc_vtxc_v = XC_Functional::v_xc( - rho_basis->nrxx, rho_basis->nxyz, GlobalC::ucell.omega, - GlobalC::CHR.rho, GlobalC::CHR.rho_core); - + } + else + { + const auto etxc_vtxc_v = XC_Functional::v_xc(rho_basis->nrxx, + rho_basis->nxyz, + GlobalC::ucell.omega, + GlobalC::CHR.rho, + GlobalC::CHR.rho_core); + GlobalC::en.etxc = std::get<0>(etxc_vtxc_v); GlobalC::en.vtxc = std::get<1>(etxc_vtxc_v); - v = std::get<2>(etxc_vtxc_v); - } + v = std::get<2>(etxc_vtxc_v); + } - const ModuleBase::matrix vxc = v; - std::complex * psiv = new std::complex [rho_basis->nmaxgr]; + const ModuleBase::matrix vxc = v; + std::complex* psiv = new std::complex[rho_basis->nmaxgr]; ModuleBase::GlobalFunc::ZEROS(psiv, rho_basis->nrxx); if (GlobalV::NSPIN == 1 || GlobalV::NSPIN == 4) { - for (int ir = 0;ir < rho_basis->nrxx;ir++) + for (int ir = 0; ir < rho_basis->nrxx; ir++) { - psiv[ir] = std::complex(vxc(0, ir), 0.0); + psiv[ir] = std::complex(vxc(0, ir), 0.0); } } else { - for (int ir = 0;ir < rho_basis->nrxx;ir++) + for (int ir = 0; ir < rho_basis->nrxx; ir++) { - psiv[ir] = 0.5 * (vxc(0 ,ir) + vxc(1, ir)); + psiv[ir] = 0.5 * (vxc(0, ir) + vxc(1, ir)); } } - // to G space + // to G space rho_basis->real2recip(psiv, psiv); - //psiv contains now Vxc(G) - double * rhocg = new double [rho_basis->ngg]; + // psiv contains now Vxc(G) + double* rhocg = new double[rho_basis->ngg]; ModuleBase::GlobalFunc::ZEROS(rhocg, rho_basis->ngg); int iat = 0; - for (int T1 = 0;T1 < GlobalC::ucell.ntype;T1++) + for (int T1 = 0; T1 < GlobalC::ucell.ntype; T1++) { if (GlobalC::ucell.atoms[T1].nlcc) { - //call drhoc - GlobalC::CHR.non_linear_core_correction( - GlobalC::ppcell.numeric, - GlobalC::ucell.atoms[T1].msh, - GlobalC::ucell.atoms[T1].r, - GlobalC::ucell.atoms[T1].rab, - GlobalC::ucell.atoms[T1].rho_atc, - rhocg, - rho_basis); - - - std::complex ipol0, ipol1, ipol2; - for (int I1 = 0;I1 < GlobalC::ucell.atoms[T1].na;I1++) + // call drhoc + GlobalC::CHR.non_linear_core_correction(GlobalC::ppcell.numeric, + GlobalC::ucell.atoms[T1].msh, + GlobalC::ucell.atoms[T1].r, + GlobalC::ucell.atoms[T1].rab, + GlobalC::ucell.atoms[T1].rho_atc, + rhocg, + rho_basis); + + std::complex ipol0, ipol1, ipol2; + for (int I1 = 0; I1 < GlobalC::ucell.atoms[T1].na; I1++) { for (int ig = 0; ig < rho_basis->npw; ig++) { @@ -656,7 +718,7 @@ void Forces::cal_force_cc(ModuleBase::matrix& forcecc, ModulePW::PW_Basis* rho_b const std::complex expiarg = std::complex(sin(arg), cos(arg)); ipol0 = GlobalC::ucell.tpiba * GlobalC::ucell.omega * rhocgigg * gv.x * psiv_conj * expiarg; - forcecc(iat, 0) += ipol0.real(); + forcecc(iat, 0) += ipol0.real(); ipol1 = GlobalC::ucell.tpiba * GlobalC::ucell.omega * rhocgigg * gv.y * psiv_conj * expiarg; forcecc(iat, 1) += ipol1.real(); @@ -667,38 +729,40 @@ void Forces::cal_force_cc(ModuleBase::matrix& forcecc, ModulePW::PW_Basis* rho_b ++iat; } } - else{ + else + { iat += GlobalC::ucell.atoms[T1].na; } } assert(iat == GlobalC::ucell.nat); - delete [] rhocg; - delete [] psiv; // mohan fix bug 2012-03-22 - Parallel_Reduce::reduce_double_pool(forcecc.c, forcecc.nr * forcecc.nc); //qianrui fix a bug for kpar > 1 - return; + delete[] rhocg; + delete[] psiv; // mohan fix bug 2012-03-22 + Parallel_Reduce::reduce_double_pool(forcecc.c, forcecc.nr * forcecc.nc); // qianrui fix a bug for kpar > 1 + return; } #include "../module_base/complexarray.h" #include "../module_base/complexmatrix.h" void Forces::cal_force_nl(ModuleBase::matrix& forcenl, const psi::Psi>* psi_in) { - ModuleBase::TITLE("Forces","cal_force_nl"); - ModuleBase::timer::tick("Forces","cal_force_nl"); + ModuleBase::TITLE("Forces", "cal_force_nl"); + ModuleBase::timer::tick("Forces", "cal_force_nl"); const int nkb = GlobalC::ppcell.nkb; - if(nkb == 0) return; // mohan add 2010-07-25 - - // dbecp: conj( -iG * ) - ModuleBase::ComplexArray dbecp( 3, GlobalV::NBANDS, nkb); - ModuleBase::ComplexMatrix becp( GlobalV::NBANDS, nkb); - - - // vkb1: |Beta(nkb,npw)> - ModuleBase::ComplexMatrix vkb1( nkb, GlobalC::wf.npwx ); - - for (int ik = 0;ik < GlobalC::kv.nks;ik++) + if (nkb == 0) + return; // mohan add 2010-07-25 + + // dbecp: conj( -iG * ) + ModuleBase::ComplexArray dbecp(3, GlobalV::NBANDS, nkb); + ModuleBase::ComplexMatrix becp(GlobalV::NBANDS, nkb); + + // vkb1: |Beta(nkb,npw)> + ModuleBase::ComplexMatrix vkb1(nkb, GlobalC::wf.npwx); + + for (int ik = 0; ik < GlobalC::kv.nks; ik++) { - if (GlobalV::NSPIN==2) GlobalV::CURRENT_SPIN = GlobalC::kv.isk[ik]; + if (GlobalV::NSPIN == 2) + GlobalV::CURRENT_SPIN = GlobalC::kv.isk[ik]; const int nbasis = GlobalC::kv.ngk[ik]; // generate vkb if (GlobalC::ppcell.nkb > 0) @@ -708,62 +772,62 @@ void Forces::cal_force_nl(ModuleBase::matrix& forcenl, const psi::Psi + // vkb: Beta(nkb,npw) + // becp(nkb,nbnd): becp.zero_out(); psi_in[0].fix_k(ik); char transa = 'C'; char transb = 'N'; /// - ///only occupied band should be calculated. + /// only occupied band should be calculated. /// int nbands_occ = GlobalV::NBANDS; - while(GlobalC::wf.wg(ik, nbands_occ-1) < ModuleBase::threshold_wg) + while (GlobalC::wf.wg(ik, nbands_occ - 1) < ModuleBase::threshold_wg) { nbands_occ--; } int npm = GlobalV::NPOL * nbands_occ; zgemm_(&transa, - &transb, - &nkb, - &npm, - &nbasis, - &ModuleBase::ONE, - GlobalC::ppcell.vkb.c, - &GlobalC::wf.npwx, - psi_in[0].get_pointer(), - &GlobalC::wf.npwx, - &ModuleBase::ZERO, - becp.c, - &nkb); - Parallel_Reduce::reduce_complex_double_pool( becp.c, becp.size); - - //out.printcm_real("becp",becp,1.0e-4); - // Calculate the derivative of beta, - // |dbeta> = -ig * |beta> + &transb, + &nkb, + &npm, + &nbasis, + &ModuleBase::ONE, + GlobalC::ppcell.vkb.c, + &GlobalC::wf.npwx, + psi_in[0].get_pointer(), + &GlobalC::wf.npwx, + &ModuleBase::ZERO, + becp.c, + &nkb); + Parallel_Reduce::reduce_complex_double_pool(becp.c, becp.size); + + // out.printcm_real("becp",becp,1.0e-4); + // Calculate the derivative of beta, + // |dbeta> = -ig * |beta> dbecp.zero_out(); - for (int ipol = 0; ipol<3; ipol++) + for (int ipol = 0; ipol < 3; ipol++) { - for (int i = 0;i < nkb;i++) - { - std::complex* pvkb1 = &vkb1(i,0); - std::complex* pvkb = &GlobalC::ppcell.vkb(i,0); - if (ipol==0) - { - for (int ig=0; iggetgcar(ik,ig)[0]; + for (int i = 0; i < nkb; i++) + { + std::complex* pvkb1 = &vkb1(i, 0); + std::complex* pvkb = &GlobalC::ppcell.vkb(i, 0); + if (ipol == 0) + { + for (int ig = 0; ig < nbasis; ig++) + pvkb1[ig] = pvkb[ig] * ModuleBase::NEG_IMAG_UNIT * GlobalC::wfcpw->getgcar(ik, ig)[0]; } - if (ipol==1) - { - for (int ig=0; iggetgcar(ik,ig)[1]; + if (ipol == 1) + { + for (int ig = 0; ig < nbasis; ig++) + pvkb1[ig] = pvkb[ig] * ModuleBase::NEG_IMAG_UNIT * GlobalC::wfcpw->getgcar(ik, ig)[1]; } - if (ipol==2) - { - for (int ig=0; iggetgcar(ik,ig)[2]; + if (ipol == 2) + { + for (int ig = 0; ig < nbasis; ig++) + pvkb1[ig] = pvkb[ig] * ModuleBase::NEG_IMAG_UNIT * GlobalC::wfcpw->getgcar(ik, ig)[2]; } - } + } std::complex* pdbecp = &dbecp(ipol, 0, 0); zgemm_(&transa, &transb, @@ -842,27 +906,27 @@ void Forces::cal_force_nl(ModuleBase::matrix& forcenl, const psi::Psiprint(GlobalV::ofs_running, "nonlocal forces", forcenl); - ModuleBase::timer::tick("Forces","cal_force_nl"); + // this->print(GlobalV::ofs_running, "nonlocal forces", forcenl); + ModuleBase::timer::tick("Forces", "cal_force_nl"); return; } void Forces::cal_force_scc(ModuleBase::matrix& forcescc, ModulePW::PW_Basis* rho_basis) { - std::complex* psic = new std::complex [rho_basis->nmaxgr]; + std::complex* psic = new std::complex[rho_basis->nmaxgr]; if (GlobalV::NSPIN == 1 || GlobalV::NSPIN == 4) { - for (int i = 0;i < rho_basis->nrxx;i++) + for (int i = 0; i < rho_basis->nrxx; i++) { - psic[i] = GlobalC::pot.vnew(0,i); + psic[i] = GlobalC::pot.vnew(0, i); } } else { int isup = 0; int isdw = 1; - for (int i = 0;i < rho_basis->nrxx;i++) + for (int i = 0; i < rho_basis->nrxx; i++) { psic[i] = (GlobalC::pot.vnew(isup, i) + GlobalC::pot.vnew(isdw, i)) * 0.5; } @@ -870,7 +934,7 @@ void Forces::cal_force_scc(ModuleBase::matrix& forcescc, ModulePW::PW_Basis* rho int ndm = 0; - for (int it = 0;it < GlobalC::ucell.ntype;it++) + for (int it = 0; it < GlobalC::ucell.ntype; it++) { if (ndm < GlobalC::ucell.atoms[it].msh) { @@ -878,29 +942,30 @@ void Forces::cal_force_scc(ModuleBase::matrix& forcescc, ModulePW::PW_Basis* rho } } - //work space + // work space double* aux = new double[ndm]; ModuleBase::GlobalFunc::ZEROS(aux, ndm); double* rhocgnt = new double[rho_basis->ngg]; ModuleBase::GlobalFunc::ZEROS(rhocgnt, rho_basis->ngg); - rho_basis->real2recip(psic,psic); + rho_basis->real2recip(psic, psic); int igg0 = 0; const int ig0 = rho_basis->ig_gge0; - if (rho_basis->gg_uniq [0] < 1.0e-8) igg0 = 1; + if (rho_basis->gg_uniq[0] < 1.0e-8) + igg0 = 1; double fact = 2.0; - for (int nt = 0;nt < GlobalC::ucell.ntype;nt++) + for (int nt = 0; nt < GlobalC::ucell.ntype; nt++) { -// Here we compute the G.ne.0 term + // Here we compute the G.ne.0 term const int mesh = GlobalC::ucell.atoms[nt].msh; - for (int ig = igg0 ; ig < rho_basis->ngg; ++ig) + for (int ig = igg0; ig < rho_basis->ngg; ++ig) { const double gx = sqrt(rho_basis->gg_uniq[ig]) * GlobalC::ucell.tpiba; - for (int ir = 0;ir < mesh;ir++) + for (int ir = 0; ir < mesh; ir++) { if (GlobalC::ucell.atoms[nt].r[ir] < 1.0e-8) { @@ -912,19 +977,20 @@ void Forces::cal_force_scc(ModuleBase::matrix& forcescc, ModulePW::PW_Basis* rho aux[ir] = GlobalC::ucell.atoms[nt].rho_at[ir] * sin(gxx) / gxx; } } - ModuleBase::Integral::Simpson_Integral(mesh , aux, GlobalC::ucell.atoms[nt].rab , rhocgnt [ig]); + ModuleBase::Integral::Simpson_Integral(mesh, aux, GlobalC::ucell.atoms[nt].rab, rhocgnt[ig]); } int iat = 0; - for (int it = 0;it < GlobalC::ucell.ntype;it++) + for (int it = 0; it < GlobalC::ucell.ntype; it++) { - for (int ia = 0;ia < GlobalC::ucell.atoms[it].na;ia++) + for (int ia = 0; ia < GlobalC::ucell.atoms[it].na; ia++) { if (nt == it) { - for (int ig = 0;ig < rho_basis->npw; ++ig) + for (int ig = 0; ig < rho_basis->npw; ++ig) { - if(ig==ig0) continue; + if (ig == ig0) + continue; const ModuleBase::Vector3 gv = rho_basis->gcar[ig]; const ModuleBase::Vector3 pos = GlobalC::ucell.atoms[it].tau[ia]; const double rhocgntigg = rhocgnt[GlobalC::rhopw->ig2igg[ig]]; @@ -935,21 +1001,19 @@ void Forces::cal_force_scc(ModuleBase::matrix& forcescc, ModulePW::PW_Basis* rho forcescc(iat, 1) += fact * rhocgntigg * GlobalC::ucell.tpiba * gv.y * cpm.real(); forcescc(iat, 2) += fact * rhocgntigg * GlobalC::ucell.tpiba * gv.z * cpm.real(); } - //std::cout << " forcescc = " << forcescc(iat,0) << " " << forcescc(iat,1) << " " << forcescc(iat,2) << std::endl; + // std::cout << " forcescc = " << forcescc(iat,0) << " " << forcescc(iat,1) << " " << + // forcescc(iat,2) << std::endl; } iat++; } } } - - Parallel_Reduce::reduce_double_pool(forcescc.c, forcescc.nr * forcescc.nc); - delete[] psic; //mohan fix bug 2012-03-22 - delete[] aux; //mohan fix bug 2012-03-22 - delete[] rhocgnt; //mohan fix bug 2012-03-22 + Parallel_Reduce::reduce_double_pool(forcescc.c, forcescc.nr * forcescc.nc); + + delete[] psic; // mohan fix bug 2012-03-22 + delete[] aux; // mohan fix bug 2012-03-22 + delete[] rhocgnt; // mohan fix bug 2012-03-22 return; } - - - diff --git a/source/src_pw/potential.cpp b/source/src_pw/potential.cpp index cde1ad785f..20514fe19f 100644 --- a/source/src_pw/potential.cpp +++ b/source/src_pw/potential.cpp @@ -7,9 +7,9 @@ #include "global.h" #include "math.h" // new +#include "../module_surchem/efield.h" #include "../module_surchem/surchem.h" #include "H_Hartree_pw.h" -#include "../module_surchem/efield.h" #ifdef __LCAO #include "../src_lcao/ELEC_evolve.h" #endif @@ -39,8 +39,10 @@ void Potential::allocate(const int nrxx) assert(nrxx >= 0); delete[] this->vltot; - if(nrxx > 0) this->vltot = new double[nrxx]; - else this->vltot = nullptr; + if (nrxx > 0) + this->vltot = new double[nrxx]; + else + this->vltot = nullptr; ModuleBase::Memory::record("Potential", "vltot", nrxx, "double"); this->vr.create(GlobalV::NSPIN, nrxx); @@ -59,8 +61,10 @@ void Potential::allocate(const int nrxx) } delete[] this->vr_eff1; - if(nrxx > 0) this->vr_eff1 = new double[nrxx]; - else this->vr_eff1 = nullptr; + if (nrxx > 0) + this->vr_eff1 = new double[nrxx]; + else + this->vr_eff1 = nullptr; #ifdef __CUDA cudaMalloc((void **)&this->d_vr_eff1, nrxx * sizeof(double)); #endif @@ -69,7 +73,7 @@ void Potential::allocate(const int nrxx) this->vnew.create(GlobalV::NSPIN, nrxx); ModuleBase::Memory::record("Potential", "vnew", GlobalV::NSPIN * nrxx, "double"); - if (GlobalV::imp_sol) + if (GlobalV::imp_sol || GlobalV::comp_chg) { GlobalC::solvent_model.allocate(nrxx, GlobalV::NSPIN); } @@ -266,7 +270,7 @@ void Potential::init_pot(const int &istep, // number of ionic steps void Potential::set_local_pot(double *vl_pseudo, // store the local pseudopotential const int &ntype, // number of atom types ModuleBase::matrix &vloc, // local pseduopotentials - ModulePW::PW_Basis* rho_basis, + ModulePW::PW_Basis *rho_basis, ModuleBase::ComplexMatrix &sf // structure factors ) const { @@ -359,17 +363,11 @@ ModuleBase::matrix Potential::v_of_rho(const double *const *const rho_in, const v += H_Hartree_pw::v_hartree(GlobalC::ucell, GlobalC::rhopw, GlobalV::NSPIN, rho_in); if(GlobalV::comp_chg) { - v += GlobalC::solvent_model.v_compensating(GlobalC::ucell, GlobalC::rhopw); + v += GlobalC::solvent_model.v_compensating(GlobalC::ucell, GlobalC::rhopw, GlobalV::NSPIN, rho_in); } if (GlobalV::imp_sol) { v += GlobalC::solvent_model.v_correction(GlobalC::ucell, GlobalC::rhopw, GlobalV::NSPIN, rho_in); - /* - // test energy outside - cout << "energy Outside: " << endl; - GlobalC::solvent_model.cal_Ael(GlobalC::ucell, GlobalC::rhopw); - GlobalC::solvent_model.cal_Acav(GlobalC::ucell, GlobalC::rhopw); - */ } } @@ -381,6 +379,11 @@ ModuleBase::matrix Potential::v_of_rho(const double *const *const rho_in, const v += Efield::add_efield(GlobalC::ucell, GlobalC::rhopw, GlobalV::NSPIN, rho_in); } + // test get ntot_reci + // complex *tmpn = new complex[GlobalC::rhopw->npw]; + // ModuleBase::GlobalFunc::ZEROS(tmpn, GlobalC::rhopw->npw); + // GlobalC::solvent_model.get_totn_reci(GlobalC::ucell, GlobalC::rhopw, tmpn); + // delete[] tmpn; ModuleBase::timer::tick("Potential", "v_of_rho"); return v; diff --git a/source/src_ri/exx_abfs.cpp b/source/src_ri/exx_abfs.cpp index 90acf22a18..995ad78e1d 100644 --- a/source/src_ri/exx_abfs.cpp +++ b/source/src_ri/exx_abfs.cpp @@ -490,7 +490,6 @@ std::cout<<"I"<>> @@ -639,7 +638,6 @@ std::cout<<"E"<>>> &&ms_abfs_abfs = m_abfs_abfs.cal_overlap_matrix( index_abfs, index_abfs ); ofs_ms("ms_abfs_abfs",ms_abfs_abfs); @@ -661,7 +659,6 @@ ofs_ms("ms_C",ms_C); std::cout<<"H"<>$1 fi +if ! test -z "$comp_chg" && [ $comp_chg == 1 ]; then + ecomp_self=`grep E_comp_self $running_path | awk '{print $3}'` + ecomp_electron=`grep E_comp_electron $running_path | awk '{print $3}'` + ecomp_nuclear=`grep E_comp_nuclear $running_path | awk '{print $3}'` + ecomp_tot=`grep E_comp_tot $running_path | awk '{print $3}'` + echo "ecompselfref $ecomp_self" >>$1 + echo "ecompelectronref $ecomp_electron" >>$1 + echo "ecompnuclearref $ecomp_nuclear" >>$1 + echo "ecomptotref $ecomp_tot" >>$1 +fi + #echo $total_band ttot=`grep $word $running_path | awk '{print $3}'` echo "totaltimeref $ttot" >>$1