We have encountered instances where papers have used the Bgolearn package but failed to cite it. We will continue to monitor such cases and will contact the respective journals to request proper citation.
The publication utilizing Bgolearn will be promoted by the Bgolearn Team. HomePage
🤝🤝🤝 Please star ⭐️ it for promoting open source projects 🌍 ! Thanks ! For inquiries or assistance, please don't hesitate to contact us at bcao686@connect.hkust-gz.edu.cn (Dr. CAO Bin).
View package usage statistics / download counts
code tutorial : BiliBili
- Bgolearn has been implemented in the platform of MLMD etc.
- The official User Interface of the Bgolearn Platform is BgoFace.
- Bgolearn Code : here
- video of Bgolearn has been uploaded to platforms : BiliBili. YouTube.
No gradient information is used
pip install Bgolearn
pip show Bgolearn
pip install --upgrade Bgolearn
# import BGOsampling after installation
# 安装后, 通过此命令调用BGOsampling类
import Bgolearn.BGOsampling as BGOS
# import your dataset (Samples have been characterized)
# 导入研究的数据集(已经表征过的样本)
data = pd.read_csv('data.csv')
# features
x = data.iloc[:,:-1]
# response / target
y = data.iloc[:,-1]
# virtual samples which have same feature dimension with x
# 设计的虚拟样本, 与x具有相同的维度
vs = pd.read_csv('virtual_data.csv')
# instantiate class
# 实例化类 Bgolearn
Bgolearn = BGOS.Bgolearn()
# Pass parameters to the function
# 传入参数
Mymodel = Bgolearn.fit(data_matrix = x, Measured_response = y, virtual_samples = vs)
# derive the result by EI
# 通过EI导出结果
Mymodel.EI()
pip install BgoKit
from BgoKit import ToolKit
# vs is the virtual samples
# score_1,score_2 are output of Bgolearn
# score_1, _= Mymodel_1.EI() ; score_2, _= Mymodel_2.EI()
Model = ToolKit.MultiOpt(vs,[score_1,score_2])
Model.BiSearch()
Model.plot_distribution()
See : Link
The User Interface of the Bgolearn Platform : BgoFace
1: Cao B., Su T, Yu S, Li T, Zhang T, Zhang J, Dong Z, Zhang Ty. Active learning accelerates the discovery of high strength and high ductility lead-free solder alloys, Materials & Design, 2024, 112921, ISSN 0264-1275, https://doi.org/10.1016/j.matdes.2024.112921.
2: Ma J.∔, Cao B.∔, Dong S, Tian Y, Wang M, Xiong J, Sun S. MLMD: a programming-free AI platform to predict and design materials. npj Comput Mater 10, 59 (2024). https://doi.org/10.1038/s41524-024-01243-4
Maintained by Bin Cao. Please feel free to open issues in the Github or contact Bin Cao (bcao686@connect.hkust-gz.edu.cn) in case of any problems/comments/suggestions in using the code.
Contribution and suggestions are always welcome. In addition, we are also looking for research collaborations. You can submit issues for suggestions, questions, bugs, and feature requests, or submit pull requests to contribute directly. You can also contact the authors for research collaboration.
© 2024 Bgolearn Development Team. All rights reserved.
This software is provided for academic and research purposes only. Commercial use is strictly prohibited. Any violation of these terms will be subject to appropriate actions.
-
1.Expected Improvement algorith (期望提升函数)
-
2.Expected improvement with “plugin” (有“plugin”的期望提升函数)
-
3.Augmented Expected Improvement (增广期望提升函数)
-
4.Expected Quantile Improvement (期望分位提升函数)
-
5.Reinterpolation Expected Improvement (重插值期望提升函数)
-
6.Upper confidence bound (高斯上确界函数)
-
7.Probability of Improvement (概率提升函数)
-
8.Predictive Entropy Search (预测熵搜索函数)
-
9.Knowledge Gradient (知识梯度函数)
-
1.Least Confidence (欠信度函数)
-
2.Margin Sampling (边界函数)
-
3.Entropy-based approach (熵索函数)
Signature:
Bgolearn.fit(
data_matrix,
Measured_response,
virtual_samples,
Mission='Regression',
Classifier='GaussianProcess',
noise_std=None,
Kriging_model=None,
opt_num=1,
min_search=True,
CV_test=False,
Dynamic_W=False,
seed=42,
)
================================================================
:param data_matrix: data matrix of training dataset, X .
:param Measured_response: response of tarining dataset, y.
:param virtual_samples: designed virtual samples.
:param Mission: str, default 'Regression', the mission of optimization. Mission = 'Regression' or 'Classification'
:param Classifier: if Mission == 'Classification', classifier is used.
if user isn't applied one, Bgolearn will call a pre-set classifier.
default, Classifier = 'GaussianProcess', i.e., Gaussian Process Classifier.
five different classifiers are pre-setd in Bgolearn:
'GaussianProcess' --> Gaussian Process Classifier (default)
'LogisticRegression' --> Logistic Regression
'NaiveBayes' --> Naive Bayes Classifier
'SVM' --> Support Vector Machine Classifier
'RandomForest' --> Random Forest Classifier
:param noise_std: float or ndarray of shape (n_samples,), default=None
Value added to the diagonal of the kernel matrix during fitting.
This can prevent a potential numerical issue during fitting, by
ensuring that the calculated values form a positive definite matrix.
It can also be interpreted as the variance of additional Gaussian.
measurement noise on the training observations.
if noise_std is not None, a noise value will be estimated by maximum likelihood
on training dataset.
:param Kriging_model (default None):
str, Kriging_model = 'SVM', 'RF', 'AdaB', 'MLP'
The machine learning models will be implemented: Support Vector Machine (SVM),
Random Forest(RF), AdaBoost(AdaB), and Multi-Layer Perceptron (MLP).
The estimation uncertainity will be determined by Boostsrap sampling.
or
a user defined callable Kriging model, has an attribute of <fit_pre>
if user isn't applied one, Bgolearn will call a pre-set Kriging model
atribute <fit_pre> :
input -> xtrain, ytrain, xtest ;
output -> predicted mean and std of xtest
e.g. (take GaussianProcessRegressor in sklearn):
class Kriging_model(object):
def fit_pre(self,xtrain,ytrain,xtest):
# instantiated model
kernel = RBF()
mdoel = GaussianProcessRegressor(kernel=kernel).fit(xtrain,ytrain)
# defined the attribute's outputs
mean,std = mdoel.predict(xtest,return_std=True)
return mean,std
e.g. (MultiModels estimations):
class Kriging_model(object):
def fit_pre(self,xtrain,ytrain,xtest):
# instantiated model
pre_1 = SVR(C=10).fit(xtrain,ytrain).predict(xtest) # model_1
pre_2 = SVR(C=50).fit(xtrain,ytrain).predict(xtest) # model_2
pre_3 = SVR(C=80).fit(xtrain,ytrain).predict(xtest) # model_3
model_1 , model_2 , model_3 can be changed to any ML models you desire
# defined the attribute's outputs
stacked_array = np.vstack((pre_1,pre_2,pre_3))
means = np.mean(stacked_array, axis=0)
std = np.sqrt(np.var(stacked_array), axis=0)
return mean, std
:param opt_num: the number of recommended candidates for next iteration, default 1.
:param min_search: default True -> searching the global minimum ;
False -> searching the global maximum.
:param CV_test: 'LOOCV' or an int, default False (pass test)
if CV_test = 'LOOCV', LOOCV will be applied,
elif CV_test = int, e.g., CV_test = 10, 10 folds cross validation will be applied.
:return: 1: array; potential of each candidate. 2: array/float; recommended candidate(s).
File: ~/miniconda3/lib/python3.9/site-packages/Bgolearn/BGOsampling.py
Type: method