You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I'm trying to create a custom dataset for Grakel:
def smiles_to_grakel_graphs(smiles_list: list[str]) -> list[grakel.Graph]:
"""
Transforms list of SMILES strings into list of graphs in GraKeL library format.
We use atomic numbers as discrete node labels.
"""
mols = [MolFromSmiles(smiles) for smiles in smiles_list]
graphs = []
bond_type_to_int = {
"SINGLE": 1,
"DOUBLE": 2,
"TRIPLE": 3,
"AROMATIC": 4,
}
for mol in mols:
graph = nx.Graph()
for atom in mol.GetAtoms():
graph.add_node(atom.GetIdx(), atom_label=atom.GetAtomicNum())
for bond in mol.GetBonds():
# default = OTHER
bond_type = bond_type_to_int.get(str(bond.GetBondType()), 5)
graph.add_edge(
bond.GetBeginAtomIdx(), bond.GetEndAtomIdx(), bond_label=bond_type
)
graphs.append(graph)
graphs = list(
graph_from_networkx(
graphs, as_Graph=True, node_labels_tag="atom_label", edge_labels_tag="bond_label"
)
)
return graphs
This should result in graphs with edge labels. However, later in cross-validation, I get:
/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/grakel/graph.py:314: UserWarning: changing format from "adjacency" to "all"
warnings.warn('changing format from "adjacency" to "all"')
Traceback (most recent call last):
File "/home/jakub/PycharmProjects/pesticide_bee_toxicity_prediction/src/graph_kernels.py", line 155, in <module>
train_graph_kernel_SVM(
File "/home/jakub/PycharmProjects/pesticide_bee_toxicity_prediction/src/graph_kernels.py", line 128, in train_graph_kernel_SVM
model.fit(graphs_train, y_train)
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/base.py", line 1474, in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/model_selection/_search.py", line 970, in fit
self._run_search(evaluate_candidates)
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/model_selection/_search.py", line 1527, in _run_search
evaluate_candidates(ParameterGrid(self.param_grid))
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/model_selection/_search.py", line 947, in evaluate_candidates
_warn_or_raise_about_fit_failures(out, self.error_score)
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/model_selection/_validation.py", line 536, in _warn_or_raise_about_fit_failures
raise ValueError(all_fits_failed_message)
ValueError:
All the 25 fits failed.
It is very likely that your model is misconfigured.
You can try to debug the error by setting error_score='raise'.
Below are more details about the failures:
--------------------------------------------------------------------------------
25 fits failed with the following error:
Traceback (most recent call last):
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/model_selection/_validation.py", line 895, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/base.py", line 1474, in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/pipeline.py", line 471, in fit
Xt = self._fit(X, y, routed_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/pipeline.py", line 408, in _fit
X, fitted_transformer = fit_transform_one_cached(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/joblib/memory.py", line 312, in __call__
return self.func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/pipeline.py", line 1303, in _fit_transform_one
res = transformer.fit_transform(X, y, **params.get("fit_transform", {}))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/sklearn/utils/_set_output.py", line 295, in wrapped
data_to_wrap = f(self, X, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/grakel/kernels/neighborhood_subgraph_pairwise_distance.py", line 308, in fit_transform
self.fit(X)
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/grakel/kernels/kernel.py", line 124, in fit
self.X = self.parse_input(X)
^^^^^^^^^^^^^^^^^^^
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/grakel/kernels/neighborhood_subgraph_pairwise_distance.py", line 138, in parse_input
x.get_labels(purpose="adjacency", label_type="edge"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jakub/.cache/pypoetry/virtualenvs/pesticide-bee-toxicity-prediction-Sj4YDJPR-py3.11/lib/python3.11/site-packages/grakel/graph.py", line 750, in get_labels
raise ValueError('Graph does not have any labels for edges.')
ValueError: Graph does not have any labels for edges.
EDIT: interestingly, labels initially seem to be there - print(graphs_train[0].edge_labels) results in {(0, 1): 2, (1, 0): 2, (1, 2): 1, (1, 3): 1, (2, 1): 1, (3, 1): 1, (3, 4): 2, (3, 5): 1, (4, 3): 2, (5, 3): 1}. I also tried using this without pipeline, just computing the kernel, but I get the same error.
The text was updated successfully, but these errors were encountered:
It turns out that I had single-atom molecules in my dataset, and that was the reason for the error. However, maybe it could be made more descriptive? Also, no labels + no edges is a completely correct input in many cases, so I think it should be handled properly.
Describe the bug
I'm trying to create a custom dataset for Grakel:
This should result in graphs with edge labels. However, later in cross-validation, I get:
My pipeline is:
EDIT: interestingly, labels initially seem to be there -
print(graphs_train[0].edge_labels)
results in{(0, 1): 2, (1, 0): 2, (1, 2): 1, (1, 3): 1, (2, 1): 1, (3, 1): 1, (3, 4): 2, (3, 5): 1, (4, 3): 2, (5, 3): 1}
. I also tried using this without pipeline, just computing the kernel, but I get the same error.The text was updated successfully, but these errors were encountered: