- WARNING: currently it runs in 40% the speed of Original C++ version, but provides better experience of compatibility with Python scripts.
- Original C++ version: SMORe
- MF (Matrix Factorization)
- BPR (Bayesian Personalized Ranking)
- LINE(Large-scale Information Network Embedding)
- DeepWalk
- Walklets
- HPE (Heterogeneous Preference Embedding)
- APP (Asymmetric Proximity Preserving graph embedding)
- WARP-like
- HOP-REC
- CSE (named nemf & nerank in cli)
cd pySmore
python3 example.py
import pysmore.models.mf as MF
import pysmore.models.bpr as BPR
import pysmore.models.line as LINE
# Choose a graph embedding method
trainer = BPR # or MF, LINE
# Create a graph with given user-item interaction data
trainer.create_graph("data/ui.train.txt", embedding_dimension=6)
# Pass the parameters that you'd like for training!
trainer.set_param({
'init_lr': 0.025, # initial learning rate
'l2_reg': 0.01 # L2-Regularize ratio
})
# Start training!
# Noted that `update_times` will be multiplied by 1 million
# `workers` is the amount of process to use, NOT THREADS
trainer.train(update_times=1e-4, workers=4)
# Afterwards, output the embeddings.
trainer.save_embeddings(file_prefix="bpr")
Given a network input:
userA itemA 3
userA itemC 5
userB itemA 1
userB itemB 5
userC itemA 4
The model learns the representations of each vertex:
userA -0.068195 0.105852 0.056242 0.084970 -0.209601 -0.018169
itemA -0.033628 0.046754 0.030732 0.035540 -0.105440 -0.008107
itemC -0.114769 0.181540 0.092762 0.146976 -0.351410 -0.031107
userB -0.050903 0.013206 0.077547 -0.013286 -0.179966 -0.003265
itemB -0.088020 0.015789 0.137878 -0.031181 -0.313660 -0.004543
userC -0.060036 0.086218 0.053508 0.066621 -0.187579 -0.014900
We can calculate the results (dot product):
userA itemA 0.034238
userA itemC 0.118970
userB itemA 0.023242
userB itemB 0.072258
userC itemA 0.029961