-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can deepFM use sparse data format? #10
Comments
hi sddi, |
hi @Leavingseason , thank you for answering. I have millions of features, if i append a zero fake value , the input file maybe very large, could you update the code support the input format likes libsvm format(index:value, if value is zero, omit it in the input file )? |
hi sddi, |
oh~~ @Leavingseason i see~~~ |
That's partially right. Now my code only supports at most one feature for each field, which follows the original paper's framework. So for itemID features, you can only keep one itemID. I know you concerns, in the real world, multiple features under one field happens a lot. We have the corresponding version of code to handle this case, which leverages sparse embedding lookup https://www.tensorflow.org/api_docs/python/tf/nn/embedding_lookup_sparse, and the input format becomes fieldID:featureID:value. We will consider to release this version. |
OK,thank you very much! I am waiting for your new version~~~:D |
Have the version which supports "multiple features under one field" released ? Thanks |
Not yet. All right, since some people are interested in this version, I will release a preview code which is now very ugly. I will try to find some time in two days (it is so sad that KDD deadline is near...) |
Done. |
@Leavingseason hello, 请教一个格式上的问题,fieldID:featureID:value 这里,如果fieldID==1 对应的featureID 有3个,如果fieldID==2对应的featureID 有2个,fieldID==2的 featureID 的值的编码需要基于 fieldID==1 的featureID 上吗? for example: 0 1:1:1 1:2:1 1:3:1 2:1:1 #这里fieldID==2的 featureID 可以重新编码 |
I try using deepFM.py with sparse data a8a.train, and its format likes "label index:value index:value..." .
I see in S1_4.txt, if some value is 0 it is also in the feature line, but in a8a.train it is not.
I run python deepFM.py, I got "Input to reshape is a tensor with 5528 values, but the requested shape requires a multiple of 672"
I don't know if the code not supports the format?
The text was updated successfully, but these errors were encountered: