-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to @vladmandic/human #1218
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: ghisch <168481965+ghisch@users.noreply.github.com>
Adapt dimensions to 1024 as outputed by @vladmandic/human Signed-off-by: ghisch <168481965+ghisch@users.noreply.github.com>
Signed-off-by: ghisch <168481965+ghisch@users.noreply.github.com>
Signed-off-by: ghisch <168481965+ghisch@users.noreply.github.com>
Thank you sooo much for this PR, I think this is strongly needed. The face-api Repo hasn't gotten a meaningful commit since January and is also Archived now. Hopefully @marcelklehr can look at this PR when he has time :) |
Hello @ghisch |
Hello @marcelklehr 👋 I totally understand no problem. Also, FYI, human lib repo also provide some kind of clustering in JS, I feel like inspiration can be taken from here. |
Hello there, We hope that the review process is going smooth and is helpful for you. We want to ensure your pull request is reviewed to your satisfaction. If you have a moment, our community management team would very much appreciate your feedback on your experience with this PR review process. Your feedback is valuable to us as we continuously strive to improve our community developer experience. Please take a moment to complete our short survey by clicking on the following link: https://cloud.nextcloud.com/apps/forms/s/i9Ago4EQRZ7TWxjfmeEpPkf6 Thank you for contributing to Nextcloud and we hope to hear from you soon! (If you believe you should not receive this message, you can add yourself to the blocklist.) |
Hello @marcelklehr , had you had the time to take a look on this PR ? I can help to make the CI pass |
@QuentinLemCode My earlier comment still applies, sadly. |
@QuentinLemCode My earlier comment still applies, sadly. This needs testing in normal CPU mode and with GPU, Face clustering needs to be fixed. If you can help, by all means :) |
@marcelklehr Sure! Do you have any documentation or information for launching the local dev env and performing tests? |
Here are our tutorials for developers: https://nextcloud.com/developer/ |
@marcelklehr In the CI, a job has failed because an archive of photos isn't available anymore on your site Could you rehost it or send it here please ? I suggest to include it in a test assets folder |
I've rehosted it, thanks for the heads up! |
Oh, this is really interesting! I can help with the hyper-parameter optimization since I'm somewhat familiar with the clustering algo. The face-api embedding model is quite old so I'm really interested in seeing how much this could improve the clustering results. However, more dimensions alone isn't necessarily better for clustering due to what is called "the curse of dimensionality". As the dimensionality of the vectors increases the distances between any two random vectors become more narrowly distributed making separating noise from a valid signal more difficult. Be that as it may: Any trace as to why the clustering fails? I take it there's no error in the logs? @ghisch: By 'limited to duplicate images', do you mean that only faces in duplicate images get clustered together? The clustering algorithm itself doesn't care about the dimensions as long as a distance calculation between two embeddings can be done so there should be no reason it wouldn't work for 1024 dimensions per se (although I expect it to be exactly 8 times slower than for 128 dimensions). |
As I said I'm not a dev and mainly relied on AI to propose this PR. But from what I understood, it was creating one cluster per face detected. No error (but also I was not running a dev instance, I was using my own production instance).
When the same image file was duplicated (and/or slightly edited like cropped), all faces were correctly clustered. |
This PR updates the face detection classifier from
@vladmandic/face-api
to@vladmandic/human
.Fix #1171
Key Changes
Improved Performance
@vladmandic/human
offers significantly better performance compared toface-api
, with faster processing and more reliable face detection.Configuration Adjustments
maxDetected
to 100 (default is 1) to support multiple faces in a single image.descriptor
(128 dimensions inface-api
), now calledembedding
(1024 dimensions inhuman
).Face Clustering
MIN_CLUSTER_SEPARATION
,MAX_CLUSTER_EDGE_LENGTH
) might need optimization for the new 1024-dimensional embeddings. Suggestions and assistance on tuning these are welcome! 🙏Compatibility Note
face-api
are incompatible withhuman
embeddings. This requires a reset of existing data to align with the new system.Testing
Request for Feedback
Recap (Help Needed 🙏)
Let me know if there's anything I can improve or clarify further!