Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Obtaining Bones and Joints #88

Open
Kaszanas opened this issue Nov 21, 2024 · 12 comments
Open

Question: Obtaining Bones and Joints #88

Kaszanas opened this issue Nov 21, 2024 · 12 comments

Comments

@Kaszanas
Copy link
Contributor

I would like to ask if it is possible to get the joints and bones as presented in the dataset that was introduced on the project page:

image

I think the related issue is #86

@geopavlakos
Copy link
Owner

For the HInt dataset, we only annotate the 2D keypoints of the hand.
If you want to get the 3D bone orientation or the 3D joints of the hand, you could run the HaMeR network on these images. This will not be ground truth, but it will give you a reasonable estimate for these parameters.

@Kaszanas
Copy link
Contributor Author

Kaszanas commented Nov 21, 2024

For the HInt dataset, we only annotate the 2D keypoints of the hand. If you want to get the 3D bone orientation or the 3D joints of the hand, you could run the HaMeR network on these images. This will not be ground truth, but it will give you a reasonable estimate for these parameters.

So if I understand correctly, during the inference these 3D keypoints are inferred to later fit the 3D mesh of the hand?
And to get the joint positions I am supposed to look at the:

You can get the 3D coordinates of the hand keypoints in the camera frame by adding pred_cam_t_full (which is calculated here) to out['pred_keypoints_3d'] (which is calculated here). Then you can export the sum in a pkl file.

Is there any order to the inferred 3D keypoints that the network generates?

I understand that this is not ground truth but ANN output.

@geopavlakos
Copy link
Owner

Actually, the model infers the hand model parameters (pose and shape parameters from MANO). Given these parameters, we can use MANO to reconstruct the mesh and the 3D keypoints.

The description in the above issue is accurate for inferring the locations of the 3D keypoints. The keypoints are in the OpenPose order. And yes, HaMeR will only give a neural network estimate - this is not ground truth.

@Kaszanas
Copy link
Contributor Author

Actually, the model infers the hand model parameters (pose and shape parameters from MANO). Given these parameters, we can use MANO to reconstruct the mesh and the 3D keypoints.

The description in the above issue is accurate for inferring the locations of the 3D keypoints. The keypoints are in the OpenPose order. And yes, HaMeR will only give a neural network estimate - this is not ground truth.

Would you be open to merging if I modify the demo.py and add a separate boolean flag argument?

@Kaszanas
Copy link
Contributor Author

@geopavlakos I am working on this in #89 but I am hitting some issues, I will create a devcontainer and run this with a debugger. Hopefully I can come up with a solution that works for me locally ;)

@longshot-pting
Copy link

@geopavlakos I am working on this in #89 but I am hitting some issues, I will create a devcontainer and run this with a debugger. Hopefully I can come up with a solution that works for me locally ;)

hi,have you gotten the img with visual joint like you first mentioned? I have tried pred_3d and pred_2d, it can not correspond location with ori img

@Kaszanas
Copy link
Contributor Author

See if you can get it working with #89

@longshot-pting
Copy link

See if you can get it working with #89

I can get 3d keypoints and 2d keypoints, but I can not get right img, like this:
Uploading WechatIMG47.jpg…

@Kaszanas
Copy link
Contributor Author

See if you can get it working with #89

I can get 3d keypoints and 2d keypoints, but I can not get right img, like this:
Uploading WechatIMG47.jpg…

I think the image is broken.

Anyway if the output from: #89 does not work, we will have to wait for @geopavlakos to answer what is the correct way to get these keypoints.

@longshot-pting
Copy link

See if you can get it working with #89
I can get 3d keypoints and 2d keypoints, but I can not get right img, like this:
Uploading WechatIMG47.jpg…

I think the image is broken.

Anyway if the output from: #89 does not work, we will have to wait for @geopavlakos to answer what is the correct way to get these keypoints.

I have get right img with keypoints, In a class named SkeletonRender, we can get pred_keypoints_3d_proj_img, when I addweight it with ori image, It fits. But I still don't know why the pred_keypoints_2d doesn't work, maybe question is the perspective_projection function. please tell me your idea if you have solution, thanks!

@JuanCarlosHR2003
Copy link

JuanCarlosHR2003 commented Dec 20, 2024

@longshot-pting Can you post how you got the pred_keypoints_3d_proj_img, please? I am trying to get the Skeleton Renderer to work, but to no avail.

@longshot-pting
Copy link

@longshot-pting Can you post how you got the pred_keypoints_3d_proj_img, please? I am trying to get the Skeleton Renderer to work, but to no avail.

of course! I checked it's function named skeleton_renderer and get five pictures with skeleton, in which the picture named ''pred_keypoints_3d_proj_img'' is right, and I use it's focal and cam_center in hamer forward_step like this:
pred_keypoints_2d = perspective_projection(pred_keypoints_3d, translation=pred_cam_t, rotation=rotation.repeat(batch_size, 1, 1).to(device), focal_length = torch.tensor([self.cfg.EXTRA.FOCAL_LENGTH, self.cfg.EXTRA.FOCAL_LENGTH]).reshape(1, 2).repeat(batch_size, 1), camera_center = (torch.tensor([self.cfg.MODEL.IMAGE_SIZE, self.cfg.MODEL.IMAGE_SIZE], dtype=torch.float).reshape(1, 2) / 2.).repeat(batch_size, 1) )
Hope it helpful for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants