Generalizable Dexterous Grasp Generation via Contact Map Transfer Contact Map Transfer with Conditional Diffusion Model for Generalizable Dexterous Grasp Generation

Contact Map Transfer with Conditional Diffusion
Model for Generalizable Dexterous Grasp Generation

NeurIPS 2025

Yiyao Ma , Kai Chen , Kexin Zheng , and Qi Dou
Department of Computer Science and Engineering, The Chinese University of Hong Kong.
Corresponding Author
Figure 1. Comparison of our proposed framework with existing analytical and generative methods.

Abstract

Dexterous grasp generation is a fundamental challenge in robotics, requiring both grasp stability and adaptability across diverse objects and tasks. Analytical methods ensure stable grasps but are inefficient and lack task adaptability, while generative approaches improve efficiency and task integration but generalize poorly to unseen objects and tasks due to data limitations. In this paper, we propose a transfer-based framework for dexterous grasp generation, leveraging a conditional diffusion model to transfer high-quality grasps from shape templates to novel objects within the same category. Specifically, we reformulate the grasp transfer problem as the generation of an object contact map, incorporating object shape similarity and task specifications into the diffusion process. To handle complex shape variations, we introduce a dual mapping mechanism, capturing intricate geometric relationship between shape templates and novel objects. Beyond the contact map, we derive two additional object-centric maps, the part map and direction map, to encode finer contact details for more stable grasps. We then develop a cascaded conditional diffusion model framework to jointly transfer these three maps, ensuring their intra-consistency. Finally, we introduce a robust grasp recovery mechanism, identifying reliable contact points and optimizing grasp configurations efficiently. Extensive experiments demonstrate the superiority of our proposed method. Our approach effectively balances grasp quality, generation efficiency, and generalization performance across various tasks.

Methodology

Conditional Diffusion Model for Contact Map Transfer

img description
Figure 2. Our conditional diffusion model learns to transfer contact maps via a template-target framework.

To handle the complex shape variations across diverse objects, we introduce a dual mapping mechanism within conditional diffusion. This mechanism explicitly models the dual mapping relationships between the template shape and the reconstructed template contact map, and between the noise and the contact map of the novel object. In this way, the diffusion model effectively captures the intricate geometric similarities between the shape template and the novel object under different task specifications, enabling the generation of an accurate contact map aligned with the intended grasping task.

Cascaded Conditional Diffusion for Joint Contact, Part, and Direction Transfer

img description
Figure 3. Overview of the cascaded diffusion framework.

To achieve more stable and dexterous grasping, we further derive two additional object-centric maps, the part map and the direction map, which provide richer information about the contact regions and the grasping orientations. To jointly transfer three object-centric maps from the shape template to the novel object, we develop a cascaded conditional diffusion framework. This cascaded design enables a progressive generation process, ensuring intra-consistency among the three maps and preserving coherent contact, part, and direction information throughout the transfer process. Based on the transferred three object-centric maps, we design a robust mechanism to automatically identify object points with reliable part and direction predictions. Finally, we recover the grasp configuration parameters through a fast and robust optimization scheme, ensuring the stability and feasibility of the generated dexterous grasps for novel objects.

Results

Quality evaluation in task-agnostic grasping.
As can be observed, both our method and the analytical methods achieve a high grasp success rate across various objects. It indicates that our method can effectively transfer dexterous grasps from a template to diverse novel objects, maintaining high grasp quality even without performing complex force-closure-based optimization for each novel object.
Interpolation end reference image.

Table 1. Performance comparison with different task-agnostic grasp generation methods. The best results are in bold. Ours-Contact denotes the result with only the transferred contact map, while Ours denotes the result after using the jointly transferred contact, part and direction maps.
Interpolation end reference image.
Figure 4. Qualitative comparison results with Tink for contact map transfer on novel objects.

Generalization evaluation in task-oriented grasping.
Our method effectively transfers grasps from shape templates to diverse novel objects while maintaining high grasp quality and strong alignment with task specifications. Without requiring retraining on new object categories, our method can be directly applied to novel categories to generate stable dexterous grasps for a wide range of manipulation tasks. These results indicate the superior generalization capability of the proposed method for dexterous grasp generation.
Interpolation end reference image.
Table 2. Performance comparison on Seen and Unseen Categories across different methods. Cont. and Consis. are two metrics used to evaluate the alignment of the generated grasp with the task specifications. The best results are in bold.
Interpolation end reference image.
Figure 5. Qualitative comparison with generative methods on unseen objects across diverse tasks.

Real world experiments
We conducted real-world dexterous grasping experiments on a self-developed humanoid robotic platform equipped with an Inspire dexterous hand mounted at the end of the robot arm. our method can effectively transfer the contact map from template objects to novel, noisy objects, and achieve an average success rate of 70% across the tested categories, demonstrating the practical applicability of our approach and its robustness to real-world observation noise.
Interpolation end reference image.
Figure 6. Visualization of the real-world experiments results. For each object, the left, middle, and right images represent the template contact map, the transferred contact map on a novel real-world object, and the recovered dexterous grasp, respectively.

Citation

Welcome to check our paper for more details of the research work. If you find our paper and repo useful, please consider to cite:
TODO: Add citation