Language‐Guided Robot Grasping Based on Basic Geometric Shape Fitting
Abstrak
In open‐world robotic manipulation tasks, language‐guided model‐free grasping has garnered increasing attention. However, existing approaches often overlook the geometric structure of target objects, which limits the effectiveness of subsequent tasks such as manipulation and placement. To address this limitation, a novel method called Language‐Guided Grasping via Primitive Fitting is proposed. This approach integrates language instructions with multimodal perception to enhance the semantic interpretability and downstream usability of the grasp through structured geometric modeling. Specifically, the user‐specified object using 2D images and depth data via multimodal understanding is first localized. Then, primitive fitting on the object's point cloud using basic geometric shapes (e.g., cuboids, ellipsoids, truncated cones) to extract approximate size and structural features is performed. Based on the geometric information, a grasp pose generation strategy guided by semantic geometry is defined, and modules for grasp feasibility filtering and task‐oriented optimization to select the optimal grasp pose are introduced. This method is validated in real‐world complex environments and achieved grasp success rates of 95% in structured and 90% in cluttered scenes. Geometric fitting enhances post‐grasp predictability and semantic consistency, enabling better generalization and planning.
Topik & Kata Kunci
Penulis (6)
Qun Niu
Chuanlin Zhang
Tianyu Zhang
Jieliang Zhao
Tie Fu
Xuemei Chen
Akses Cepat
- Tahun Terbit
- 2026
- Sumber Database
- DOAJ
- DOI
- 10.1002/aisy.202501276
- Akses
- Open Access ✓