DIV-Nav: Open-Vocabulary Spatial Relationships for Multi-Object Navigation

Anonymous, Author

DIV-Nav: Open-Vocabulary Spatial Relationships for Multi-Object Navigation

Anonymous Author 1^*, Anonymous Author 2^*, Anonymous Author 3^*, Anonymous Author 4^*, Anonymous Author 5

Anonymous Institution
IROS 2026 SUBMISSION
^*Indicates Equal Contribution

Paper Supplementary Code arXiv (Anonymous)

Overview of DIV-Nav. (1) We Decompose spatially-constrained instructions into object-level queries on a semantic map (a-b); (2) Compute Intersections to create a joint belief map that actively guides online frontier exploration toward regions where objects likely co-exist (c-d); (3) Validate high-similarity regions by approaching the region, and via LVLM reasoning to confirm individual objects and their spatial relationships (e).

Abstract

Advances in open-vocabulary semantic mapping and object navigation have enabled robots to perform an informed search of their environment for an arbitrary object. However, such zero-shot object navigation is typically designed for simple queries with an object label such as ‘television’ or ‘blue rug’. While single- and multi-object search have progressed, real-time navigation with explicit spatial relation- ship reasoning during online map construction remains largely unexplored, as existing methods either rely on offline 3D reconstruction or handle only individual object targets without relational constraints. Here, we consider more complex free- text queries with spatial relationships, such as ’find the remote on the table’. We present DIV-Nav, a real-time semantic map- based navigation system for sequential multi-object search that addresses this problem through a series of relaxations: i) De- composing natural language instructions with complex spatial constraints into simpler object-level queries, ii) computing the Intersection of individual semantic belief maps via continuous- valued scoring to identify regions where all objects co-exist, and iii) Validating discovered objects against the original spatial constraints via a vision-language model. We further investigate how to adapt frontier exploration for online semantic mapping to such spatial search queries to more effectively guide the search process. We validate our system through extensive experiments on the MultiON benchmark and real- world deployment on a Boston Dynamics Spot robot, achiev- ing an 88% success rate on multi-object spatial-relationship navigation tasks.

Problem Statement.

Our contributions.

Evaluation Results.

Real World Experiments

Scene 1 - A robotics research lab.

Scene 2 - An office common space.

System Overview

Full Video Presentation

BibTeX

@article{Anonymous2025,
title={DIV-Nav: Open-Vocabulary Spatial Relationships for Multi-Object Navigation},
author={Anonymous Authors},
journal={ICRA 2026},
year={2025},
note={Anonymous submission}
}

More Works from Our Lab

Related Anonymous Work 1

Related Anonymous Work 2

Related Anonymous Work 3

DIV-Nav: Open-Vocabulary Spatial Relationships for Multi-Object Navigation

Abstract

Problem Statement.

Our contributions.

Evaluation Results.

Real World Experiments

Scene 1 - A robotics research lab.

Scene 2 - An office common space.

System Overview

Full Video Presentation

BibTeX