Other

Other

Other tasks that don't fit into specific categories

4 datasets0 resultsView full task mapping →

Other is a key task in other. Below you will find the standard benchmarks used to evaluate models, along with current state-of-the-art results.

Benchmarks & SOTA

NAVI

NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations

0 results

NAVI is a category-agnostic multi-view image collection dataset with high-quality 3D scans and precise 2D–3D alignments (per-image camera poses). It was created to enable systematic evaluation of image-based 3D reconstruction, multi-view geometric correspondence, and surface-level/keypoint correspondence tasks from casual (in-the-wild) image collections where traditional SfM often fails. NAVI provides object-centric image collections paired with near-perfect ground-truth camera parameters and 3D shapes, enabling extraction of accurate cross-view correspondences and evaluation following protocols such as Probe3D. Primary resources: NAVI project site (https://navidataset.github.io/), NeurIPS 2023 paper and supplemental materials, and the google/navi GitHub repository.

No results tracked yet

SPair

SPair-71k: A Large-scale Benchmark for Semantic Correspondence

0 results

SPair-71k is a large-scale benchmark dataset for semantic correspondence (semantic keypoint matching) introduced by Min et al. (2019). It contains 70,958 semantically paired images with large intra-class variations in viewpoint and scale and provides accurate, rich annotations intended for evaluating semantic correspondence methods. Annotations include per-image-pair semantic keypoint correspondences, bounding boxes, segmentation masks and metadata about viewpoint/scale variation, truncation and occlusion. The dataset is commonly used as a testbed for semantic keypoint/correspondence and matching algorithms and is distributed with a project page and an arXiv preprint (arXiv:1908.10543). A Hugging Face dataset mirror is also available.

No results tracked yet

Revisited Oxford (R_Ox) — Medium split

Revisited Oxford (R-Oxford / Roxford5k) — Medium split

0 results

Revisited Oxford (R-Oxford, also referred to as Roxford5k) is the corrected/re-annotated version of the classic Oxford Buildings image retrieval benchmark introduced in “Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking” (Radenović et al., CVPR 2018 / arXiv:1803.11285). The authors provide revised ground-truth annotations (including bounding boxes and an updated query list: the 55 original queries plus 15 new challenging queries = 70 queries), three evaluation protocols of different difficulty (Easy / Medium / Hard), and an optional R1M set of hard distractor images for large-scale testing. The “Medium” split is the medium-difficulty evaluation protocol from this benchmark (i.e., the dataset subset/protocol used when reporting Medium-difficulty mAP in papers). The dataset is widely used for instance-level image retrieval / landmark retrieval evaluation; the authors publish the images (original Oxford images) and the revisited annotation files (e.g. gnd_roxford5k.mat) and provide code and downloads from the project page.

No results tracked yet

Natural Questions

Natural Questions (NQ)

0 results

Natural Questions (NQ) is a large question-answering corpus released by Google Research. Questions are real anonymized, aggregated queries issued to the Google search engine. For each question, annotators are given a selected Wikipedia page (from the top search results) and label a long answer (typically a paragraph) and, if present, a short answer (one or more entities or spans); pages can also be labeled null when no answer is present. NQ is intended to require reading and comprehension of entire Wikipedia articles and is used for open-domain / reading-comprehension QA research. The public release contains on the order of a few hundred thousand examples (commonly cited: ~307k training examples) and is English-only. License: CC BY-SA 3.0.

No results tracked yet

Related Tasks

Get notified when these results update

New models drop weekly. We track them so you don't have to.

Something wrong or missing?

Help keep Other benchmarks accurate. Report outdated results, missing benchmarks, or errors.

0/2000