Spaces:

usm3d
/

S23DR2026

Running on CPU Upgrade

App Files Files Community

Possible GT vertex / image misalignment in val split

by sunj - opened 7 days ago

Discussion

sunj

7 days ago

•

edited 7 days ago

Hi ! while training on usm3d/hoho22k_2026_trainval, I noticed that for some validation buildings the wf_vertices don't line up with the images when projected using the cameras that come with the sample.

To project, I used the cameras inside the colmap zip (via pycolmap.Reconstruction(...).images[i].cam_from_world()), since those are the cameras available for every view (the top-level K/R/t are zero when pose_only_in_colmap == True).

Example: building index 77 in the validation split. Red dots are the projected wf_vertices, green dots are the projected colmap.points3D:

The green points cover the building nicely (as expected since they came from these images), but the red vertices sit visibly below / off the actual roof edges — not something that occlusion alone explains.

Doing the same for every validation building and measuring the 3D distance from each wf_vertex to the nearest colmap.points3D.xyz:

For 6 / 170 val buildings the median GT→COLMAP distance is > 1 m:

val index	median dist (m)	max dist (m)
133	1173.32	1182.02
119	18.26	21.22
100	15.53	27.53
127	2.08	3.93
77	1.41	5.44
149	1.11	2.52

Per-view projection overlays for all 6 are here, with 4 normal ones for comparison:

This one is less than above things, but it's shifted.

My guess is wf_vertices might be in the BPO frame while colmap.points3D is in the SfM frame, and for these 6 buildings the two frames aren't close.

A couple of questions:

Is this a known issue, or am I doing something wrong in how I'm getting the cameras?
If wf_vertices really is in a different frame, is there a per-sample transform we're supposed to apply that I missed?
Is the same mismatch present in the train, test(for leaderboard) split?

Thanks!

dmytromishkin

Urban Scene Modeling Competition CVPR 2026 (Image Track) org 7 days ago

Hi,

Some of the colmap cameras are wrong -- that's could be present in all parts of the dataset (train, val, test).
The poses from the non-colmap (K,R,t) + wf_vertices should be consistent, but some of the colmap reconstructions are not -- either because of the whole reconsturction being wrong, or because the registration failed.
We have done our best to clean those cases as much as possible, but some of the wrong reconstructions remain.
It is up to you, how to process such scenes. Hopefully, there are not too many of them to influence the final results.

--
Best, Dmytro.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment