Merge branch 'master' into paul/scikit-build

colmap · Oct 13, 2023 · 03f610f · 03f610f
2 parents 938801e + de3ce4c
commit 03f610f
Show file tree

Hide file tree

Showing 12 changed files with 203 additions and 480 deletions.
diff --git a/.github/workflows/build-new.yml b/.github/workflows/build-new.yml
@@ -39,7 +39,7 @@ jobs:
     runs-on: ${{ matrix.os }}
     strategy:
       matrix:
-        os: [ macos-11, macos-12 ]
+        os: [ macos-11, macos-12, macos-13 ]
     steps:
       - name: Checkout
         uses: actions/checkout@v3

diff --git a/.github/workflows/clang-format-check.yml b/.github/workflows/clang-format-check.yml
@@ -1,5 +1,10 @@
 name: clang-format Check
-on: [push, pull_request]
+on:
+  push:
+    branches:
+      - master
+  pull_request:
+    types: [ assigned, opened, synchronize, reopened ]
 jobs:
   formatting-check:
     name: Formatting Check
@@ -10,4 +15,3 @@ jobs:
       uses: jidicula/clang-format-action@v4.11.0
       with:
         clang-format-version: '14'
-        check-path: 'src/colmap'
diff --git a/README.md b/README.md
@@ -9,7 +9,7 @@ Wheels for Python 8/9/10 on Linux and macOS 11/12 can be install using pip:
 pip install pycolmap
 ```
 
-The wheels are automatically built and pushed to [pypi](https://pypi.org/project/pycolmap/) at each release.
+The wheels are automatically built and pushed to [pypi](https://pypi.org/project/pycolmap/) at each release. They are currently not built with CUDA support, which requires building from source (see below).
 
 ### Building from source
 
@@ -67,13 +67,11 @@ pycolmap.stereo_fusion(mvs_path / "dense.ply", mvs_path)
 ```
 
 PyCOLMAP can leverage the GPU for feature extraction, matching, and multi-view stereo if COLMAP was compiled with CUDA support.
-
 Similarly, PyCOLMAP can run Delauney Triangulation if COLMAP was compiled with CGAL support.
 This requires to build the package from source and is not available with the PyPI wheels.
 
 All of the above steps are easily configurable with python dicts which are recursively merged into
-their respective defaults, e.g.
-
+their respective defaults, for example:
 ```python
 pycolmap.extract_features(database_path, image_dir, sift_options={"max_num_features": 512})
 # equivalent to
@@ -125,14 +123,15 @@ The object API mirrors the COLMAP C++ library. The bindings support many other o
 - projecting a 3D point into an image with arbitrary camera model:
 
 ```python
-uv = camera.world_to_image(image.project(point3D.xyz))
+uv = camera.img_from_cam(image.cam_from_world * point3D.xyz)
 ```
 
 - aligning two 3D reconstructions by their camera poses:
 
 ```python
-tfm = reconstruction1.align_poses(reconstruction2)  # transforms reconstruction1 in-place
-print(tfm.rotation, tfm.translation)
+rec2_from_rec1 = pycolmap.align_reconstructions_via_reprojections(reconstruction1, reconstrution2)
+reconstruction1.transform(rec2_from_rec1)
+print(rec2_from_rec1.scale, rec2_from_rec1.rotation, rec2_from_rec1.translation)
 ```
 
 
@@ -166,7 +165,7 @@ reconstruction.export_VRML("rec.images.wrl", "rec.points3D.wrl",
 
 We provide robust RANSAC-based estimators for absolute camera pose (single-camera and multi-camera-rig), essential matrix, fundamental matrix, homography, and two-view relative pose for calibrated cameras.
 
-All RANSAC and estimation parameters are exposed as objects that behave similarly as Python dataclasses. The RANSAC options are described in [`colmap/src/optim/ransac.h`](https://github.com/colmap/colmap/blob/dev/src/optim/ransac.h#L47-L76) and their default values are:
+All RANSAC and estimation parameters are exposed as objects that behave similarly as Python dataclasses. The RANSAC options are described in [`colmap/optim/ransac.h`](https://github.com/colmap/colmap/blob/main/src/colmap/optim/ransac.h#L45-L74) and their default values are:
 
 ```python
 ransac_options = pycolmap.RANSACOptions(
@@ -187,61 +186,57 @@ For instance, to estimate the absolute pose of a query camera given 2D-3D corres
 # - points3D: Nx3 array; world coordinates
 # - camera: pycolmap.Camera
 # Optional parameters:
-# - max_error_px: float; RANSAC inlier threshold in pixels (default=12.0)
 # - estimation_options: dict or pycolmap.AbsolutePoseEstimationOptions
 # - refinement_options: dict or pycolmap.AbsolutePoseRefinementOptions
-answer = pycolmap.absolute_pose_estimation(points2D, points3D, camera, max_error_px=12.0)
-# Returns: dictionary of estimation outputs
+answer = pycolmap.absolute_pose_estimation(points2D, points3D, camera)
+# Returns: dictionary of estimation outputs or None if failure
 ```
 
-2D and 3D points are passed as Numpy arrays or lists. The options are defined in [`estimators/absolute_pose.cc`](./estimators/absolute_pose.cc#L187-L220) and can be passed as regular (nested) Python dictionaries:
+2D and 3D points are passed as Numpy arrays or lists. The options are defined in [`estimators/absolute_pose.cc`](./estimators/absolute_pose.cc#L104-L152) and can be passed as regular (nested) Python dictionaries:
 
 ```python
 pycolmap.absolute_pose_estimation(
     points2D, points3D, camera,
-    estimation_options={'ransac': {'max_error': 12.0}},
-    refinement_options={'refine_focal_length': True},
+    estimation_options=dict(ransac=dict(max_error=12.0)),
+    refinement_options=dict(refine_focal_length=True),
 )
 ```
 
 ### Absolute Pose Refinement
 
 ```python
 # Parameters:
-# - tvec: List of 3 floats, translation component of the pose (world to camera)
-# - qvec: List of 4 floats, quaternion component of the pose (world to camera)
+# - cam_from_world: pycolmap.Rigid3d, initial pose
 # - points2D: Nx2 array; pixel coordinates
 # - points3D: Nx3 array; world coordinates
 # - inlier_mask: array of N bool; inlier_mask[i] is true if correpondence i is an inlier
 # - camera: pycolmap.Camera
 # Optional parameters:
 # - refinement_options: dict or pycolmap.AbsolutePoseRefinementOptions
-answer = pycolmap.pose_refinement(tvec, qvec, points2D, points3D, inlier_mask, camera)
-# Returns: dictionary of refinement outputs
+answer = pycolmap.pose_refinement(cam_from_world, points2D, points3D, inlier_mask, camera)
+# Returns: dictionary of refinement outputs or None if failure
 ```
 
 ### Essential matrix estimation
 
 ```python
 # Parameters:
-# - points2D1: Nx2 array; pixel coordinates in image 1
-# - points2D2: Nx2 array; pixel coordinates in image 2
+# - points1: Nx2 array; 2D pixel coordinates in image 1
+# - points2: Nx2 array; 2D pixel coordinates in image 2
 # - camera1: pycolmap.Camera of image 1
 # - camera2: pycolmap.Camera of image 2
 # Optional parameters:
-# - max_error_px: float; RANSAC inlier threshold in pixels (default=4.0)
-# - options: dict or pycolmap.RANSACOptions
-answer = pycolmap.essential_matrix_estimation(points2D1, points2D2, camera1, camera2)
-# Returns: dictionary of estimation outputs
+# - options: dict or pycolmap.RANSACOptions (default inlier threshold is 4px)
+answer = pycolmap.essential_matrix_estimation(points1, points2, camera1, camera2)
+# Returns: dictionary of estimation outputs or None if failure
 ```
 
 ### Fundamental matrix estimation
 
 ```python
 answer = pycolmap.fundamental_matrix_estimation(
-    points2D1,
-    points2D2,
-    [max_error_px],  # optional RANSAC inlier threshold in pixels
+    points1,
+    points2,
     [options],       # optional dict or pycolmap.RANSACOptions
 )
 ```
@@ -250,8 +245,8 @@ answer = pycolmap.fundamental_matrix_estimation(
 
 ```python
 answer = pycolmap.homography_matrix_estimation(
-    points2D1,
-    points2D2,
+    points1,
+    points2,
     [max_error_px],  # optional RANSAC inlier threshold in pixels
     [options],       # optional dict or pycolmap.RANSACOptions
 )
@@ -263,18 +258,18 @@ COLMAP can also estimate a relative pose between two calibrated cameras by estim
 
 ```python
 # Parameters:
-# - points2D1: Nx2 array; pixel coordinates in image 1
-# - points2D2: Nx2 array; pixel coordinates in image 2
 # - camera1: pycolmap.Camera of image 1
+# - points1: Nx2 array; 2D pixel coordinates in image 1
 # - camera2: pycolmap.Camera of image 2
+# - points2: Nx2 array; 2D pixel coordinates in image 2
 # Optional parameters:
-# - max_error_px: float; RANSAC inlier threshold in pixels (default=4.0)
+# - matches: Nx2 integer array; correspondences across images
 # - options: dict or pycolmap.TwoViewGeometryOptions
-answer = pycolmap.homography_matrix_estimation(points2D1, points2D2)
-# Returns: dictionary of estimation outputs
+answer = pycolmap.estimate_calibrated_two_view_geometry(camera1, points1, camera2, points2)
+# Returns: pycolmap.TwoViewGeometry
 ```
 
- The options are defined in [`estimators/two_view_geometry.cc`](estimators/two_view_geometry.cc#L102-L117) and control how each model is selected. The return dictionary contains the relative pose, inlier mask, as well as the type of camera configuration, such as degenerate or planar. This type is an instance of the enum `pycolmap.TwoViewGeometry` whose values are explained in [`colmap/src/estimators/two_view_geometry.h`](https://github.com/colmap/colmap/blob/dev/src/estimators/two_view_geometry.h#L47-L67).
+The `TwoViewGeometryOptions` control how each model is selected. The output structure contains the geometric model, inlier matches, the relative pose (if `options.compute_relative_pose=True`), and the type of camera configuration, which is an instance of the enum `pycolmap.TwoViewGeometryConfiguration`.
 
 ### Camera argument
 

diff --git a/estimators/absolute_pose.cc b/estimators/absolute_pose.cc
@@ -19,21 +19,18 @@ using namespace pybind11::literals;
 
 #include "helpers.h"
 #include "log_exceptions.h"
+#include "utils.h"
 
-py::dict absolute_pose_estimation(
+py::object absolute_pose_estimation(
     const std::vector<Eigen::Vector2d> points2D,
     const std::vector<Eigen::Vector3d> points3D,
     Camera& camera,
     const AbsolutePoseEstimationOptions estimation_options,
     const AbsolutePoseRefinementOptions refinement_options,
     const bool return_covariance) {
   SetPRNGSeed(0);
-
-  // Check that both vectors have the same size.
   THROW_CHECK_EQ(points2D.size(), points3D.size());
-
-  // Failure output dictionary.
-  py::dict failure_dict("success"_a = false);
+  py::object failure = py::none();
   py::gil_scoped_release release;
 
   // Absolute pose estimation.
@@ -48,7 +45,7 @@ py::dict absolute_pose_estimation(
                             &camera,
                             &num_inliers,
                             &inlier_mask)) {
-    return failure_dict;
+    return failure;
   }
 
   // Absolute pose refinement.
@@ -60,103 +57,46 @@ py::dict absolute_pose_estimation(
                           &cam_from_world,
                           &camera,
                           return_covariance ? &covariance : nullptr)) {
-    return failure_dict;
-  }
-
-  // Convert vector<char> to vector<int>.
-  std::vector<bool> inliers;
-  for (auto it : inlier_mask) {
-    if (it) {
-      inliers.push_back(true);
-    } else {
-      inliers.push_back(false);
-    }
+    return failure;
   }
 
-  // Success output dictionary.
   py::gil_scoped_acquire acquire;
-  py::dict success_dict("success"_a = true,
-                        "cam_from_world"_a = cam_from_world,
+  py::dict success_dict("cam_from_world"_a = cam_from_world,
                         "num_inliers"_a = num_inliers,
-                        "inliers"_a = inliers);
+                        "inliers"_a = ToPythonMask(inlier_mask));
   if (return_covariance) success_dict["covariance"] = covariance;
   return success_dict;
 }
 
-py::dict absolute_pose_estimation(const std::vector<Eigen::Vector2d> points2D,
-                                  const std::vector<Eigen::Vector3d> points3D,
-                                  Camera& camera,
-                                  const double max_error_px,
-                                  const double min_inlier_ratio,
-                                  const int min_num_trials,
-                                  const int max_num_trials,
-                                  const double confidence,
-                                  const bool return_covariance) {
-  // Absolute pose estimation parameters.
-  AbsolutePoseEstimationOptions abs_pose_options;
-  abs_pose_options.estimate_focal_length = false;
-  abs_pose_options.ransac_options.max_error = max_error_px;
-  abs_pose_options.ransac_options.min_inlier_ratio = min_inlier_ratio;
-  abs_pose_options.ransac_options.min_num_trials = min_num_trials;
-  abs_pose_options.ransac_options.max_num_trials = max_num_trials;
-  abs_pose_options.ransac_options.confidence = confidence;
-
-  // Refine absolute pose parameters.
-  AbsolutePoseRefinementOptions abs_pose_refinement_options;
-  abs_pose_refinement_options.refine_focal_length = false;
-  abs_pose_refinement_options.refine_extra_params = false;
-  abs_pose_refinement_options.print_summary = false;
-
-  return absolute_pose_estimation(points2D,
-                                  points3D,
-                                  camera,
-                                  abs_pose_options,
-                                  abs_pose_refinement_options,
-                                  return_covariance);
-}
-
-py::dict pose_refinement(
-    const Rigid3d init_cam_from_world,
-    const std::vector<Eigen::Vector2d> points2D,
-    const std::vector<Eigen::Vector3d> points3D,
-    const std::vector<bool> inlier_mask,
-    const Camera camera,
-    const AbsolutePoseRefinementOptions refinement_options) {
+py::object pose_refinement(
+    const Rigid3d& init_cam_from_world,
+    const std::vector<Eigen::Vector2d>& points2D,
+    const std::vector<Eigen::Vector3d>& points3D,
+    const PyInlierMask& inlier_mask,
+    Camera& camera,
+    const AbsolutePoseRefinementOptions& refinement_options) {
   SetPRNGSeed(0);
-
-  // Check that both vectors have the same size.
   THROW_CHECK_EQ(points2D.size(), points3D.size());
   THROW_CHECK_EQ(inlier_mask.size(), points2D.size());
-
-  // Failure output dictionary.
-  py::dict failure_dict("success"_a = false);
+  py::object failure = py::none();
   py::gil_scoped_release release;
 
-  // Absolute pose estimation.
   Rigid3d refined_cam_from_world = init_cam_from_world;
-  std::vector<char> inlier_mask_char;
-  for (size_t i = 0; i < inlier_mask.size(); ++i) {
-    if (inlier_mask[i]) {
-      inlier_mask_char.emplace_back(1);
-    } else {
-      inlier_mask_char.emplace_back(0);
-    }
-  }
-
-  // Absolute pose refinement.
+  std::vector<char> inlier_mask_char(inlier_mask.size());
+  Eigen::Map<Eigen::Matrix<char, Eigen::Dynamic, 1>>(
+      inlier_mask_char.data(), inlier_mask.size()) = inlier_mask.cast<char>();
   if (!RefineAbsolutePose(refinement_options,
                           inlier_mask_char,
                           points2D,
                           points3D,
                           &refined_cam_from_world,
-                          const_cast<Camera*>(&camera))) {
-    return failure_dict;
+                          &camera)) {
+    return failure;
   }
 
   // Success output dictionary.
   py::gil_scoped_acquire acquire;
-  return py::dict("success"_a = true,
-                  "cam_from_world"_a = refined_cam_from_world);
+  return py::dict("cam_from_world"_a = refined_cam_from_world);
 }
 
 void bind_absolute_pose_estimation(py::module& m,
@@ -215,12 +155,7 @@ void bind_absolute_pose_estimation(py::module& m,
       PyRefinementOptions().cast<AbsolutePoseRefinementOptions>();
 
   m.def("absolute_pose_estimation",
-        static_cast<py::dict (*)(const std::vector<Eigen::Vector2d>,
-                                 const std::vector<Eigen::Vector3d>,
-                                 Camera&,
-                                 const AbsolutePoseEstimationOptions,
-                                 const AbsolutePoseRefinementOptions,
-                                 bool)>(&absolute_pose_estimation),
+        &absolute_pose_estimation,
         "points2D"_a,
         "points3D"_a,
         "camera"_a,
@@ -229,27 +164,6 @@ void bind_absolute_pose_estimation(py::module& m,
         "return_covariance"_a = false,
         "Absolute pose estimation with non-linear refinement.");
 
-  m.def("absolute_pose_estimation",
-        static_cast<py::dict (*)(const std::vector<Eigen::Vector2d>,
-                                 const std::vector<Eigen::Vector3d>,
-                                 Camera&,
-                                 const double,
-                                 const double,
-                                 const int,
-                                 const int,
-                                 const double,
-                                 const bool)>(&absolute_pose_estimation),
-        "points2D"_a,
-        "points3D"_a,
-        "camera"_a,
-        "max_error_px"_a = est_options.ransac_options.max_error,
-        "min_inlier_ratio"_a = est_options.ransac_options.min_inlier_ratio,
-        "min_num_trials"_a = est_options.ransac_options.min_num_trials,
-        "max_num_trials"_a = est_options.ransac_options.max_num_trials,
-        "confidence"_a = est_options.ransac_options.confidence,
-        "return_covariance"_a = false,
-        "Absolute pose estimation with non-linear refinement.");
-
   m.def("pose_refinement",
         &pose_refinement,
         "cam_from_world"_a,