CoTracker

class grid.model.perception.tracking.cotracker.CoTracker(*args, **kwargs)

CoTracker: Point Tracking Model

This class implements a wrapper for the CoTracker model, which tracks points in video frames in an online manner (without having to look at the entire video).

Credits:

https://github.com/facebookresearch/co-tracker

License:

This code is licensed under the CC-BY-NC 4.0 License.

Parameters:
  • queries (Tensor)

  • checkpoint (str | None)

  • save_results (bool)

__init__(queries, checkpoint=None, save_results=False)

Initialize CoTracker model with user-supplied queries

Parameters:
  • queries (Tensor)

  • checkpoint (str | None)

  • save_results (bool)

Return type:

None

finalize()

Finalize processing of remaining frames and return the final predicted tracks and visibility.

If there are any remaining frames that haven't been processed in the window, they will be processed during this call. The results will be logged and optionally saved.

Returns:

Final predicted tracks and visibility.

Return type:

Tuple[Optional[torch.Tensor], Optional[torch.Tensor]]

Example

>>> import numpy as np
>>> from cotracker import CoTracker
>>> queries = torch.tensor([
...     [0., 600., 350.],
...     [0., 600., 250.],
...     [10., 600., 500.],
...     [20., 750., 600.],
...     [30., 900., 200.]
... ])
>>> cotracker = CoTracker(queries=queries)
>>> frame = np.random.rand(720, 1280, 3)  # Example frame
>>> pred_tracks, pred_visibility = cotracker.process_frame(frame)
>>> if pred_tracks is not None and pred_visibility is not None:
...     print("Predicted Tracks:", pred_tracks)
...     print("Predicted Visibility:", pred_visibility)
>>> cotracker.finalize()
process_frame(frame)

Process a single video frame and update tracking information.

This method appends the given frame to the window of frames and processes the frames at intervals defined by the CoTracker model's step size. If the current frame count is a multiple of the step size, it updates the predicted tracks and visibility.

Parameters:

frame (np.ndarray) -- The video frame to be processed.

Returns:

A tuple containing the predicted tracks and visibility tensors. If the frame count is not a multiple of the step size, both values will be None.

Return type:

Tuple[Optional[torch.Tensor], Optional[torch.Tensor]]

Example

>>> import numpy as np
>>> from cotracker import CoTracker
>>> queries = torch.tensor([
...     [0., 600., 350.],
...     [0., 600., 250.],
...     [10., 600., 500.],
...     [20., 750., 600.],
...     [30., 900., 200.]
... ])
>>> cotracker = CoTracker(queries=queries)
>>> frame = np.random.rand(720, 1280, 3)  # Example frame
>>> pred_tracks, pred_visibility = cotracker.process_frame(frame)
>>> if pred_tracks is not None and pred_visibility is not None:
...     print("Predicted Tracks:", pred_tracks)
...     print("Predicted Visibility:", pred_visibility)