Skip to content

shape_metric

cellseg_gsontools.geometry.shape_metric(gdf, metrics, parallel=True, num_processes=-1, col_prefix=None, create_copy=True)

Compute a set of shape metrics for every row of the gdf.

Parameters:

Name Type Description Default
gdf GeoDataFrame

The input GeoDataFrame.

required
metrics Tuple[str, ...]

A Tuple/List of shape metrics.

required
parallel bool

Flag whether to use parallel apply operations when computing the diversities.

True
num_processes int, default=-1

The number of processes to use when parallel=True. If -1, this will use all available cores.

-1
col_prefix str

Prefix for the new column names.

None
create_copy bool

Flag whether to create a copy of the input gdf or not.

True
Note

Allowed shape metrics are:

  • area
  • major_axis_len
  • minor_axis_len
  • major_axis_angle
  • minor_axis_angle
  • compactness
  • circularity
  • convexity
  • solidity
  • elongation
  • eccentricity
  • fractal_dimension
  • sphericity
  • shape_index
  • rectangularity
  • squareness
  • equivalent_rectangular_index

Raises:

Type Description
ValueError

If an illegal metric is given.

Returns:

Type Description
GeoDataFrame

gpd.GeoDataFrame: The input geodataframe with computed shape metric columns added.

Examples:

Compute the eccentricity and solidity for each polygon in gdf.

>>> from cellseg_gsontools.geometry import shape_metric
>>> shape_metric(gdf, metrics=["eccentricity", "solidity"], parallel=True)
Source code in cellseg_gsontools/geometry/shape_metrics.py
def shape_metric(
    gdf: gpd.GeoDataFrame,
    metrics: Tuple[str, ...],
    parallel: bool = True,
    num_processes: int = -1,
    col_prefix: str = None,
    create_copy: bool = True,
) -> gpd.GeoDataFrame:
    """Compute a set of shape metrics for every row of the gdf.

    Parameters:
        gdf (gpd.GeoDataFrame):
            The input GeoDataFrame.
        metrics (Tuple[str, ...]):
            A Tuple/List of shape metrics.
        parallel (bool):
            Flag whether to use parallel apply operations when computing the diversities.
        num_processes (int, default=-1):
            The number of processes to use when parallel=True. If -1,
            this will use all available cores.
        col_prefix (str):
            Prefix for the new column names.
        create_copy (bool):
            Flag whether to create a copy of the input gdf or not.

    Note:
        Allowed shape metrics are:

        - `area`
        - `major_axis_len`
        - `minor_axis_len`
        - `major_axis_angle`
        - `minor_axis_angle`
        - `compactness`
        - `circularity`
        - `convexity`
        - `solidity`
        - `elongation`
        - `eccentricity`
        - `fractal_dimension`
        - `sphericity`
        - `shape_index`
        - `rectangularity`
        - `squareness`
        - `equivalent_rectangular_index`

    Raises:
        ValueError:
            If an illegal metric is given.

    Returns:
        gpd.GeoDataFrame:
            The input geodataframe with computed shape metric columns added.

    Examples:
        Compute the eccentricity and solidity for each polygon in gdf.
        >>> from cellseg_gsontools.geometry import shape_metric
        >>> shape_metric(gdf, metrics=["eccentricity", "solidity"], parallel=True)
    """
    if not isinstance(metrics, (list, tuple)):
        raise ValueError(f"`metrics` must be a list or tuple. Got: {type(metrics)}.")

    allowed = list(SHAPE_LOOKUP.keys())
    if not all(m in allowed for m in metrics):
        raise ValueError(
            f"Illegal metric in `metrics`. Got: {metrics}. Allowed metrics: {allowed}."
        )

    if create_copy:
        gdf = gdf.copy()

    if col_prefix is None:
        col_prefix = ""
    else:
        col_prefix += "_"

    met = list(metrics)
    if "area" in metrics:
        gdf[f"{col_prefix}area"] = gdf.area
        met.remove("area")

    for metric in met:
        gdf[metric] = gdf_apply(
            gdf,
            SHAPE_LOOKUP[metric],
            columns=["geometry"],
            parallel=parallel,
            num_processes=num_processes,
        )

    return gdf