Skip to content

label_connected_components

cellseg_gsontools.clustering.label_connected_components(gdf, sub_graphs, label_col, min_size=10)

Assign cluster labels to the objects in the GeoDataFrame.

The cluster labels are assigned based on the connected components of the graph.

Parameters:

Name Type Description Default
gdf GeoDataFrame

The GeoDataFrame with the objects to assign cluster labels to.

required
sub_graphs List[W]

The connected components of the graph.

required
label_col str

The column name to assign the cluster labels to.

required
min_size int

The minimum size of the cluster to assign a label.

10

Returns:

Name Type Description
gdf GeoDataFrame

The GeoDataFrame with the assigned cluster labels.

Examples:

Assign cluster labels to the objects in a GeoDataFrame.

>>> from cellseg_gsontools.clustering import label_connected_components
>>> from cellseg_gsontools.graphs import fit_graph
>>> from cellseg_gsontools.utils import read_gdf, set_uid
>>> cells = read_gdf("cells.geojson")
>>> cells = cells[cells["class_name"] == "inflammatory"]
>>> cells = set_uid(cells)
>>> w = fit_graph(cells, type="distband", id_col="uid", thresh=100)
>>> sub_graphs = get_connected_components(cells, w)
>>> labeled_cells = label_connected_components(
...     cells, sub_graphs, "label", min_size=10
... )
Source code in cellseg_gsontools/clustering.py
def label_connected_components(
    gdf: gpd.GeoDataFrame, sub_graphs: List[W], label_col: str, min_size: int = 10
) -> gpd.GeoDataFrame:
    """Assign cluster labels to the objects in the GeoDataFrame.

    The cluster labels are assigned based on the connected components of the graph.

    Parameters:
        gdf (gpd.GeoDataFrame):
            The GeoDataFrame with the objects to assign cluster labels to.
        sub_graphs (List[W]):
            The connected components of the graph.
        label_col (str):
            The column name to assign the cluster labels to.
        min_size (int):
            The minimum size of the cluster to assign a label.

    Returns:
        gdf (gpd.GeoDataFrame):
            The GeoDataFrame with the assigned cluster labels.

    Examples:
        Assign cluster labels to the objects in a GeoDataFrame.
        >>> from cellseg_gsontools.clustering import label_connected_components
        >>> from cellseg_gsontools.graphs import fit_graph
        >>> from cellseg_gsontools.utils import read_gdf, set_uid
        >>> cells = read_gdf("cells.geojson")
        >>> cells = cells[cells["class_name"] == "inflammatory"]
        >>> cells = set_uid(cells)
        >>> w = fit_graph(cells, type="distband", id_col="uid", thresh=100)
        >>> sub_graphs = get_connected_components(cells, w)
        >>> labeled_cells = label_connected_components(
        ...     cells, sub_graphs, "label", min_size=10
        ... )
    """
    i = 0
    for ww in sub_graphs:
        idxs = list(ww.neighbors.keys())
        if len(idxs) < min_size:
            continue

        gdf.iloc[idxs, gdf.columns.get_loc(label_col)] = i
        i += 1

    return gdf