Interface Context¶

The InterfaceContext class is used to extract cells within the interface of two or more given tissues. Here we will use it to extract the tumor-stroma interfaces and the cells within them.

After fititng the context, The InterfaceContext.context attribute will contain a nested dict where each value is a dict contain the following key-value pairs:

roi_area - the roi areas of type top_labels (gpd.GeoDataFrame)
roi_cells - the cells inside the roi areas. (gpd.GeoDataFrame)
roi_grid - the grids fitted on top of the roi areas of type top_labels. (gpd.GeoDataFrame)
interface_area - returns the interface areas between top_labels and bottom_labels areas (gpd.GeoDataFrame)
interface_cells - returns the cells inside the interface_area. (gpd.GeoDataFrame)
interface_grid - returns the grids fitted on top of the interface areas. (gpd.GeoDataFrame)
full_network - the network fitted on the union of interface_cells and roi_cells. (libpysal.weights.W)
roi_network - the network fitted on the roi_cells. (libpysal.weights.W)
interface_network - returns the network fitted on top of the interface_cells. (libpysal.weights.W)
border_network - returns the network fitted on top of the cells that cross the interface border. (libpysal.weights.W)

The Data¶

A quick sneak peak into the data.

In [1]:

Copied!

from cellseg_gsontools.data import cervix_cells

# sneak peak to the data that will be used in the notebook
cervix_cells().head()
from cellseg_gsontools.data import cervix_cells

# sneak peak to the data that will be used in the notebook
cervix_cells().head()

Out[1]:

	type	geometry	class_name
uid
1	Feature	POLYGON ((-10.988 48446.005, -10.988 48453.996...	inflammatory
2	Feature	POLYGON ((-20.988 48477.996, -19.990 48479.993...	connective
3	Feature	POLYGON ((-14.988 48767.995, -11.993 48770.990...	inflammatory
4	Feature	POLYGON ((-3.988 49537.995, -2.995 49538.988, ...	connective
5	Feature	POLYGON ((-7.988 49562.995, -5.995 49564.988, ...	connective

In [2]:

Copied!

from cellseg_gsontools.data import cervix_tissue

cervix_tissue().head()
from cellseg_gsontools.data import cervix_tissue

cervix_tissue().head()

Out[2]:

	type	geometry	class_name
uid
1	Feature	POLYGON ((1852.953 51003.603, 1853.023 51009.1...	areastroma
2	Feature	POLYGON ((4122.334 48001.899, 4122.994 48014.8...	areagland
3	Feature	POLYGON ((3075.002 48189.068, 3075.001 48218.8...	areagland
4	Feature	POLYGON ((51.106 50822.418, 57.151 50834.504, ...	areagland
5	Feature	POLYGON ((3150.958 52999.764, 3147.245 52996.5...	areastroma

Fitting the Context¶

In [3]:

Copied!





from cellseg_gsontools.spatial_context import InterfaceContext

icsp = InterfaceContext(
    area_gdf=cervix_tissue(),
    cell_gdf=cervix_cells(),
    top_labels="area_cin",  # tissue classes that are buffered on top of bottom_labels
    bottom_labels="areastroma",  # tissue classes that are being buffered on
    buffer_dist=250.0,  # the distance to buffer top_labels on top of bottom_labels
    min_area_size=100000,  # minimum area size of the top_labels (in pixels**2)
    silence_warnings=True,
    parallel=False,  # Whether to run in parallel
    num_processes=1,  # Number of processes to use
    graph_type="distband",  # Use a distance band graph (distance thresholded KNN-graph)
    dist_thresh=90,  # Distance threshold for the graph
    grid_type="hex",  # Use a hexagonal grid
    resolution=10,  # Resolution of the grid
)
icsp.fit(fit_graph=True, fit_grid=True)
from cellseg_gsontools.spatial_context import InterfaceContext

icsp = InterfaceContext(
    area_gdf=cervix_tissue(),
    cell_gdf=cervix_cells(),
    top_labels="area_cin",  # tissue classes that are buffered on top of bottom_labels
    bottom_labels="areastroma",  # tissue classes that are being buffered on
    buffer_dist=250.0,  # the distance to buffer top_labels on top of bottom_labels
    min_area_size=100000,  # minimum area size of the top_labels (in pixels**2)
    silence_warnings=True,
    parallel=False,  # Whether to run in parallel
    num_processes=1,  # Number of processes to use
    graph_type="distband",  # Use a distance band graph (distance thresholded KNN-graph)
    dist_thresh=90,  # Distance threshold for the graph
    grid_type="hex",  # Use a hexagonal grid
    resolution=10,  # Resolution of the grid
)
icsp.fit(fit_graph=True, fit_grid=True)

Processing roi area: 3: 100%|██████████| 4/4 [00:00<00:00,  7.04it/s]

In [4]:

Copied!

# Cells of the first interface
icsp.context[0]["interface_cells"].head()
# Cells of the first interface
icsp.context[0]["interface_cells"].head()

Out[4]:

	type	geometry	class_name	global_id
global_id
12407	Feature	POLYGON ((10569.01182 48660.00490, 10569.01188...	connective	12407
12416	Feature	POLYGON ((10542.01188 48671.00360, 10542.01188...	inflammatory	12416
12421	Feature	POLYGON ((10549.01182 48697.99510, 10553.00585...	inflammatory	12421
12438	Feature	POLYGON ((10764.01188 48750.99640, 10765.00780...	inflammatory	12438
12445	Feature	POLYGON ((10506.01182 48760.00490, 10506.01182...	connective	12445

Plotting the Interfaces¶

Let's plot the extracted interfaces and cells.

In [5]:

Copied!

icsp.plot("interface_area", figsize=(12, 6))
icsp.plot("interface_area", figsize=(12, 6))

Out[5]:

<Axes: >

No description has been provided for this image

We can also plot the graphs that are fitted to the cells in different contexts. Here, we will plot the graphs fitted to the cells that cross the interface i.e. the border_network. With the border_network, we can count the cell-cell connections that cross the border of the two tissues to see whether, for example, there is any immune infiltration from stroma to tumor.

In [6]:

Copied!





icsp.plot(
    "interface_area",
    network_key="border_network",
    figsize=(12, 6),
    edge_kws={"linewidth": 0.5}
)
icsp.plot(
    "interface_area",
    network_key="border_network",
    figsize=(12, 6),
    edge_kws={"linewidth": 0.5}
)

Out[6]:

<Axes: >

In [7]:

Copied!

icsp.plot("interface_area", grid_key="interface_grid", figsize=(12, 6))
icsp.plot("interface_area", grid_key="interface_grid", figsize=(12, 6))

Out[7]:

<Axes: >

Downstream Analysis¶

Now that we’ve fitted the context, we can use the interfaces in the context class attribute to compute more features. Here, we will showcase some lightweight downstream analyses for the extracted interfaces.

Example 1: Immune Infiltration Density¶

Here we compute the density of the inflammatory cells at the interface that have links to neoplastic cells at the tumor. We will use the interface_grid of the context and compute the link count of the inflammatory-neoplastic links within the grid cells to do this.

In [8]:

Copied!





import geopandas as gpd
import mapclassify
import numpy as np
import pandas as pd

from cellseg_gsontools.grid import grid_classify
from cellseg_gsontools.links import weights2gdf
from cellseg_gsontools.plotting import plot_all


# helper function to replace legend items
def replace_legend_items(legend, mapping):
    for txt in legend.texts:
        for k, v in mapping.items():
            if txt.get_text() == str(k):
                txt.set_text(v)


# Immune-neoplastic link cnt heuristic to classify the grid cells into two classes
def get_infiltration_cnt(gdf: gpd.GeoDataFrame, **kwargs) -> float:
    dd = np.array(["inflammatory-neoplastic", "neoplastic-inflammatory"])

    try:
        class_name = dd[
            np.array([cl in gdf.class_name.unique().tolist() for cl in dd])
        ][0]
        cnt = gdf.class_name.value_counts()[class_name]
    except Exception:
        cnt = 0

    return int(cnt)


# get the border network and convert it to a gdf
w = icsp.context[0]["border_network"]
link_gdf = weights2gdf(icsp.cell_gdf, w)

# Count the immune cells within the grid cells with the cell cnt heuristic
iface_grid = grid_classify(
    grid=icsp.context[0]["interface_grid"],
    objs=link_gdf,
    metric_func=get_infiltration_cnt,
    predicate="intersects",
    new_col_names="infiltrate_cnt",
    parallel=False,
)

# bin the grid cells into two classes ("has infiltration" and "no infiltration")
col = "infiltrate_cnt"
bins = mapclassify.Quantiles(iface_grid[col], k=2)
iface_grid["infiltrate_density_level"] = bins.yb

immune_density_plot = plot_all(
    icsp.context[0]["interface_area"],
    pd.concat([icsp.context[0]["roi_cells"], icsp.context[0]["interface_cells"]]),
    grid_gdf=iface_grid.copy(),
    network_gdf=link_gdf.copy(),
    figsize=(10, 10),
    grid_col="infiltrate_density_level",
    grid_cmap="jet",
    grid_n_bins=bins.k,
    show_legends=True,
    edge_kws={"linewidth": 0.5},
)
mapping = dict([(i, s) for i, s in enumerate(bins.get_legend_classes())])
replace_legend_items(immune_density_plot.get_legend(), mapping)
immune_density_plot
import geopandas as gpd
import mapclassify
import numpy as np
import pandas as pd

from cellseg_gsontools.grid import grid_classify
from cellseg_gsontools.links import weights2gdf
from cellseg_gsontools.plotting import plot_all


# helper function to replace legend items
def replace_legend_items(legend, mapping):
    for txt in legend.texts:
        for k, v in mapping.items():
            if txt.get_text() == str(k):
                txt.set_text(v)


# Immune-neoplastic link cnt heuristic to classify the grid cells into two classes
def get_infiltration_cnt(gdf: gpd.GeoDataFrame, **kwargs) -> float:
    dd = np.array(["inflammatory-neoplastic", "neoplastic-inflammatory"])

    try:
        class_name = dd[
            np.array([cl in gdf.class_name.unique().tolist() for cl in dd])
        ][0]
        cnt = gdf.class_name.value_counts()[class_name]
    except Exception:
        cnt = 0

    return int(cnt)


# get the border network and convert it to a gdf
w = icsp.context[0]["border_network"]
link_gdf = weights2gdf(icsp.cell_gdf, w)

# Count the immune cells within the grid cells with the cell cnt heuristic
iface_grid = grid_classify(
    grid=icsp.context[0]["interface_grid"],
    objs=link_gdf,
    metric_func=get_infiltration_cnt,
    predicate="intersects",
    new_col_names="infiltrate_cnt",
    parallel=False,
)

# bin the grid cells into two classes ("has infiltration" and "no infiltration")
col = "infiltrate_cnt"
bins = mapclassify.Quantiles(iface_grid[col], k=2)
iface_grid["infiltrate_density_level"] = bins.yb

immune_density_plot = plot_all(
    icsp.context[0]["interface_area"],
    pd.concat([icsp.context[0]["roi_cells"], icsp.context[0]["interface_cells"]]),
    grid_gdf=iface_grid.copy(),
    network_gdf=link_gdf.copy(),
    figsize=(10, 10),
    grid_col="infiltrate_density_level",
    grid_cmap="jet",
    grid_n_bins=bins.k,
    show_legends=True,
    edge_kws={"linewidth": 0.5},
)
mapping = dict([(i, s) for i, s in enumerate(bins.get_legend_classes())])
replace_legend_items(immune_density_plot.get_legend(), mapping)
immune_density_plot

Out[8]:

<Axes: >

Example 2: Cell-Cell Interactions¶

We can also directly count the cell-cell interactions between the cells crossing the tissue border. We can do this by using the border_network key of the context.

In [9]:

Copied!

from cellseg_gsontools.links import weights2gdf

w = icsp.context[0]["border_network"]
w_gdf = weights2gdf(icsp.cell_gdf, w)
w_gdf.head()
from cellseg_gsontools.links import weights2gdf

w = icsp.context[0]["border_network"]
w_gdf = weights2gdf(icsp.cell_gdf, w)
w_gdf.head()

Out[9]:

	index	focal	neighbor	weight	focal_centroid	neighbor_centroid	focal_class_name	neighbor_class_name	class_name	geometry
0	0	9315	10482	1.0	POINT (7999.222138714856 49789.40336456389)	POINT (8033.742411090165 49850.15890484998)	glandular_epithel	neoplastic	glandular_epithel-neoplastic	LINESTRING (7999.222 49789.403, 8033.742 49850...
1	1	9356	9395	1.0	POINT (7937.002202793094 50113.275934318786)	POINT (7896.607546050848 50178.72364538605)	neoplastic	connective	connective-neoplastic	LINESTRING (7937.002 50113.276, 7896.608 50178...
2	2	9372	9395	1.0	POINT (7958.532206571154 50144.25476695621)	POINT (7896.607546050848 50178.72364538605)	neoplastic	connective	connective-neoplastic	LINESTRING (7958.532 50144.255, 7896.608 50178...
3	3	9372	9408	1.0	POINT (7958.532206571154 50144.25476695621)	POINT (7914.7260717517165 50204.87002012703)	neoplastic	connective	connective-neoplastic	LINESTRING (7958.532 50144.255, 7914.726 50204...
4	6	9395	9404	1.0	POINT (7896.607546050848 50178.72364538605)	POINT (7954.544809238641 50192.02208046518)	connective	neoplastic	connective-neoplastic	LINESTRING (7896.608 50178.724, 7954.545 50192...

In [10]:

Copied!

# cell-cell link counts
w_gdf.value_counts("class_name")
# cell-cell link counts
w_gdf.value_counts("class_name")

Out[10]:

class_name
connective-neoplastic             565
inflammatory-neoplastic           147
neoplastic-neoplastic              74
connective-inflammatory            60
inflammatory-inflammatory          31
connective-connective              25
connective-glandular_epithel        4
glandular_epithel-inflammatory      3
glandular_epithel-neoplastic        2
Name: count, dtype: int64

Example 3: Computing Neighborhood Statistics¶

Next we will compute some neighborhood statistics for the cells within the ROI and the interface. We will compute the simpson index over the different cell types in each cell neighborhood. The higher the simpson index of a cell neighborhood is, the more diverse it is in terms of cell types.

In [11]:

Copied!





from cellseg_gsontools.diversity import local_diversity

w = icsp.context[0]["full_network"]
cells = pd.concat([icsp.context[0]["roi_cells"], icsp.context[0]["interface_cells"]])
cells = cells[
    [
        "geometry",
        "global_id",
        "class_name",
    ]
]
# compute the heterogeneity of the neighborhood areas
cells = local_diversity(
    cells,
    spatial_weights=w,
    val_col="class_name",
    id_col="global_id",
    metrics=["simpson_index"],
    rm_nhood_cols=False,
)

cols = cols = [
    "geometry",
    "class_name",
    "nhood",
    "class_name_nhood_counts",
    "class_name_simpson_index",
]
cells[cols].head()
from cellseg_gsontools.diversity import local_diversity

w = icsp.context[0]["full_network"]
cells = pd.concat([icsp.context[0]["roi_cells"], icsp.context[0]["interface_cells"]])
cells = cells[
    [
        "geometry",
        "global_id",
        "class_name",
    ]
]
# compute the heterogeneity of the neighborhood areas
cells = local_diversity(
    cells,
    spatial_weights=w,
    val_col="class_name",
    id_col="global_id",
    metrics=["simpson_index"],
    rm_nhood_cols=False,
)

cols = cols = [
    "geometry",
    "class_name",
    "nhood",
    "class_name_nhood_counts",
    "class_name_simpson_index",
]
cells[cols].head()

Out[11]:

	geometry	class_name	nhood	class_name_nhood_counts	class_name_simpson_index
global_id
12525	POLYGON ((10992.01188 48227.00360, 10992.01191...	neoplastic	[12525]	[1]	0.000000
12526	POLYGON ((10967.01188 48340.00360, 10967.01188...	neoplastic	[12526, 12352, 12353, 12354, 12527]	[4, 1]	0.320000
12352	POLYGON ((10929.01168 48343.00780, 10929.01182...	neoplastic	[12352, 12526, 12353, 12354, 12355, 12527, 12358]	[6, 1]	0.244898
12353	POLYGON ((10904.01182 48357.00490, 10904.01191...	inflammatory	[12353, 12526, 12352, 12354, 12355, 12527, 123...	[7, 1, 1]	0.370370
12354	POLYGON ((10925.01168 48361.00780, 10925.01182...	neoplastic	[12354, 12526, 12352, 12353, 12355, 12527, 123...	[7, 1]	0.218750

In [12]:

Copied!

# !pip install legendgram
# !pip install legendgram

In [13]:

Copied!





import matplotlib.pyplot as plt
import palettable as palet
from legendgram import legendgram


# Helper function to plot cells with a feature value highlighted
def plot_cells(cells: gpd.GeoDataFrame, col: str):
    # bin the values with the Fisher-Jenks method
    bins = mapclassify.FisherJenks(cells[col], k=5)
    cells["bin_vals"] = bins.yb

    # Let's plot the cells with the eccentricity metric
    f, ax = plt.subplots(figsize=(10, 10))

    ax = cells.plot(
        ax=ax,
        column="bin_vals",
        cmap="viridis",
        categorical=True,
        legend=True,
        legend_kwds={
            "fontsize": 8,
            "loc": "center left",
            "bbox_to_anchor": (1.0, 0.94),
        },
    )

    bin_legends = bins.get_legend_classes()
    mapping = dict([(i, s) for i, s in enumerate(bin_legends)])
    replace_legend_items(ax.get_legend(), mapping)

    ax = legendgram(
        f,
        ax,
        cells[col],
        bins=100,
        breaks=bins.bins,
        pal=palet.matplotlib.Viridis_5,
        loc="lower left",
    )

    return ax


# Let's plot the cells with the eccentricity metric
plot_cells(cells, col="class_name_simpson_index")
import matplotlib.pyplot as plt
import palettable as palet
from legendgram import legendgram


# Helper function to plot cells with a feature value highlighted
def plot_cells(cells: gpd.GeoDataFrame, col: str):
    # bin the values with the Fisher-Jenks method
    bins = mapclassify.FisherJenks(cells[col], k=5)
    cells["bin_vals"] = bins.yb

    # Let's plot the cells with the eccentricity metric
    f, ax = plt.subplots(figsize=(10, 10))

    ax = cells.plot(
        ax=ax,
        column="bin_vals",
        cmap="viridis",
        categorical=True,
        legend=True,
        legend_kwds={
            "fontsize": 8,
            "loc": "center left",
            "bbox_to_anchor": (1.0, 0.94),
        },
    )

    bin_legends = bins.get_legend_classes()
    mapping = dict([(i, s) for i, s in enumerate(bin_legends)])
    replace_legend_items(ax.get_legend(), mapping)

    ax = legendgram(
        f,
        ax,
        cells[col],
        bins=100,
        breaks=bins.bins,
        pal=palet.matplotlib.Viridis_5,
        loc="lower left",
    )

    return ax


# Let's plot the cells with the eccentricity metric
plot_cells(cells, col="class_name_simpson_index")

Out[13]:

<Axes: >

We can clearly see that the most diverse neighborhoods are located at the interface border. This is expected since the interface border is where the two tissues meet and on the other side there are tumor cells and on the other side stromal and immune cells, meaning that the cells at the border are likely to have more heterogenous neighboring cells. Also, we see that the tumor cell neighborhoods are more homogenous than the cell neighborhoods in the stroma. This is also expected since tumors are normally homogenous mass of tumor cells with some tumor-infiltrating-lymphocytes here and there whereas in the stroma there are several different cell types present in a heterogenous mix.

In [ ]: