Quick start¶
Data¶
Get some data to play with.
NOTE that the cell type labels are under the class_name
column. Whenever you use cellseg_gsontools
, you need to have a column named class_name
containing the class labels.
In [1]:
Copied!
from cellseg_gsontools.data import cervix_cells, cervix_tissue
cells = cervix_cells()
tissue = cervix_tissue()
cells.head(4)
from cellseg_gsontools.data import cervix_cells, cervix_tissue
cells = cervix_cells()
tissue = cervix_tissue()
cells.head(4)
Out[1]:
type | geometry | class_name | |
---|---|---|---|
uid | |||
1 | Feature | POLYGON ((-10.988 48446.005, -10.988 48453.996... | inflammatory |
2 | Feature | POLYGON ((-20.988 48477.996, -19.990 48479.993... | connective |
3 | Feature | POLYGON ((-14.988 48767.995, -11.993 48770.990... | inflammatory |
4 | Feature | POLYGON ((-3.988 49537.995, -2.995 49538.988, ... | connective |
In [2]:
Copied!
cells.plot(column="class_name", figsize=(10, 5))
cells.plot(column="class_name", figsize=(10, 5))
Out[2]:
<Axes: >
In [3]:
Copied!
tissue.plot(column="class_name", figsize=(10, 5))
tissue.plot(column="class_name", figsize=(10, 5))
Out[3]:
<Axes: >
Subset Cells Within a Specific Tissue¶
Two ways to subset cells within a specific tissue: sjoin
or sindex
. The sjoin
operation can be super slow when the dataframes are very large.
In [4]:
Copied!
# Get only the stromal tissue from `tissue` gdf
stroma = tissue.loc[tissue["class_name"] == "areastroma"]
# sjoin
cells_in_stroma = cells.sjoin(stroma, how="inner", predicate="intersects")
cells_in_stroma.plot(column="class_name_left", figsize=(10, 5))
# Get only the stromal tissue from `tissue` gdf
stroma = tissue.loc[tissue["class_name"] == "areastroma"]
# sjoin
cells_in_stroma = cells.sjoin(stroma, how="inner", predicate="intersects")
cells_in_stroma.plot(column="class_name_left", figsize=(10, 5))
Out[4]:
<Axes: >
In [5]:
Copied!
# sindex. Very fast!
tissue_inds, cell_inds = cells.sindex.query(stroma.geometry, predicate="intersects")
cells_in_stroma = cells.iloc[cell_inds]
cells_in_stroma.plot(column="class_name", figsize=(10, 5))
# sindex. Very fast!
tissue_inds, cell_inds = cells.sindex.query(stroma.geometry, predicate="intersects")
cells_in_stroma = cells.iloc[cell_inds]
cells_in_stroma.plot(column="class_name", figsize=(10, 5))
Out[5]:
<Axes: >
Compute Morphological Features¶
In [6]:
Copied!
from cellseg_gsontools.geometry import shape_metric
cells_in_stroma = shape_metric(
cells_in_stroma,
metrics=["area", "sphericity", "eccentricity"],
parallel=True
)
cells_in_stroma.head(4)
from cellseg_gsontools.geometry import shape_metric
cells_in_stroma = shape_metric(
cells_in_stroma,
metrics=["area", "sphericity", "eccentricity"],
parallel=True
)
cells_in_stroma.head(4)
Out[6]:
type | geometry | class_name | area | sphericity | eccentricity | |
---|---|---|---|---|---|---|
uid | ||||||
3181 | Feature | POLYGON ((2050.012 50909.995, 2054.007 50913.9... | inflammatory | 366.583000 | 0.666200 | 0.564428 |
2021 | Feature | POLYGON ((1905.012 51053.996, 1906.826 51056.8... | glandular_epithel | 924.939588 | 0.530253 | 0.817158 |
2030 | Feature | POLYGON ((1928.012 51090.995, 1932.005 51094.9... | glandular_epithel | 281.475643 | 0.626814 | 0.527087 |
2036 | Feature | POLYGON ((1927.012 51147.003, 1928.012 51151.9... | glandular_epithel | 460.664656 | 0.617916 | 0.661679 |
In [10]:
Copied!
import matplotlib.pyplot as plt
fig, ax = plt.subplots(3, 1, figsize=(10, 15))
ax = ax.flatten()
ax[0] = cells_in_stroma.plot(
ax=ax[0],
column="area",
cmap="viridis",
legend=True,
)
ax[0].set_title("Area")
ax[1] = cells_in_stroma.plot(
ax=ax[1],
column="sphericity",
cmap="viridis",
legend=True,
)
ax[1].set_title("Sphericity")
ax[2] = cells_in_stroma.plot(
ax=ax[2],
column="eccentricity",
cmap="viridis",
legend=True,
)
ax[2].set_title("Eccentricity")
import matplotlib.pyplot as plt
fig, ax = plt.subplots(3, 1, figsize=(10, 15))
ax = ax.flatten()
ax[0] = cells_in_stroma.plot(
ax=ax[0],
column="area",
cmap="viridis",
legend=True,
)
ax[0].set_title("Area")
ax[1] = cells_in_stroma.plot(
ax=ax[1],
column="sphericity",
cmap="viridis",
legend=True,
)
ax[1].set_title("Sphericity")
ax[2] = cells_in_stroma.plot(
ax=ax[2],
column="eccentricity",
cmap="viridis",
legend=True,
)
ax[2].set_title("Eccentricity")
Out[10]:
Text(0.5, 1.0, 'Eccentricity')
Compute Spatial Weights¶
Spatial weights are connectivity graphs between cells.
In [12]:
Copied!
from cellseg_gsontools.graphs import fit_graph
from cellseg_gsontools.links import weights2gdf
stromal_connectivity = fit_graph(
cells_in_stroma,
type="distband",
thresh=100,
)
weights = weights2gdf(cells_in_stroma, stromal_connectivity)
weights.head(4)
from cellseg_gsontools.graphs import fit_graph
from cellseg_gsontools.links import weights2gdf
stromal_connectivity = fit_graph(
cells_in_stroma,
type="distband",
thresh=100,
)
weights = weights2gdf(cells_in_stroma, stromal_connectivity)
weights.head(4)
Out[12]:
index | focal | neighbor | weight | focal_centroid | neighbor_centroid | focal_class_name | neighbor_class_name | class_name | geometry | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 2021 | 2030 | 1.0 | POINT (1927.025763595986 51043.32882097136) | POINT (1938.091708479527 51086.61324699228) | glandular_epithel | glandular_epithel | glandular_epithel-glandular_epithel | LINESTRING (1927.026 51043.329, 1938.092 51086... |
1 | 2 | 2030 | 2036 | 1.0 | POINT (1938.091708479527 51086.61324699228) | POINT (1941.0429358989277 51145.53971058136) | glandular_epithel | glandular_epithel | glandular_epithel-glandular_epithel | LINESTRING (1938.092 51086.613, 1941.043 51145... |
2 | 4 | 6600 | 6614 | 1.0 | POINT (6981.369919939368 47996.21331351611) | POINT (6955.696549069773 48039.52205471928) | connective | connective | connective-connective | LINESTRING (6981.370 47996.213, 6955.697 48039... |
3 | 5 | 6600 | 6617 | 1.0 | POINT (6981.369919939368 47996.21331351611) | POINT (6907.960024465983 48045.98316704895) | connective | connective | connective-connective | LINESTRING (6981.370 47996.213, 6907.960 48045... |
In [16]:
Copied!
weights.value_counts("class_name", normalize=True)
weights.value_counts("class_name", normalize=True)
Out[16]:
class_name connective-inflammatory 0.381386 connective-connective 0.331238 inflammatory-inflammatory 0.249722 glandular_epithel-glandular_epithel 0.019352 connective-glandular_epithel 0.011105 glandular_epithel-inflammatory 0.005143 connective-neoplastic 0.001220 inflammatory-neoplastic 0.000618 neoplastic-neoplastic 0.000108 connective-dead 0.000015 connective-squamous_epithel 0.000015 dead-inflammatory 0.000015 dead-neoplastic 0.000015 glandular_epithel-neoplastic 0.000015 inflammatory-squamous_epithel 0.000015 neoplastic-squamous_epithel 0.000015 Name: proportion, dtype: float64
Compute Spatial Indices (Grids)¶
In [28]:
Copied!
from cellseg_gsontools.grid import hexgrid_overlay
grid = hexgrid_overlay(stroma)
ax = tissue.plot(column="class_name", figsize=(10, 5))
grid.boundary.plot(ax=ax, color="white")
from cellseg_gsontools.grid import hexgrid_overlay
grid = hexgrid_overlay(stroma)
ax = tissue.plot(column="class_name", figsize=(10, 5))
grid.boundary.plot(ax=ax, color="white")
Out[28]:
<Axes: >
Compute Grid Features¶
In [31]:
Copied!
from cellseg_gsontools.grid import grid_classify
# Immune cell cnt heuristic to classify the grid cells into two classes
def get_immune_cell_prop(gdf, **kwargs) -> int:
try:
cnt = gdf.class_name.value_counts(normalize=True)["inflammatory"]
except KeyError:
cnt = 0
return cnt
grid = grid_classify(
grid=grid,
objs=cells_in_stroma,
metric_func=get_immune_cell_prop,
predicate="intersects",
new_col_names="immune_percentage"
)
grid.head(4)
from cellseg_gsontools.grid import grid_classify
# Immune cell cnt heuristic to classify the grid cells into two classes
def get_immune_cell_prop(gdf, **kwargs) -> int:
try:
cnt = gdf.class_name.value_counts(normalize=True)["inflammatory"]
except KeyError:
cnt = 0
return cnt
grid = grid_classify(
grid=grid,
objs=cells_in_stroma,
metric_func=get_immune_cell_prop,
predicate="intersects",
new_col_names="immune_percentage"
)
grid.head(4)
Out[31]:
geometry | immune_percentage | |
---|---|---|
8982f6d61d7ffff | POLYGON ((9170.47315 50052.75362, 9195.41909 5... | 0.375000 |
8982f6d6117ffff | POLYGON ((8371.83439 50391.14515, 8396.78532 5... | 0.740741 |
8982f699343ffff | POLYGON ((5547.10778 49343.98203, 5572.07323 4... | 0.389610 |
8982f6d608fffff | POLYGON ((10842.80505 49967.64578, 10867.74140... | 0.200000 |
In [40]:
Copied!
ax = tissue.plot(column="class_name", figsize=(10, 5), cmap="tab20_r")
ax = cells.plot(ax=ax, column="class_name", figsize=(10, 5), cmap="tab20")
grid.geometry = grid.boundary
grid.plot(
ax=ax,
column="immune_percentage",
figsize=(10, 5),
alpha=0.5,
cmap="turbo"
)
ax = tissue.plot(column="class_name", figsize=(10, 5), cmap="tab20_r")
ax = cells.plot(ax=ax, column="class_name", figsize=(10, 5), cmap="tab20")
grid.geometry = grid.boundary
grid.plot(
ax=ax,
column="immune_percentage",
figsize=(10, 5),
alpha=0.5,
cmap="turbo"
)
Out[40]:
<Axes: >
In [ ]:
Copied!