Skip to content

simpson_index

cellseg_gsontools.diversity.simpson_index(counts)

Compute the Simpson diversity index on a count vector.

Note

Simpson diversity index is a quantitative measure that reflects how many different types (such as species) there are in a dataset (a community). It is a probability measure, when it is low, the greater the probability that two randomly selected individuals will be the same species. - A. Wilson, N. Gownaris

Simpson index: $$ D = 1 - \sum_{i=1}^n \left(\frac{n_i}{N}\right)^2 $$

where \(n_i\) is the count of species \(i\) and \(N\) is the total count of species.

Parameters:

Name Type Description Default
counts Sequence

A count vector/list of shape (C, ).

required

Returns:

Name Type Description
float float

The computed Simpson diversity index.

Source code in cellseg_gsontools/diversity.py
def simpson_index(counts: Sequence) -> float:
    """Compute the Simpson diversity index on a count vector.

    Note:
        Simpson diversity index is a quantitative measure that reflects how many
        different types (such as species) there are in a dataset (a community). It
        is a probability measure, when it is low, the greater the probability that
        two randomly selected individuals will be the same species.
        - [A. Wilson, N. Gownaris](https://bio.libretexts.org/Courses/Gettysburg_College/01%3A_Ecology_for_All/22%3A_Biodiversity/22.02%3A_Diversity_Indices)


    **Simpson index:**
    $$
    D = 1 - \\sum_{i=1}^n \\left(\\frac{n_i}{N}\\right)^2
    $$

    where $n_i$ is the count of species $i$ and $N$ is the total count of species.

    Parameters:
        counts (Sequence):
            A count vector/list of shape (C, ).

    Returns:
        float:
            The computed Simpson diversity index.
    """
    N = np.sum(counts) + SMALL
    return 1 - np.sum([(n / N) ** 2 for n in counts if n != 0])