cell_annotator.SampleAnnotator#

class cell_annotator.SampleAnnotator(adata, sample_name, species, tissue, stage='adult', cluster_key='leiden', model=None, max_completion_tokens=None, provider=None, api_key=None, _skip_validation=False)#

Handles cell type annotation for a single sample/batch.

Computes marker genes, queries LLM for cell type predictions, and manages annotation results for an individual sample. Typically used as part of a multi-sample workflow orchestrated by CellAnnotator.

Parameters:

%(adata_sample)s
%(sample_name)s
%(species)s
%(tissue)s
%(stage)s
%(cluster_key)s
%(model)s
%(max_completion_tokens)s
%(provider)s
%(api_key)s
adata (AnnData)
sample_name (str)
species (str)
tissue (str)
stage (str)
cluster_key (str)
model (str | None)
max_completion_tokens (int | None)
provider (str | None)
api_key (str | None)
_skip_validation (bool)

Attributes table#

api_keys

Access to API key manager.

Methods table#

`annotate_clusters`(min_markers, ...[, ...])	Annotate clusters based on marker genes.
`check_api_access`([provider, model])	Check API access and log warnings if needed.
`get_cluster_markers`([method, ...])	Get marker genes per cluster
`harmonize_annotations`(global_cell_type_list)	Map local cell type names to global cell type names.
`list_available_models`()	List available models for the current provider.
`query_llm`(instruction, response_format[, ...])	Query the LLM with a given instruction.
`test_query`([return_details])	Test if the LLM setup is working correctly.

Attributes#

SampleAnnotator.api_keys#: Access to API key manager.

Methods#

SampleAnnotator.annotate_clusters(min_markers, expected_marker_genes, restrict_to_expected=False)#

Annotate clusters based on marker genes.

Parameters:

min_markers (int) – Minimum number of required marker genes per cluster.
expected_marker_genes (dict[str, list[str]] | None) – Expected marker genes per cell type.
restrict_to_expected (bool (default: False)) – If True, only use expected cell types for annotation.

Return type:

None

Returns:

Updates the following attributes: - self.annotation_dict - self.annotation_df

SampleAnnotator.check_api_access(provider=None, model=None)#

Check API access and log warnings if needed.

Return type:

bool

Parameters:

provider (str | None)
model (str | None)

SampleAnnotator.get_cluster_markers(method='wilcoxon', min_cells_per_cluster=3, min_specificity=0.75, min_auc=0.7, max_markers=7, use_raw=False, use_rapids=False)#

Get marker genes per cluster

Parameters:

method (_Method | None (default: 'wilcoxon')) – Method for marker gene computation. See scanpy.tl.rank_genes_groups for details.
min_cells_per_cluster (int (default: 3)) – Include only clusters with at least this many cells.
min_specificity (float (default: 0.75)) – Minimum specificity threshold for marker genes.
min_auc (float (default: 0.7)) – Minimum AUC threshold for marker genes.
max_markers (int (default: 7)) – Maximum number of marker genes per cluster.
use_raw (bool (default: False)) – Whether to use raw data for calculations.
use_rapids (bool (default: False)) – Whether to use RAPIDS for GPU acceleration.

Return type:

None

Returns:

None

Updates the following attributes: - self.marker_dfs - self.marker_genes

SampleAnnotator.harmonize_annotations(global_cell_type_list, unknown_key='Unknown')#

Map local cell type names to global cell type names.

Parameters:

global_cell_type_list (list[str]) – List of global cell types.
unknown_key (str (default: 'Unknown')) – Key for the unknown category.

Return type:

None

Returns:

Updates the following fields: - self.local_cell_type_mapping - self.annotation_df["cell_type_harmonized"]

SampleAnnotator.list_available_models()#

List available models for the current provider.

Return type:: list[str]
Returns:: list[str] List of available model names.

SampleAnnotator.query_llm(instruction, response_format, agent_description=None, other_messages=None)#

Query the LLM with a given instruction.

Parameters:

instruction (str) – Instruction to provide to the model.
response_format (type[BaseOutput]) – Response format class.
agent_description (str | None (default: None)) – Optional system prompt override. If None, uses the default cell-annotation prompt from self.prompts.
other_messages (list | None (default: None)) – Additional messages to provide to the model.

Return type:

BaseOutput

Returns:

Parsed response.

SampleAnnotator.test_query(return_details=False)#

Test if the LLM setup is working correctly.

Performs a simple structured-output query against the configured model. For OpenRouter slugs whose upstream model does not implement OpenAI’s .parse() endpoint, the provider’s fallback chain (extra_body json_schema → plain json_object → optional text-repair) carries the request, so the same code path works for every provider.

Parameters:: return_details (bool (default: False)) – If True, returns (success, message) tuple with detailed information. If False, returns only boolean success status.
Return type:: bool | tuple[bool, str]
Returns:: If return_details=False: True if the test query succeeds, False otherwise. If return_details=True: Tuple of (success, message) with detailed status.

cell_annotator.SampleAnnotator

Contents

cell_annotator.SampleAnnotator#

Attributes table#

Methods table#

Attributes#

Methods#