cell_annotator.SampleAnnotator#
- class cell_annotator.SampleAnnotator(adata, sample_name, species, tissue, stage='adult', cluster_key='leiden', model=None, max_completion_tokens=None, provider=None, api_key=None, _skip_validation=False)#
Handles cell type annotation for a single sample/batch.
Computes marker genes, queries LLM for cell type predictions, and manages annotation results for an individual sample. Typically used as part of a multi-sample workflow orchestrated by CellAnnotator.
- Parameters:
%(adata_sample)s
%(sample_name)s
%(species)s
%(tissue)s
%(stage)s
%(cluster_key)s
%(model)s
%(max_completion_tokens)s
%(provider)s
%(api_key)s
adata (AnnData)
sample_name (str)
species (str)
tissue (str)
stage (str)
cluster_key (str)
model (str | None)
max_completion_tokens (int | None)
provider (str | None)
api_key (str | None)
_skip_validation (bool)
Attributes table#
Access to API key manager. |
Methods table#
|
Annotate clusters based on marker genes. |
|
Check API access and log warnings if needed. |
|
Get marker genes per cluster |
|
Map local cell type names to global cell type names. |
List available models for the current provider. |
|
|
Query the LLM with a given instruction. |
|
Test if the LLM setup is working correctly. |
Attributes#
- SampleAnnotator.api_keys#
Access to API key manager.
Methods#
- SampleAnnotator.annotate_clusters(min_markers, expected_marker_genes, restrict_to_expected=False)#
Annotate clusters based on marker genes.
- Parameters:
- Return type:
- Returns:
Updates the following attributes: -
self.annotation_dict-self.annotation_df
- SampleAnnotator.check_api_access(provider=None, model=None)#
Check API access and log warnings if needed.
- SampleAnnotator.get_cluster_markers(method='wilcoxon', min_cells_per_cluster=3, min_specificity=0.75, min_auc=0.7, max_markers=7, use_raw=False, use_rapids=False)#
Get marker genes per cluster
- Parameters:
method (
Optional[Literal['logreg','t-test','wilcoxon','t-test_overestim_var']] (default:'wilcoxon')) – Method for marker gene computation. See scanpy.tl.rank_genes_groups for details.min_cells_per_cluster (
int(default:3)) – Include only clusters with at least this many cells.min_specificity (
float(default:0.75)) – Minimum specificity threshold for marker genes.min_auc (
float(default:0.7)) – Minimum AUC threshold for marker genes.max_markers (
int(default:7)) – Maximum number of marker genes per cluster.use_raw (
bool(default:False)) – Whether to use raw data for calculations.use_rapids (
bool(default:False)) – Whether to use RAPIDS for GPU acceleration.
- Return type:
- Returns:
None
Updates the following attributes: -
self.marker_dfs-self.marker_genes
- SampleAnnotator.harmonize_annotations(global_cell_type_list, unknown_key='Unknown')#
Map local cell type names to global cell type names.
- SampleAnnotator.list_available_models()#
List available models for the current provider.
- SampleAnnotator.query_llm(instruction, response_format, other_messages=None)#
Query the LLM with a given instruction.
- SampleAnnotator.test_query(return_details=False)#
Test if the LLM setup is working correctly.
Performs a simple query to verify that the API key is valid and the model can be accessed successfully.
- Parameters:
return_details (
bool(default:False)) – If True, returns (success, message) tuple with detailed information. If False, returns only boolean success status.- Return type:
- Returns:
If return_details=False: True if the test query succeeds, False otherwise. If return_details=True: Tuple of (success, message) with detailed status.