lamindb.curators.DataFrameCurator¶
- class lamindb.curators.DataFrameCurator(dataset, schema)¶
Bases:
Curator
Curator for a DataFrame object.
- Parameters:
dataset (
DataFrame
) – The DataFrame-like object to validate & annotate.schema (
Schema
) – ASchema
object that defines the validation constraints.
Example:
import lamindb as ln import bionty as bt # define valid labels cell_medium = ln.ULabel(name="CellMedium", is_type=True).save() ln.ULabel(name="DMSO", type=cell_medium).save() ln.ULabel(name="IFNG", type=cell_medium).save() bt.CellType.from_source(name="B cell").save() bt.CellType.from_source(name="T cell").save() # define schema schema = ln.Schema( name="small_dataset1_obs_level_metadata", otype="DataFrame", features=[ ln.Feature(name="cell_medium", dtype="cat[ULabel[CellMedium]]").save(), ln.Feature(name="sample_note", dtype="str").save(), ln.Feature(name="cell_type_by_expert", dtype="cat[bionty.CellType]").save(), ln.Feature(name="cell_type_by_model", dtype="cat[bionty.CellType]").save(), ], coerce_dtype=True, ).save() # curate a DataFrame df = datasets.small_dataset1(otype="DataFrame") curator = ln.curators.DataFrameCurator(df, small_dataset1_schema) artifact = curator.save_artifact(key="example_datasets/dataset1.parquet") assert artifact.schema == anndata_schema
Methods¶
- save_artifact(*, key=None, description=None, revises=None, run=None)¶
Save an annotated artifact.
- Parameters:
key (
str
|None
, default:None
) – A path-like key to reference artifact in default storage, e.g.,"myfolder/myfile.fcs"
. Artifacts with the same key form a revision family.description (
str
|None
, default:None
) – A description.revises (
Artifact
|None
, default:None
) – Previous version of the artifact. Is an alternative way to passingkey
to trigger a revision.run (
Run
|None
, default:None
) – The run that creates the artifact.
- Returns:
A saved artifact record.
- validate()¶
Validate dataset.
- Raises:
lamindb.errors.ValidationError – If validation fails.
- Return type:
None