Developing a PhEval plugin
This guide explains how to develop a PhEval plugin that exposes a runner and produces PhEval standardised results that can be benchmarked consistently.
Video walkthrough
If you prefer a guided walkthrough, start here:
Key takeaways
- A runner must implement all
PhEvalRunnermethods (prepare,run,post_process).- Your runner must write standardised result files with the required columns for the benchmark type.
- Result filenames must match phenopacket filenames (file stem matching) so PhEval can align outputs to cases.
Standardised result schemas (required)
PhEval benchmarking operates on standardised result files. Each result file must conform exactly to the required schema for the type of prioritisation being produced.
Schemas are validated during post-processing.
Missing or incorrectly named columns will cause validation to fail.
Gene prioritisation results
Each gene result must contain the following columns:
| Column name | Type | Description |
|---|---|---|
gene_symbol |
pl.String |
Gene symbol |
gene_identifier |
pl.String |
Gene identifier |
score |
pl.Float64 |
Tool-specific score |
grouping_id |
pl.Utf8 |
Optional grouping identifier |
Variant prioritisation results
Each variant result must contain the following columns:
| Column name | Type | Description |
|---|---|---|
chrom |
pl.String |
Chromosome |
start |
pl.Int64 |
Start position |
end |
pl.Int64 |
End position |
ref |
pl.String |
Reference allele |
alt |
pl.String |
Alternate allele |
score |
pl.Float64 |
Tool-specific score |
grouping_id |
pl.Utf8 |
Optional grouping identifier |
Disease prioritisation results
Each disease result must contain the following columns:
| Column name | Type | Description |
|---|---|---|
disease_identifier |
pl.String |
Disease identifier |
score |
pl.Float64 |
Tool-specific score |
The grouping_id column (optional but important)
grouping_id is optional and enables joint ranking of entities that should be treated as a single unit without penalty.
Typical examples include:
- Compound heterozygous variants (multiple variants contributing together)
- Grouped variant representations within the same gene
- Polygenic or grouped signals where multiple items should be evaluated jointly
How to use it
- Variants in the same group share the same
grouping_id - Variants not in any group should each have a unique
grouping_id
This preserves ranking semantics when benchmarking.
Result file naming (required)
PhEval aligns result files to cases using filename stem matching.
Rule:
The result filename stem must exactly match the phenopacket filename stem.
Example:
- Phenopacket:
patient_001.json - Result filename:
patient_001-exomiser.json - Processed result filename passed to PhEval:
patient_001.json
If the stems do not match, PhEval cannot reliably associate results with phenopackets, and benchmarking may be incomplete or incorrect.
Recommendation:
Always derive result filenames programmatically from the phenopacket stem.
Step-by-step plugin development
PhEval plugins are typically derived from the runner template and standardised tooling. The recommended approach uses the PhEval runner template, MkDocs, tox, and uv.
The template is available here
1. Scaffold a new plugin
Install cruft (used to create projects from the template and keep them up to date):
pip install cruft
Create a project using the template:
cruft create https://github.com/monarch-initiative/pheval-runner-template
2. Environment and dependencies
Install uv (if you do not already use it):
pip install uv
Install dependencies and activate the environment:
uv sync
source .venv/bin/activate
Run the test suite to confirm the setup:
uv run tox
Note
The template uses
uvby default, but this is not required. You may use any packaging/dependency manager. PhEval only requires a validpheval.pluginsentry point.
3. Implement your custom runner
In the generated template, implement your runner in runner.py (under src/).
At minimum, implement prepare, run, and post_process:
"""Runner."""
from dataclasses import dataclass
from pathlib import Path
from pheval.runners.runner import PhEvalRunner
@dataclass
class CustomRunner(PhEvalRunner):
"""Runner class implementation."""
input_dir: Path
testdata_dir: Path
tmp_dir: Path
output_dir: Path
config_file: Path
version: str
def prepare(self):
"""Prepare inputs."""
print("preparing")
def run(self):
"""Execute the tool."""
print("running")
def post_process(self):
"""Convert raw outputs to PhEval standardised results."""
print("post processing")
4. Register the runner entry point
The template populates your pyproject.toml entry points.
If you rename the runner class or move files, update this accordingly:
[project.entry-points."pheval.plugins"]
customrunner = "pheval_plugin_example.runner:CustomRunner"
Tip
The module path and class name are case-sensitive.
Tool-specific configuration (config.yaml)
For pheval run to execute, the input directory must contain a config.yaml:
tool:
tool_version:
variant_analysis:
gene_analysis:
disease_analysis:
tool_specific_configuration_options:
variant_analysis,gene_analysis,disease_analysismust be booleans (true/false)tool_specific_configuration_optionsis optional and may include plugin-specific configuration
Parsing tool-specific configuration (recommended)
Using pydantic can simplify parsing:
from pydantic import BaseModel, Field
class CustomisedConfigurations(BaseModel):
environment: str = Field(...)
Then parse in your runner:
config = CustomisedConfigurations.parse_obj(
self.input_dir_config.tool_specific_configuration_options
)
environment = config.environment
Post-processing: generating standardised results
PhEval can handle ranking and writing result files in the correct locations. Your runner’s post-processing must:
- Read tool-specific raw outputs
- Extract the required fields
- Construct a Polars DataFrame with the required schema
- Call the appropriate PhEval helper method to write standardised results
Result generation helpers
Breaking change (v0.5.0)
generate_pheval_result was replaced with:
generate_gene_resultgenerate_variant_resultgenerate_disease_result
Generating gene result files
Use generate_gene_result to write PhEval-standardised gene results
from a Polars DataFrame.
from pheval.post_processing.post_processing import (
generate_gene_result,
SortOrder,
)
generate_gene_result(
results=pheval_gene_result, # Polars DataFrame (gene schema)
sort_order=SortOrder.DESCENDING, # or SortOrder.ASCENDING
output_dir=output_directory, # typically self.output_dir
result_path=result_path, # path to raw tool output, stem MUST match phenopacket stem exactly
phenopacket_dir=phenopacket_dir, # directory containing phenopackets
)
Generating variant result files
Use generate_variant_result to write PhEval-standardised variant results.
from pheval.post_processing.post_processing import (
generate_variant_result,
SortOrder,
)
generate_variant_result(
results=pheval_variant_result, # Polars DataFrame (variant schema)
sort_order=SortOrder.DESCENDING,
output_dir=output_directory,
result_path=result_path, # stem must match phenopacket stem
phenopacket_dir=phenopacket_dir,
)
Generating disease result files
Use generate_disease_result to write PhEval-standardised disease results.
from pheval.post_processing.post_processing import (
generate_disease_result,
SortOrder,
)
generate_disease_result(
results=pheval_disease_result, # Polars DataFrame (disease schema)
sort_order=SortOrder.DESCENDING,
output_dir=output_directory,
result_path=result_path, # stem must match phenopacket stem
phenopacket_dir=phenopacket_dir,
)
Important
The stem of
result_pathmust exactly match the phenopacket stem. This often requires stripping tool-specific suffixes from raw output filenames.
Adding metadata to results.yml (optional)
PhEval writes a results.yml file to the output directory by default.
You can add customised metadata by overriding construct_meta_data().
Example dataclass:
from dataclasses import dataclass
@dataclass
class CustomisedMetaData:
customised_field: str
Runner implementation:
def construct_meta_data(self):
self.meta_data.tool_specific_configuration_options = CustomisedMetaData(
customised_field="customised_value"
)
return self.meta_data
Helper utilities (optional)
PhEval provides helper methods that can simplify runner implementations.
PhenopacketUtil
Useful for extracting observed phenotypes when tools do not accept phenopackets directly:
Class for retrieving data from a Phenopacket or Family object
Source code in src/pheval/utils/phenopacket_utils.py
205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 | |
Example usage:
from pheval.utils.phenopacket_utils import phenopacket_reader, PhenopacketUtil
phenopacket = phenopacket_reader("/path/to/phenopacket.json")
phenopacket_util = PhenopacketUtil(phenopacket)
observed_phenotypes = phenopacket_util.observed_phenotypic_features()
observed_phenotypes_hpo_ids = [p.type.id for p in observed_phenotypes]
Testing your runner
Install dependencies:
uv sync
Run PhEval using your custom runner:
pheval run -i ./input_dir -t ./test_data_dir -r customrunner -o output_dir
Notes:
- the
-r/--runnervalue must match the entry point name (lowercase) - confirm that standardised result files are produced and validate correctly
- confirm that result file stems match the phenopacket file stems
Checklist before release
- Runner implements
prepare,run,post_process - Entry point registered under
pheval.plugins - Standardised results conform to required schema(s)
- Result filenames use phenopacket stem matching
- Optional:
grouping_idcorrectly set for grouped ranking scenarios - Optional:
results.ymlmetadata populated where useful