CDA Mutation
We extract information about variants from the CDA table mutation
.
We first summarize the table and then outline our ETL strategy.
Column | Example | Explanation |
---|---|---|
project_short_name | TCGA-CESC | |
case_barcode | TCGA-C5-A1MI | |
cda_subject_id | TCGA.TCGA-C5-A1MI | |
primary_site | Cervix uteri | |
Hugo_Symbol | IGSF9B | |
Entrez_Gene_Id | 22997 | |
Center | BI | |
NCBI_Build | GRCh38 | |
Chromosome | chr11 | |
Start_Position | 133921225 | |
End_Position | 133921225 | |
Strand | + | |
Variant_Classification | Missense_Mutation | |
Variant_Type | SNP | |
Reference_Allele | C | |
Tumor_Seq_Allele1 | C | |
Tumor_Seq_Allele2 | T | |
dbSNP_RS | rs771150072 | |
dbSNP_Val_Status | ||
Tumor_Aliquot_Barcode | TCGA-C5-A1MI-01A-11D-A14W-08 | |
Matched_Norm_Aliquot_Barcode | TCGA-C5-A1MI-10A-01D-A14W-08 | |
Match_Norm_Seq_Allele1 | ||
Match_Norm_Seq_Allele2 | ||
Tumor_Validation_Allele1 | ||
Tumor_Validation_Allele2 | ||
Match_Norm_Validation_Allele1 | ||
Match_Norm_Validation_Allele2 | ||
Verification_Status | ||
Validation_Status | ||
Mutation_Status | Somatic | |
Sequencing_Phase | ||
Sequence_Source | ||
Validation_Method | ||
Score | ||
BAM_File | ||
Sequencer | ||
Tumor_Aliquot_UUID | 497c20f0-8a42-4d20-abdc-0415982ebb9f | |
Matched_Norm_Aliquot_UUID | a3d0503b-baac-4f83-9182-be7b4154c61d | |
HGVSc | c.2500G>A | |
HGVSp | p.Val834Met | |
HGVSp_Short | p.V834M | |
Transcript_ID | ENST00000321016 | |
Exon_Number | 18/19 | |
t_depth | 64 | |
t_ref_count | 43 | |
t_alt_count | 20 | |
n_depth | 88 | |
n_ref_count | ||
n_alt_count | ||
all_effects | IGSF9B,missense_variant,p.V834M,ENST00000533871,NM_001277285.4,c.2500G>A,MODERATE,YES,deleterious(0),probably_damaging(1),-1;IGSF9B,missense_variant,p.V834M,ENST00000321016,,c.2500G>A,MODERATE,,deleterious(0.01),probably_damaging(0.988),-1;IGSF9B,downstream_gene_variant,,ENST00000527648,,,MODIFIER,,,,-1 | |
Allele | T | |
Gene | ENSG00000080854 | |
Feature | ENST00000321016 | |
Feature_type | Transcript | |
One_Consequence | missense_variant | |
Consequence | missense_variant | |
cDNA_position | 2500/4050 | |
Protein_position | 834/1349 | |
Amino_acids | V/M | |
Codons | Gtg/Atg | |
Existing_variation | rs771150072;COSV58068494 | |
DISTANCE | ||
TRANSCRIPT_STRAND | -1 | |
SYMBOL | IGSF9B | |
SYMBOL_SOURCE | HGNC | |
HGNC_ID | HGNC:32326 | |
BIOTYPE | protein_coding | |
CANONICAL | ||
CCDS | ||
ENSP | ENSP00000317980 | |
SWISSPROT | Q9UPX0.150 | |
TREMBL | ||
UNIPARC | UPI0001545E3E | |
UNIPROT_ISOFORM | Q9UPX0-1 | |
RefSeq | ||
MANE | ||
APPRIS | ||
FLAGS | ||
SIFT | deleterious(0.01) | |
PolyPhen | probably_damaging(0.988) | |
EXON | 18/19 | |
INTRON | ||
DOMAINS | PANTHER:PTHR12231;PANTHER:PTHR12231:SF240;Low_complexity_(Seg):seg | |
ThousG_AF | ||
ThousG_AFR_AF | ||
ThousG_AMR_AF | ||
ThousG_EAS_AF | ||
ThousG_EUR_AF | ||
ThousG_SAS_AF | ||
ESP_AA_AF | ||
ESP_EA_AF | ||
gnomAD_AF | 1.216e-05 | |
gnomAD_AFR_AF | ||
gnomAD_AMR_AF | ||
gnomAD_ASJ_AF | ||
gnomAD_EAS_AF | ||
gnomAD_FIN_AF | ||
gnomAD_NFE_AF | ||
gnomAD_OTH_AF | ||
gnomAD_SAS_AF | ||
MAX_AF | 9.968e-05 | |
MAX_AF_POPS | 3.278e-05 | |
gnomAD_non_cancer_AF | ||
gnomAD_non_cancer_AFR_AF | ||
gnomAD_non_cancer_AMI_AF | ||
gnomAD_non_cancer_AMR_AF | ||
gnomAD_non_cancer_ASJ_AF | ||
gnomAD_non_cancer_EAS_AF | ||
gnomAD_non_cancer_FIN_AF | ||
gnomAD_non_cancer_MID_AF | ||
gnomAD_non_cancer_NFE_AF | ||
gnomAD_non_cancer_OTH_AF | ||
gnomAD_non_cancer_SAS_AF | ||
gnomAD_non_cancer_MAX_AF_adj | ||
gnomAD_non_cancer_MAX_AF_POPS_adj | ||
CLIN_SIG | ||
SOMATIC | 0;1 | |
PUBMED | ||
TRANSCRIPTION_FACTORS | ||
MOTIF_NAME | ||
MOTIF_POS | ||
HIGH_INF_POS | ||
MOTIF_SCORE_CHANGE | ||
miRNA | ||
IMPACT | MODERATE | |
PICK | ||
VARIANT_CLASS | SNV | |
TSL | 5 | |
HGVS_OFFSET | ||
PHENO | 0;1 | |
GENE_PHENO | ||
CONTEXT | GGCCACGCTGT | |
tumor_submitter_uuid | 8c3559db-155f-42d3-9a73-38d5610f74b5 | |
normal_submitter_uuid | 59778b5f-335a-471e-abb2-6dde0b5d7fe7 | |
case_id | 941f75a1-fea4-4539-ba69-60bb11608f6d | |
GDC_FILTER | ||
COSMIC | COSM376595;COSM376596 | |
hotspot | False | |
RNA_Support | Unknown | |
RNA_depth | ||
RNA_ref_count | ||
RNA_alt_count | ||
callers | muse;mutect2;varscan2 | |
file_gdc_id | 3fd5afe7-9e69-4ea8-ab01-80e41783d795 | |
muse | Yes | |
mutect2 | Yes | |
pindel | No | |
varscan2 | Yes | |
sample_barcode_tumor | TCGA-C5-A1MI-01A | |
sample_barcode_normal | TCGA-C5-A1MI-10A | |
aliquot_barcode_tumor | TCGA-C5-A1MI-01A-11D-A14W-08 | |
aliquot_barcode_normal | TCGA-C5-A1MI-10A-01D-A14W-08 |
We will use only a few fields for extracting data for the phenopacket. These fields are explained below.