Data Load Summary

Data Load Summary
This documents the data load decisions made when loading the TCGA Breast Cancer data. The data
loaded is identified, the decisions made to overcome conflict or deficiencies in the data. Also identified
is the schema created to support the TCGA Breast Cancer data in BIS.
Clinical and Sample Data
Loaded the clinical (and sample) files with the X next to them.
...\Clinical\Biotab
x
x
x
x
x
x
x
nationwidechildrens.org_biospecimen_aliquot_brca.txt
nationwidechildrens.org_biospecimen_analyte_brca.txt
nationwidechildrens.org_biospecimen_cqcf_brca.txt
nationwidechildrens.org_biospecimen_normal_control_brca.txt
nationwidechildrens.org_biospecimen_portion_brca.txt
nationwidechildrens.org_biospecimen_protocol_brca.txt
nationwidechildrens.org_biospecimen_sample_brca.txt
nationwidechildrens.org_biospecimen_shipment_portion_brca.txt
nationwidechildrens.org_biospecimen_slide_brca.txt
nationwidechildrens.org_biospecimen_tumor_sample_brca.txt
nationwidechildrens.org_clinical_cqcf_brca.txt
nationwidechildrens.org_clinical_drug_brca.txt
nationwidechildrens.org_clinical_follow_up_v1.5_brca.txt
nationwidechildrens.org_clinical_follow_up_v2.1_brca.txt
nationwidechildrens.org_clinical_follow_up_v4.0_brca.txt
nationwidechildrens.org_clinical_follow_up_v4.0_nte_brca.txt
nationwidechildrens.org_clinical_nte_brca.txt
nationwidechildrens.org_clinical_patient_brca.txt
nationwidechildrens.org_clinical_radiation_brca.txt
There were samples that referred to missing participants (patients), so the following participants were
inserted
TCGA-A7-A5ZW
TCGA-A7-A5ZX
TCGA-A7-A6VV
TCGA-A7-A6VW
TCGA-A7-A6VX
TCGA-A7-A6VY
TCGA-AC-A2FE
TCGA-AC-A3EH
TCGA-AC-A3QQ
TCGA-AC-A3TM
TCGA-AC-A3W7
TCGA-AC-A5EH
TCGA-AC-A5EI
TCGA-AC-A5XS
TCGA-AC-A5XU
TCGA-AC-A62V
TCGA-AC-A62X
TCGA-AC-A6IW
TCGA-AC-A6IX
TCGA-AC-A6NO
TCGA-AC-A7VB
TCGA-AC-A7VC
TCGA-AC-A8OP
TCGA-AQ-A7U7
TCGA-AR-A2LJ
TCGA-B6-A3ZX
TCGA-B6-A400
TCGA-B6-A401
TCGA-E2-A572
TCGA-E2-A576
TCGA-E9-A6HE
TCGA-EW-A6S9
TCGA-EW-A6SA
TCGA-EW-A6SB
TCGA-EW-A6SC
TCGA-EW-A6SD
TCGA-HN-A2NL
TCGA-HN-A2OB
TCGA-LD-A74U
TCGA-LD-A7W5
TCGA-LD-A7W6
TCGA-LL-A5YN
TCGA-LL-A5YO
TCGA-LL-A5YP
TCGA-LL-A6FP
TCGA-LL-A6FQ
TCGA-LL-A6FR
TCGA-LL-A73Y
TCGA-LL-A73Z
TCGA-LL-A740
TCGA-LL-A7SZ
TCGA-LL-A7T0
TCGA-LL-A8F5
TCGA-OL-A6VO
TCGA-OL-A6VR
TCGA-PE-A5DC
TCGA-BH-A8G0
TCGA-C8-A8HP
TCGA-C8-A8HQ
TCGA-C8-A8HR
TCGA-D8-A73U
TCGA-D8-A73W
TCGA-D8-A73X
TCGA-B6-A408
TCGA-B6-A409
TCGA-B6-A40B
TCGA-B6-A40C
TCGA-BH-A5IZ
TCGA-BH-A5J0
TCGA-BH-A6R8
TCGA-BH-A6R9
TCGA-BH-A8FY
TCGA-BH-A8FZ
TCGA-PE-A5DE
TCGA-PL-A8LZ
TCGA-S3-A6ZF
TCGA-S3-A6ZG
TCGA-S3-A6ZH
TCGA-V7-A7HQ
TCGA-W8-A86G
TCGA-XX-A899
TCGA-XX-A89A
TCGA-B6-A402
TCGA-PE-A5DD
Computed survival_time (days) = max(days_to_last_followup, days_to_death)
Inserted two Samples referred to by the Experiment data.
Participant
TCGA-E2-A2P5
TCGA-E2-A2P6
Sample
TCGA-E2-A2P5-10B
TCGA-E2-A2P6-10B
Molecular Data
Somatic Mutations
Loaded file:
\somatic mutation\Somatic_Mutations\WUSM__IlluminaGA_DNASeq\Level_2\
genome.wustl.edu__Illumina_Genome_Analyzer_DNA_Sequencing_level2.maf
Added the Amino Acid change field values to Genetic Variant by joining chromosome position, type and
mutation.
Gene Expression
Loaded the 590 files in the
…\Expression-Genes\UNC__AgilentG4502A_07_3\Level_3 folder.
Removed expression observation with gene symbol Rgr.
Copy Number Variant
Loaded the *hg19.* files.
Duplicate rows in the data were found, so we used the following precedence rule
If 1 distinct row with CNVObservation_barcode, load it.
If there are multiple rows with the same CNVObservation_barcode, load the row that does not have
FFPE in the filename and type=NOCNV.
For more information
To learn more, contact your TransMed representative or visit: xbtransmed.com. © Copyright TransMed Systems, Inc. 2011
TransMed Systems, 21170 Canyon Oak Way, Cupertino, CA 95014. The information contained in this documentation is
provided for informational purposes only and is subject to change without notice. While efforts were made to verify the
completeness and accuracy of the information contained in this documentation, it is provided “as is” without warranty of any
kind, express or implied. All Rights Reserved.