«

»

Nov
27

FDA Validation of a PCR test: Pre-processing data Part 2 / n

All of the data analyzed used the same format to make it easy to re-use code .

The data for each PCR experiment was in an Excel file or a comma-delimited (.csv) file with the following format.

WellFluorTargetContentSampleBiological Set NameCqCq MeanCq Std. Dev
A01FAMFusionUnkn-1Fusion28.18682328.25895690.10657898
A01HEXInternal ControlUnkn-1Fusion30.8251246930.81717730.193826672
A01Texas RedGene AUnkn-1FusionNaN00

To be able to complete the FDA sections above, data from the different runs need to be combined together. So I manually added an additional column with a unique run name to every run, which allows me to combine the runs together and still distinguish between wells.

RunWellFluorTargetContentSampleBiological Set NameCqCq MeanCq Std. Dev
VP-xxxx-yyy_001A01FAMFusionUnkn-1Fusion28.18682328.25895690.10657898
VP-xxxx-yyy_001A01HEXInternal ControlUnkn-1Fusion30.8251246930.81717730.193826672
VP-xxxx-yyy_001A01Texas RedGene AUnkn-1FusionNaN00

Sometimes we have to exclude a few wells due to operator or technical errors.In a separate file that called “wells_to_exclude.csv”, I listed the Run and Well ID.

RunWellComments
VP-xxxx-yyy_001A08Operator error
VP-xxxx-yyy_002A12Forgot to add template

I can remove these wells with the following code.

For all of my analyses, the first 10 lines of my R code is the same: the data from the PCR runs are read into a data frame and then bad wells are excluded.

Here’s the code:


# libraries that I use a lot
library (tolerance)
library (plyr)
library (EnvStats)

# folder to print all my output
folder = “C:\\Users\\pauline\\Documents\\Fusion\\AnalyticalSpecificity\\”

# folder containing all of the runs combined for this particular experiment.
# See format in Table 2 above.
filename = “C:\\Users\\pauline\\Documents\\Fusion\\EDTA_EtOH_combined.csv”

# read in data
whole_df <- read.csv (filename, header=TRUE)

# make a unique id combining the Run ID and Well ID
whole_df[,”UniqueID”] = paste (whole_df[,”Run”],whole_df[,”Well”])

# remove more problematic wells
df_problem <- read.csv (“C:\\Users\\pauline\\Documents\\Fusion\\wells_to_exclude.csv”, header=TRUE)
problem_samples <- paste (df_problem[,”Run”], df_problem[,”Well”])
whole_df <- whole_df[!(whole_df$UniqueID %in% problem_samples), ]

whole_df is a data frame containing all the data, excluding the problematic wells.

To call a fusion, we use ΔCt. For the same well, we have to calculate ΔCt = FAM Ct – Texas Red Ct.

In all my scripts, I’ll create a new data frame called “channels”, where ΔCt is calculated by merging FAM and Texas Red into to the same row (based on run and well), and then doing the subtraction.


# get Texas Red values & other important stuff
df_TexasRed <- df[df$Fluor == "Texas Red", c("UniqueID", "Fluor", "Target", "Sample", "Spikein_Level", "Sample_no_number", "Content", "Cq")]


# get FAM values only
df_FAM <- df[df$Fluor == "FAM", c("UniqueID", "Fluor", "Cq")]


# merge into same row, based on UniqueID (run & well)
channels <- merge (df_TexasRed ,df_FAM, by="UniqueID")


# calculate deltaCq
channels[,"deltaCq"] <- channels["Cq.y"] - channels["Cq.x"]

If I'm lazy and want to use the same functions as the FAM and Texas Red channels which use the "Ct column", I might also create channels["Ct"], and assign the column ΔCt values.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>