Title: | Optimal Decision Trees Algorithm |
---|---|
Description: | Implements a tree-based method specifically designed for personalized medicine applications. By using genomic and mutational data, 'ODT' efficiently identifies optimal drug recommendations tailored to individual patient profiles. The 'ODT' algorithm constructs decision trees that bifurcate at each node, selecting the most relevant markers (discrete or continuous) and corresponding treatments, thus ensuring that recommendations are both personalized and statistically robust. This iterative approach enhances therapeutic decision-making by refining treatment suggestions until a predefined group size is achieved. Moreover, the simplicity and interpretability of the resulting trees make the method accessible to healthcare professionals. Includes functions for training the decision tree, making predictions on new samples or patients, and visualizing the resulting tree. For detailed insights into the methodology, please refer to Gimeno et al. (2023) <doi:10.1093/bib/bbad200>. |
Authors: | Maddi Eceiza [aut], Lucia Ruiz [aut], Angel Rubio [aut], Katyna Sada Del Real [aut, cre] |
Maintainer: | Katyna Sada Del Real <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.0.1 |
Built: | 2025-02-18 06:13:32 UTC |
Source: | https://github.com/katynasada/odt |
A matrix containing drug response values (IC50 values) obtained from patients in Waves 1 and 2 of the BeatAML2 cohort. This dataset serves as a toy example for demonstrating the functionality of the trainTree
, predictTree
, and niceTree
functions.
data("drug_response_w12")
data("drug_response_w12")
The format is: num [1:247, 1:119] 2.710983 2.8755433 3.4390103 2.6527257...
data(drug_response_w12)
data(drug_response_w12)
A matrix containing drug response values (IC50 values) obtained from patients in Waves 3 and 4 of the BeatAML2 cohort. This dataset serves as a toy example for demonstrating the functionality of the trainTree
, predictTree
, and niceTree
functions.
data("drug_response_w34")
data("drug_response_w34")
The format is: num [1:142, 1:119] 3.4156359 3.2345985 3.1836058 3.7874252...
data(drug_response_w34)
data(drug_response_w34)
A dataframe containing gene expression values obtained from patients in Waves 1 and 2 of the BeatAML cohort. This dataset serves as a toy example for demonstrating the functionality of the trainTree
, predictTree
, and niceTree
functions.
data("expression_w12")
data("expression_w12")
A dataframe consisting of 247 rows and 1000 columns, where each row represents a different patient and each column corresponds to the expression levels of a specific gene. The entries in the data frame are floating-point values, indicating the gene expression levels measured for each patient.
data(expression_w12)
data(expression_w12)
A dataframe containing gene expression values obtained from patients in Waves 3 and 4 of the BeatAML cohort. This dataset serves as a toy example for demonstrating the functionality of the trainTree
, predictTree
, and niceTree
functions.
data("expression_w34")
data("expression_w34")
A dataframe consisting of 142 rows and 1000 columns, where each row represents a different patient and each column corresponds to the expression levels of a specific gene. The entries in the data frame are floating-point values, indicating the gene expression levels measured for each patient.
data(expression_w34)
data(expression_w34)
A binary matrix representing mutation status for patients from Waves 1 and 2 of the BeatAML cohort, indicating whether specific mutations are present (1) or absent (0) in each patient. This dataset serves as a toy example for demonstrating the functionality of the trainTree
, predictTree
, and niceTree
functions.
data("mutations_w12")
data("mutations_w12")
A binary matrix consisting of 247 rows and 70 columns, where each row represents a different patient and each column corresponds to a specific mutation.
The format is as follows: num [1:247, 1:70] 0 0 0 0 0 1 ...
data(mutations_w12)
data(mutations_w12)
A binary matrix representing mutation status for patients from Waves 3 and 4 of the BeatAML cohort, indicating whether specific mutations are present (1) or absent (0) in each patient. This dataset serves as a toy example for demonstrating the functionality of the trainTree
, predictTree
, and niceTree
functions.
data("mutations_w34")
data("mutations_w34")
A binary matrix consisting of 142 rows and 70 columns, where each row represents a different patient and each column corresponds to a specific mutation.
The format is as follows: num [1:247, 1:70] 0 0 0 0 0 1 ...
data(mutations_w34)
data(mutations_w34)
A graphical display of the tree. It can also be saved as an image in the selected directory.
niceTree( tree, folder = NULL, colors = c("", "#367592", "#39A7AE", "#96D6B6", "#FDE5B0", "#F3908B", "#E36192", "#8E4884", "#A83333"), fontname = "Roboto", fontstyle = "plain", shape = "diamond", output_format = "png" )
niceTree( tree, folder = NULL, colors = c("", "#367592", "#39A7AE", "#96D6B6", "#FDE5B0", "#F3908B", "#E36192", "#8E4884", "#A83333"), fontname = "Roboto", fontstyle = "plain", shape = "diamond", output_format = "png" )
tree |
A party of the trained tree with the treatments assigned to each node. |
folder |
Directory to save the image (default is the current working directory). |
colors |
A vector of colors for the boxes. Can include hex color codes (e.g., "#FFFFFF"). |
fontname |
The name of the font to use for the text labels (default is "Roboto"). |
fontstyle |
The style of the font (e.g., "plain", "italic", "bold"). |
shape |
The format of the boxes for the different genes (e.g., "diamond", "box"). |
output_format |
The image format for saving (e.g., "png", "jpg", "svg", "pdf"). |
The user has already defined a style for the plot; the parameters are set if not modified when calling niceTree.
(Invisibly) returns a list. The representation of the tree in the command window and the plot of the tree.
# Basic example of how to perform niceTree: data("mutations_w12") data("drug_response_w12") ODTmut <- trainTree(PatientData = mutations_w12, PatientSensitivity = drug_response_w12, minbucket = 10) niceTree(ODTmut) # Example for plotting the tree trained for gene expressions: data("expression_w34") data("drug_response_w34") ODTExp <- trainTree(PatientData = expression_w34, PatientSensitivity = drug_response_w34, minbucket = 20) niceTree(ODTExp)
# Basic example of how to perform niceTree: data("mutations_w12") data("drug_response_w12") ODTmut <- trainTree(PatientData = mutations_w12, PatientSensitivity = drug_response_w12, minbucket = 10) niceTree(ODTmut) # Example for plotting the tree trained for gene expressions: data("expression_w34") data("drug_response_w34") ODTExp <- trainTree(PatientData = expression_w34, PatientSensitivity = drug_response_w34, minbucket = 20) niceTree(ODTExp)
This function utilizes a trained decision tree model (ODT) to predict treatment outcomes for test data based on patient sensitivity data and features, such as mutations or gene expression profiles.
predictTree(tree, PatientData, PatientSensitivityTrain)
predictTree(tree, PatientData, PatientSensitivityTrain)
tree |
A trained decision tree object created by the 'trainTree' function. |
PatientData |
A matrix representing patient features, where rows correspond to patients/samples and columns correspond to genes/features. This matrix can contain:
|
PatientSensitivityTrain |
A matrix containing the drug response values of the **training dataset**. In this matrix, rows correspond to patients, and columns correspond to drugs. This matrix is used solely for extracting treatment names and is not used in the prediction process itself. |
A factor representing the assigned treatment for each node in the decision tree based on the provided patient data and sensitivity.
# Example 1: Prediction using mutation data data("mutations_w12") data("mutations_w34") data("drug_response_w12") ODTmut <- trainTree(PatientData = mutations_w12, PatientSensitivity = drug_response_w12, minbucket = 10) ODTmut ODT_mutpred <- predictTree(tree = ODTmut, PatientSensitivityTrain = drug_response_w12, PatientData = mutations_w34) # Example 2: Prediction using gene expression data data("expression_w34") data("expression_w12") data("drug_response_w34") ODTExp <- trainTree(PatientData = expression_w34, PatientSensitivity = drug_response_w34, minbucket = 20) ODTExp ODT_EXPpred <- predictTree(tree = ODTExp, PatientSensitivityTrain = drug_response_w34, PatientData = expression_w12)
# Example 1: Prediction using mutation data data("mutations_w12") data("mutations_w34") data("drug_response_w12") ODTmut <- trainTree(PatientData = mutations_w12, PatientSensitivity = drug_response_w12, minbucket = 10) ODTmut ODT_mutpred <- predictTree(tree = ODTmut, PatientSensitivityTrain = drug_response_w12, PatientData = mutations_w34) # Example 2: Prediction using gene expression data data("expression_w34") data("expression_w12") data("drug_response_w34") ODTExp <- trainTree(PatientData = expression_w34, PatientSensitivity = drug_response_w34, minbucket = 20) ODTExp ODT_EXPpred <- predictTree(tree = ODTExp, PatientSensitivityTrain = drug_response_w34, PatientData = expression_w12)
This function trains a decision tree model based on patient data, which can either be gene expression levels or a binary matrix indicating mutations.
trainTree(PatientData, PatientSensitivity, minbucket = 20)
trainTree(PatientData, PatientSensitivity, minbucket = 20)
PatientData |
A matrix representing patient features, where rows correspond to patients/samples and columns correspond to genes/features. This matrix can contain:
|
PatientSensitivity |
A matrix representing drug response values, where rows correspond to patients in the same order as in 'PatientData', and columns correspond to drugs. Higher values indicate greater drug resistance and, consequently, lower sensitivity to treatment. This matrix can represent various measures of drug response, such as IC50 values or area under the drug response curve (AUC). Depending on the interpretation of these values, users may need to adjust the sign of this data. |
minbucket |
An integer specifying the minimum number of patients required in a node to allow for a split. |
An object of class 'party' representing the trained decision tree, with the assigned treatments for each node.
# Basic example of using the trainTree function with mutational data data("drug_response_w12") data("mutations_w12") ODTmut <- trainTree(PatientData = mutations_w12, PatientSensitivity = drug_response_w12, minbucket = 10) plot(ODTmut) # Example using gene expression data instead data("drug_response_w34") data("expression_w34") ODTExp <- trainTree(PatientData = expression_w34, PatientSensitivity = drug_response_w34, minbucket = 20) plot(ODTExp)
# Basic example of using the trainTree function with mutational data data("drug_response_w12") data("mutations_w12") ODTmut <- trainTree(PatientData = mutations_w12, PatientSensitivity = drug_response_w12, minbucket = 10) plot(ODTmut) # Example using gene expression data instead data("drug_response_w34") data("expression_w34") ODTExp <- trainTree(PatientData = expression_w34, PatientSensitivity = drug_response_w34, minbucket = 20) plot(ODTExp)