Evaluation of Multiple Models to Distinguish Closely Related Forms of Disease Using DNA Microarray Data: an Application to Multiple Myeloma

Verfasser / Beitragende:
[Johanna Hardin, Michael Waddell, C. David Page, Fenghuang Zhan, Bart Barlogie, John Shaughnessy, John J Crowley]
Ort, Verlag, Jahr:
2004
Enthalten in:
Statistical Applications in Genetics and Molecular Biology, 3/1(2004-06-08), 1-21
Format:
Artikel (online)
ID: 378926098
LEADER caa a22 4500
001 378926098
003 CHVBK
005 20180305123617.0
007 cr unu---uuuuu
008 161128e20040608xx s 000 0 eng
024 7 0 |a 10.2202/1544-6115.1018  |2 doi 
035 |a (NATIONALLICENCE)gruyter-10.2202/1544-6115.1018 
245 0 0 |a Evaluation of Multiple Models to Distinguish Closely Related Forms of Disease Using DNA Microarray Data: an Application to Multiple Myeloma  |h [Elektronische Daten]  |c [Johanna Hardin, Michael Waddell, C. David Page, Fenghuang Zhan, Bart Barlogie, John Shaughnessy, John J Crowley] 
520 3 |a Motivation: Standard laboratory classification of the plasma cell dyscrasia monoclonal gammopathy of undetermined significance (MGUS) and the overt plasma cell neoplasm multiple myeloma (MM) is quite accurate, yet, for the most part, biologically uninformative. Most, if not all, cancers are caused by inherited or acquired genetic mutations that manifest themselves in altered gene expression patterns in the clonally related cancer cells. Microarray technology allows for qualitative and quantitative measurements of the expression levels of thousands of genes simultaneously, and it has now been used both to classify cancers that are morphologically indistinguishable and to predict response to therapy. It is anticipated that this information can also be used to develop molecular diagnostic models and to provide insight into mechanisms of disease progression, e.g., transition from healthy to benign hyperplasia or conversion of a benign hyperplasia to overt malignancy. However, standard data analysis techniques are not trivial to employ on these large data sets. Methodology designed to handle large data sets (or modified to do so) is needed to access the vital information contained in the genetic samples, which in turn can be used to develop more robust and accurate methods of clinical diagnostics and prognostics. Results: Here we report on the application of a panel of statistical and data mining methodologies to classify groups of samples based on expression of 12,000 genes derived from a high density oligonucleotide microarray analysis of highly purified plasma cells from newly diagnosed MM, MGUS, and normal healthy donors. The three groups of samples are each tested against each other. The methods are found to be similar in their ability to predict group membership; all do quite well at predicting MM vs. normal and MGUS vs. normal. However, no method appears to be able to distinguish explicitly the genetic mechanisms between MM and MGUS. We believe this might be due to the lack of genetic differences between these two conditions, and may not be due to the failure of the models. We report the prediction errors for each of the models and each of the methods. Additionally, we report ROC curves for the results on group prediction. Availability: Logistic regression: standard software, available, for example in SAS. Decision trees and boosted trees: C5.0 from www.rulequest.com. SVM: SVM-light is publicly available from svmlight.joachims.org. Naïve Bayes and ensemble of voters are publicly available from www.biostat.wisc.edu/~mwaddell/eov.html. Nearest Shrunken Centroids is publicly available from http://www-stat.stanford.edu/~tibs/PAM. 
540 |a ©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston 
690 7 |a Multivariate Analysis  |2 nationallicence 
690 7 |a Microarrays  |2 nationallicence 
690 7 |a Microarray  |2 nationallicence 
690 7 |a Logistic Regression  |2 nationallicence 
690 7 |a Boosted Decision Trees  |2 nationallicence 
690 7 |a Ensemble of Voters  |2 nationallicence 
690 7 |a Support Vector Machines  |2 nationallicence 
690 7 |a Nearest Shrunken Centroid  |2 nationallicence 
690 7 |a Multiple Myeloma  |2 nationallicence 
690 7 |a MGUS  |2 nationallicence 
700 1 |a Hardin  |D Johanna  |u Pomona College  |4 aut 
700 1 |a Waddell  |D Michael  |u University of Wisconsin, Madison  |4 aut 
700 1 |a Page  |D C. David  |u University of Wisconsin  |4 aut 
700 1 |a Zhan  |D Fenghuang  |u University of Arkansas for Medical Sciences, Little Rock  |4 aut 
700 1 |a Barlogie  |D Bart  |u University of Arkansas  |4 aut 
700 1 |a Shaughnessy  |D John  |u University of Arkansas for Medical Sciences  |4 aut 
700 1 |a Crowley  |D John J.  |u Cancer Research And Biostatistics  |4 aut 
773 0 |t Statistical Applications in Genetics and Molecular Biology  |d De Gruyter  |g 3/1(2004-06-08), 1-21  |q 3:1<1  |1 2004  |2 3  |o sagmb 
856 4 0 |u https://doi.org/10.2202/1544-6115.1018  |q text/html  |z Onlinezugriff via DOI 
908 |D 1  |a research article  |2 jats 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.2202/1544-6115.1018  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Hardin  |D Johanna  |u Pomona College  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Waddell  |D Michael  |u University of Wisconsin, Madison  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Page  |D C. David  |u University of Wisconsin  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Zhan  |D Fenghuang  |u University of Arkansas for Medical Sciences, Little Rock  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Barlogie  |D Bart  |u University of Arkansas  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Shaughnessy  |D John  |u University of Arkansas for Medical Sciences  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Crowley  |D John J.  |u Cancer Research And Biostatistics  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Statistical Applications in Genetics and Molecular Biology  |d De Gruyter  |g 3/1(2004-06-08), 1-21  |q 3:1<1  |1 2004  |2 3  |o sagmb 
900 7 |b CC0  |u http://creativecommons.org/publicdomain/zero/1.0  |2 nationallicence 
898 |a BK010053  |b XK010053  |c XK010000 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-gruyter