A computational approach to identification and comparison of cell subsets in flow cytometry data [electronic resource]
- Noah Zimmerman.
- Physical description
- 1 online resource.
All items must be viewed on site
Request items at least 2 days before you visit to allow retrieval from off-site storage. You can request at most 5 items per day.
|3781 2011 Z||In-library use|
- Zimmerman, Noah.
- Das, Amar K. (Amar Kumar), primary advisor.
- Walther, Guenther, primary advisor.
- Herzenberg, Leonore A. advisor.
- Stanford University Program in Biomedical Informatics.
- Changes in frequency and/or biomarker expression in small subsets of peripheral blood cells provide key diagnostics for disease presence, status and prognosis. At present, flow cytometry instruments that measure the joint expression of up to 20 markers in/on large numbers of individual cells are used to measure surface and internal marker expression. This technology is routinely used to determine the frequencies of various marker-defined cell subsets in patient samples and is often used to inform therapeutic decision-making. Nevertheless, quantitative methods for comparing data between samples are sorely lacking. There are no reliable computational methods for determining the magnitude of differences among samples from different patients, among samples obtained from the same patient on different days, or between aliquots of the same sample measured before and after response to stimulation or other treatment. This thesis describes novel computational methods that provide reliable indices of change in subset representation and/or marker expression by individual subsets of cells. The methods we have developed utilize a non-parametric clustering algorithm, Density-Based Merging (DBM), that we developed to identify subsets (clusters) of cells that express a common set of markers measured independently for each cell by flow cytometry. To quantitate differences between these subsets, we introduce the application of Earth Movers Distance (EMD), an algorithm used to compare multivariate distributions borrowed from the image retrieval literature. The resultant methods are highly sensitive and reliable for identifying small marker expression differences between subset of cells in flow cytometry data sets. We show that these methods are easily applied and readily interpreted. Importantly, we demonstrate their practical utility with data from an allergy study in which the expression of two markers on very rare blood cells (basophils) in response to stimulation with an offending allergen indicates whether the patient is allergic to the stimulating antigen. In addition, we have developed novel evaluation criteria for assessing the performance of clustering algorithms on flow cytometry data by combining mixtures of cells identifiable by dimensions ``hidden'' from the algorithm that provide true cluster membership. Thus, we expect that the methods described here will introduce a new approach to using flow cytometry to measure biomarker changes as indices of drug response, disease susceptibility, disease progress and prognosis.
- Publication date
- Submitted to the Program in Biomedical Informatics.
- Thesis (Ph.D.)--Stanford University, 2011.
Browse related items
Start at call number: