DigitalCommons@Kennesaw State University

Home > CCSE > Data Science and Analytics > PhD DSA

Doctor of Data Science and Analytics Dissertations

The PhD Website

The Ph.D. in Data Science and Analytics is an advanced degree with a dual focus of application and research - where students will engage in real world business problems, which will inform and guide their research interests.

We launched the first formal PhD program in Data Science in 2015. Our program sits at the intersection of computer science, statistics, mathematics, and business. Our students engage in relevant research with faculty from across our eleven colleges. As one of the institutions on the forefront of the development of data science as an academic discipline, we are committed to developing the next generation of Data Science leaders, researchers, and educators. Culturally, we are committed to the discipline of Data Science, through ethical practices, attention to fairness, to a diverse student body, to academic excellence, and research which makes positive contributions to our local, regional, and global community. -Sherry Ni, Director, Ph.D. in Data Science and Analytics

This degree will train individuals to translate and facilitate new innovative research, structured and unstructured, complex data into information to improve decision making. This curriculum includes heavy emphasis on programming, data mining, statistical modeling, and the mathematical foundations to support these concepts. Importantly, the program also emphasizes communication skills – both oral and written – as well as application and tying results to business and research problems.

Need to Submit Your Dissertation? Submit Here!

Dissertations from 2023 2023.

Quantification of Various Types of Biases in Large Language Models , Sudhashree Sayenju

Dissertations from 2022 2022

Appley: Approximate Shapley Values for Model Explainability in Linear Time , Md Shafiul Alam

Ethical Analytics: A Framework for a Practically-Oriented Sub-Discipline of AI Ethics , Jonathan Boardman

Novel Instance-Level Weighted Loss Function for Imbalanced Learning , Trent Geisler

Debiasing Cyber Incidents – Correcting for Reporting Delays and Under-reporting , Seema Sangari

Dissertations from 2021 2021

Integrated Machine Learning Approaches to Improve Classification performance and Feature Extraction Process for EEG Dataset , Mohammad Masum

A Distance-Based Clustering Framework for Categorical Time Series: A Case Study in Episodes of Care Healthcare Delivery System , Lauren Staples

Dissertations from 2020 2020

A CREDIT ANALYSIS OF THE UNBANKED AND UNDERBANKED: AN ARGUMENT FOR ALTERNATIVE DATA , Edwin Baidoo

Quantitatively Motivated Model Development Framework: Downstream Analysis Effects of Normalization Strategies , Jessica M. Rudd

Data-driven Investment Decisions in P2P Lending: Strategies of Integrating Credit Scoring and Profit Scoring , Yan Wang

A Novel Penalized Log-likelihood Function for Class Imbalance Problem , Lili Zhang

ATTACK AND DEFENSE IN SECURITY ANALYTICS , Yiyun Zhou

Dissertations from 2019 2019

One and Two-Step Estimation of Time Variant Parameters and Nonparametric Quantiles , Bogdan Gadidov

Biologically Interpretable, Integrative Deep Learning for Cancer Survival Analysis , Jie Hao

Deep Embedding Kernel , Linh Le

Ordinal HyperPlane Loss , Bob Vanderheyden

Advanced Search

  • Notify me via email or RSS
  • All Collections
  • Disciplines
  • Conferences
  • Faculty Works
  • Open Access
  • Research Support
  • Student Works
  • Data Science Homepage

Useful Links

  • Training Materials

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright DigitalCommons@Kennesaw State University ISSN: 2576-6805

Machine Learning - CMU

PhD Dissertations

PhD Dissertations

[all are .pdf files].

Learning Models that Match Jacob Tyo, 2024

Improving Human Integration across the Machine Learning Pipeline Charvi Rastogi, 2024

Reliable and Practical Machine Learning for Dynamic Healthcare Settings Helen Zhou, 2023

Automatic customization of large-scale spiking network models to neuronal population activity (unavailable) Shenghao Wu, 2023

Estimation of BVk functions from scattered data (unavailable) Addison J. Hu, 2023

Rethinking object categorization in computer vision (unavailable) Jayanth Koushik, 2023

Advances in Statistical Gene Networks Jinjin Tian, 2023 Post-hoc calibration without distributional assumptions Chirag Gupta, 2023

The Role of Noise, Proxies, and Dynamics in Algorithmic Fairness Nil-Jana Akpinar, 2023

Collaborative learning by leveraging siloed data Sebastian Caldas, 2023

Modeling Epidemiological Time Series Aaron Rumack, 2023

Human-Centered Machine Learning: A Statistical and Algorithmic Perspective Leqi Liu, 2023

Uncertainty Quantification under Distribution Shifts Aleksandr Podkopaev, 2023

Probabilistic Reinforcement Learning: Using Data to Define Desired Outcomes, and Inferring How to Get There Benjamin Eysenbach, 2023

Comparing Forecasters and Abstaining Classifiers Yo Joong Choe, 2023

Using Task Driven Methods to Uncover Representations of Human Vision and Semantics Aria Yuan Wang, 2023

Data-driven Decisions - An Anomaly Detection Perspective Shubhranshu Shekhar, 2023

Applied Mathematics of the Future Kin G. Olivares, 2023

METHODS AND APPLICATIONS OF EXPLAINABLE MACHINE LEARNING Joon Sik Kim, 2023

NEURAL REASONING FOR QUESTION ANSWERING Haitian Sun, 2023

Principled Machine Learning for Societally Consequential Decision Making Amanda Coston, 2023

Long term brain dynamics extend cognitive neuroscience to timescales relevant for health and physiology Maxwell B. Wang, 2023

Long term brain dynamics extend cognitive neuroscience to timescales relevant for health and physiology Darby M. Losey, 2023

Calibrated Conditional Density Models and Predictive Inference via Local Diagnostics David Zhao, 2023

Towards an Application-based Pipeline for Explainability Gregory Plumb, 2022

Objective Criteria for Explainable Machine Learning Chih-Kuan Yeh, 2022

Making Scientific Peer Review Scientific Ivan Stelmakh, 2022

Facets of regularization in high-dimensional learning: Cross-validation, risk monotonization, and model complexity Pratik Patil, 2022

Active Robot Perception using Programmable Light Curtains Siddharth Ancha, 2022

Strategies for Black-Box and Multi-Objective Optimization Biswajit Paria, 2022

Unifying State and Policy-Level Explanations for Reinforcement Learning Nicholay Topin, 2022

Sensor Fusion Frameworks for Nowcasting Maria Jahja, 2022

Equilibrium Approaches to Modern Deep Learning Shaojie Bai, 2022

Towards General Natural Language Understanding with Probabilistic Worldbuilding Abulhair Saparov, 2022

Applications of Point Process Modeling to Spiking Neurons (Unavailable) Yu Chen, 2021

Neural variability: structure, sources, control, and data augmentation Akash Umakantha, 2021

Structure and time course of neural population activity during learning Jay Hennig, 2021

Cross-view Learning with Limited Supervision Yao-Hung Hubert Tsai, 2021

Meta Reinforcement Learning through Memory Emilio Parisotto, 2021

Learning Embodied Agents with Scalably-Supervised Reinforcement Learning Lisa Lee, 2021

Learning to Predict and Make Decisions under Distribution Shift Yifan Wu, 2021

Statistical Game Theory Arun Sai Suggala, 2021

Towards Knowledge-capable AI: Agents that See, Speak, Act and Know Kenneth Marino, 2021

Learning and Reasoning with Fast Semidefinite Programming and Mixing Methods Po-Wei Wang, 2021

Bridging Language in Machines with Language in the Brain Mariya Toneva, 2021

Curriculum Learning Otilia Stretcu, 2021

Principles of Learning in Multitask Settings: A Probabilistic Perspective Maruan Al-Shedivat, 2021

Towards Robust and Resilient Machine Learning Adarsh Prasad, 2021

Towards Training AI Agents with All Types of Experiences: A Unified ML Formalism Zhiting Hu, 2021

Building Intelligent Autonomous Navigation Agents Devendra Chaplot, 2021

Learning to See by Moving: Self-supervising 3D Scene Representations for Perception, Control, and Visual Reasoning Hsiao-Yu Fish Tung, 2021

Statistical Astrophysics: From Extrasolar Planets to the Large-scale Structure of the Universe Collin Politsch, 2020

Causal Inference with Complex Data Structures and Non-Standard Effects Kwhangho Kim, 2020

Networks, Point Processes, and Networks of Point Processes Neil Spencer, 2020

Dissecting neural variability using population recordings, network models, and neurofeedback (Unavailable) Ryan Williamson, 2020

Predicting Health and Safety: Essays in Machine Learning for Decision Support in the Public Sector Dylan Fitzpatrick, 2020

Towards a Unified Framework for Learning and Reasoning Han Zhao, 2020

Learning DAGs with Continuous Optimization Xun Zheng, 2020

Machine Learning and Multiagent Preferences Ritesh Noothigattu, 2020

Learning and Decision Making from Diverse Forms of Information Yichong Xu, 2020

Towards Data-Efficient Machine Learning Qizhe Xie, 2020

Change modeling for understanding our world and the counterfactual one(s) William Herlands, 2020

Machine Learning in High-Stakes Settings: Risks and Opportunities Maria De-Arteaga, 2020

Data Decomposition for Constrained Visual Learning Calvin Murdock, 2020

Structured Sparse Regression Methods for Learning from High-Dimensional Genomic Data Micol Marchetti-Bowick, 2020

Towards Efficient Automated Machine Learning Liam Li, 2020

LEARNING COLLECTIONS OF FUNCTIONS Emmanouil Antonios Platanios, 2020

Provable, structured, and efficient methods for robustness of deep networks to adversarial examples Eric Wong , 2020

Reconstructing and Mining Signals: Algorithms and Applications Hyun Ah Song, 2020

Probabilistic Single Cell Lineage Tracing Chieh Lin, 2020

Graphical network modeling of phase coupling in brain activity (unavailable) Josue Orellana, 2019

Strategic Exploration in Reinforcement Learning - New Algorithms and Learning Guarantees Christoph Dann, 2019 Learning Generative Models using Transformations Chun-Liang Li, 2019

Estimating Probability Distributions and their Properties Shashank Singh, 2019

Post-Inference Methods for Scalable Probabilistic Modeling and Sequential Decision Making Willie Neiswanger, 2019

Accelerating Text-as-Data Research in Computational Social Science Dallas Card, 2019

Multi-view Relationships for Analytics and Inference Eric Lei, 2019

Information flow in networks based on nonstationary multivariate neural recordings Natalie Klein, 2019

Competitive Analysis for Machine Learning & Data Science Michael Spece, 2019

The When, Where and Why of Human Memory Retrieval Qiong Zhang, 2019

Towards Effective and Efficient Learning at Scale Adams Wei Yu, 2019

Towards Literate Artificial Intelligence Mrinmaya Sachan, 2019

Learning Gene Networks Underlying Clinical Phenotypes Under SNP Perturbations From Genome-Wide Data Calvin McCarter, 2019

Unified Models for Dynamical Systems Carlton Downey, 2019

Anytime Prediction and Learning for the Balance between Computation and Accuracy Hanzhang Hu, 2019

Statistical and Computational Properties of Some "User-Friendly" Methods for High-Dimensional Estimation Alnur Ali, 2019

Nonparametric Methods with Total Variation Type Regularization Veeranjaneyulu Sadhanala, 2019

New Advances in Sparse Learning, Deep Networks, and Adversarial Learning: Theory and Applications Hongyang Zhang, 2019

Gradient Descent for Non-convex Problems in Modern Machine Learning Simon Shaolei Du, 2019

Selective Data Acquisition in Learning and Decision Making Problems Yining Wang, 2019

Anomaly Detection in Graphs and Time Series: Algorithms and Applications Bryan Hooi, 2019

Neural dynamics and interactions in the human ventral visual pathway Yuanning Li, 2018

Tuning Hyperparameters without Grad Students: Scaling up Bandit Optimisation Kirthevasan Kandasamy, 2018

Teaching Machines to Classify from Natural Language Interactions Shashank Srivastava, 2018

Statistical Inference for Geometric Data Jisu Kim, 2018

Representation Learning @ Scale Manzil Zaheer, 2018

Diversity-promoting and Large-scale Machine Learning for Healthcare Pengtao Xie, 2018

Distribution and Histogram (DIsH) Learning Junier Oliva, 2018

Stress Detection for Keystroke Dynamics Shing-Hon Lau, 2018

Sublinear-Time Learning and Inference for High-Dimensional Models Enxu Yan, 2018

Neural population activity in the visual cortex: Statistical methods and application Benjamin Cowley, 2018

Efficient Methods for Prediction and Control in Partially Observable Environments Ahmed Hefny, 2018

Learning with Staleness Wei Dai, 2018

Statistical Approach for Functionally Validating Transcription Factor Bindings Using Population SNP and Gene Expression Data Jing Xiang, 2017

New Paradigms and Optimality Guarantees in Statistical Learning and Estimation Yu-Xiang Wang, 2017

Dynamic Question Ordering: Obtaining Useful Information While Reducing User Burden Kirstin Early, 2017

New Optimization Methods for Modern Machine Learning Sashank J. Reddi, 2017

Active Search with Complex Actions and Rewards Yifei Ma, 2017

Why Machine Learning Works George D. Montañez , 2017

Source-Space Analyses in MEG/EEG and Applications to Explore Spatio-temporal Neural Dynamics in Human Vision Ying Yang , 2017

Computational Tools for Identification and Analysis of Neuronal Population Activity Pengcheng Zhou, 2016

Expressive Collaborative Music Performance via Machine Learning Gus (Guangyu) Xia, 2016

Supervision Beyond Manual Annotations for Learning Visual Representations Carl Doersch, 2016

Exploring Weakly Labeled Data Across the Noise-Bias Spectrum Robert W. H. Fisher, 2016

Optimizing Optimization: Scalable Convex Programming with Proximal Operators Matt Wytock, 2016

Combining Neural Population Recordings: Theory and Application William Bishop, 2015

Discovering Compact and Informative Structures through Data Partitioning Madalina Fiterau-Brostean, 2015

Machine Learning in Space and Time Seth R. Flaxman, 2015

The Time and Location of Natural Reading Processes in the Brain Leila Wehbe, 2015

Shape-Constrained Estimation in High Dimensions Min Xu, 2015

Spectral Probabilistic Modeling and Applications to Natural Language Processing Ankur Parikh, 2015 Computational and Statistical Advances in Testing and Learning Aaditya Kumar Ramdas, 2015

Corpora and Cognition: The Semantic Composition of Adjectives and Nouns in the Human Brain Alona Fyshe, 2015

Learning Statistical Features of Scene Images Wooyoung Lee, 2014

Towards Scalable Analysis of Images and Videos Bin Zhao, 2014

Statistical Text Analysis for Social Science Brendan T. O'Connor, 2014

Modeling Large Social Networks in Context Qirong Ho, 2014

Semi-Cooperative Learning in Smart Grid Agents Prashant P. Reddy, 2013

On Learning from Collective Data Liang Xiong, 2013

Exploiting Non-sequence Data in Dynamic Model Learning Tzu-Kuo Huang, 2013

Mathematical Theories of Interaction with Oracles Liu Yang, 2013

Short-Sighted Probabilistic Planning Felipe W. Trevizan, 2013

Statistical Models and Algorithms for Studying Hand and Finger Kinematics and their Neural Mechanisms Lucia Castellanos, 2013

Approximation Algorithms and New Models for Clustering and Learning Pranjal Awasthi, 2013

Uncovering Structure in High-Dimensions: Networks and Multi-task Learning Problems Mladen Kolar, 2013

Learning with Sparsity: Structures, Optimization and Applications Xi Chen, 2013

GraphLab: A Distributed Abstraction for Large Scale Machine Learning Yucheng Low, 2013

Graph Structured Normal Means Inference James Sharpnack, 2013 (Joint Statistics & ML PhD)

Probabilistic Models for Collecting, Analyzing, and Modeling Expression Data Hai-Son Phuoc Le, 2013

Learning Large-Scale Conditional Random Fields Joseph K. Bradley, 2013

New Statistical Applications for Differential Privacy Rob Hall, 2013 (Joint Statistics & ML PhD)

Parallel and Distributed Systems for Probabilistic Reasoning Joseph Gonzalez, 2012

Spectral Approaches to Learning Predictive Representations Byron Boots, 2012

Attribute Learning using Joint Human and Machine Computation Edith L. M. Law, 2012

Statistical Methods for Studying Genetic Variation in Populations Suyash Shringarpure, 2012

Data Mining Meets HCI: Making Sense of Large Graphs Duen Horng (Polo) Chau, 2012

Learning with Limited Supervision by Input and Output Coding Yi Zhang, 2012

Target Sequence Clustering Benjamin Shih, 2011

Nonparametric Learning in High Dimensions Han Liu, 2010 (Joint Statistics & ML PhD)

Structural Analysis of Large Networks: Observations and Applications Mary McGlohon, 2010

Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy Brian D. Ziebart, 2010

Tractable Algorithms for Proximity Search on Large Graphs Purnamrita Sarkar, 2010

Rare Category Analysis Jingrui He, 2010

Coupled Semi-Supervised Learning Andrew Carlson, 2010

Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong, 2009

Efficient Matrix Models for Relational Learning Ajit Paul Singh, 2009

Exploiting Domain and Task Regularities for Robust Named Entity Recognition Andrew O. Arnold, 2009

Theoretical Foundations of Active Learning Steve Hanneke, 2009

Generalized Learning Factors Analysis: Improving Cognitive Models with Machine Learning Hao Cen, 2009

Detecting Patterns of Anomalies Kaustav Das, 2009

Dynamics of Large Networks Jurij Leskovec, 2008

Computational Methods for Analyzing and Modeling Gene Regulation Dynamics Jason Ernst, 2008

Stacked Graphical Learning Zhenzhen Kou, 2007

Actively Learning Specific Function Properties with Applications to Statistical Inference Brent Bryan, 2007

Approximate Inference, Structure Learning and Feature Estimation in Markov Random Fields Pradeep Ravikumar, 2007

Scalable Graphical Models for Social Networks Anna Goldenberg, 2007

Measure Concentration of Strongly Mixing Processes with Applications Leonid Kontorovich, 2007

Tools for Graph Mining Deepayan Chakrabarti, 2005

Automatic Discovery of Latent Variable Models Ricardo Silva, 2005

phd thesis in data science pdf

Brown Logo

  • About Brown
  • Campus Life

Information for:

  • Current Students
  • Friends & Neighbors
  • A to Z Index
  • People Directory
  • Social@Brown
  • About the Department
  • Systems & Software
  • Socially Responsible    Computing
  • Positions / Jobs
  • Brown CS News
  • Brown CS Blog
  • Our Community
  • Grad Students
  • Ugrad Students
  • Research Links
  • Publications
  • Opportunities For    Visiting Students
  • Degree Programs
  • Computer Science
  • Cybersecurity
  • Undergraduate
  • Interdisciplinary
  • Miscellaneous
  • Course List
  • TA Program
  • Who We Are
  • Action Plan & Initiatives
  • Student Advocates
  • Data And Demographics
  • Student Groups
  • UTA Endowment
  • Home »
  • Research »
  • Publications »
  • Student Project Reports »

Icon

Computer Science at Brown University Providence, Rhode Island 02912 USA Phone: 401-863-7600 Map & Directions / Contact Us

Facebook

  • Warning : Invalid argument supplied for foreach() in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 95 Warning : array_merge(): Expected parameter 2 to be an array, null given in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 102
  • ODSC EUROPE
  • AI+ Training
  • Speak at ODSC

phd thesis in data science pdf

  • Data Analytics
  • Data Engineering
  • Data Visualization
  • Deep Learning
  • Generative AI
  • Machine Learning
  • NLP and LLMs
  • Business & Use Cases
  • Career Advice
  • Write for us
  • ODSC Community Slack Channel
  • Upcoming Webinars

10 Compelling Machine Learning Ph.D. Dissertations for 2020

10 Compelling Machine Learning Ph.D. Dissertations for 2020

Machine Learning Modeling Research posted by Daniel Gutierrez, ODSC August 19, 2020 Daniel Gutierrez, ODSC

As a data scientist, an integral part of my work in the field revolves around keeping current with research coming out of academia. I frequently scour arXiv.org for late-breaking papers that show trends and reveal fertile areas of research. Other sources of valuable research developments are in the form of Ph.D. dissertations, the culmination of a doctoral candidate’s work to confer his/her degree. Ph.D. candidates are highly motivated to choose research topics that establish new and creative paths toward discovery in their field of study. Their dissertations are highly focused on a specific problem. If you can find a dissertation that aligns with your areas of interest, consuming the research is an excellent way to do a deep dive into the technology. After reviewing hundreds of recent theses from universities all over the country, I present 10 machine learning dissertations that I found compelling in terms of my own areas of interest.

[Related article: Introduction to Bayesian Deep Learning ]

I hope you’ll find several that match your own fields of inquiry. Each thesis may take a while to consume but will result in hours of satisfying summer reading. Enjoy!

1. Bayesian Modeling and Variable Selection for Complex Data

As we routinely encounter high-throughput data sets in complex biological and environmental research, developing novel models and methods for variable selection has received widespread attention. This dissertation addresses a few key challenges in Bayesian modeling and variable selection for high-dimensional data with complex spatial structures. 

2. Topics in Statistical Learning with a Focus on Large Scale Data

Big data vary in shape and call for different approaches. One type of big data is the tall data, i.e., a very large number of samples but not too many features. This dissertation describes a general communication-efficient algorithm for distributed statistical learning on this type of big data. The algorithm distributes the samples uniformly to multiple machines, and uses a common reference data to improve the performance of local estimates. The algorithm enables potentially much faster analysis, at a small cost to statistical performance.

Another type of big data is the wide data, i.e., too many features but a limited number of samples. It is also called high-dimensional data, to which many classical statistical methods are not applicable. 

This dissertation discusses a method of dimensionality reduction for high-dimensional classification. The method partitions features into independent communities and splits the original classification problem into separate smaller ones. It enables parallel computing and produces more interpretable results.

3. Sets as Measures: Optimization and Machine Learning

The purpose of this machine learning dissertation is to address the following simple question:

How do we design efficient algorithms to solve optimization or machine learning problems where the decision variable (or target label) is a set of unknown cardinality?

Optimization and machine learning have proved remarkably successful in applications requiring the choice of single vectors. Some tasks, in particular many inverse problems, call for the design, or estimation, of sets of objects. When the size of these sets is a priori unknown, directly applying optimization or machine learning techniques designed for single vectors appears difficult. The work in this dissertation shows that a very old idea for transforming sets into elements of a vector space (namely, a space of measures), a common trick in theoretical analysis, generates effective practical algorithms.

4. A Geometric Perspective on Some Topics in Statistical Learning

Modern science and engineering often generate data sets with a large sample size and a comparably large dimension which puts classic asymptotic theory into question in many ways. Therefore, the main focus of this dissertation is to develop a fundamental understanding of statistical procedures for estimation and hypothesis testing from a non-asymptotic point of view, where both the sample size and problem dimension grow hand in hand. A range of different problems are explored in this thesis, including work on the geometry of hypothesis testing, adaptivity to local structure in estimation, effective methods for shape-constrained problems, and early stopping with boosting algorithms. The treatment of these different problems shares the common theme of emphasizing the underlying geometric structure.

5. Essays on Random Forest Ensembles

A random forest is a popular machine learning ensemble method that has proven successful in solving a wide range of classification problems. While other successful classifiers, such as boosting algorithms or neural networks, admit natural interpretations as maximum likelihood, a suitable statistical interpretation is much more elusive for a random forest. The first part of this dissertation demonstrates that a random forest is a fruitful framework in which to study AdaBoost and deep neural networks. The work explores the concept and utility of interpolation, the ability of a classifier to perfectly fit its training data. The second part of this dissertation places a random forest on more sound statistical footing by framing it as kernel regression with the proximity kernel. The work then analyzes the parameters that control the bandwidth of this kernel and discuss useful generalizations.

6. Marginally Interpretable Generalized Linear Mixed Models

A popular approach for relating correlated measurements of a non-Gaussian response variable to a set of predictors is to introduce latent random variables and fit a generalized linear mixed model. The conventional strategy for specifying such a model leads to parameter estimates that must be interpreted conditional on the latent variables. In many cases, interest lies not in these conditional parameters, but rather in marginal parameters that summarize the average effect of the predictors across the entire population. Due to the structure of the generalized linear mixed model, the average effect across all individuals in a population is generally not the same as the effect for an average individual. Further complicating matters, obtaining marginal summaries from a generalized linear mixed model often requires evaluation of an analytically intractable integral or use of an approximation. Another popular approach in this setting is to fit a marginal model using generalized estimating equations. This strategy is effective for estimating marginal parameters, but leaves one without a formal model for the data with which to assess quality of fit or make predictions for future observations. Thus, there exists a need for a better approach.

This dissertation defines a class of marginally interpretable generalized linear mixed models that leads to parameter estimates with a marginal interpretation while maintaining the desirable statistical properties of a conditionally specified model. The distinguishing feature of these models is an additive adjustment that accounts for the curvature of the link function and thereby preserves a specific form for the marginal mean after integrating out the latent random variables. 

7. On the Detection of Hate Speech, Hate Speakers and Polarized Groups in Online Social Media

The objective of this dissertation is to explore the use of machine learning algorithms in understanding and detecting hate speech, hate speakers and polarized groups in online social media. Beginning with a unique typology for detecting abusive language, the work outlines the distinctions and similarities of different abusive language subtasks (offensive language, hate speech, cyberbullying and trolling) and how we might benefit from the progress made in each area. Specifically, the work suggests that each subtask can be categorized based on whether or not the abusive language being studied 1) is directed at a specific individual, or targets a generalized “Other” and 2) the extent to which the language is explicit versus implicit. The work then uses knowledge gained from this typology to tackle the “problem of offensive language” in hate speech detection. 

8. Lasso Guarantees for Dependent Data

Serially correlated high dimensional data are prevalent in the big data era. In order to predict and learn the complex relationship among the multiple time series, high dimensional modeling has gained importance in various fields such as control theory, statistics, economics, finance, genetics and neuroscience. This dissertation studies a number of high dimensional statistical problems involving different classes of mixing processes. 

9. Random forest robustness, variable importance, and tree aggregation

Random forest methodology is a nonparametric, machine learning approach capable of strong performance in regression and classification problems involving complex data sets. In addition to making predictions, random forests can be used to assess the relative importance of feature variables. This dissertation explores three topics related to random forests: tree aggregation, variable importance, and robustness. 

10. Climate Data Computing: Optimal Interpolation, Averaging, Visualization and Delivery

This dissertation solves two important problems in the modern analysis of big climate data. The first is the efficient visualization and fast delivery of big climate data, and the second is a computationally extensive principal component analysis (PCA) using spherical harmonics on the Earth’s surface. The second problem creates a way to supply the data for the technology developed in the first. These two problems are computationally difficult, such as the representation of higher order spherical harmonics Y400, which is critical for upscaling weather data to almost infinitely fine spatial resolution.

I hope you enjoyed learning about these compelling machine learning dissertations.

Editor’s note: Interested in more data science research? Check out the Research Frontiers track at ODSC Europe this September 17-19 or the ODSC West Research Frontiers track this October 27-30.

phd thesis in data science pdf

Daniel Gutierrez, ODSC

Daniel D. Gutierrez is a practicing data scientist who’s been working with data long before the field came in vogue. As a technology journalist, he enjoys keeping a pulse on this fast-paced industry. Daniel is also an educator having taught data science, machine learning and R classes at the university level. He has authored four computer industry books on database and data science technology, including his most recent title, “Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R.” Daniel holds a BS in Mathematics and Computer Science from UCLA.

DE Summit Square

Meta Sees Free Models as its Future

AI and Data Science News posted by ODSC Team Apr 27, 2024 A few days ago, Meta introduced Llama 3, its latest advanced AI model, to the public...

ODSC’s AI Weekly Recap: Week of April 26th

ODSC’s AI Weekly Recap: Week of April 26th

AI and Data Science News posted by Jorge Arenas Apr 26, 2024 Every week, the ODSC team researches the latest advancements in AI. We review a selection of...

New AI Models From Apple May Find Home in Future iPhones

New AI Models From Apple May Find Home in Future iPhones

AI and Data Science News posted by ODSC Team Apr 25, 2024 In a report from the Independent AI, Apple researchers have introduced a series of new AI...

AI weekly square

Chapman University Digital Commons

  • < Previous

Home > Dissertations and Theses > Computational and Data Sciences (PhD) Dissertations > 17

Computational and Data Sciences (PhD) Dissertations

Development of integrated machine learning and data science approaches for the prediction of cancer mutation and autonomous drug discovery of anti-cancer therapeutic agents.

Steven Agajanian , Chapman University Follow

Date of Award

Fall 1-2020

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Computational and Data Sciences

First Advisor

Gennady Verkhivker

Second Advisor

Hesham El-Askary

Third Advisor

Erik Linstead

Fourth Advisor

Cyril Rakovski

Few technological ideas have captivated the minds of biochemical researchers to the degree that machine learning (ML) and artificial intelligence (AI) have. Over the last few years, advances in the ML field have driven the design of new computational systems that improve with experience and are able to model increasingly complex chemical and biological phenomena. In this dissertation, we capitalize on these achievements and use machine learning to study drug receptor sites and design drugs to target these sites. First, we analyze the significance of various single nucleotide variations and assess their rate of contribution to cancer. Following that, we used a portfolio of machine learning and data science approaches to design new drugs to target protein kinase inhibitors. We show that these techniques exhibit strong promise in aiding cancer research and drug discovery.

Creative Commons License

Creative Commons License

Recommended Citation

S. Agajanian, "Development of integrated machine learning and data science approaches for the prediction of cancer mutation and autonomous drug discovery of anti-cancer therapeutic agents," Ph.D. dissertation, Chapman University, Orange, CA, 2021. https://doi.org/10.36837/chapman.000220

Since January 20, 2021

https://doi.org/10.36837/chapman.000220

To view the content in your browser, please download Adobe Reader or, alternately, you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.

  • Collections
  • Disciplines

Advanced Search

  • Notify me via email or RSS

Author Corner

  • Submit Research
  • Rights and Terms of Use
  • Leatherby Libraries
  • Chapman University

ISSN 2572-1496

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

MIT Libraries home DSpace@MIT

  • DSpace@MIT Home
  • MIT Libraries

Doctoral Theses

Theses by department.

  • Computational and Systems Biology
  • Department of Aeronautics and Astronautics
  • Department of Architecture
  • Department of Biological Engineering
  • Department of Biology
  • Department of Brain and Cognitive Sciences
  • Department of Chemical Engineering
  • Department of Chemistry
  • Department of Civil and Environmental Engineering
  • Department of Earth, Atmospheric, and Planetary Sciences
  • Department of Economics
  • Department of Electrical Engineering and Computer Sciences
  • Department of Humanities
  • Department of Linguistics and Philosophy
  • Department of Materials Science and Engineering
  • Department of Mathematics
  • Department of Mechanical Engineering
  • Department of Nuclear Science and Engineering
  • Department of Ocean Engineering
  • Department of Physics
  • Department of Political Science
  • Department of Urban Studies and Planning
  • Engineering Systems Division
  • Harvard-MIT Program of Health Sciences and Technology
  • Institute for Data, Systems, and Society
  • Media Arts & Sciences
  • Operations Research Center
  • Science, Technology & Society
  • Sloan School of Management
  • Technology and Policy Program

Recent Submissions

Thumbnail

L-dopa metabolism and the regulation of brain polysome aggregation 

Thumbnail

Families of ideals in the ring of power series in two variables. 

Thumbnail

A methodology for assessing alternative water acquisition and water use strategies for western energy facilities in th American West 

feed

Home > FACULTIES > Computer Science > CSD-ETD

Computer Science Department

Computer Science Theses and Dissertations

This collection contains theses and dissertations from the Department of Computer Science, collected from the Scholarship@Western Electronic Thesis and Dissertation Repository

Theses/Dissertations from 2024 2024

A Target-Based and A Targetless Extrinsic Calibration Methods for Thermal Camera and 3D LiDAR , Farhad Dalirani

Investigating Tree- and Graph-based Neural Networks for Natural Language Processing Applications , Sudipta Singha Roy

Theses/Dissertations from 2023 2023

Classification of DDoS Attack with Machine Learning Architectures and Exploratory Analysis , Amreen Anbar

Multi-view Contrastive Learning for Unsupervised Domain Adaptation in Brain-Computer Interfaces , Sepehr Asgarian

Improved Protein Sequence Alignments Using Deep Learning , Seyed Sepehr Ashrafzadeh

INVESTIGATING IMPROVEMENTS TO MESH INDEXING , Anurag Bhattacharjee

Algorithms and Software for Oligonucleotide Design , Qin Dong

Framework for Assessing Information System Security Posture Risks , Syed Waqas Hamdani

De novo sequencing of multiple tandem mass spectra of peptide containing SILAC labeling , Fang Han

Local Model Agnostic XAI Methodologies Applied to Breast Cancer Malignancy Predictions , Heather Hartley

A Quantitative Analysis Between Software Quality Posture and Bug-fixing Commit , Rongji He

A Novel Method for Assessment of Batch Effect on single cell RNA sequencing data , Behnam Jabbarizadeh

Dynamically Finding Optimal Kernel Launch Parameters for CUDA Programs , Taabish Jeshani

Citation Polarity Identification From Scientific Articles Using Deep Learning Methods , Souvik Kundu

Denoising-Based Domain Adaptation Network for EEG Source Imaging , Runze Li

Decoy-Target Database Strategy and False Discovery Rate Analysis for Glycan Identification , Xiaoou Li

DpNovo: A DEEP LEARNING MODEL COMBINED WITH DYNAMIC PROGRAMMING FOR DE NOVO PEPTIDE SEQUENCING , Yizhou Li

Developing A Smart Home Surveillance System Using Autonomous Drones , Chongju Mai

Look-Ahead Selective Plasticity for Continual Learning , Rouzbeh Meshkinnejad

The Two Visual Processing Streams Through The Lens Of Deep Neural Networks , Aidasadat Mirebrahimi Tafreshi

Source-free Domain Adaptation for Sleep Stage Classification , Yasmin Niknam

Data Heterogeneity and Its Implications for Fairness , Ghazaleh Noroozi

Enhancing Urban Life: A Policy-Based Autonomic Smart City Management System for Efficient, Sustainable, and Self-Adaptive Urban Environments , Elham Okhovat

Evaluating the Likelihood of Bug Inducing Commits Using Metrics Trend Analysis , Parul Parul

On Computing Optimal Repairs for Conditional Independence , Alireza Pirhadi

Open-Set Source-Free Domain Adaptation in Fundus Images Analysis , Masoud Pourreza

Migration in Edge Computing , Arshin Rezazadeh

A Modified Hopfield Network for the K-Median Problem , Cody Rossiter

Predicting Network Failures with AI Techniques , Chandrika Saha

Toward Building an Intelligent and Secure Network: An Internet Traffic Forecasting Perspective , Sajal Saha

An Exploration of Visual Analytic Techniques for XAI: Applications in Clinical Decision Support , Mozhgan Salimiparsa

Attention-based Multi-Source-Free Domain Adaptation for EEG Emotion Recognition , Amir Hesam Salimnia

Global Cyber Attack Forecast using AI Techniques , Nusrat Kabir Samia

IMPLEMENTATION OF A PRE-ASSESSMENT MODULE TO IMPROVE THE INITIAL PLAYER EXPERIENCE USING PREVIOUS GAMING INFORMATION , Rafael David Segistan Canizales

A Computational Framework For Identifying Relevant Cell Types And Specific Regulatory Mechanisms In Schizophrenia Using Data Integration Methods , Kayvan Shabani

Weakly-Supervised Anomaly Detection in Surveillance Videos Based on Two-Stream I3D Convolution Network , Sareh Soltani Nejad

Smartphone Loss Prevention System Using BLE and GPS Technology , Noshin Tasnim

A Hybrid Continual Machine Learning Model for Efficient Hierarchical Classification of Domain-Specific Text in The Presence of Class Overlap (Case Study: IT Support Tickets) , Yasmen M. Wahba

Reducing Negative Transfer of Random Data in Source-Free Unsupervised Domain Adaptation , Anthony Wong

Deep Neural Methods for True/Pseudo- Invasion Classification in Colorectal Polyp Whole-Slide Images , Zhiyuan Yang

Developing a Relay-based Autonomous Drone Delivery System , Muhammad Zakar

Learning Mortality Risk for COVID-19 Using Machine Learning and Statistical Methods , Shaoshi Zhang

Machine Learning Techniques for Improved Functional Brain Parcellation , Da Zhi

Theses/Dissertations from 2022 2022

The Design and Implementation of a High-Performance Polynomial System Solver , Alexander Brandt

Defining Service Level Agreements in Serverless Computing , Mohamed Elsakhawy

Algorithms for Regular Chains of Dimension One , Juan P. Gonzalez Trochez

Towards a Novel and Intelligent e-commerce Framework for Smart-Shopping Applications , Susmitha Hanumanthu

Multi-Device Data Analysis for Fault Localization in Electrical Distribution Grids , Jacob D L Hunte

Towards Parking Lot Occupancy Assessment Using Aerial Imagery and Computer Vision , John Jewell

Potential of Vision Transformers for Advanced Driver-Assistance Systems: An Evaluative Approach , Andrew Katoch

Psychological Understanding of Textual journals using Natural Language Processing approaches , Amirmohammad Kazemeinizadeh

Driver Behavior Analysis Based on Real On-Road Driving Data in the Design of Advanced Driving Assistance Systems , Nima Khairdoost

Solving Challenges in Deep Unsupervised Methods for Anomaly Detection , Vahid Reza Khazaie

Developing an Efficient Real-Time Terrestrial Infrastructure Inspection System Using Autonomous Drones and Deep Learning , Marlin Manka

Predictive Modelling For Topic Handling Of Natural Language Dialogue With Virtual Agents , Lareina Milambiling

Improving Deep Entity Resolution by Constraints , Soudeh Nilforoushan

Respiratory Pattern Analysis for COVID-19 Digital Screening Using AI Techniques , Annita Tahsin Priyoti

Extracting Microservice Dependencies Using Log Analysis , Andres O. Rodriguez Ishida

False Discovery Rate Analysis for Glycopeptide Identification , Shun Saito

Towards a Generalization of Fulton's Intersection Multiplicity Algorithm , Ryan Sandford

An Investigation Into Time Gazed At Traffic Objects By Drivers , Kolby R. Sarson

Exploring Artificial Intelligence (AI) Techniques for Forecasting Network Traffic: Network QoS and Security Perspectives , Ibrahim Mohammed Sayem

A Unified Representation and Deep Learning Architecture for Persuasive Essays in English , Muhammad Tawsif Sazid

Towards the development of a cost-effective Image-Sensing-Smart-Parking Systems (ISenSmaP) , Aakriti Sharma

Advances in the Automatic Detection of Optimization Opportunities in Computer Programs , Delaram Talaashrafi

Reputation-Based Trust Assessment of Transacting Service Components , Konstantinos Tsiounis

Fully Autonomous UAV Exploration in Confined and Connectionless Environments , Kirk P. Vander Ploeg

Three Contributions to the Theory and Practice of Optimizing Compilers , Linxiao Wang

Developing Intelligent Routing Algorithm over SDN: Reusable Reinforcement Learning Approach , Wumian Wang

Predicting and Modifying Memorability of Images , Mohammad Younesi

Theses/Dissertations from 2021 2021

Generating Effective Sentence Representations: Deep Learning and Reinforcement Learning Approaches , Mahtab Ahmed

A Physical Layer Framework for a Smart City Using Accumulative Bayesian Machine Learning , Razan E. AlFar

Load Balancing and Resource Allocation in Smart Cities using Reinforcement Learning , Aseel AlOrbani

Contrastive Learning of Auditory Representations , Haider Al-Tahan

Cache-Friendly, Modular and Parallel Schemes For Computing Subresultant Chains , Mohammadali Asadi

Protein Interaction Sites Prediction using Deep Learning , Sourajit Basak

Predicting Stock Market Sector Sentiment Through News Article Based Textual Analysis , William A. Beldman

Improving Reader Motivation with Machine Learning , Tanner A. Bohn

A Black-box Approach for Containerized Microservice Monitoring in Fog Computing , Shi Chang

Visualization and Interpretation of Protein Interactions , Dipanjan Chatterjee

A Framework for Characterising Performance in Multi-Class Classification Problems with Applications in Cancer Single Cell RNA Sequencing , Erik R. Christensen

Exploratory Search with Archetype-based Language Models , Brent D. Davis

Evolutionary Design of Search and Triage Interfaces for Large Document Sets , Jonathan A. Demelo

Building Effective Network Security Frameworks using Deep Transfer Learning Techniques , Harsh Dhillon

A Deep Topical N-gram Model and Topic Discovery on COVID-19 News and Research Manuscripts , Yuan Du

Automatic extraction of requirements-related information from regulatory documents cited in the project contract , Sara Fotouhi

Developing a Resource and Energy Efficient Real-time Delivery Scheduling Framework for a Network of Autonomous Drones , Gopi Gugan

A Visual Analytics System for Rapid Sensemaking of Scientific Documents , Amirreza Haghverdiloo Barzegar

Calibration Between Eye Tracker and Stereoscopic Vision System Employing a Linear Closed-Form Perspective-n-Point (PNP) Algorithm , Mohammad Karami

Fuzzy and Probabilistic Rule-Based Approaches to Identify Fault Prone Files , Piyush Kumar Korlepara

Parallel Arbitrary-precision Integer Arithmetic , Davood Mohajerani

A Technique for Evaluating the Health Status of a Software Module Using Process Metrics , . Ria

Visual Analytics for Performing Complex Tasks with Electronic Health Records , Neda Rostamzadeh

Predictive Model of Driver's Eye Fixation for Maneuver Prediction in the Design of Advanced Driving Assistance Systems , Mohsen Shirpour

A Generative-Discriminative Approach to Human Brain Mapping , Deepanshu Wadhwa

WesternAccelerator:Rapid Development of Microservices , Haoran Wei

A Lightweight and Explainable Citation Recommendation System , Juncheng Yin

Mitosis Detection from Pathology Images , Jinhang Zhang

Theses/Dissertations from 2020 2020

Visual Analytics of Electronic Health Records with a focus on Acute Kidney Injury , Sheikh S. Abdullah

Towards the Development of Network Service Cost Modeling-An ISP Perspective , Yasmeen Ali

  • Accessible Formats

Advanced Search

  • Notify me via email or RSS
  • Expert Gallery
  • Online Journals
  • eBook Collections
  • Reports and Working Papers
  • Conferences and Symposiums
  • Electronic Theses and Dissertations
  • Digitized Special Collections
  • All Collections
  • Disciplines

Author Corner

  • Submit Thesis/Dissertation

Home | About | FAQ | My Account | Accessibility Statement | Privacy | Copyright

©1878 - 2016 Western University

Harvard University Theses, Dissertations, and Prize Papers

The Harvard University Archives ’ collection of theses, dissertations, and prize papers document the wide range of academic research undertaken by Harvard students over the course of the University’s history.

Beyond their value as pieces of original research, these collections document the history of American higher education, chronicling both the growth of Harvard as a major research institution as well as the development of numerous academic fields. They are also an important source of biographical information, offering insight into the academic careers of the authors.

Printed list of works awarded the Bowdoin prize in 1889-1890.

Spanning from the ‘theses and quaestiones’ of the 17th and 18th centuries to the current yearly output of student research, they include both the first Harvard Ph.D. dissertation (by William Byerly, Ph.D . 1873) and the dissertation of the first woman to earn a doctorate from Harvard ( Lorna Myrtle Hodgkinson , Ed.D. 1922).

Other highlights include:

  • The collection of Mathematical theses, 1782-1839
  • The 1895 Ph.D. dissertation of W.E.B. Du Bois, The suppression of the African slave trade in the United States, 1638-1871
  • Ph.D. dissertations of astronomer Cecilia Payne-Gaposchkin (Ph.D. 1925) and physicist John Hasbrouck Van Vleck (Ph.D. 1922)
  • Undergraduate honors theses of novelist John Updike (A.B. 1954), filmmaker Terrence Malick (A.B. 1966),  and U.S. poet laureate Tracy Smith (A.B. 1994)
  • Undergraduate prize papers and dissertations of philosophers Ralph Waldo Emerson (A.B. 1821), George Santayana (Ph.D. 1889), and W.V. Quine (Ph.D. 1932)
  • Undergraduate honors theses of U.S. President John F. Kennedy (A.B. 1940) and Chief Justice John Roberts (A.B. 1976)

What does a prize-winning thesis look like?

If you're a Harvard undergraduate writing your own thesis, it can be helpful to review recent prize-winning theses. The Harvard University Archives has made available for digital lending all of the Thomas Hoopes Prize winners from the 2019-2021 academic years.

Accessing These Materials

How to access materials at the Harvard University Archives

How to find and request dissertations, in person or virtually

How to find and request undergraduate honors theses

How to find and request Thomas Temple Hoopes Prize papers

How to find and request Bowdoin Prize papers

  • email: Email
  • Phone number 617-495-2461

Related Collections

Harvard faculty personal and professional archives, harvard student life collections: arts, sports, politics and social life, access materials at the harvard university archives.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts

Latest science news, discoveries and analysis

phd thesis in data science pdf

Could a rare mutation that causes dwarfism also slow ageing?

phd thesis in data science pdf

Bird flu in US cows: is the milk supply safe?

phd thesis in data science pdf

Future of Humanity Institute shuts: what's next for ‘deep future’ research?

phd thesis in data science pdf

Judge dismisses superconductivity physicist’s lawsuit against university

Nih pay raise for postdocs and phd students could have us ripple effect, hello puffins, goodbye belugas: changing arctic fjord hints at our climate future, china's moon atlas is the most detailed ever made, ‘shut up and calculate’: how einstein lost the battle to explain quantum reality, ecologists: don’t lose touch with the joy of fieldwork chris mantegna.

phd thesis in data science pdf

Should the Maldives be creating new land?

phd thesis in data science pdf

Lethal AI weapons are here: how can we control them?

phd thesis in data science pdf

Algorithm ranks peer reviewers by reputation — but critics warn of bias

phd thesis in data science pdf

How gliding marsupials got their ‘wings’

Bird flu virus has been spreading in us cows for months, rna reveals, audio long read: why loneliness is bad for your health, nato is boosting ai and climate research as scientific diplomacy remains on ice, rat neurons repair mouse brains — and restore sense of smell.

phd thesis in data science pdf

Retractions are part of science, but misconduct isn’t — lessons from a superconductivity lab

phd thesis in data science pdf

Any plan to make smoking obsolete is the right step

phd thesis in data science pdf

Citizenship privilege harms science

European ruling linking climate change to human rights could be a game changer — here’s how charlotte e. blattner, will ai accelerate or delay the race to net-zero emissions, current issue.

Issue Cover

The Maldives is racing to create new land. Why are so many people concerned?

Surprise hybrid origins of a butterfly species, stripped-envelope supernova light curves argue for central engine activity, optical clocks at sea, research analysis.

phd thesis in data science pdf

Ancient DNA traces family lines and political shifts in the Avar empire

phd thesis in data science pdf

A chemical method for selective labelling of the key amino acid tryptophan

phd thesis in data science pdf

Robust optical clocks promise stable timing in a portable package

phd thesis in data science pdf

Targeting RNA opens therapeutic avenues for Timothy syndrome

Bioengineered ‘mini-colons’ shed light on cancer progression, galaxy found napping in the primordial universe, tumours form without genetic mutations, marsupial genomes reveal how a skin membrane for gliding evolved.

phd thesis in data science pdf

Scientists urged to collect royalties from the ‘magic money tree’

phd thesis in data science pdf

Breaking ice, and helicopter drops: winning photos of working scientists

phd thesis in data science pdf

Shrouded in secrecy: how science is harmed by the bullying and harassment rumour mill

Want to make a difference try working at an environmental non-profit organization, how ground glass might save crops from drought on a caribbean island, books & culture.

phd thesis in data science pdf

How volcanoes shaped our planet — and why we need to be ready for the next big eruption

phd thesis in data science pdf

Dogwhistles, drilling and the roots of Western civilization: Books in brief

phd thesis in data science pdf

Cosmic rentals

Las borinqueñas remembers the forgotten puerto rican women who tested the first pill, dad always mows on summer saturday mornings, nature podcast.

Nature Podcast

Latest videos

Nature briefing.

An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.

phd thesis in data science pdf

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

IMAGES

  1. thesis in data science

    phd thesis in data science pdf

  2. Research PhD Thesis on Big Data Analytics (#PhDThesisGuidance)

    phd thesis in data science pdf

  3. (PDF) Master’s Thesis in Computing Science

    phd thesis in data science pdf

  4. Data Science Statement of Purpose Pdf

    phd thesis in data science pdf

  5. Thesis Template Format

    phd thesis in data science pdf

  6. (PDF) Investigating PhD thesis examination reports

    phd thesis in data science pdf

VIDEO

  1. PhD Thesis Defense. Vadim Sotskov

  2. PhD Programme at IIMB: PhD scholar Sai Dattathrani, Information Systems area

  3. PhD thesis printing from Patel Printers Mumbai

  4. Janell Shah

  5. IIT Kharagpur Ph.D. Forms #csirnet #entranceexam

  6. How to Get Research paper thesis free with Google Scholar

COMMENTS

  1. Computational and Data Sciences (PhD) Dissertations

    Computational and Data Sciences (PhD) Dissertations. Below is a selection of dissertations from the Doctor of Philosophy in Computational and Data Sciences program in Schmid College that have been included in Chapman University Digital Commons. Additional dissertations from years prior to 2019 are available through the Leatherby Libraries ...

  2. PDF Reliable and Flexible Inference for High Dimensional Data

    High-dimensional data are now widely collected in many areas to make scienti c discoveries or build complicated predictive models. The high dimensionality of such data requires analyses to have greater exibility in modeling while ensuring the re-producibility of discoveries. This thesis contains three self-contained chapters that

  3. Doctor of Data Science and Analytics Dissertations

    The PhD Website. The Ph.D. in Data Science and Analytics is an advanced degree with a dual focus of application and research - where students will engage in real world business problems, which will inform and guide their research interests. We launched the first formal PhD program in Data Science in 2015.

  4. PDF Adversarially Robust Machine Learning With Guarantees a Dissertation

    in scope and quality as a dissertation for the degree of Doctor of Philosophy. Tatsunori Hashimoto I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Tengyu Ma Approved for the Stanford University Committee on Graduate Studies.

  5. PhD Dissertations

    PhD Dissertations [All are .pdf files] Probabilistic Reinforcement Learning: Using Data to Define Desired Outcomes, and Inferring How to Get There Benjamin Eysenbach, 2023. Data-driven Decisions - An Anomaly Detection Perspective Shubhranshu Shekhar, 2023. METHODS AND APPLICATIONS OF EXPLAINABLE MACHINE LEARNING Joon Sik Kim, 2023. Applied Mathematics of the Future Kin G. Olivares, 2023

  6. PDF The Evolution of Big Data and Its Business Applications

    THE EVOLUTION OF BIG DATA AND ITS BUSINESS APPLICATIONS Marwah Ahmed Halwani Dissertation Prepared for the Degree of DOCTOR OF PHILOSOPHY UNIVERSITY OF NORTH TEXAS May 2018 . Halwani, Marwah Ahmed. ... professionals will be prepared in data science programs, to aid in the entire process of preparing

  7. PDF Optimization-based Modeling in Investment and Data Science a

    scope and quality as a dissertation for the degree of Doctor of Philosophy. (Stephen P. Boyd) I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. (Emmanuel J. Candes) Approved for the Stanford University Committee on Graduate ...

  8. 17 Compelling Machine Learning Ph.D. Dissertations

    This dissertation revisits and makes progress on some old but challenging problems concerning least squares estimation, the work-horse of supervised machine learning. Two major problems are addressed: (i) least squares estimation with heavy-tailed errors, and (ii) least squares estimation in non-Donsker classes.

  9. PDF University of Washington

    University of Washington

  10. PDF Visual Analytics and Interactive Machine Learning for Human Brain Data

    Human brain data including structural-MRI, function-MRI and di usion MRI [1] hold great promise for a systematic characterization of human brain connectivity and its relationship with cognition and behavior. This study mainly focus on applying visualization techniques on human brain data for data exploration, quality control, and hypothesis ...

  11. MIT Theses

    MIT's DSpace contains more than 58,000 theses completed at MIT dating as far back as the mid 1800's. Theses in this collection have been scanned by the MIT Libraries or submitted in electronic format by thesis authors. Since 2004 all new Masters and Ph.D. theses are scanned and added to this collection after degrees are awarded.

  12. PDF Investigating the Impact of Big Data Analytics on Supply Chain

    Thesis Title: Investigating the Impact of Big Data Analytics on Supply Chain Operations: Case Studies from the UK Private Sector A thesis submitted for the degree of Doctor of Philosophy By Ruaa Hasan Brunel Business School Brunel University London 2021 . 2 | P a g e

  13. Brown CS: PhD Theses

    PhD Theses. 2023 Kristo, Ani Engineering a high-performing, learning-enhanced sorting algorithm (3.9 MB) • Tim Kraska, advisor ... Bounds and Applications of Concentration of Measure in Fair Machine Learning and Data Science (7.0 MB) • Eli Upfal, advisor Dursun, Kayhan Query Processing for Data Analytics on Modern Multicore Systems (2.7 MB)

  14. 10 Compelling Machine Learning Ph.D. Dissertations for 2020

    This dissertation explores three topics related to random forests: tree aggregation, variable importance, and robustness. 10. Climate Data Computing: Optimal Interpolation, Averaging, Visualization and Delivery. This dissertation solves two important problems in the modern analysis of big climate data.

  15. Development of Integrated Machine Learning and Data Science Approaches

    Few technological ideas have captivated the minds of biochemical researchers to the degree that machine learning (ML) and artificial intelligence (AI) have. Over the last few years, advances in the ML field have driven the design of new computational systems that improve with experience and are able to model increasingly complex chemical and biological phenomena. In this dissertation, we ...

  16. PDF Proposal for PhD Option in "Advanced Data Science"

    educate and recognize PhD students whose thesis work focuses specifically on building and using advanced data science tools. The goal of this option is not to educate all students in the foundations of data science but rather to provide advanced education to the students who will push the state-of-the-art in data science methods in their domain.

  17. OATD

    You may also want to consult these sites to search for other theses: Google Scholar; NDLTD, the Networked Digital Library of Theses and Dissertations.NDLTD provides information and a search engine for electronic theses and dissertations (ETDs), whether they are open access or not. Proquest Theses and Dissertations (PQDT), a database of dissertations and theses, whether they were published ...

  18. Doctoral Theses

    Doctoral Theses. Theses by Department. Computational and Systems Biology; ... Institute for Data, Systems, and Society; Media Arts & Sciences; Operations Research Center ... and Planetary Sciences. (654) Electrical Engineering and Computer Science (572) Aeronautics and Astronautics. (552)... View More Date Issued 2000 - 2024 (13822) 1910 - 1999 ...

  19. Computer Science Theses and Dissertations

    Theses/Dissertations from 2022. PDF. The Design and Implementation of a High-Performance Polynomial System Solver, Alexander Brandt. PDF. Defining Service Level Agreements in Serverless Computing, Mohamed Elsakhawy. PDF. Algorithms for Regular Chains of Dimension One, Juan P. Gonzalez Trochez. PDF.

  20. Harvard University Theses, Dissertations, and Prize Papers

    The Harvard University Archives' collection of theses, dissertations, and prize papers document the wide range of academic research undertaken by Harvard students over the course of the University's history.. Beyond their value as pieces of original research, these collections document the history of American higher education, chronicling both the growth of Harvard as a major research ...

  21. PDF Writing up your PhD (Qualitative Research)

    This is for PhD students working on a qualitative thesis who have completed their data collection and analysis and are at the stage of writing up. The materials should also be useful if you are writing up a 'mixed-methods' thesis, including chapters of analysis and discussion of qualitative data.

  22. PDF Writing a Doctoral Thesis or Dissertation in the Social Sciences

    Writing a Doctoral Thesis or Dissertation in the Social Sciences Anne Jordan, Ph.D. Ontario Institute for Studies in Education University of Toronto ©2020 A guide for doctoral students at various stages of their doctoral theses and dissertations: Designing their thesis proposals, developing their research

  23. PDF LIST OF Ph.D. THESES

    School of Arts & Aesthetics 1 3 School of Biotechnology 4 22 School of Computational and Integrative Sciences 23 24 School of Computer and Systems Sciences 25 41 School of Environmental Sciences 42 94 School of Information Technology 95 96 School of International Studies 97 309.

  24. Latest science news, discoveries and analysis

    Find breaking science news and analysis from the world's leading research journal.