A list of completed theses and new thesis topics from the Computer Vision Group.

Are you about to start a BSc or MSc thesis? Please read our instructions for preparing and delivering your work.

Below we list possible thesis topics for Bachelor and Master students in the areas of Computer Vision, Machine Learning, Deep Learning and Pattern Recognition. The project descriptions leave plenty of room for your own ideas. If you would like to discuss a topic in detail, please contact the supervisor listed below and Prof. Paolo Favaro to schedule a meeting. Note that for MSc students in Computer Science it is required that the official advisor is a professor in CS.

AI deconvolution of light microscopy images

Level: master.

Background Light microscopy became an indispensable tool in life sciences research. Deconvolution is an important image processing step in improving the quality of microscopy images for removing out-of-focus light, higher resolution, and beter signal to noise ratio. Currently classical deconvolution methods, such as regularisation or blind deconvolution, are implemented in numerous commercial software packages and widely used in research. Recently AI deconvolution algorithms have been introduced and being currently actively developed, as they showed a high application potential.

Aim Adaptation of available AI algorithms for deconvolution of microscopy images. Validation of these methods against state-of-the -art commercially available deconvolution software.

Material and Methods Student will implement and further develop available AI deconvolution methods and acquire test microscopy images of different modalities. Performance of developed AI algorithms will be validated against available commercial deconvolution software.

machine vision thesis topics

  • Al algorithm development and implementation: 50%.
  • Data acquisition: 10%.
  • Comparison of performance: 40 %.


  • Interest in imaging.
  • Solid knowledge of AI.
  • Good programming skills.

Supervisors Paolo Favaro, Guillaume Witz, Yury Belyaev.

Institutes Computer Vison Group, Digital Science Lab, Microscopy imaging Center.

Contact Yury Belyaev, Microscopy imaging Center, [email protected] , + 41 78 899 0110.

Instance segmentation of cryo-ET images

Level: bachelor/master.

In the 1600s, a pioneering Dutch scientist named Antonie van Leeuwenhoek embarked on a remarkable journey that would forever transform our understanding of the natural world. Armed with a simple yet ingenious invention, the light microscope, he delved into uncharted territory, peering through its lens to reveal the hidden wonders of microscopic structures. Fast forward to today, where cryo-electron tomography (cryo-ET) has emerged as a groundbreaking technique, allowing researchers to study proteins within their natural cellular environments. Proteins, functioning as vital nano-machines, play crucial roles in life and understanding their localization and interactions is key to both basic research and disease comprehension. However, cryo-ET images pose challenges due to inherent noise and a scarcity of annotated data for training deep learning models.

machine vision thesis topics

Credit: S. Albert et al./PNAS (CC BY 4.0)

To address these challenges, this project aims to develop a self-supervised pipeline utilizing diffusion models for instance segmentation in cryo-ET images. By leveraging the power of diffusion models, which iteratively diffuse information to capture underlying patterns, the pipeline aims to refine and accurately segment cryo-ET images. Self-supervised learning, which relies on unlabeled data, reduces the dependence on extensive manual annotations. Successful implementation of this pipeline could revolutionize the field of structural biology, facilitating the analysis of protein distribution and organization within cellular contexts. Moreover, it has the potential to alleviate the limitations posed by limited annotated data, enabling more efficient extraction of valuable information from cryo-ET images and advancing biomedical applications by enhancing our understanding of protein behavior.

Methods The segmentation pipeline for cryo-electron tomography (cryo-ET) images consists of two stages: training a diffusion model for image generation and training an instance segmentation U-Net using synthetic and real segmentation masks.

    1. Diffusion Model Training:         a. Data Collection: Collect and curate cryo-ET image datasets from the EMPIAR             database (https://www.ebi.ac.uk/empiar/).         b. Architecture Design: Select an appropriate architecture for the diffusion model.         c. Model Evaluation: Cryo-ET experts will help assess image quality and fidelity             through visual inspection and quantitative measures     2. Building the Segmentation dataset:         a. Synthetic and real mask generation: Use the trained diffusion model to generate             synthetic cryo-ET images. The diffusion process will be seeded from either a real             or a synthetic segmentation mask. This will yield to pairs of cryo-ET images and             segmentation masks.     3. Instance Segmentation U-Net Training:         a. Architecture Design: Choose an appropriate instance segmentation U-Net             architecture.         b. Model Evaluation: Evaluate the trained U-Net using precision, recall, and F1             score metrics.

By combining the diffusion model for cryo-ET image generation and the instance segmentation U-Net, this pipeline provides an efficient and accurate approach to segment structures in cryo-ET images, facilitating further analysis and interpretation.

References     1. Kwon, Diana. "The secret lives of cells-as never seen before." Nature 598.7882 (2021):         558-560.     2. Moebel, Emmanuel, et al. "Deep learning improves macromolecule identification in 3D         cellular cryo-electron tomograms." Nature methods 18.11 (2021): 1386-1394.     3. Rice, Gavin, et al. "TomoTwin: generalized 3D localization of macromolecules in         cryo-electron tomograms with structural data mining." Nature Methods (2023): 1-10.

Contacts Prof. Thomas Lemmin Institute of Biochemistry and Molecular Medicine Bühlstrasse 28, 3012 Bern ( [email protected] )

Prof. Paolo Favaro Institute of Computer Science Neubrückstrasse 10 3012 Bern ( [email protected] )

Adding and removing multiple sclerosis lesions with to imaging with diffusion networks

Background multiple sclerosis lesions are the result of demyelination: they appear as dark spots on t1 weighted mri imaging and as bright spots on flair mri imaging.  image analysis for ms patients requires both the accurate detection of new and enhancing lesions, and the assessment of  atrophy via local thickness and/or volume changes in the cortex.  detection of new and growing lesions is possible using deep learning, but made difficult by the relative lack of training data: meanwhile cortical morphometry can be affected by the presence of lesions, meaning that removing lesions prior to morphometry may be more robust.  existing ‘lesion filling’ methods are rather crude, yielding unrealistic-appearing brains where the borders of the removed lesions are clearly visible., aim: denoising diffusion networks are the current gold standard in mri image generation [1]: we aim to leverage this technology to remove and add lesions to existing mri images.  this will allow us to create realistic synthetic mri images for training and validating ms lesion segmentation algorithms, and for investigating the sensitivity of morphometry software to the presence of ms lesions at a variety of lesion load levels., materials and methods: a large, annotated, heterogeneous dataset of mri data from ms patients, as well as images of healthy controls without white matter lesions, will be available for developing the method.  the student will work in a research group with a long track record in applying deep learning methods to neuroimaging data, as well as experience training denoising diffusion networks..

Nature of the Thesis:

Literature review: 10%

Replication of Blob Loss paper: 10%

Implementation of the sliding window metrics:10%

Training on MS lesion segmentation task: 30%

Extension to other datasets: 20%

Results analysis: 20%

Fig. Results of an existing lesion filling algorithm, showing inadequate performance


Interest/Experience with image processing

Python programming knowledge (Pytorch bonus)

Interest in neuroimaging


PD. Dr. Richard McKinley

Institutes: Diagnostic and Interventional Neuroradiology

Center for Artificial Intelligence in Medicine (CAIM), University of Bern

References: [1] Brain Imaging Generation with Latent Diffusion Models , Pinaya et al, Accepted in the Deep Generative Models workshop @ MICCAI 2022 , https://arxiv.org/abs/2209.07162

Contact : PD Dr Richard McKinley, Support Centre for Advanced Neuroimaging ( [email protected] )

Improving metrics and loss functions for targets with imbalanced size: sliding window Dice coefficient and loss.

Background The Dice coefficient is the most commonly used metric for segmentation quality in medical imaging, and a differentiable version of the coefficient is often used as a loss function, in particular for small target classes such as multiple sclerosis lesions.  Dice coefficient has the benefit that it is applicable in instances where the target class is in the minority (for example, in case of segmenting small lesions).  However, if lesion sizes are mixed, the loss and metric is biased towards performance on large lesions, leading smaller lesions to be missed and harming overall lesion detection.  A recently proposed loss function (blob loss[1]) aims to combat this by treating each connected component of a lesion mask separately, and claims improvements over Dice loss on lesion detection scores in a variety of tasks.

Aim: The aim of this thesisis twofold.  First, to benchmark blob loss against a simple, potentially superior loss for instance detection: sliding window Dice loss, in which the Dice loss is calculated over a sliding window across the area/volume of the medical image.  Second, we will investigate whether a sliding window Dice coefficient is better corellated with lesion-wise detection metrics than Dice coefficient and may serve as an alternative metric capturing both global and instance-wise detection.

Materials and Methods: A large, annotated, heterogeneous dataset of MRI data from MS patients will be available for benchmarking the method, as well as our existing codebases for MS lesion segmentation.  Extension of the method to other diseases and datasets (such as covered in the blob loss paper) will make the method more plausible for publication.  The student will work alongside clinicians and engineers carrying out research in multiple sclerosis lesion segmentation, in particular in the context of our running project supported by the CAIM grant.

machine vision thesis topics

Fig. An  annotated MS lesion case, showing the variety of lesion sizes

References: [1] blob loss: instance imbalance aware loss functions for semantic segmentation, Kofler et al, https://arxiv.org/abs/2205.08209

Idempotent and partial skull-stripping in multispectral MRI imaging

Background Skull stripping (or brain extraction) refers to the masking of non-brain tissue from structural MRI imaging.  Since 3D MRI sequences allow reconstruction of facial features, many data providers supply data only after skull-stripping, making this a vital tool in data sharing.  Furthermore, skull-stripping is an important pre-processing step in many neuroimaging pipelines, even in the deep-learning era: while many methods could now operate on data with skull present, they have been trained only on skull-stripped data and therefore produce spurious results on data with the skull present.

High-quality skull-stripping algorithms based on deep learning are now widely available: the most prominent example is HD-BET [1].  A major downside of HD-BET is its behaviour on datasets to which skull-stripping has already been applied: in this case the algorithm falsely identifies brain tissue as skull and masks it.  A skull-stripping algorithm F not exhibiting this behaviour would  be idempotent: F(F(x)) = F(x) for any image x.  Furthermore, legacy datasets from before the availability of high-quality skull-stripping algorithms may still contain images which have been inadequately skull-stripped: currently the only solution to improve the skull-stripping on this data is to go back to the original datasource or to manually correct the skull-stripping, which is time-consuming and prone to error. 

Aim: In this project, the student will develop an idempotent skull-stripping network which can also handle partially skull-stripped inputs.  In the best case, the network will operate well on a large subset of the data we work with (e.g. structural MRI, diffusion-weighted MRI, Perfusion-weighted MRI,  susceptibility-weighted MRI, at a variety of field strengths) to maximize the future applicability of the network across the teams in our group.

Materials and Methods: Multiple datasets, both publicly available and internal (encompassing thousands of 3D volumes) will be available. Silver standard reference data for standard sequences at 1.5T and 3T can be generated using existing tools such as HD-BET: for other sequences and field strengths semi-supervised learning or methods improving robustness to domain shift may be employed.  Robustness to partial skull-stripping may be induced by a combination of learning theory and model-based approaches.

machine vision thesis topics

Dataset curation: 10%

Idempotent skull-stripping model building: 30%

Modelling of partial skull-stripping:10%

Extension of model to handle partial skull: 30%

Results analysis: 10%

Fig. An example of failed skull-stripping requiring manual correction

References: [1] Isensee, F, Schell, M, Pflueger, I, et al. Automated brain extraction of multisequence MRI using artificial neural networks. Hum Brain Mapp . 2019; 40: 4952– 4964. https://doi.org/10.1002/hbm.24750

Automated leaf detection and leaf area estimation (for Arabidopsis thaliana)

Correlating plant phenotypes such as leaf area or number of leaves to the genotype (i.e. changes in DNA) is a common goal for plant breeders and molecular biologists. Such data can not only help to understand fundamental processes in nature, but also can help to improve ecotypes, e.g., to perform better under climate change, or reduce fertiliser input. However, collecting data for many plants is very time consuming and automated data acquisition is necessary.

The project aims at building a machine learning model to automatically detect plants in top-view images (see examples below), segment their leaves (see Fig C) and to estimate the leaf area. This information will then be used to determine the leaf area of different Arabidopsis ecotypes. The project will be carried out in collaboration with researchers of the Institute of Plant Sciences at the University of Bern. It will also involve the design and creation of a dataset of plant top-views with the corresponding annotation (provided by experts at the Institute of Plant Sciences).

machine vision thesis topics

Contact: Prof. Dr. Paolo Favaro ( [email protected] )

Master Projects at the ARTORG Center

The Gerontechnology and Rehabilitation group at the ARTORG Center for Biomedical Engineering is offering multiple MSc thesis projects to students, which are interested in working with real patient data, artificial intelligence and machine learning algorithms. The goal of these projects is to transfer the findings to the clinic in order to solve today’s healthcare problems and thus to improve the quality of life of patients. Assessment of Digital Biomarkers at Home by Radar.  [PDF] Comparison of Radar, Seismograph and Ballistocardiography and to Monitor Sleep at Home.   [PDF] Sentimental Analysis in Speech.  [PDF] Contact: Dr. Stephan Gerber ( [email protected] )

Internship in Computational Imaging at Prophesee

A 6 month intership at Prophesee, Grenoble is offered to a talented Master Student.

The topic of the internship is working on burst imaging following the work of Sam Hasinoff , and exploring ways to improve it using event-based vision.

A compensation to cover the expenses of living in Grenoble is offered. Only students that have legal rights to work in France can apply.

Anyone interested can send an email with the CV to Daniele Perrone ( [email protected] ).

Using machine learning applied to wearables to predict mental health

This Master’s project lies at the intersection of psychiatry and computer science and aims to use machine learning techniques to improve health. Using sensors to detect sleep and waking behavior has as of yet unexplored potential to reveal insights into health.  In this study, we make use of a watch-like device, called an actigraph, which tracks motion to quantify sleep behavior and waking activity. Participants in the study consist of healthy and depressed adolescents and wear actigraphs for a year during which time we query their mental health status monthly using online questionnaires.  For this masters thesis we aim to make use of machine learning methods to predict mental health based on the data from the actigraph. The ability to predict mental health crises based on sleep and wake behavior would provide an opportunity for intervention, significantly impacting the lives of patients and their families. This Masters thesis is a collaboration between Professor Paolo Favaro at the Institute of Computer Science ( [email protected] ) and Dr Leila Tarokh at the Universitäre Psychiatrische Dienste (UPD) ( [email protected] ).  We are looking for a highly motivated individual interested in bridging disciplines. 

Bachelor or Master Projects at the ARTORG Center

The Gerontechnology and Rehabilitation group at the ARTORG Center for Biomedical Engineering is offering multiple BSc- and MSc thesis projects to students, which are interested in working with real patient data, artificial intelligence and machine learning algorithms. The goal of these projects is to transfer the findings to the clinic in order to solve today’s healthcare problems and thus to improve the quality of life of patients. Machine Learning Based Gait-Parameter Extraction by Using Simple Rangefinder Technology.  [PDF] Detection of Motion in Video Recordings   [PDF] Home-Monitoring of Elderly by Radar  [PDF] Gait feature detection in Parkinson's Disease  [PDF] Development of an arthroscopic training device using virtual reality  [PDF] Contact: Dr. Stephan Gerber ( [email protected] ), Michael Single ( [email protected]. ch )

Dynamic Transformer

Level: bachelor.

Visual Transformers have obtained state of the art classification accuracies [ViT, DeiT, T2T, BoTNet]. Mixture of experts could be used to increase the capacity of a neural network by learning instance dependent execution pathways in a network [MoE]. In this research project we aim to push the transformers to their limit and combine their dynamic attention with MoEs, compared to Switch Transformer [Switch], we will use a much more efficient formulation of mixing [CondConv, DynamicConv] and we will use this idea in the attention part of the transformer, not the fully connected layer.

  • Input dependent attention kernel generation for better transformer layers.

Publication Opportunity: Dynamic Neural Networks Meets Computer Vision (a CVPR 2021 Workshop)


  • The same idea could be extended to other ViT/Transformer based models [DETR, SETR, LSTR, TrackFormer, BERT]

Related Papers:

  • Visual Transformers: Token-based Image Representation and Processing for Computer Vision [ViT]
  • DeiT: Data-efficient Image Transformers [DeiT]
  • Bottleneck Transformers for Visual Recognition [BoTNet]
  • Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet [T2TViT]
  • Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer [MoE]
  • Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity [Switch]
  • CondConv: Conditionally Parameterized Convolutions for Efficient Inference [CondConv]
  • Dynamic Convolution: Attention over Convolution Kernels [DynamicConv]
  • End-to-End Object Detection with Transformers [DETR]
  • Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers [SETR]
  • End-to-end Lane Shape Prediction with Transformers [LSTR]
  • TrackFormer: Multi-Object Tracking with Transformers [TrackFormer]
  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [BERT]

Contact: Sepehr Sameni

Visual Transformers have obtained state of the art classification accuracies for 2d images[ViT, DeiT, T2T, BoTNet]. In this project, we aim to extend the same ideas to 3d data (videos), which requires a more efficient attention mechanism [Performer, Axial, Linformer]. In order to accelerate the training process, we could use [Multigrid] technique.

  • Better video understanding by attention blocks.

Publication Opportunity: LOVEU (a CVPR workshop) , Holistic Video Understanding (a CVPR workshop) , ActivityNet (a CVPR workshop)

  • Rethinking Attention with Performers [Performer]
  • Axial Attention in Multidimensional Transformers [Axial]
  • Linformer: Self-Attention with Linear Complexity [Linformer]
  • A Multigrid Method for Efficiently Training Video Models [Multigrid]

GIRAFFE is a newly introduced GAN that can generate scenes via composition with minimal supervision [GIRAFFE]. Generative methods can implicitly learn interpretable representation as can be seen in GAN image interpretations [GANSpace, GanLatentDiscovery]. Decoding GIRAFFE could give us per-object interpretable representations that could be used for scene manipulation, data augmentation, scene understanding, semantic segmentation, pose estimation [iNeRF], and more. 

In order to invert a GIRAFFE model, we will first train the generative model on Clevr and CompCars datasets, then we add a decoder to the pipeline and train this autoencoder. We can make the task easier by knowing the number of objects in the scene and/or knowing their positions. 


Scene Manipulation and Decomposition by Inverting the GIRAFFE 

Publication Opportunity:  DynaVis 2021 (a CVPR workshop on Dynamic Scene Reconstruction)  

Related Papers: 

  • GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields [GIRAFFE] 
  • Neural Scene Graphs for Dynamic Scenes 
  • pixelNeRF: Neural Radiance Fields from One or Few Images [pixelNeRF] 
  • NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis [NeRF] 
  • Neural Volume Rendering: NeRF And Beyond 
  • GANSpace: Discovering Interpretable GAN Controls [GANSpace] 
  • Unsupervised Discovery of Interpretable Directions in the GAN Latent Space [GanLatentDiscovery] 
  • Inverting Neural Radiance Fields for Pose Estimation [iNeRF] 

Quantized ViT

Visual Transformers have obtained state of the art classification accuracies [ViT, CLIP, DeiT], but the best ViT models are extremely compute heavy and running them even only for inference (not doing backpropagation) is expensive. Running transformers cheaply by quantization is not a new problem and it has been tackled before for BERT [BERT] in NLP [Q-BERT, Q8BERT, TernaryBERT, BinaryBERT]. In this project we will be trying to quantize pretrained ViT models. 

Quantizing ViT models for faster inference and smaller models without losing accuracy 

Publication Opportunity:  Binary Networks for Computer Vision 2021 (a CVPR workshop)  


  • Having a fast pipeline for image inference with ViT will allow us to dig deep into the attention of ViT and analyze it, we might be able to prune some attention heads or replace them with static patterns (like local convolution or dilated patterns), We might be even able to replace the transformer with performer and increase the throughput even more [Performer]. 
  • The same idea could be extended to other ViT based models [DETR, SETR, LSTR, TrackFormer, CPTR, BoTNet, T2TViT] 
  • Learning Transferable Visual Models From Natural Language Supervision [CLIP] 
  • Visual Transformers: Token-based Image Representation and Processing for Computer Vision [ViT] 
  • DeiT: Data-efficient Image Transformers [DeiT] 
  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [BERT] 
  • Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT [Q-BERT] 
  • Q8BERT: Quantized 8Bit BERT [Q8BERT] 
  • TernaryBERT: Distillation-aware Ultra-low Bit BERT [TernaryBERT] 
  • BinaryBERT: Pushing the Limit of BERT Quantization [BinaryBERT] 
  • Rethinking Attention with Performers [Performer] 
  • End-to-End Object Detection with Transformers [DETR] 
  • Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers [SETR] 
  • End-to-end Lane Shape Prediction with Transformers [LSTR] 
  • TrackFormer: Multi-Object Tracking with Transformers [TrackFormer] 
  • CPTR: Full Transformer Network for Image Captioning [CPTR] 
  • Bottleneck Transformers for Visual Recognition [BoTNet] 
  • Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet [T2TViT] 

Multimodal Contrastive Learning

Recently contrastive learning has gained a lot of attention for self-supervised image representation learning [SimCLR, MoCo]. Contrastive learning could be extended to multimodal data, like videos (images and audio) [CMC, CoCLR]. Most contrastive methods require large batch sizes (or large memory pools) which makes them expensive for training. In this project we are going to use non batch size dependent contrastive methods [SwAV, BYOL, SimSiam] to train multimodal representation extractors. 

Our main goal is to compare the proposed method with the CMC baseline, so we will be working with STL10, ImageNet, UCF101, HMDB51, and NYU Depth-V2 datasets. 

Inspired by the recent works on smaller datasets [ConVIRT, CPD], to accelerate the training speed, we could start with two pretrained single-modal models and finetune them with the proposed method.  

  • Extending SwAV to multimodal datasets 
  • Grasping a better understanding of the BYOL 

Publication Opportunity:  MULA 2021 (a CVPR workshop on Multimodal Learning and Applications)  

  • Most knowledge distillation methods for contrastive learners also use large batch sizes (or memory pools) [CRD, SEED], the proposed method could be extended for knowledge distillation. 
  • One could easily extend this idea to multiview learning, for example one could have two different networks working on the same input and train them with contrastive learning, this may lead to better models [DeiT] by cross-model inductive biases communications. 
  • Self-supervised Co-training for Video Representation Learning [CoCLR] 
  • Learning Spatiotemporal Features via Video and Text Pair Discrimination [CPD] 
  • Audio-Visual Instance Discrimination with Cross-Modal Agreement [AVID-CMA] 
  • Self-Supervised Learning by Cross-Modal Audio-Video Clustering [XDC] 
  • Contrastive Multiview Coding [CPC] 
  • Contrastive Learning of Medical Visual Representations from Paired Images and Text [ConVIRT] 
  • A Simple Framework for Contrastive Learning of Visual Representations [SimCLR] 
  • Momentum Contrast for Unsupervised Visual Representation Learning [MoCo] 
  • Bootstrap your own latent: A new approach to self-supervised Learning [BYOL] 
  • Exploring Simple Siamese Representation Learning [SimSiam] 
  • Unsupervised Learning of Visual Features by Contrasting Cluster Assignments [SwAV] 
  • Contrastive Representation Distillation [CRD] 
  • SEED: Self-supervised Distillation For Visual Representation [SEED] 

Robustness of Neural Networks

Neural Networks have been found to achieve surprising performance in several tasks such as classification, detection and segmentation. However, they are also very sensitive to small (controlled) changes to the input. It has been shown that some changes to an image that are not visible to the naked eye may lead the network to output an incorrect label. This thesis will focus on studying recent progress in this area and aim to build a procedure for a trained network to self-assess its reliability in classification or one of the popular computer vision tasks.

Contact: Paolo Favaro

Masters projects at sitem center

The Personalised Medicine Research Group at the sitem Center for Translational Medicine and Biomedical Entrepreneurship is offering multiple MSc thesis projects to the biomed eng MSc students that may also be of interest to the computer science students. Automated quantification of cartilage quality for hip treatment decision support.  PDF Automated quantification of massive rotator cuff tears from MRI. PDF Deep learning-based segmentation and fat fraction analysis of the shoulder muscles using quantitative MRI. PDF Unsupervised Domain Adaption for Cross-Modality Hip Joint Segmentation. PDF Contact:  Dr. Kate Gerber

Internships/Master thesis @ Chronocam

3-6 months internships on event-based computer vision. Chronocam is a rapidly growing startup developing event-based technology, with more than 15 PhDs working on problems like tracking, detection, classification, SLAM, etc. Event-based computer vision has the potential to solve many long-standing problems in traditional computer vision, and this is a super exciting time as this potential is becoming more and more tangible in many real-world applications. For next year we are looking for motivated Master and PhD students with good software engineering skills (C++ and/or python), and preferable good computer vision and deep learning background. PhD internships will be more research focused and possibly lead to a publication.  For each intern we offer a compensation to cover the expenses of living in Paris.  List of some of the topics we want to explore:

  • Photo-realistic image synthesis and super-resolution from event-based data (PhD)
  • Self-supervised representation learning (PhD)
  • End-to-end Feature Learning for Event-based Data
  • Bio-inspired Filtering using Spiking Networks
  • On-the fly Compression of Event-based Streams for Low-Power IoT Cameras
  • Tracking of Multiple Objects with a Dual-Frequency Tracker
  • Event-based Autofocus
  • Stabilizing an Event-based Stream using an IMU
  • Crowd Monitoring for Low-power IoT Cameras
  • Road Extraction from an Event-based Camera Mounted in a Car for Autonomous Driving
  • Sign detection from an Event-based Camera Mounted in a Car for Autonomous Driving
  • High-frequency Eye Tracking

Email with attached CV to Daniele Perrone at  [email protected] .

Contact: Daniele Perrone

Object Detection in 3D Point Clouds

Today we have many 3D scanning techniques that allow us to capture the shape and appearance of objects. It is easier than ever to scan real 3D objects and transform them into a digital model for further processing, such as modeling, rendering or animation. However, the output of a 3D scanner is often a raw point cloud with little to no annotations. The unstructured nature of the point cloud representation makes it difficult for processing, e.g. surface reconstruction. One application is the detection and segmentation of an object of interest.  In this project, the student is challenged to design a system that takes a point cloud (a 3D scan) as input and outputs the names of objects contained in the scan. This output can then be used to eliminate outliers or points that belong to the background. The approach involves collecting a large dataset of 3D scans and training a neural network on it.

Contact: Adrian Wälchli

Shape Reconstruction from a Single RGB Image or Depth Map

A photograph accurately captures the world in a moment of time and from a specific perspective. Since it is a projection of the 3D space to a 2D image plane, the depth information is lost. Is it possible to restore it, given only a single photograph? In general, the answer is no. This problem is ill-posed, meaning that many different plausible depth maps exist, and there is no way of telling which one is the correct one.  However, if we cover one of our eyes, we are still able to recognize objects and estimate how far away they are. This motivates the exploration of an approach where prior knowledge can be leveraged to reduce the ill-posedness of the problem. Such a prior could be learned by a deep neural network, trained with many images and depth maps.

CNN Based Deblurring on Mobile

Deblurring finds many applications in our everyday life. It is particularly useful when taking pictures on handheld devices (e.g. smartphones) where camera shake can degrade important details. Therefore, it is desired to have a good deblurring algorithm implemented directly in the device.  In this project, the student will implement and optimize a state-of-the-art deblurring method based on a deep neural network for deployment on mobile phones (Android).  The goal is to reduce the number of network weights in order to reduce the memory footprint while preserving the quality of the deblurred images. The result will be a camera app that automatically deblurs the pictures, giving the user a choice of keeping the original or the deblurred image.

Depth from Blur

If an object in front of the camera or the camera itself moves while the aperture is open, the region of motion becomes blurred because the incoming light is accumulated in different positions across the sensor. If there is camera motion, there is also parallax. Thus, a motion blurred image contains depth information.  In this project, the student will tackle the problem of recovering a depth-map from a motion-blurred image. This includes the collection of a large dataset of blurred- and sharp images or videos using a pair or triplet of GoPro action cameras. Two cameras will be used in stereo to estimate the depth map, and the third captures the blurred frames. This data is then used to train a convolutional neural network that will predict the depth map from the blurry image.

Unsupervised Clustering Based on Pretext Tasks

The idea of this project is that we have two types of neural networks that work together: There is one network A that assigns images to k clusters and k (simple) networks of type B perform a self-supervised task on those clusters. The goal of all the networks is to make the k networks of type B perform well on the task. The assumption is that clustering in semantically similar groups will help the networks of type B to perform well. This could be done on the MNIST dataset with B being linear classifiers and the task being rotation prediction.

Adversarial Data-Augmentation

The student designs a data augmentation network that transforms training images in such a way that image realism is preserved (e.g. with a constrained spatial transformer network) and the transformed images are more difficult to classify (trained via adversarial loss against an image classifier). The model will be evaluated for different data settings (especially in the low data regime), for example on the MNIST and CIFAR datasets.

Unsupervised Learning of Lip-reading from Videos

People with sensory impairment (hearing, speech, vision) depend heavily on assistive technologies to communicate and navigate in everyday life. The mass production of media content today makes it impossible to manually translate everything into a common language for assistive technologies, e.g. captions or sign language.  In this project, the student employs a neural network to learn a representation for lip-movement in videos in an unsupervised fashion, possibly with an encoder-decoder structure where the decoder reconstructs the audio signal. This requires collecting a large dataset of videos (e.g. from YouTube) of speakers or conversations where lip movement is visible. The outcome will be a neural network that learns an audio-visual representation of lip movement in videos, which can then be leveraged to generate captions for hearing impaired persons.

Learning to Generate Topographic Maps from Satellite Images

Satellite images have many applications, e.g. in meteorology, geography, education, cartography and warfare. They are an accurate and detailed depiction of the surface of the earth from above. Although it is relatively simple to collect many satellite images in an automated way, challenges arise when processing them for use in navigation and cartography. The idea of this project is to automatically convert an arbitrary satellite image, of e.g. a city, to a map of simple 2D shapes (streets, houses, forests) and label them with colors (semantic segmentation). The student will collect a dataset of satellite image and topological maps and train a deep neural network that learns to map from one domain to the other. The data could be obtained from a Google Maps database or similar.

New Variables of Brain Morphometry: the Potential and Limitations of CNN Regression

Timo blattner · sept. 2022.

The calculation of variables of brain morphology is computationally very expensive and time-consuming. A previous work showed the feasibility of ex- tracting the variables directly from T1-weighted brain MRI images using a con- volutional neural network. We used significantly more data and extended their model to a new set of neuromorphological variables, which could become inter- esting biomarkers in the future for the diagnosis of brain diseases. The model shows for nearly all subjects a less than 5% mean relative absolute error. This high relative accuracy can be attributed to the low morphological variance be- tween subjects and the ability of the model to predict the cortical atrophy age trend. The model however fails to capture all the variance in the data and shows large regional differences. We attribute these limitations in part to the moderate to poor reliability of the ground truth generated by FreeSurfer. We further investigated the effects of training data size and model complexity on this regression task and found that the size of the dataset had a significant impact on performance, while deeper models did not perform better. Lack of interpretability and dependence on a silver ground truth are the main drawbacks of this direct regression approach.

Home Monitoring by Radar

Lars ziegler · sept. 2022.

Detection and tracking of humans via UWB radars is a promising and continuously evolving field with great potential for medical technology. This contactless method of acquiring data of a patients movement patterns is ideal for in home application. As irregularities in a patients movement patterns are an indicator for various health problems including neurodegenerative diseases, the insight this data could provide may enable earlier detection of such problems. In this thesis a signal processing pipeline is presented with which a persons movement is modeled. During an experiment 142 measurements were recorded by two separate radar systems and one lidar system which each consisted of multiple sensors. The models that were calculated on these measurements by the signal processing pipeline were used to predict the times when a person stood up or sat down. The predictions showed an accuracy of 72.2%.

Revisiting non-learning based 3D reconstruction from multiple images

Aaron sägesser · oct. 2021.

Arthroscopy consists of challenging tasks and requires skills that even today, young surgeons still train directly throughout the surgery. Existing simulators are expensive and rarely available. Through the growing potential of virtual reality(VR) (head-mounted) devices for simulation and their applicability in the medical context, these devices have become a promising alternative that would be orders of magnitude cheaper and could be made widely available. To build a VR-based training device for arthroscopy is the overall aim of our project, as this would be of great benefit and might even be applicable in other minimally invasive surgery (MIS). This thesis marks a first step of the project with its focus to explore and compare well-known algorithms in a multi-view stereo (MVS) based 3D reconstruction with respect to imagery acquired by an arthroscopic camera. Simultaneously with this reconstruction, we aim to gain essential measures to compare the VR environment to the real world, as validation of the realism of future VR tasks. We evaluate 3 different feature extraction algorithms with 3 different matching techniques and 2 different algorithms for the estimation of the fundamental (F) matrix. The evaluation of these 18 different setups is made with a reconstruction pipeline embedded in a jupyter notebook implemented in python based on common computer vision libraries and compared with imagery generated with a mobile phone as well as with the reconstruction results of state-of-the-art (SOTA) structure-from-motion (SfM) software COLMAP and Multi-View Environment (MVE). Our comparative analysis manifests the challenges of heavy distortion, the fish-eye shape and weak image quality of arthroscopic imagery, as all results are substantially worse using this data. However, there are huge differences regarding the different setups. Scale Invariant Feature Transform (SIFT) and Oriented FAST Rotated BRIEF (ORB) in combination with k-Nearest Neighbour (kNN) matching and Least Median of Squares (LMedS) present the most promising results. Overall, the 3D reconstruction pipeline is a useful tool to foster the process of gaining measurements from the arthroscopic exploration device and to complement the comparative research in this context.

Examination of Unsupervised Representation Learning by Predicting Image Rotations

Eric lagger · sept. 2020.

In recent years deep convolutional neural networks achieved a lot of progress. To train such a network a lot of data is required and in supervised learning algorithms it is necessary that the data is labeled. To label data there is a lot of human work needed and this takes a lot of time and money to be done. To avoid the inconveniences that come with this we would like to find systems that don’t need labeled data and therefore are unsupervised learning algorithms. This is the importance of unsupervised algorithms, even though their outcome is not yet on the same qualitative level as supervised algorithms. In this thesis we will discuss an approach of such a system and compare the results to other papers. A deep convolutional neural network is trained to learn the rotations that have been applied to a picture. So we take a large amount of images and apply some simple rotations and the task of the network is to discover in which direction the image has been rotated. The data doesn’t need to be labeled to any category or anything else. As long as all the pictures are upside down we hope to find some high dimensional patterns for the network to learn.

StitchNet: Image Stitching using Autoencoders and Deep Convolutional Neural Networks

Maurice rupp · sept. 2019.

This thesis explores the prospect of artificial neural networks for image processing tasks. More specifically, it aims to achieve the goal of stitching multiple overlapping images to form a bigger, panoramic picture. Until now, this task is solely approached with ”classical”, hardcoded algorithms while deep learning is at most used for specific subtasks. This thesis introduces a novel end-to-end neural network approach to image stitching called StitchNet, which uses a pre-trained autoencoder and deep convolutional networks. Additionally to presenting several new datasets for the task of supervised image stitching with each 120’000 training and 5’000 validation samples, this thesis also conducts various experiments with different kinds of existing networks designed for image superresolution and image segmentation adapted to the task of image stitching. StitchNet outperforms most of the adapted networks in both quantitative as well as qualitative results.

Facial Expression Recognition in the Wild

Luca rolshoven · sept. 2019.

The idea of inferring the emotional state of a subject by looking at their face is nothing new. Neither is the idea of automating this process using computers. Researchers used to computationally extract handcrafted features from face images that had proven themselves to be effective and then used machine learning techniques to classify the facial expressions using these features. Recently, there has been a trend towards using deeplearning and especially Convolutional Neural Networks (CNNs) for the classification of these facial expressions. Researchers were able to achieve good results on images that were taken in laboratories under the same or at least similar conditions. However, these models do not perform very well on more arbitrary face images with different head poses and illumination. This thesis aims to show the challenges of Facial Expression Recognition (FER) in this wild setting. It presents the currently used datasets and the present state-of-the-art results on one of the biggest facial expression datasets currently available. The contributions of this thesis are twofold. Firstly, I analyze three famous neural network architectures and their effectiveness on the classification of facial expressions. Secondly, I present two modifications of one of these networks that lead to the proposed STN-COV model. While this model does not outperform all of the current state-of-the-art models, it does beat several ones of them.

A Study of 3D Reconstruction of Varying Objects with Deformable Parts Models

Raoul grossenbacher · july 2019.

This work covers a new approach to 3D reconstruction. In traditional 3D reconstruction one uses multiple images of the same object to calculate a 3D model by taking information gained from the differences between the images, like camera position, illumination of the images, rotation of the object and so on, to compute a point cloud representing the object. The characteristic trait shared by all these approaches is that one can almost change everything about the image, but it is not possible to change the object itself, because one needs to find correspondences between the images. To be able to use different instances of the same object, we used a 3D DPM model that can find different parts of an object in an image, thereby detecting the correspondences between the different pictures, which we then can use to calculate the 3D model. To take this theory to practise, we gave a 3D DPM model, which was trained to detect cars, pictures of different car brands, where no pair of images showed the same vehicle and used the detected correspondences and the Factorization Method to compute the 3D point cloud. This technique leads to a completely new approach in 3D reconstruction, because changing the object itself was never done before.

Motion deblurring in the wild replication and improvements

Alvaro juan lahiguera · jan. 2019, coma outcome prediction with convolutional neural networks, stefan jonas · oct. 2018, automatic correction of self-introduced errors in source code, sven kellenberger · aug. 2018, neural face transfer: training a deep neural network to face-swap, till nikolaus schnabel · july 2018.

This thesis explores the field of artificial neural networks with realistic looking visual outputs. It aims at morphing face pictures of a specific identity to look like another individual by only modifying key features, such as eye color, while leaving identity-independent features unchanged. Prior works have covered the topic of symmetric translation between two specific domains but failed to optimize it on faces where only parts of the image may be changed. This work applies a face masking operation to the output at training time, which forces the image generator to preserve colors while altering the face, fitting it naturally inside the unmorphed surroundings. Various experiments are conducted including an ablation study on the final setting, decreasing the baseline identity switching performance from 81.7% to 75.8 % whilst improving the average χ2 color distance from 0.551 to 0.434. The provided code-based software gives users easy access to apply this neural face swap to images and videos of arbitrary crop and brings Computer Vision one step closer to replacing Computer Graphics in this specific area.

A Study of the Importance of Parts in the Deformable Parts Model

Sammer puran · june 2017, self-similarity as a meta feature, lucas husi · april 2017, a study of 3d deformable parts models for detection and pose-estimation, simon jenni · march 2015, amodal leaf segmentation, nicolas maier · nov. 2023.

Plant phenotyping is the process of measuring and analyzing various traits of plants. It provides essential information on how genetic and environmental factors affect plant growth and development. Manual phenotyping is highly time-consuming; therefore, many computer vision and machine learning based methods have been proposed in the past years to perform this task automatically based on images of the plants. However, the publicly available datasets (in particular, of Arabidopsis thaliana) are limited in size and diversity, making them unsuitable to generalize to new unseen environments. In this work, we propose a complete pipeline able to automatically extract traits of interest from an image of Arabidopsis thaliana. Our method uses a minimal amount of existing annotated data from a source domain to generate a large synthetic dataset adapted to a different target domain (e.g., different backgrounds, lighting conditions, and plant layouts). In addition, unlike the source dataset, the synthetic one provides ground-truth annotations for the occluded parts of the leaves, which are relevant when measuring some characteristics of the plant, e.g., its total area. This synthetic dataset is then used to train a model to perform amodal instance segmentation of the leaves to obtain the total area, leaf count, and color of each plant. To validate our approach, we create a small dataset composed of manually annotated real images of Arabidopsis thaliana, which is used to assess the performance of the models.

Assessment of movement and pose in a hospital bed by ambient and wearable sensor technology in healthy subjects

Tony licata · sept. 2022.

The use of automated systems describing the human motion has become possible in various domains. Most of the proposed systems are designed to work with people moving around in a standing position. Because such system could be interesting in a medical environment, we propose in this work a pipeline that can effectively predict human motion from people lying on beds. The proposed pipeline is tested with a data set composed of 41 participants executing 7 predefined tasks in a bed. The motion of the participants is measured with video cameras, accelerometers and pressure mat. Various experiments are carried with the information retrieved from the data set. Two approaches combining the data from the different measure technologies are explored. The performance of the different carried experiments is measured, and the proposed pipeline is composed with components providing the best results. Later on, we show that the proposed pipeline only needs to use the video cameras, which make the proposed environment easier to implement in real life situations.

Machine Learning Based Prediction of Mental Health Using Wearable-measured Time Series

Seyedeh sharareh mirzargar · sept. 2022.

Depression is the second major cause for years spent in disability and has a growing prevalence in adolescents. The recent Covid-19 pandemic has intensified the situation and limited in-person patient monitoring due to distancing measures. Recent advances in wearable devices have made it possible to record the rest/activity cycle remotely with high precision and in real-world contexts. We aim to use machine learning methods to predict an individual's mental health based on wearable-measured sleep and physical activity. Predicting an impending mental health crisis of an adolescent allows for prompt intervention, detection of depression onset or its recursion, and remote monitoring. To achieve this goal, we train three primary forecasting models; linear regression, random forest, and light gradient boosted machine (LightGBM); and two deep learning models; block recurrent neural network (block RNN) and temporal convolutional network (TCN); on Actigraph measurements to forecast mental health in terms of depression, anxiety, sleepiness, stress, sleep quality, and behavioral problems. Our models achieve a high forecasting performance, the random forest being the winner to reach an accuracy of 98% for forecasting the trait anxiety. We perform extensive experiments to evaluate the models' performance in accuracy, generalization, and feature utilization, using a naive forecaster as the baseline. Our analysis shows minimal mental health changes over two months, making the prediction task easily achievable. Due to these minimal changes in mental health, the models tend to primarily use the historical values of mental health evaluation instead of Actigraph features. At the time of this master thesis, the data acquisition step is still in progress. In future work, we plan to train the models on the complete dataset using a longer forecasting horizon to increase the level of mental health changes and perform transfer learning to compensate for the small dataset size. This interdisciplinary project demonstrates the opportunities and challenges in machine learning based prediction of mental health, paving the way toward using the same techniques to forecast other mental disorders such as internalizing disorder, Parkinson's disease, Alzheimer's disease, etc. and improving the quality of life for individuals who have some mental disorder.

CNN Spike Detector: Detection of Spikes in Intracranial EEG using Convolutional Neural Networks

Stefan jonas · oct. 2021.

The detection of interictal epileptiform discharges in the visual analysis of electroencephalography (EEG) is an important but very difficult, tedious, and time-consuming task. There have been decades of research on computer-assisted detection algorithms, most recently focused on using Convolutional Neural Networks (CNNs). In this thesis, we present the CNN Spike Detector, a convolutional neural network to detect spikes in intracranial EEG. Our dataset of 70 intracranial EEG recordings from 26 subjects with epilepsy introduces new challenges in this research field. We report cross-validation results with a mean AUC of 0.926 (+- 0.04), an area under the precision-recall curve (AUPRC) of 0.652 (+- 0.10) and 12.3 (+- 7.47) false positive epochs per minute for a sensitivity of 80%. A visual examination of false positive segments is performed to understand the model behavior leading to a relatively high false detection rate. We notice issues with the evaluation measures and highlight a major limitation of the common approach of detecting spikes using short segments, namely that the network is not capable to consider the greater context of the segment with regards to its origination. For this reason, we present the Context Model, an extension in which the CNN Spike Detector is supplied with additional information about the channel. Results show promising but limited performance improvements. This thesis provides important findings about the spike detection task for intracranial EEG and lays out promising future research directions to develop a network capable of assisting experts in real-world clinical applications.

PolitBERT - Deepfake Detection of American Politicians using Natural Language Processing

Maurice rupp · april 2021.

This thesis explores the application of modern Natural Language Processing techniques to the detection of artificially generated videos of popular American politicians. Instead of focusing on detecting anomalies and artifacts in images and sounds, this thesis focuses on detecting irregularities and inconsistencies in the words themselves, opening up a new possibility to detect fake content. A novel, domain-adapted, pre-trained version of the language model BERT combined with several mechanisms to overcome severe dataset imbalances yielded the best quantitative as well as qualitative results. Additionally to the creation of the biggest publicly available dataset of English-speaking politicians consisting of 1.5 M sentences from over 1000 persons, this thesis conducts various experiments with different kinds of text classification and sequence processing algorithms applied to the political domain. Furthermore, multiple ablations to manage severe data imbalance are presented and evaluated.

A Study on the Inversion of Generative Adversarial Networks

Ramona beck · march 2021.

The desire to use generative adversarial networks (GANs) for real-world tasks such as object segmentation or image manipulation is increasing as synthesis quality improves, which has given rise to an emerging research area called GAN inversion that focuses on exploring methods for embedding real images into the latent space of a GAN. In this work, we investigate different GAN inversion approaches using an existing generative model architecture that takes a completely unsupervised approach to object segmentation and is based on StyleGAN2. In particular, we propose and analyze algorithms for embedding real images into the different latent spaces Z, W, and W+ of StyleGAN following an optimization-based inversion approach, while also investigating a novel approach that allows fine-tuning of the generator during the inversion process. Furthermore, we investigate a hybrid and a learning-based inversion approach, where in the former we train an encoder with embeddings optimized by our best optimization-based inversion approach, and in the latter we define an autoencoder, consisting of an encoder and the generator of our generative model as a decoder, and train it to map an image into the latent space. We demonstrate the effectiveness of our methods as well as their limitations through a quantitative comparison with existing inversion methods and by conducting extensive qualitative and quantitative experiments with synthetic data as well as real images from a complex image dataset. We show that we achieve qualitatively satisfying embeddings in the W and W+ spaces with our optimization-based algorithms, that fine-tuning the generator during the inversion process leads to qualitatively better embeddings in all latent spaces studied, and that the learning-based approach also benefits from a variable generator as well as a pre-training with our hybrid approach. Furthermore, we evaluate our approaches on the object segmentation task and show that both our optimization-based and our hybrid and learning-based methods are able to generate meaningful embeddings that achieve reasonable object segmentations. Overall, our proposed methods illustrate the potential that lies in the GAN inversion and its application to real-world tasks, especially in the relaxed version of the GAN inversion where the weights of the generator are allowed to vary.

Multi-scale Momentum Contrast for Self-supervised Image Classification

Zhao xueqi · dec. 2020.

With the maturity of supervised learning technology, people gradually shift the research focus to the field of self-supervised learning. ”Momentum Contrast” (MoCo) proposes a new self-supervised learning method and raises the correct rate of self-supervised learning to a new level. Inspired by another article ”Representation Learning by Learning to Count”, if a picture is divided into four parts and passed through a neural network, it is possible to further improve the accuracy of MoCo. Different from the original MoCo, this MoCo variant (Multi-scale MoCo) does not directly pass the image through the encoder after the augmented images. Multi-scale MoCo crops and resizes the augmented images, and the obtained four parts are respectively passed through the encoder and then summed (upsampled version do not do resize to input but resize the contrastive samples). This method of images crop is not only used for queue q but also used for comparison queue k, otherwise the weights of queue k might be damaged during the moment update. This will further discussed in the experiments chapter between downsampled Multi-scale version and downsampled both Multi-scale version. Human beings also have the same principle of object recognition: when human beings see something they are familiar with, even if the object is not fully displayed, people can still guess the object itself with a high probability. Because of this, Multi-scale MoCo applies this concept to the pretext part of MoCo, hoping to obtain better feature extraction. In this thesis, there are three versions of Multi-scale MoCo, downsampled input samples version, downsampled input samples and contrast samples version and upsampled input samples version. The differences between these versions will be described in more detail later. The neural network architecture comparison includes ResNet50 , and the tested data set is STL-10. The weights obtained in pretext will be transferred to self-supervised learning, and in the process of self-supervised learning, the weights of other layers except the final linear layer are frozen without changing (these weights come from pretext).

Self-Supervised Learning Using Siamese Networks and Binary Classifier

Dušan mihajlov · march 2020.

In this thesis, we present several approaches for training a convolutional neural network using only unlabeled data. Our autonomously supervised learning algorithms are based on connections between image patch i. e. zoomed image and its original. Using the siamese architecture neural network we aim to recognize, if the image patch, which is input to the first neural network part, comes from the same image presented to the second neural network part. By applying transformations to both images, and different zoom sizes at different positions, we force the network to extract high level features using its convolutional layers. At the top of our siamese architecture, we have a simple binary classifier that measures the difference between feature maps that we extract and makes a decision. Thus, the only way that the classifier will solve the task correctly is when our convolutional layers are extracting useful representations. Those representations we can than use to solve many different tasks that are related to the data used for unsupervised training. As the main benchmark for all of our models, we used STL10 dataset, where we train a linear classifier on the top of our convolutional layers with a small amount of manually labeled images, which is a widely used benchmark for unsupervised learning tasks. We also combine our idea with recent work on the same topic, and the network called RotNet, which makes use of image rotations and therefore forces the network to learn rotation dependent features from the dataset. As a result of this combination we create a new procedure that outperforms original RotNet.

Learning Object Representations by Mixing Scenes

Lukas zbinden · may 2019.

In the digital age of ever increasing data amassment and accessibility, the demand for scalable machine learning models effective at refining the new oil is unprecedented. Unsupervised representation learning methods present a promising approach to exploit this invaluable yet unlabeled digital resource at scale. However, a majority of these approaches focuses on synthetic or simplified datasets of images. What if a method could learn directly from natural Internet-scale image data? In this thesis, we propose a novel approach for unsupervised learning of object representations by mixing natural image scenes. Without any human help, our method mixes visually similar images to synthesize new realistic scenes using adversarial training. In this process the model learns to represent and understand the objects prevalent in natural image data and makes them available for downstream applications. For example, it enables the transfer of objects from one scene to another. Through qualitative experiments on complex image data we show the effectiveness of our method along with its limitations. Moreover, we benchmark our approach quantitatively against state-of-the-art works on the STL-10 dataset. Our proposed method demonstrates the potential that lies in learning representations directly from natural image data and reinforces it as a promising avenue for future research.

Representation Learning using Semantic Distances

Markus roth · may 2019, zero-shot learning using generative adversarial networks, hamed hemati · dec. 2018, dimensionality reduction via cnns - learning the distance between images, ioannis glampedakis · sept. 2018, learning to play othello using deep reinforcement learning and self play, thomas simon steinmann · sept. 2018, aba-j interactive multi-modality tissue sectionto-volume alignment: a brain atlasing toolkit for imagej, felix meyenhofer · march 2018, learning visual odometry with recurrent neural networks, adrian wälchli · feb. 2018.

In computer vision, Visual Odometry is the problem of recovering the camera motion from a video. It is related to Structure from Motion, the problem of reconstructing the 3D geometry from a collection of images. Decades of research in these areas have brought successful algorithms that are used in applications like autonomous navigation, motion capture, augmented reality and others. Despite the success of these prior works in real-world environments, their robustness is highly dependent on manual calibration and the magnitude of noise present in the images in form of, e.g., non-Lambertian surfaces, dynamic motion and other forms of ambiguity. This thesis explores an alternative approach to the Visual Odometry problem via Deep Learning, that is, a specific form of machine learning with artificial neural networks. It describes and focuses on the implementation of a recent work that proposes the use of Recurrent Neural Networks to learn dependencies over time due to the sequential nature of the input. Together with a convolutional neural network that extracts motion features from the input stream, the recurrent part accumulates knowledge from the past to make camera pose estimations at each point in time. An analysis on the performance of this system is carried out on real and synthetic data. The evaluation covers several ways of training the network as well as the impact and limitations of the recurrent connection for Visual Odometry.

Crime location and timing prediction

Bernard swart · jan. 2018, from cartoons to real images: an approach to unsupervised visual representation learning, simon jenni · feb. 2017, automatic and large-scale assessment of fluid in retinal oct volume, nina mujkanovic · dec. 2016, segmentation in 3d using eye-tracking technology, michele wyss · july 2016, accurate scale thresholding via logarithmic total variation prior, remo diethelm · aug. 2014, novel techniques for robust and generalizable machine learning, abdelhak lemkhenter · sept. 2023.

Neural networks have transcended their status of powerful proof-of-concept machine learning into the realm of a highly disruptive technology that has revolutionized many quantitative fields such as drug discovery, autonomous vehicles, and machine translation. Today, it is nearly impossible to go a single day without interacting with a neural network-powered application. From search engines to on-device photo-processing, neural networks have become the go-to solution thanks to recent advances in computational hardware and an unprecedented scale of training data. Larger and less curated datasets, typically obtained through web crawling, have greatly propelled the capabilities of neural networks forward. However, this increase in scale amplifies certain challenges associated with training such models. Beyond toy or carefully curated datasets, data in the wild is plagued with biases, imbalances, and various noisy components. Given the larger size of modern neural networks, such models run the risk of learning spurious correlations that fail to generalize beyond their training data. This thesis addresses the problem of training more robust and generalizable machine learning models across a wide range of learning paradigms for medical time series and computer vision tasks. The former is a typical example of a low signal-to-noise ratio data modality with a high degree of variability between subjects and datasets. There, we tailor the training scheme to focus on robust patterns that generalize to new subjects and ignore the noisier and subject-specific patterns. To achieve this, we first introduce a physiologically inspired unsupervised training task and then extend it by explicitly optimizing for cross-dataset generalization using meta-learning. In the context of image classification, we address the challenge of training semi-supervised models under class imbalance by designing a novel label refinement strategy with higher local sensitivity to minority class samples while preserving the global data distribution. Lastly, we introduce a new Generative Adversarial Networks training loss. Such generative models could be applied to improve the training of subsequent models in the low data regime by augmenting the dataset using generated samples. Unfortunately, GAN training relies on a delicate balance between its components, making it prone mode collapse. Our contribution consists of defining a more principled GAN loss whose gradients incentivize the generator model to seek out missing modes in its distribution. All in all, this thesis tackles the challenge of training more robust machine learning models that can generalize beyond their training data. This necessitates the development of methods specifically tailored to handle the diverse biases and spurious correlations inherent in the data. It is important to note that achieving greater generalizability in models goes beyond simply increasing the volume of data; it requires meticulous consideration of training objectives and model architecture. By tackling these challenges, this research contributes to advancing the field of machine learning and underscores the significance of thoughtful design in obtaining more resilient and versatile models.

Automated Sleep Scoring, Deep Learning and Physician Supervision

Luigi fiorillo · oct. 2022.

Sleep plays a crucial role in human well-being. Polysomnography is used in sleep medicine as a diagnostic tool, so as to objectively analyze the quality of sleep. Sleep scoring is the procedure of extracting sleep cycle information from the wholenight electrophysiological signals. The scoring is done worldwide by the sleep physicians according to the official American Academy of Sleep Medicine (AASM) scoring manual. In the last decades, a wide variety of deep learning based algorithms have been proposed to automatise the sleep scoring task. In this thesis we study the reasons why these algorithms fail to be introduced in the daily clinical routine, with the perspective of bridging the existing gap between the automatic sleep scoring models and the sleep physicians. In this light, the primary step is the design of a simplified sleep scoring architecture, also providing an estimate of the model uncertainty. Beside achieving results on par with most up-to-date scoring systems, we demonstrate the efficiency of ensemble learning based algorithms, together with label smoothing techniques, in both enhancing the performance and calibrating the simplified scoring model. We introduced an uncertainty estimate procedure, so as to identify the most challenging sleep stage predictions, and to quantify the disagreement between the predictions given by the model and the annotation given by the physicians. In this thesis we also propose a novel method to integrate the inter-scorer variability into the training procedure of a sleep scoring model. We clearly show that a deep learning model is able to encode this variability, so as to better adapt to the consensus of a group of scorers-physicians. We finally address the generalization ability of a deep learning based sleep scoring system, further studying its resilience to the sleep complexity and to the AASM scoring rules. We can state that there is no need to train the algorithm strictly following the AASM guidelines. Most importantly, using data from multiple data centers results in a better performing model compared with training on a single data cohort. The variability among different scorers and data centers needs to be taken into account, more than the variability among sleep disorders.

Learning Representations for Controllable Image Restoration

Givi meishvili · march 2022.

Deep Convolutional Neural Networks have sparked a renaissance in all the sub-fields of computer vision. Tremendous progress has been made in the area of image restoration. The research community has pushed the boundaries of image deblurring, super-resolution, and denoising. However, given a distorted image, most existing methods typically produce a single restored output. The tasks mentioned above are inherently ill-posed, leading to an infinite number of plausible solutions. This thesis focuses on designing image restoration techniques capable of producing multiple restored results and granting users more control over the restoration process. Towards this goal, we demonstrate how one could leverage the power of unsupervised representation learning. Image restoration is vital when applied to distorted images of human faces due to their social significance. Generative Adversarial Networks enable an unprecedented level of generated facial details combined with smooth latent space. We leverage the power of GANs towards the goal of learning controllable neural face representations. We demonstrate how to learn an inverse mapping from image space to these latent representations, tuning these representations towards a specific task, and finally manipulating latent codes in these spaces. For example, we show how GANs and their inverse mappings enable the restoration and editing of faces in the context of extreme face super-resolution and the generation of novel view sharp videos from a single motion-blurred image of a face. This thesis also addresses more general blind super-resolution, denoising, and scratch removal problems, where blur kernels and noise levels are unknown. We resort to contrastive representation learning and first learn the latent space of degradations. We demonstrate that the learned representation allows inference of ground-truth degradation parameters and can guide the restoration process. Moreover, it enables control over the amount of deblurring and denoising in the restoration via manipulation of latent degradation features.

Learning Generalizable Visual Patterns Without Human Supervision

Simon jenni · oct. 2021.

Owing to the existence of large labeled datasets, Deep Convolutional Neural Networks have ushered in a renaissance in computer vision. However, almost all of the visual data we generate daily - several human lives worth of it - remains unlabeled and thus out of reach of today’s dominant supervised learning paradigm. This thesis focuses on techniques that steer deep models towards learning generalizable visual patterns without human supervision. Our primary tool in this endeavor is the design of Self-Supervised Learning tasks, i.e., pretext-tasks for which labels do not involve human labor. Besides enabling the learning from large amounts of unlabeled data, we demonstrate how self-supervision can capture relevant patterns that supervised learning largely misses. For example, we design learning tasks that learn deep representations capturing shape from images, motion from video, and 3D pose features from multi-view data. Notably, these tasks’ design follows a common principle: The recognition of data transformations. The strong performance of the learned representations on downstream vision tasks such as classification, segmentation, action recognition, or pose estimation validate this pretext-task design. This thesis also explores the use of Generative Adversarial Networks (GANs) for unsupervised representation learning. Besides leveraging generative adversarial learning to define image transformation for self-supervised learning tasks, we also address training instabilities of GANs through the use of noise. While unsupervised techniques can significantly reduce the burden of supervision, in the end, we still rely on some annotated examples to fine-tune learned representations towards a target task. To improve the learning from scarce or noisy labels, we describe a supervised learning algorithm with improved generalization in these challenging settings.

Learning Interpretable Representations of Images

Attila szabó · june 2019.

Computers represent images with pixels and each pixel contains three numbers for red, green and blue colour values. These numbers are meaningless for humans and they are mostly useless when used directly with classical machine learning techniques like linear classifiers. Interpretable representations are the attributes that humans understand: the colour of the hair, viewpoint of a car or the 3D shape of the object in the scene. Many computer vision tasks can be viewed as learning interpretable representations, for example a supervised classification algorithm directly learns to represent images with their class labels. In this work we aim to learn interpretable representations (or features) indirectly with lower levels of supervision. This approach has the advantage of cost savings on dataset annotations and the flexibility of using the features for multiple follow-up tasks. We made contributions in three main areas: weakly supervised learning, unsupervised learning and 3D reconstruction. In the weakly supervised case we use image pairs as supervision. Each pair shares a common attribute and differs in a varying attribute. We propose a training method that learns to separate the attributes into separate feature vectors. These features then are used for attribute transfer and classification. We also show theoretical results on the ambiguities of the learning task and the ways to avoid degenerate solutions. We show a method for unsupervised representation learning, that separates semantically meaningful concepts. We explain and show ablation studies how the components of our proposed method work: a mixing autoencoder, a generative adversarial net and a classifier. We propose a method for learning single image 3D reconstruction. It is done using only the images, no human annotation, stereo, synthetic renderings or ground truth depth map is needed. We train a generative model that learns the 3D shape distribution and an encoder to reconstruct the 3D shape. For that we exploit the notion of image realism. It means that the 3D reconstruction of the object has to look realistic when it is rendered from different random angles. We prove the efficacy of our method from first principles.

Learning Controllable Representations for Image Synthesis

Qiyang hu · june 2019.

In this thesis, our focus is learning a controllable representation and applying the learned controllable feature representation on images synthesis, video generation, and even 3D reconstruction. We propose different methods to disentangle the feature representation in neural network and analyze the challenges in disentanglement such as reference ambiguity and shortcut problem when using the weak label. We use the disentangled feature representation to transfer attributes between images such as exchanging hairstyle between two face images. Furthermore, we study the problem of how another type of feature, sketch, works in a neural network. The sketch can provide shape and contour of an object such as the silhouette of the side-view face. We leverage the silhouette constraint to improve the 3D face reconstruction from 2D images. The sketch can also provide the moving directions of one object, thus we investigate how one can manipulate the object to follow the trajectory provided by a user sketch. We propose a method to automatically generate video clips from a single image input using the sketch as motion and trajectory guidance to animate the object in that image. We demonstrate the efficiency of our approaches on several synthetic and real datasets.

Beyond Supervised Representation Learning

Mehdi noroozi · jan. 2019.

The complexity of any information processing task is highly dependent on the space where data is represented. Unfortunately, pixel space is not appropriate for the computer vision tasks such as object classification. The traditional computer vision approaches involve a multi-stage pipeline where at first images are transformed to a feature space through a handcrafted function and then consequenced by the solution in the feature space. The challenge with this approach is the complexity of designing handcrafted functions that extract robust features. The deep learning based approaches address this issue by end-to-end training of a neural network for some tasks that lets the network to discover the appropriate representation for the training tasks automatically. It turns out that image classification task on large scale annotated datasets yields a representation transferable to other computer vision tasks. However, supervised representation learning is limited to annotations. In this thesis we study self-supervised representation learning where the goal is to alleviate these limitations by substituting the classification task with pseudo tasks where the labels come for free. We discuss self-supervised learning by solving jigsaw puzzles that uses context as supervisory signal. The rational behind this task is that the network requires to extract features about object parts and their spatial configurations to solve the jigsaw puzzles. We also discuss a method for representation learning that uses an artificial supervisory signal based on counting visual primitives. This supervisory signal is obtained from an equivariance relation. We use two image transformations in the context of counting: scaling and tiling. The first transformation exploits the fact that the number of visual primitives should be invariant to scale. The second transformation allows us to equate the total number of visual primitives in each tile to that in the whole image. The most effective transfer strategy is fine-tuning, which restricts one to use the same model or parts thereof for both pretext and target tasks. We discuss a novel framework for self-supervised learning that overcomes limitations in designing and comparing different tasks, models, and data domains. In particular, our framework decouples the structure of the self-supervised model from the final task-specific finetuned model. Finally, we study the problem of multi-task representation learning. A naive approach to enhance the representation learned by a task is to train the task jointly with other tasks that capture orthogonal attributes. Having a diverse set of auxiliary tasks, imposes challenges on multi-task training from scratch. We propose a framework that allows us to combine arbitrarily different feature spaces into a single deep neural network. We reduce the auxiliary tasks to classification tasks and the multi-task learning to multi-label classification task consequently. Nevertheless, combining multiple representation space without being aware of the target task might be suboptimal. As our second contribution, we show empirically that this is indeed the case and propose to combine multiple tasks after the fine-tuning on the target task.

Motion Deblurring from a Single Image

Meiguang jin · dec. 2018.

With the information explosion, a tremendous amount photos is captured and shared via social media everyday. Technically, a photo requires a finite exposure to accumulate light from the scene. Thus, objects moving during the exposure generate motion blur in a photo. Motion blur is an image degradation that makes visual content less interpretable and is therefore often seen as a nuisance. Although motion blur can be reduced by setting a short exposure time, an insufficient amount of light has to be compensated through increasing the sensor’s sensitivity, which will inevitably bring large amount of sensor noise. Thus this motivates the necessity of removing motion blur computationally. Motion deblurring is an important problem in computer vision and it is challenging due to its ill-posed nature, which means the solution is not well defined. Mathematically, a blurry image caused by uniform motion is formed by the convolution operation between a blur kernel and a latent sharp image. Potentially there are infinite pairs of blur kernel and latent sharp image that can result in the same blurry image. Hence, some prior knowledge or regularization is required to address this problem. Even if the blur kernel is known, restoring the latent sharp image is still difficult as the high frequency information has been removed. Although we can model the uniform motion deblurring problem mathematically, it can only address the camera in-plane translational motion. Practically, motion is more complicated and can be non-uniform. Non-uniform motion blur can come from many sources, camera out-of-plane rotation, scene depth change, object motion and so on. Thus, it is more challenging to remove non-uniform motion blur. In this thesis, our focus is motion blur removal. We aim to address four challenging motion deblurring problems. We start from the noise blind image deblurring scenario where blur kernel is known but the noise level is unknown. We introduce an efficient and robust solution based on a Bayesian framework using a smooth generalization of the 0−1 loss to address this problem. Then we study the blind uniform motion deblurring scenario where both the blur kernel and the latent sharp image are unknown. We explore the relative scale ambiguity between the latent sharp image and blur kernel to address this issue. Moreover, we study the face deblurring problem and introduce a novel deep learning network architecture to solve it. We also address the general motion deblurring problem and particularly we aim at recovering a sequence of 7 frames each depicting some instantaneous motion of the objects in the scene.

Towards a Novel Paradigm in Blind Deconvolution: From Natural to Cartooned Image Statistics

Daniele perrone · july 2015.

In this thesis we study the blind deconvolution problem. Blind deconvolution consists in the estimation of a sharp image and a blur kernel from an observed blurry image. Because the blur model admits several solutions it is necessary to devise an image prior that favors the true blur kernel and sharp image. Recently it has been shown that a class of blind deconvolution formulations and image priors has the no-blur solution as global minimum. Despite this shortcoming, algorithms based on these formulations and priors can successfully solve blind deconvolution. In this thesis we show that a suitable initialization can exploit the non-convexity of the problem and yield the desired solution. Based on these conclusions, we propose a novel “vanilla” algorithm stripped of any enhancement typically used in the literature. Our algorithm, despite its simplicity, is able to compete with the top performers on several datasets. We have also investigated a remarkable behavior of a 1998 algorithm, whose formulation has the no-blur solution as global minimum: even when initialized at the no-blur solution, it converges to the correct solution. We show that this behavior is caused by an apparently insignificant implementation strategy that makes the algorithm no longer minimize the original cost functional. We also demonstrate that this strategy improves the results of our “vanilla” algorithm. Finally, we present a study of image priors for blind deconvolution. We provide experimental evidence supporting the recent belief that a good image prior is one that leads to a good blur estimate rather than being a good natural image statistical model. By focusing the attention on the blur estimation alone, we show that good blur estimates can be obtained even when using images quite different from the true sharp image. This allows using image priors, such as those leading to “cartooned” images, that avoid the no-blur solution. By using an image prior that produces “cartooned” images we achieve state-of-the-art results on different publicly available datasets. We therefore suggests a shift of paradigm in blind deconvolution: from modeling natural image statistics to modeling cartooned image statistics.

New Perspectives on Uncalibrated Photometric Stereo

Thoma papadhimitri · june 2014.

This thesis investigates the problem of 3D reconstruction of a scene from 2D images. In particular, we focus on photometric stereo which is a technique that computes the 3D geometry from at least three images taken from the same viewpoint and under different illumination conditions. When the illumination is unknown (uncalibrated photometric stereo) the problem is ambiguous: different combinations of geometry and illumination can generate the same images. First, we solve the ambiguity by exploiting the Lambertian reflectance maxima. These are points defined on curved surfaces where the normals are parallel to the light direction. Then, we propose a solution that can be computed in closed-form and thus very efficiently. Our algorithm is also very robust and yields always the same estimate regardless of the initial ambiguity. We validate our method on real world experiments and achieve state-of-art results. In this thesis we also solve for the first time the uncalibrated photometric stereo problem under the perspective projection model. We show that unlike in the orthographic case, one can uniquely reconstruct the normals of the object and the lights given only the input images and the camera calibration (focal length and image center). We also propose a very efficient algorithm which we validate on synthetic and real world experiments and show that the proposed technique is a generalization of the orthographic case. Finally, we investigate the uncalibrated photometric stereo problem in the case where the lights are distributed near the scene. In this case we propose an alternating minimization technique which converges quickly and overcomes the limitations of prior work that assumes distant illumination. We show experimentally that adopting a near-light model for real world scenes yields very accurate reconstructions.

machine vision thesis topics

Analytics Insight

Top 10 Research and Thesis Topics for ML Projects in 2022

Avatar photo

This article features the top 10 research and thesis topics for ML projects for students to try in 2022

Text mining and text classification, image-based applications, machine vision, optimization, voice classification, sentiment analysis, recommendation framework project, mall customers’ project, object detection with deep learning.

Whatsapp Icon

Disclaimer: Any financial and crypto market information given on Analytics Insight are sponsored articles, written for informational purpose only and is not an investment advice. The readers are further advised that Crypto products and NFTs are unregulated and can be highly risky. There may be no regulatory recourse for any loss from such transactions. Conduct your own research by contacting financial experts before making any investment decisions. The decision to read hereinafter is purely a matter of choice and shall be construed as an express undertaking/guarantee in favour of Analytics Insight of being absolved from any/ all potential legal action, or enforceable claims. We do not represent nor own any cryptocurrency, any complaints, abuse or concerns with regards to the information provided shall be immediately informed here .

You May Also Like


Bitcoin and Ethereum Whales Are Piling into InQubeta Presale Following 25% Growth This Week


Famous Investor Sets Ambitious Bitcoin Target of $250,000; DigiToads (TOADS) and Cardano (ADA) Poised to 100x

Shiba Inu

HedgeUp (HDUP) Pre-Sale In Headlights for Whales in May, Experts Invision Another Shiba Inu (SHIB) and Dogecoin (DOGE) Crypto Eclipse


Unleashing the Power of AI and Data Science Careers

machine vision thesis topics

Analytics Insight® is an influential platform dedicated to insights, trends, and opinion from the world of data-driven technologies. It monitors developments, recognition, and achievements made by Artificial Intelligence, Big Data and Analytics companies across the globe.


  • Select Language:
  • Privacy Policy
  • Content Licensing
  • Terms & Conditions
  • Submit an Interview

Special Editions

  • Dec – Crypto Weekly Vol-1
  • 40 Under 40 Innovators
  • Women In Technology
  • Market Reports
  • AI Glossary
  • Infographics

Latest Issue

Dec – Crypto Weekly Vol-2

Disclaimer: Any financial and crypto market information given on Analytics Insight is written for informational purpose only and is not an investment advice. Conduct your own research by contacting financial experts before making any investment decisions, more information here .

Second Menu

machine vision thesis topics

youtube logo

The Future of AI Research: 20 Thesis Ideas for Undergraduate Students in Machine Learning and Deep Learning for 2023!

A comprehensive guide for crafting an original and innovative thesis in the field of ai..

By Aarafat Islam on 2023-01-11

“The beauty of machine learning is that it can be applied to any problem you want to solve, as long as you can provide the computer with enough examples.” — Andrew Ng

This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023. Each thesis idea includes an  introduction , which presents a brief overview of the topic and the  research objectives . The ideas provided are related to different areas of machine learning and deep learning, such as computer vision, natural language processing, robotics, finance, drug discovery, and more. The article also includes explanations, examples, and conclusions for each thesis idea, which can help guide the research and provide a clear understanding of the potential contributions and outcomes of the proposed research. The article also emphasized the importance of originality and the need for proper citation in order to avoid plagiarism.

1. Investigating the use of Generative Adversarial Networks (GANs) in medical imaging:  A deep learning approach to improve the accuracy of medical diagnoses.

Introduction:  Medical imaging is an important tool in the diagnosis and treatment of various medical conditions. However, accurately interpreting medical images can be challenging, especially for less experienced doctors. This thesis aims to explore the use of GANs in medical imaging, in order to improve the accuracy of medical diagnoses.

2. Exploring the use of deep learning in natural language generation (NLG): An analysis of the current state-of-the-art and future potential.

Introduction:  Natural language generation is an important field in natural language processing (NLP) that deals with creating human-like text automatically. Deep learning has shown promising results in NLP tasks such as machine translation, sentiment analysis, and question-answering. This thesis aims to explore the use of deep learning in NLG and analyze the current state-of-the-art models, as well as potential future developments.

3. Development and evaluation of deep reinforcement learning (RL) for robotic navigation and control.

Introduction:  Robotic navigation and control are challenging tasks, which require a high degree of intelligence and adaptability. Deep RL has shown promising results in various robotics tasks, such as robotic arm control, autonomous navigation, and manipulation. This thesis aims to develop and evaluate a deep RL-based approach for robotic navigation and control and evaluate its performance in various environments and tasks.

4. Investigating the use of deep learning for drug discovery and development.

Introduction:  Drug discovery and development is a time-consuming and expensive process, which often involves high failure rates. Deep learning has been used to improve various tasks in bioinformatics and biotechnology, such as protein structure prediction and gene expression analysis. This thesis aims to investigate the use of deep learning for drug discovery and development and examine its potential to improve the efficiency and accuracy of the drug development process.

5. Comparison of deep learning and traditional machine learning methods for anomaly detection in time series data.

Introduction:  Anomaly detection in time series data is a challenging task, which is important in various fields such as finance, healthcare, and manufacturing. Deep learning methods have been used to improve anomaly detection in time series data, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for anomaly detection in time series data and examine their respective strengths and weaknesses.

machine vision thesis topics

Photo by  Joanna Kosinska  on  Unsplash

6. Use of deep transfer learning in speech recognition and synthesis.

Introduction:  Speech recognition and synthesis are areas of natural language processing that focus on converting spoken language to text and vice versa. Transfer learning has been widely used in deep learning-based speech recognition and synthesis systems to improve their performance by reusing the features learned from other tasks. This thesis aims to investigate the use of transfer learning in speech recognition and synthesis and how it improves the performance of the system in comparison to traditional methods.

7. The use of deep learning for financial prediction.

Introduction:  Financial prediction is a challenging task that requires a high degree of intelligence and adaptability, especially in the field of stock market prediction. Deep learning has shown promising results in various financial prediction tasks, such as stock price prediction and credit risk analysis. This thesis aims to investigate the use of deep learning for financial prediction and examine its potential to improve the accuracy of financial forecasting.

8. Investigating the use of deep learning for computer vision in agriculture.

Introduction:  Computer vision has the potential to revolutionize the field of agriculture by improving crop monitoring, precision farming, and yield prediction. Deep learning has been used to improve various computer vision tasks, such as object detection, semantic segmentation, and image classification. This thesis aims to investigate the use of deep learning for computer vision in agriculture and examine its potential to improve the efficiency and accuracy of crop monitoring and precision farming.

9. Development and evaluation of deep learning models for generative design in engineering and architecture.

Introduction:  Generative design is a powerful tool in engineering and architecture that can help optimize designs and reduce human error. Deep learning has been used to improve various generative design tasks, such as design optimization and form generation. This thesis aims to develop and evaluate deep learning models for generative design in engineering and architecture and examine their potential to improve the efficiency and accuracy of the design process.

10. Investigating the use of deep learning for natural language understanding.

Introduction:  Natural language understanding is a complex task of natural language processing that involves extracting meaning from text. Deep learning has been used to improve various NLP tasks, such as machine translation, sentiment analysis, and question-answering. This thesis aims to investigate the use of deep learning for natural language understanding and examine its potential to improve the efficiency and accuracy of natural language understanding systems.

machine vision thesis topics

Photo by  UX Indonesia  on  Unsplash

11. Comparing deep learning and traditional machine learning methods for image compression.

Introduction:  Image compression is an important task in image processing and computer vision. It enables faster data transmission and storage of image files. Deep learning methods have been used to improve image compression, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for image compression and examine their respective strengths and weaknesses.

12. Using deep learning for sentiment analysis in social media.

Introduction:  Sentiment analysis in social media is an important task that can help businesses and organizations understand their customers’ opinions and feedback. Deep learning has been used to improve sentiment analysis in social media, by training models on large datasets of social media text. This thesis aims to use deep learning for sentiment analysis in social media, and evaluate its performance against traditional machine learning methods.

13. Investigating the use of deep learning for image generation.

Introduction:  Image generation is a task in computer vision that involves creating new images from scratch or modifying existing images. Deep learning has been used to improve various image generation tasks, such as super-resolution, style transfer, and face generation. This thesis aims to investigate the use of deep learning for image generation and examine its potential to improve the quality and diversity of generated images.

14. Development and evaluation of deep learning models for anomaly detection in cybersecurity.

Introduction:  Anomaly detection in cybersecurity is an important task that can help detect and prevent cyber-attacks. Deep learning has been used to improve various anomaly detection tasks, such as intrusion detection and malware detection. This thesis aims to develop and evaluate deep learning models for anomaly detection in cybersecurity and examine their potential to improve the efficiency and accuracy of cybersecurity systems.

15. Investigating the use of deep learning for natural language summarization.

Introduction:  Natural language summarization is an important task in natural language processing that involves creating a condensed version of a text that preserves its main meaning. Deep learning has been used to improve various natural language summarization tasks, such as document summarization and headline generation. This thesis aims to investigate the use of deep learning for natural language summarization and examine its potential to improve the efficiency and accuracy of natural language summarization systems.

machine vision thesis topics

Photo by  Windows  on  Unsplash

16. Development and evaluation of deep learning models for facial expression recognition.

Introduction:  Facial expression recognition is an important task in computer vision and has many practical applications, such as human-computer interaction, emotion recognition, and psychological studies. Deep learning has been used to improve facial expression recognition, by training models on large datasets of images. This thesis aims to develop and evaluate deep learning models for facial expression recognition and examine their performance against traditional machine learning methods.

17. Investigating the use of deep learning for generative models in music and audio.

Introduction:  Music and audio synthesis is an important task in audio processing, which has many practical applications, such as music generation and speech synthesis. Deep learning has been used to improve generative models for music and audio, by training models on large datasets of audio data. This thesis aims to investigate the use of deep learning for generative models in music and audio and examine its potential to improve the quality and diversity of generated audio.

18. Study the comparison of deep learning models with traditional algorithms for anomaly detection in network traffic.

Introduction:  Anomaly detection in network traffic is an important task that can help detect and prevent cyber-attacks. Deep learning models have been used for this task, and traditional methods such as clustering and rule-based systems are widely used as well. This thesis aims to compare deep learning models with traditional algorithms for anomaly detection in network traffic and analyze the trade-offs between the models in terms of accuracy and scalability.

19. Investigating the use of deep learning for improving recommender systems.

Introduction:  Recommender systems are widely used in many applications such as online shopping, music streaming, and movie streaming. Deep learning has been used to improve the performance of recommender systems, by training models on large datasets of user-item interactions. This thesis aims to investigate the use of deep learning for improving recommender systems and compare its performance with traditional content-based and collaborative filtering approaches.

20. Development and evaluation of deep learning models for multi-modal data analysis.

Introduction:  Multi-modal data analysis is the task of analyzing and understanding data from multiple sources such as text, images, and audio. Deep learning has been used to improve multi-modal data analysis, by training models on large datasets of multi-modal data. This thesis aims to develop and evaluate deep learning models for multi-modal data analysis and analyze their potential to improve performance in comparison to single-modal models.

I hope that this article has provided you with a useful guide for your thesis research in machine learning and deep learning. Remember to conduct a thorough literature review and to include proper citations in your work, as well as to be original in your research to avoid plagiarism. I wish you all the best of luck with your thesis and your research endeavors!

Continue Learning

2050: what ai foresees for the future world.

Exploring the future of AI and its potential impact on various aspects of our world by the year 2050.

JARVIS (Just a Rather Very Intelligent System): Your Personal AI Assistant

Advanced AI Technology Simplifying Your Life

Top AI/ML Conferences to attend in 2023

An extensive list of the top upcoming conferences in 2023 within the Artificial Intelligence field you should not miss

Enhance Your Chatbot’s Web Search Capabilities with Langchain and SerpAPI

Znote ai: the perfect sandbox for prototyping and deploying code, how ai is revolutionizing fraud prevention in the digital age.


Computer Vision Group TUM School of Computation, Information and Technology Technical University of Munich

Technical university of munich.

  • Our University
  • Coronavirus
  • Publications
  • Departments
  • Awards and Honors
  • University Hospitals
  • Teaching and QMS
  • Working at TUM
  • Contact & Directions
  • Research Centers
  • Excellence Strategy
  • Research projects
  • Research Partners
  • Research promotion
  • Doctorate (Ph.D.)
  • Career openings
  • Entrepre­neurship
  • Technology transfer
  • Industry Liaison Office
  • Lifelong learning
  • Degree programs
  • International Students
  • Application
  • Fees and Financial Aid
  • During your Studies
  • Completing your Studies
  • Student Life
  • Accommo­dation
  • Music and Arts
  • Alumni Services
  • Career Service
  • TUM for schools
  • International Locations
  • International Alliances
  • Language Center

machine vision thesis topics

  • Warning : Invalid argument supplied for foreach() in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 95 Warning : array_merge(): Expected parameter 2 to be an array, null given in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 102
  • AI+ Training
  • Speak at ODSC

machine vision thesis topics

  • Data Visualization
  • Machine Learning
  • Deep Learning
  • Downloadable Guide
  • NLP/Text Analytics
  • Generative AI
  • Write for us
  • ODSC Community Slack Channel
  • ODSC Medium Publication
  • Speaker Blogs
  • Guest Contributors
  • AI and Data Science News
  • Research in academia
  • Upcoming Webinars

10 Compelling Machine Learning Ph.D. Dissertations for 2020

10 Compelling Machine Learning Ph.D. Dissertations for 2020

Machine Learning Modeling Research posted by Daniel Gutierrez, ODSC August 19, 2020 Daniel Gutierrez, ODSC

As a data scientist, an integral part of my work in the field revolves around keeping current with research coming out of academia. I frequently scour arXiv.org for late-breaking papers that show trends and reveal fertile areas of research. Other sources of valuable research developments are in the form of Ph.D. dissertations, the culmination of a doctoral candidate’s work to confer his/her degree. Ph.D. candidates are highly motivated to choose research topics that establish new and creative paths toward discovery in their field of study. Their dissertations are highly focused on a specific problem. If you can find a dissertation that aligns with your areas of interest, consuming the research is an excellent way to do a deep dive into the technology. After reviewing hundreds of recent theses from universities all over the country, I present 10 machine learning dissertations that I found compelling in terms of my own areas of interest.

[Related article: Introduction to Bayesian Deep Learning ]

I hope you’ll find several that match your own fields of inquiry. Each thesis may take a while to consume but will result in hours of satisfying summer reading. Enjoy!

1. Bayesian Modeling and Variable Selection for Complex Data

As we routinely encounter high-throughput data sets in complex biological and environmental research, developing novel models and methods for variable selection has received widespread attention. This dissertation addresses a few key challenges in Bayesian modeling and variable selection for high-dimensional data with complex spatial structures. 

2. Topics in Statistical Learning with a Focus on Large Scale Data

Big data vary in shape and call for different approaches. One type of big data is the tall data, i.e., a very large number of samples but not too many features. This dissertation describes a general communication-efficient algorithm for distributed statistical learning on this type of big data. The algorithm distributes the samples uniformly to multiple machines, and uses a common reference data to improve the performance of local estimates. The algorithm enables potentially much faster analysis, at a small cost to statistical performance.

Another type of big data is the wide data, i.e., too many features but a limited number of samples. It is also called high-dimensional data, to which many classical statistical methods are not applicable. 

This dissertation discusses a method of dimensionality reduction for high-dimensional classification. The method partitions features into independent communities and splits the original classification problem into separate smaller ones. It enables parallel computing and produces more interpretable results.

3. Sets as Measures: Optimization and Machine Learning

The purpose of this machine learning dissertation is to address the following simple question:

How do we design efficient algorithms to solve optimization or machine learning problems where the decision variable (or target label) is a set of unknown cardinality?

Optimization and machine learning have proved remarkably successful in applications requiring the choice of single vectors. Some tasks, in particular many inverse problems, call for the design, or estimation, of sets of objects. When the size of these sets is a priori unknown, directly applying optimization or machine learning techniques designed for single vectors appears difficult. The work in this dissertation shows that a very old idea for transforming sets into elements of a vector space (namely, a space of measures), a common trick in theoretical analysis, generates effective practical algorithms.

4. A Geometric Perspective on Some Topics in Statistical Learning

Modern science and engineering often generate data sets with a large sample size and a comparably large dimension which puts classic asymptotic theory into question in many ways. Therefore, the main focus of this dissertation is to develop a fundamental understanding of statistical procedures for estimation and hypothesis testing from a non-asymptotic point of view, where both the sample size and problem dimension grow hand in hand. A range of different problems are explored in this thesis, including work on the geometry of hypothesis testing, adaptivity to local structure in estimation, effective methods for shape-constrained problems, and early stopping with boosting algorithms. The treatment of these different problems shares the common theme of emphasizing the underlying geometric structure.

5. Essays on Random Forest Ensembles

A random forest is a popular machine learning ensemble method that has proven successful in solving a wide range of classification problems. While other successful classifiers, such as boosting algorithms or neural networks, admit natural interpretations as maximum likelihood, a suitable statistical interpretation is much more elusive for a random forest. The first part of this dissertation demonstrates that a random forest is a fruitful framework in which to study AdaBoost and deep neural networks. The work explores the concept and utility of interpolation, the ability of a classifier to perfectly fit its training data. The second part of this dissertation places a random forest on more sound statistical footing by framing it as kernel regression with the proximity kernel. The work then analyzes the parameters that control the bandwidth of this kernel and discuss useful generalizations.

6. Marginally Interpretable Generalized Linear Mixed Models

A popular approach for relating correlated measurements of a non-Gaussian response variable to a set of predictors is to introduce latent random variables and fit a generalized linear mixed model. The conventional strategy for specifying such a model leads to parameter estimates that must be interpreted conditional on the latent variables. In many cases, interest lies not in these conditional parameters, but rather in marginal parameters that summarize the average effect of the predictors across the entire population. Due to the structure of the generalized linear mixed model, the average effect across all individuals in a population is generally not the same as the effect for an average individual. Further complicating matters, obtaining marginal summaries from a generalized linear mixed model often requires evaluation of an analytically intractable integral or use of an approximation. Another popular approach in this setting is to fit a marginal model using generalized estimating equations. This strategy is effective for estimating marginal parameters, but leaves one without a formal model for the data with which to assess quality of fit or make predictions for future observations. Thus, there exists a need for a better approach.

This dissertation defines a class of marginally interpretable generalized linear mixed models that leads to parameter estimates with a marginal interpretation while maintaining the desirable statistical properties of a conditionally specified model. The distinguishing feature of these models is an additive adjustment that accounts for the curvature of the link function and thereby preserves a specific form for the marginal mean after integrating out the latent random variables. 

7. On the Detection of Hate Speech, Hate Speakers and Polarized Groups in Online Social Media

The objective of this dissertation is to explore the use of machine learning algorithms in understanding and detecting hate speech, hate speakers and polarized groups in online social media. Beginning with a unique typology for detecting abusive language, the work outlines the distinctions and similarities of different abusive language subtasks (offensive language, hate speech, cyberbullying and trolling) and how we might benefit from the progress made in each area. Specifically, the work suggests that each subtask can be categorized based on whether or not the abusive language being studied 1) is directed at a specific individual, or targets a generalized “Other” and 2) the extent to which the language is explicit versus implicit. The work then uses knowledge gained from this typology to tackle the “problem of offensive language” in hate speech detection. 

8. Lasso Guarantees for Dependent Data

Serially correlated high dimensional data are prevalent in the big data era. In order to predict and learn the complex relationship among the multiple time series, high dimensional modeling has gained importance in various fields such as control theory, statistics, economics, finance, genetics and neuroscience. This dissertation studies a number of high dimensional statistical problems involving different classes of mixing processes. 

9. Random forest robustness, variable importance, and tree aggregation

Random forest methodology is a nonparametric, machine learning approach capable of strong performance in regression and classification problems involving complex data sets. In addition to making predictions, random forests can be used to assess the relative importance of feature variables. This dissertation explores three topics related to random forests: tree aggregation, variable importance, and robustness. 

10. Climate Data Computing: Optimal Interpolation, Averaging, Visualization and Delivery

This dissertation solves two important problems in the modern analysis of big climate data. The first is the efficient visualization and fast delivery of big climate data, and the second is a computationally extensive principal component analysis (PCA) using spherical harmonics on the Earth’s surface. The second problem creates a way to supply the data for the technology developed in the first. These two problems are computationally difficult, such as the representation of higher order spherical harmonics Y400, which is critical for upscaling weather data to almost infinitely fine spatial resolution.

I hope you enjoyed learning about these compelling machine learning dissertations.

Editor’s note: Interested in more data science research? Check out the Research Frontiers track at ODSC Europe this September 17-19 or the ODSC West Research Frontiers track this October 27-30.

machine vision thesis topics

Daniel Gutierrez, ODSC

Daniel D. Gutierrez is a practicing data scientist who’s been working with data long before the field came in vogue. As a technology journalist, he enjoys keeping a pulse on this fast-paced industry. Daniel is also an educator having taught data science, machine learning and R classes at the university level. He has authored four computer industry books on database and data science technology, including his most recent title, “Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R.” Daniel holds a BS in Mathematics and Computer Science from UCLA.

east discount square

U.S. Supreme Court Warning of Dangers Related to AI in the Legal Profession

AI and Data Science News posted by ODSC Team Jan 3, 2024 On Sunday, the Supreme Court published the 2023 Year-End Report on the Federal Judiciary. In the...

Samsung Poised to Show Off New Smartphone With AI Taking The Lead Role

Samsung Poised to Show Off New Smartphone With AI Taking The Lead Role

AI and Data Science News posted by ODSC Team Jan 3, 2024 Electronics maker Samsung is poised to introduce a new smartphone that will have AI taking the...

The Biggest AI News That Shook the World in 2023

The Biggest AI News That Shook the World in 2023

Featured Post posted by ODSC Team Jan 3, 2024 To say that 2023 was a big year for AI is an understatement.  During the last...

east cfs square

Research Topics

Biomedical Imaging

Biomedical Imaging

The current plethora of imaging technologies such as magnetic resonance imaging (MR), computed tomography (CT), position emission tomography (PET), optical coherence tomography (OCT), and ultrasound provide great insight into the different anatomical and functional processes of the human body.

Computer Vision

Computer Vision

Computer vision is the science and technology of teaching a computer to interpret images and video as well as a typical human. Technically, computer vision encompasses the fields of image/video processing, pattern recognition, biological vision, artificial intelligence, augmented reality, mathematical modeling, statistics, probability, optimization, 2D sensors, and photography.

Image Segmentation/Classification

Image Segmentation/Classification

Extracting information from a digital image often depends on first identifying desired objects or breaking down the image into homogenous regions (a process called 'segmentation') and then assigning these objects to particular classes (a process called 'classification'). This is a fundamental part of computer vision, combining image processing and pattern recognition techniques.

Multiresolution Techniques

Multiresolution   Techniques

The VIP lab has a particularly extensive history with multiresolution methods, and a significant number of research students have explored this theme. Multiresolution methods are very broad, essentially meaning than an image or video is modeled, represented, or features extracted on more than one scale, somehow allowing both local and non-local phenomena.

Remote Sensing

Remote Sensing

Remote sensing, or the science of capturing data of the earth from airplanes or satellites, enables regular monitoring of land, ocean, and atmosphere expanses, representing data that cannot be captured using any other means. A vast amount of information is generated by remote sensing platforms and there is an obvious need to analyze the data accurately and efficiently.

Scientific Imaging

Scientific Imaging

Scientific Imaging refers to working on two- or three-dimensional imagery taken for a scientific purpose, in most cases acquired either through a microscope or remotely-sensed images taken at a distance.

Stochastic Models

Stochastic Models

In many image processing, computer vision, and pattern recognition applications, there is often a large degree of uncertainty associated with factors such as the appearance of the underlying scene within the acquired data, the location and trajectory of the object of interest, the physical appearance (e.g., size, shape, color, etc.) of the objects being detected, etc.

Video Analysis

Video Analysis

Video analysis is a field within  computer vision  that involves the automatic interpretation of digital video using computer algorithms. Although humans are readily able to interpret digital video, developing algorithms for the computer to perform the same task has been highly evasive and is now an active research field.

Deep Evolution Figure

Evolutionary Deep Intelligence

Deep learning has shown considerable promise in recent years, producing tremendous results and significantly improving the accuracy of a variety of challenging problems when compared to other machine learning methods.

Discovered Radiomics Sequencer

Discovery Radiomics

Radiomics, which involves the high-throughput extraction and analysis of a large amount of quantitative features from medical imaging data to characterize tumor phenotype in a quantitative manner, is ushering in a new era of imaging-driven quantitative personalized cancer decision support and management. 

Discovered Radiomics Sequencer

Sports Analytics

Sports Analytics is a growing field in computer vision that analyzes visual cues from images to provide statistical data on players, teams, and games. Want to know how a player's technique improves the quality of the team? Can a team, based on their defensive position, increase their chances to the finals? These are a few out of a plethora of questions that are answered in sports analytics.

Share via Facebook

  • Contact Waterloo
  • Maps & Directions
  • Accessibility

The University of Waterloo acknowledges that much of our work takes place on the traditional territory of the Neutral, Anishinaabeg and Haudenosaunee peoples. Our main campus is situated on the Haldimand Tract, the land granted to the Six Nations that includes six miles on each side of the Grand River. Our active work toward reconciliation takes place across our campuses through research, learning, teaching, and community building, and is co-ordinated within the Office of Indigenous Relations .

You are using an outdated browser. Please upgrade your browser .


machine vision thesis topics

Computer vision Research Topics Ideas

List of Computer vision Research Topics Ideas for MS and PH.D. 1. Deep learning-enabled medical computer vision 2. Deep learning and computer vision will transform entomology 3. Exploring human–nature interactions in national parks with social media photographs and computer vision 4. Assessing the potential for deep learning and computer vision to identify bumble bee species from images 5. Imitation and recognition of facial emotions in autism: a computer vision approach 6. A review on near-duplicate detection of images using computer vision techniques 7. Noncontact cable force estimation with unmanned aerial vehicle and computer vision 8. Computer vision based two-stage waste recognition-retrieval algorithm for waste classification 9. A survey on generative adversarial networks for imbalance problems in computer vision tasks 10. Deception in the eyes of deceiver: A computer vision and machine learning based automated deception detection 11. A computer vision approach based on deep learning for the detection of dairy cows in free stall barn 12. Classification of fermented cocoa beans (cut test) using computer vision 13. Real-time water level monitoring using live cameras and computer vision techniques 14. Aeroelastic Vibration Measurement Based on Laser and Computer Vision Technique 15. Individualized SAR calculations using computer vision-based MR segmentation and a fast electromagnetic solver 16. Crop Nutrition and Computer Vision Technology 17. Advancing Eosinophilic Esophagitis Diagnosis and Phenotype Assessment with Deep Learning Computer Vision 18. Computer Vision-Based Bridge Damage Detection Using Deep Convolutional Networks with Expectation Maximum Attention Module 19. Analysis of ultrasonic vocalizations from mice using computer vision and machine learning 20. Developing a mold-free approach for complex glulam production with the assist of computer vision technologies 21. Decoding depressive disorder using computer vision 22. Assessment of Computer Vision Syndrome and Personal Risk Factors among Employees of Commercial Bank of Ethiopia in Addis Ababa, Ethiopia 23. One Label, One Billion Faces: Usage and Consistency of Racial Categories in Computer Vision 24. A survey of image labelling for computer vision applications 25. Development of Kid Height Measurement Application based on Image using Computer Vision 26. Computer vision AC-STEM automated image analysis for 2D nanopore applications 27. Displacement Identification by Computer Vision for Condition Monitoring of Rail Vehicle Bearings 28. An Open-Source Computer Vision Tool for Automated Vocal Fold Tracking From Videoendoscopy 29. Computer Vision and Human Behaviour, Emotion and Cognition Detection: A Use Case on Student Engagement 30. Computer vision-based tree trunk and branch identification and shaking points detection in Dense-Foliage canopy for automated harvesting of apples 31. Computer Vision–Based Estimation of Flood Depth in Flooded-Vehicle Images 32. An automated light trap to monitor moths (Lepidoptera) using computer vision-based tracking and deep learning 33. The Use of Saliency in Underwater Computer Vision: A Review 34. Computer vision for liquid samples in hospitals and medical labs using hierarchical image segmentation and relations prediction 35. Computer vision syndrome prevalence according to individual and video display terminal exposure characteristics in Spanish university students 36. Computer vision and unsupervised machine learning for pore-scale structural analysis of fractured porous media 37. Research on computer vision enhancement in intelligent robot based on machine learning and deep learning 38. Deformable Scintillation Dosimeter I: Challenges and Implementation using Computer Vision Techniques 39. Use of Computer Vision to Identify the Frequency and Magnitude of Insulin Syringe Preparation Errors 40. Action recognition of dance video learning based on embedded system and computer vision image 41. Frontiers of computer vision technologies on real estate property photographs and floorplans 42. Analysis of UAV-Acquired Wetland Orthomosaics Using GIS, Computer Vision, Computational Topology and Deep Learning 43. Computer vision applied to dual-energy computed tomography images for precise calcinosis cutis quantification in patients with systemic sclerosis 44. Human Motion Gesture Recognition Based on Computer Vision 45. Application of computer vision in fish intelligent feeding system—A review 46. Application of Computer Vision in 3D Film 47. WiCV 2020: The Seventh Women In Computer Vision Workshop 48. Computer vision based obstacle detection and target tracking for autonomous vehicles 49. Evaluating Congou black tea quality using a lab-made computer vision system coupled with morphological features and chemometrics 50. Research on Key Technologies in the Field of Computer Vision Based on Deep Learning 51. Online detection of naturally DON contaminated wheat grains from China using Vis-NIR spectroscopy and computer vision 52. A Computer Vision-Based Occupancy and Equipment Usage Detection Approach for Reducing Building Energy Demand 53. Application of Computer Vision Technology in Agricultural Products and Food Inspection 54. Automatic Evaluation of Wheat Resistance to Fusarium Head Blight Using Dual Mask-RCNN Deep Learning Frameworks in Computer Vision 55. A computer vision algorithm for locating and recognizing traffic signal control light status and countdown time 56. Microplastic abundance quantification via a computer-vision-based chemometrics-assisted approach 57. Computer Vision for Dietary Assessment 58. Determinants of computer vision system’s technology acceptance to improve incoming cargo receiving at Eastern European and Central Asian transportation … 59. CONSTRUCTION OF A SOMATOSENSORY INTERACTIVE SYSTEM BASED ON COMPUTER VISION AND AUGMENTED REALITY TECHNIQUES USI… 60. Estimating California’s Solar and Wind Energy Production using Computer Vision Deep Learning Techniques on Weather Images 61. Leaf disease segmentation and classification of Jatropha Curcas L. and Pongamia Pinnata L. biofuel plants using computer vision based approaches 62. Automated correlation of petrographic images of sandstones to a textural properties database extracted with computer vision techniques 63. Computer Vision-based Intelligent Bookshelf System 64. Computer Vision Techniques for Crowd Density and Motion Direction Analysis 65. Computer Vision System for Landing Platform State Assessment Onboard of Unmanned Aerial Vehicle in Case of Input Visual Information Distortion 66. Research on Bridge Deck Health Assessment System Based on BIM and Computer Vision Technology 67. Computer Vision for Dynamic Student Data Management in Higher Education Platform. 68. Tulipp and ClickCV: How the Future Demands of Computer Vision Can Be Met Using FPGAs 69. Stripenn detects architectural stripes from chromatin conformation data using computer vision 70. Having Fun with Computer Vision 71. Application of Computer Vision in Pipeline Inspection Robot 72. Design of Digital Museum Narrative Space Based on Perceptual Experience Data Mining and Computer Vision 73. Study on Pipelined Parallel Processing Architectures for Imaging and Computer Vision 74. Research on fire inspection robot based on computer vision 75. ActiveNet: A computer-vision based approach to determine lethargy 76. Individual Wave Detection and Tracking within a Rotating Detonation Engine through Computer Vision Object Detection applied to High-Speed Images 77. Human Thorax Parametric Reconstruction Using Computer Vision 78. Advancing Eosinophilic Esophagitis Diagnosis and Phenotype Assessment with Deep Learning Computer Vision 79. LANE DETECTION USING COMPUTER VISION FOR SELF-DRIVING CARS 80. Automatic Gear Sorting Using Wireless PLC Based on Computer Vision 81. Computer Vision-based Marker-less Real Time Motion Analysis for Rehabilitation–An Interdisciplinary Research Project 82. Computer vision in surgery 83. GUIs for Computer Vision 84. Surgical navigation technology based on computer vision and vr towards iot 85. A web-based survey on various symptoms of computer vision syndrome and the genetic understanding based on a multi-trait genome-wide association study 86. Automated Classification and Detection of Malaria Cell Using Computer Vision 87. Computer-Assisted Self-Training for Kyudo Posture Rectification Using Computer Vision Methods 88. Comparison of Computer Vision Techniques for Drowsiness Detection While Driving 89. A Real-Time Computer Vision System for Workers’ PPE and Posture Detection in Actual Construction Site Environment 90. Embedded Computer Vision System Applied to a Four-Legged Line Follower Robot 91. Computer Vision in Industry, Practice in the Czech Republic 92. Computer vision for microscopic skin cancer diagnosis using handcrafted and non-handcrafted features 93. Deep nets: What have they ever done for vision? 94. Analysis of Traditional Computer Vision Techniques Used for Hemp Leaf Water Stress Detection and Classification 95. Embedded Computer Vision System Applied to a Four-Legged Line Follower Robot 96. Deep Learning and Computer Vision Strategies for Automated Gene Editing with a Single-Cell Electroporation Platform 97. A computer vision-based approach for behavior recognition of gestating sows fed different fiber levels during high ambient temperature 98. Swin transformer: Hierarchical vision transformer using shifted windows 99. GEJALA COMPUTER VISION SYNDROME YANG DIALAMI OLEH KARYAWAN BUMN SEKTOR KEUANGAN KOTA TASIKMALAYA 100. Sistem Identifikasi Tingkat Kematangan Buah Nanas Secara Non-Destruktif Berbasis Computer Vision 101. Field-programmable gate arrays in a low power vision system 102. SiT: Self-supervised vIsion Transformer 103. SISTEM PENDETEKSI MASKER WAJAH DAN SUHU TUBUH MENGGUNAKAN TEKNIK COMPUTER VISION DAN SENSOR INFRARED NON-CONTACT 104. Do we really need explicit position encodings for vision transformers? 105. Cvt: Introducing convolutions to vision transformers 106. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions 107. Training vision transformers for image retrieval 108. Transformers in Vision: A Survey 109. Uncertainty-assisted deep vision structural health monitoring 110. Twins: Revisiting spatial attention design in vision transformers 111. Crossvit: Cross-attention multi-scale vision transformer for image classification 112. Smart Computer Laboratory: IoT Based Smartphone Application 113. A Dense Tensor Accelerator with Data Exchange Mesh for DNN and Vision Workloads 114. Physics-based vision meets deep learning 115. Neural vision-based semantic 3D world modeling 116. Scaling up visual and vision-language representation learning with noisy text supervision 117. Deep Learning–Based Scene Simplification for Bionic Vision 118. Techniques To Improve Machine Vision In Robots 119. Future Vision Exhibition: Artificial Landscapes 120. Enabling energy efficient machine learning on a Ultra-Low-Power vision sensor for IoT 121. Tokens-to-token vit: Training vision transformers from scratch on imagenet 122. Vision-based Sensors for Production Control 123. Machine Learning and Deep Learning Applications-A Vision 124. The quiet revolution in machine vision-a state-of-the-art survey paper, including historical review, perspectives, and future directions 125. Synthesizing Pose Sequences from 3D Assets for Vision-Based Activity Analysis 126. Applying Mobile Intelligent API Vision Kit and Normalized Features for Face Recognition Using Live Cameras 127. A New Approach for Fire Pixel Detection in Building Environment Using Vision Sensor 128. A Vision-Based Parameter Estimation for an Aircraft in Approach Phase 129. ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision 130. Analysis of Target Detection and Tracking for Intelligent Vision System 131. Test Automation with Grad-CAM Heatmaps-A Future Pipe Segment in MLOps for Vision AI? 132. Investigating the Vision Transformer Model for Image Retrieval Tasks 133. Vision-based adjusting of a digital model to real-world conditions for wire insertion tasks 134. Combining brief and ad for edge-preserved dense stereo matching 135. Elf: accelerate high-resolution mobile deep vision with content-aware parallel offloading 136. Shallow Convolution Neural Network for an Industrial Robot Real Time Vision System 137. Research Status of Gesture Recognition Based on Vision: A Review 138. Multi-scale vision longformer: A new vision transformer for high-resolution image encoding 139. Road Peculiarities Detection using Deep Learning for Vehicle Vision System 140. Reinforcement learning applied to machine vision: state of the art 141. Vision based inspection system for leather surface defect detection using fast convergence particle swarm optimization ensemble classifier approach 142. Evaluation of visual complications among professional computer users 143. Brain Tumor Segmentation: A Comparative Analysis 144. Detection of Atlantic salmon bone residues using machine vision technology 145. Understanding Perceptual Bias in Machine Vision Systems 146. The MVTec anomaly detection dataset: a comprehensive real-world dataset for unsupervised anomaly detection 147. Scene text detection and recognition: The deep learning era 148. Vision-based continuous sign language recognition using multimodal sensor fusion 149. Egocentric Vision for Dog Behavioral Analysis 150. Vision-based Docking of a Mobile Robot 151. VLGrammar: Grounded Grammar Induction of Vision and Language 152. Mask-aware photorealistic facial attribute manipulation 153. Real-time plant phenomics under robotic farming setup: A vision-based platform for complex plant phenotyping tasks 154. VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers 155. Deep Vision Based Surveillance System to Prevent Train-Elephant Collisions 156. Boosting High-Level Vision with Joint Compression Artifacts Reduction and Super-Resolution 157. Autonomous, onboard vision-based trash and litter detection in low altitude aerial images collected by an unmanned aerial vehicle 158. Research on the algorithm of painting image style feature extraction based on intelligent vision 159. Fast semantic segmentation method for machine vision inspection based on a fewer-parameters atrous convolution neural network 160. Investigating Bi-Level Optimization for Learning and Vision from a Unified Perspective: A Survey and Beyond 161. Vision-Based Full-Field Sensing for Condition Assessment of Structural Systems 162. New method of traffic flow forecasting based on quantum particle swarm optimization strategy for intelligent transportation system 163. Same-different conceptualization: a machine vision perspective 164. Toward high-quality magnetic data survey using UAV: development of a magnetic-isolated vision-based positioning system 165. Urban landscape ecological design and stereo vision based on 3D mesh simplification algorithm and artificial intelligence 166. Transformer in transformer 167. A Robotic grinding station based on an industrial manipulator and vision system 168. Applications of Internet of Things (IoT) in Agriculture-The Potential and Challenges in Smart Farm in Uganda 169. Vision-Based Patient Monitoring and Management in Mental Health Settings 170. Mitigating Demographic Bias in Facial Datasets with Style-Based Multi-attribute Transfer 171. Dynamic tree branch tracking for aerial canopy sampling using stereo vision 172. Smart Office Model Based on Internet of Things 173. How to construct low-altitude aerial image datasets for deep learning [J] 174. Pretraining boosts out-of-domain robustness for pose estimation 175. A benchmark and evaluation of non-rigid structure from motion 176. The devil is in the boundary: Exploiting boundary representation for basis-based instance segmentation 177. Vision based collision detection for a safe collaborative industrial manipulator 178. Eden: Multimodal synthetic dataset of enclosed garden scenes 179. Pixel-wise crowd understanding via synthetic data 180. Vision-Based Framework for Automatic Progress Monitoring of Precast Walls by Using Surveillance Videos during the Construction Phase 181. LSPnet: A 2D Localization-oriented Spacecraft Pose Estimation Neural Network 182. Transfer of Learning from Vision to Touch: A Hybrid Deep Convolutional Neural Network for Visuo-Tactile 3D Object Recognition 183. UAV Use Case: Real-Time Obstacle Avoidance System for Unmanned Aerial Vehicles Based on Stereo Vision 184. Improving grain size analysis using computer vision techniques and implications for grain growth kinetics 185. Comparison of full-reference image quality models for optimization of image processing systems 186. High Precision Medicine Bottles Vision Online Inspection System and Classification Based on Multi-Features and Ensemble Learning via Independence Test 187. Efficient attention: Attention with linear complexities 188. Fusion Learning Using Semantics and Graph Convolutional Network for Visual Food Recognition 189. The ikea asm dataset: Understanding people assembling furniture through actions, objects and pose 190. DualSR: Zero-Shot Dual Learning for Real-World Super-Resolution 191. Modelling and Analysis of Facial Expressions Using Optical Flow Derived Divergence and Curl Templates 192. Barlow twins: Self-supervised learning via redundancy reduction 193. Facial expression recognition in the wild via deep attentive center loss 194. Facial Beauty Prediction and Analysis Based on Deep Convolutional Neural Network: A Review 195. Rodnet: Radar object detection using cross-modal supervision 196. Using open-source computer vision software for identification and tracking of convective storms 197. Improving Robustness and Uncertainty Modelling in Neural Ordinary Differential Equations 198. Vision and Inertial Sensor Fusion for Terrain Relative Navigation 199. Domain-Aware Unsupervised Hyperspectral Reconstruction for Aerial Image Dehazing 200. A method for classifying citrus surface defects based on machine vision 201. Transgan: Two transformers can make one strong gan 202. Improving Point Cloud Semantic Segmentation by Learning 3D Object Detection 203. Information Systems Integration to Enhance Operational Customer Relationship Management in the Pharmaceutical Industry 204. Attentional feature fusion 205. The Isowarp: The Template-Based Visual Geometry of Isometric Surfaces 206. Vision-Based Diagnosis and Location of Insulator Self-Explosion Defects 207. Activity Recognition with Moving Cameras and Few Training Examples: Applications for Detection of Autism-Related Headbanging 208. Development and Validation of an Unsupervised Feature Learning System for Leukocyte Characterization and Classification: A Multi-Hospital Study 209. DualSANet: Dual Spatial Attention Network for Iris Recognition 210. Long-Range Attention Network for Multi-View Stereo 211. Towards visually explaining video understanding networks with perturbation 212. MinkLoc3D: Point Cloud Based Large-Scale Place Recognition 213. p-RT: A Runtime Framework to Enable Energy-Efficient Real-Time Robotic Vision Applications on Heterogeneous Architectures 214. JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition 215. Binarized neural architecture search for efficient object recognition 216. Learning transferable visual models from natural language supervision 217. Improved techniques for training single-image gans 218. Machine Vision 219. SSDMNV2: A real time DNN-based face mask detection system using single shot multibox detector and MobileNetV2 220. Pervasive label errors in test sets destabilize machine learning benchmarks 221. MPRNet: Multi-Path Residual Network for Lightweight Image Super Resolution 222. Fuzzy-aided solution for out-of-view challenge in visual tracking under IoT-assisted complex environment 223. Deep Learning in X-ray Testing 224. Tresnet: High performance gpu-dedicated architecture 225. ATM: Attentional Text Matting 226. Classmix: Segmentation-based data augmentation for semi-supervised learning 227. Videossl: Semi-supervised learning for video classification 228. Stratified rule-aware network for abstract visual reasoning 229. A hierarchical privacy-preserving IoT architecture for vision-based hand rehabilitation assessment 230. Adversarial reinforcement learning for unsupervised domain adaptation 231. Visual question answering model based on graph neural network and contextual attention 232. Towards Balanced Learning for Instance Recognition 233. The automatic detection of pedestrians under the high-density conditions by deep learning techniques 234. RGB-D salient object detection: A survey 235. A deep active learning system for species identification and counting in camera trap images 236. Domain Impression: A Source Data Free Domain Adaptation Method 237. Robust feature learning for adversarial defense via hierarchical feature alignment 238. HADEM-MACS: a hybrid approach for detection and extraction of objects in movement by multimedia autonomous computer systems 239. Vision-Based Method Integrating Deep Learning Detection for Tracking Multiple Construction Machines 240. Bottleneck transformers for visual recognition 241. Single Image Human Proxemics Estimation for Visual Social Distancing 242. SoFA: Source-Data-Free Feature Alignment for Unsupervised Domain Adaptation 243. Contrastive learning of general-purpose audio representations 244. Knowledge distillation: A survey 245. Let’s Get Dirty: GAN Based Data Augmentation for Camera Lens Soiling Detection in Autonomous Driving 246. Video Semantic Analysis: The Sparsity Based Locality-Sensitive Discriminative Dictionary Learning Factor 247. Vision-Based Tactile Sensor Mechanism for the Estimation of Contact Position and Force Distribution Using Deep Learning 248. A Discriminative Model for Multiple People Detection 249. Generative adversarial networks and their application to 3D face generation: A survey 250. Semantic hierarchy emerges in deep generative representations for scene synthesis 251. Fake face detection via adaptive manipulation traces extraction network 252. Accuracy of smartphone video for contactless measurement of hand tremor frequency 253. Video Captioning of Future Frames 254. Disentangled Contour Learning for Quadrilateral Text Detection 255. Visual Structure Constraint for Transductive Zero-Shot Learning in the Wild 256. CNN features with bi-directional LSTM for real-time anomaly detection in surveillance networks 257. CityFlow-NL: Tracking and Retrieval of Vehicles at City Scaleby Natural Language Descriptions 258. 3D Head Pose Estimation through Facial Features and Deep Convolutional Neural Networks 259. Rescuenet: Joint building segmentation and damage assessment from satellite imagery 260. Alleviating over-segmentation errors by detecting action boundaries 261. Multiresolution Adaptive Threshold Based Segmentation of Real-Time Vision-Based Database for Human Motion Estimation 262. Simplifying dependent reductions in the polyhedral model 263. Multi-camera traffic scene mosaic based on camera calibration 264. Novel View Synthesis via Depth-guided Skip Connections 265. Self-Supervised Pretraining Improves Self-Supervised Pretraining 266. Weakly Supervised Multi-Object Tracking and Segmentation 267. Proposal learning for semi-supervised object detection 268. You only look yourself: Unsupervised and untrained single image dehazing neural network 269. Algorithm for epipolar geometry and correcting monocular stereo vision based on a plane mirror 270. Mutual Information Maximization on Disentangled Representations for Differential Morph Detection 271. Route planning methods in indoor navigation tools for vision impaired persons: a systematic review 272. Subject Guided Eye Image Synthesis with Application to Gaze Redirection 273. Roles of artificial intelligence in construction engineering and management: A critical review and future trends 274. PI-Net: Pose Interacting Network for Multi-Person Monocular 3D Pose Estimation 275. Swag: Superpixels weighted by average gradients for explanations of cnns 276. Impact diagnosis in stiffened structural panels using a deep learning approach 277. X-ray Testing 278. Computer Assisted Classification Framework for Detection of Acute Myeloid Leukemia in Peripheral Blood Smear Images 279. IGSSTRCF: Importance Guided Sparse Spatio-Temporal Regularized Correlation Filters for Tracking 280. A Unified Learning Approach for Hand Gesture Recognition and Fingertip Detection 281. EventAnchor: Reducing Human Interactions in Event Annotation of Racket Sports Videos 282. An Evolution of CNN Object Classifiers on Low-Resolution Images 283. A vector-based representation to enhance head pose estimation 284. Automatic Defect Detection of Print Fabric Using Convolutional Neural Network 285. Multi-Scale Voxel Class Balanced ASPP for LIDAR Pointcloud Semantic Segmentation 286. Non-Destructive Quality Inspection of Potato Tubers Using Automated Vision System 287. Layering Defects Detection in Laser Powder Bed Fusion using Embedded Vision System 288. Label-Free Robustness Estimation of Object Detection CNNs for Autonomous Driving Applications 289. Convolutional Neural Networks and Transfer Learning for Quality Inspection of Different Sugarcane Varieties 290. Vision-Based Suture Tensile Force Estimation in Robotic Surgery 291. FEANet: Foreground-edge-aware network with DenseASPOC for human parsing 292. Application of a convolutional neural network for detection of ignition sources and smoke 293. Image matching across wide baselines: From paper to practice 294. Towards Annotation-free Instance Segmentation and Tracking with Adversarial Simulations 295. Self-Supervised Learning for Domain Adaptation on Point Clouds 296. Ontology-driven event type classification in images 297. Apple Ripeness Identification Using Deep Learning 298. Deep sparse transfer learning for remote smart tongue diagnosis [J] 299. CONVERSATION OF REAL IMAGES INTO CARTOONIZE IMAGE FORMAT USING GENERATIVE ADVERSARIAL NETWORK 300. RICORD: A Precedent for Open AI in COVID-19 Image Analytics 301. CapGen: A Neural Image Caption Generator with Speech Synthesis 302. Computer-Aided Diagnosis of Alzheimer’s Disease through Weak Supervision Deep Learning Framework with Attention Mechanism 303. A novel and intelligent vision-based tutor for Yogasana: e-YogaGuru 304. Plant Trait Estimation and Classification Studies in Plant Phenotyping Using Machine Vision-A Review 305. Iranis: A Large-scale Dataset of Farsi License Plate Characters 306. A numerical framework for elastic surface matching, comparison, and interpolation 307. Sign language recognition from digital videos using deep learning methods 308. Enhanced Information Fusion Network for Crowd Counting 309. This Face Does Not Exist… But It Might Be Yours! Identity Leakage in Generative Models 310. Defense-friendly Images in Adversarial Attacks: Dataset and Metrics for Perturbation Difficulty 311. Guided attentive feature fusion for multispectral pedestrian detection 312. Vision-Based Guidance for Tracking Dynamic Objects 313. Single-shot fringe projection profilometry based on Deep Learning and Computer Graphics 314. Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding 315. Adversarial deepfakes: Evaluating vulnerability of deepfake detectors to adversarial examples 316. Vision-Aided 6G Wireless Communications: Blockage Prediction and Proactive Handoff 317. FACEGAN: Facial Attribute Controllable rEenactment GAN 318. Benchmarking the robustness of semantic segmentation models with respect to common corruptions 319. Vision-based egg quality prediction in Pacific bluefin tuna (Thunnus orientalis) by deep neural network 320. Diagnosing colorectal abnormalities using scattering coefficient maps acquired from optical coherence tomography 321. Recovering Trajectories of Unmarked Joints in 3D Human Actions Using Latent Space Optimization 322. Only time can tell: Discovering temporal data for temporal modeling 323. Deep Learning applications for COVID-19 324. PointCutMix: Regularization Strategy for Point Cloud Classification 325. A Survey on the Usage of Pattern Recognition and Image Analysis Methods for the Lifestyle Improvement on Low Vision and Visually Impaired People 326. Mechanical System Control by RGB-D Device 327. DACS: Domain Adaptation via Cross-domain Mixed Sampling 328. Vision-Based Vibration Monitoring of Structures and Infrastructures: An Overview of Recent Applications 329. CycleSegNet: Object Co-segmentation with Cycle Refinement and Region Correspondence 330. Explainable Fingerprint ROI Segmentation Using Monte Carlo Dropout 331. Self-supervised pretraining of visual features in the wild 332. Optimized Z-Buffer Using Divide and Conquer 333. Efficient and robust unsupervised inverse intensity compensation for stereo image registration under radiometric changes 334. Zero-shot text-to-image generation 335. PoseRBPF: A Rao–Blackwellized Particle Filter for 6-D Object Pose Tracking 336. MobiSamadhaan—Intelligent Vision-Based Smart City Solution 337. A Training Method for Low Rank Convolutional Neural Networks Based on Alternating Tensor Compose-Decompose Method 338. Individual Sick Fir Tree (Abies mariesii) Identification in Insect Infested Forests by Means of UAV Images and Deep Learning 339. Fighting against COVID-19: A novel deep learning model based on YOLO-v2 with ResNet-50 for medical face mask detection 340. 3D Image Conversion of a Scene from Multiple 2D Images with Background Depth Profile 341. Falsification of a Vision-based Automatic Landing System 342. Generating Physically Sound Training Data for Image Recognition of Additively Manufactured Parts 343. Shelf Auditing Based on Image Classification Using Semi-Supervised Deep Learning to Increase On-Shelf Availability in Grocery Stores 344. Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks 345. Research on Detection Method of Sheet Surface Defects Based on Machine Vision 346. Low-Resolution LiDAR Upsampling Using Weighted Median Filter 347. RGB-D Human Action Recognition of Deep Feature Enhancement and Fusion Using Two-Stream ConvNet 348. Breaking Shortcuts by Masking for Robust Visual Reasoning 349. Correlation filter tracking based on superpixel and multifeature fusion 350. Segmentation of body parts of cows in RGB-depth images based on template matching 351. Electric Scooter and Its Rider Detection Framework Based on Deep Learning for Supporting Scooter-Related Injury Emergency Services 352. A Model of Diameter Measurement Based on the Machine Vision 353. Transreid: Transformer-based object re-identification 354. LightLayers: Parameter Efficient Dense and Convolutional Layers for Image Classification 355. Residual Dual Scale Scene Text Spotting by Fusing Bottom-Up and Top-Down Processing 356. Pedestrian Detection on Multispectral Images in Different Lighting Conditions 357. CovidSens: a vision on reliable social sensing for COVID-19 358. An Improved Approach for Face Detection 359. Classification and Measuring Accuracy of Lenses Using Inception Model V3 360. Nighttime image dehazing based on Retinex and dark channel prior using Taylor series expansion 361. Multiple Object Tracking Using Convolutional Neural Network on Aerial Imagery Sequences 362. Piano Skills Assessment 363. Automatic recognition of surface cracks in bridges based on 2D-APES and mobile machine vision 364. Real-time Navigation for Drogue-Type Autonomous Aerial Refueling Using Vision-Based Deep Learning Detection 365. Robust 3D Reconstruction Through Noise Reduction of Ultra-Fast Images 366. Survey of Occluded and Unoccluded Face Recognition 367. Cell tracking in time-lapse microscopy image sequences 368. Depth Estimation Using Blob Detection for Stereo Vision Images 369. Fast human activity recognition 370. Single Shot Multitask Pedestrian Detection and Behavior Prediction 371. The Vision of Digital Surgery 372. Modeling of Potato Slice Drying Process in a Microwave Dryer using Artificial Neural Network and Machine Vision 373. VinVL: Making Visual Representations Matter in Vision-Language Models 374. Quality safety monitoring of LED chips using deep learning-based vision inspection methods 375. Machine Vision Based Phenotype Recognition of Plant and Animal 376. Towards manufacturing robotics accuracy degradation assessment: A vision-based data-driven implementation 377. Attention guided low-light image enhancement with a large scale low-light simulation dataset 378. Human action identification by a quality-guided fusion of multi-model feature 379. Optimal quantization using scaled codebook 380. A Robust Illumination-Invariant Camera System for Agricultural Applications 381. Real-Time Gait-Based Age Estimation and Gender Classification From a Single Image 382. Counting and Tracking of Vehicles and Pedestrians in Real Time Using You Only Look Once V3 383. Style Normalization and Restitution for DomainGeneralization and Adaptation 384. A Machine Vision-Based Method Optimized for Restoring Broiler Chicken Images Occluded by Feeding and Drinking Equipment 385. Face Recognition for Surveillance Systems using SRGAN 386. FACIAL RECOGNITION AND ATTENDANCE SYSTEM USING DLIB AND FACE RECOGNITION LIBRARIES 387. Deep Preset: Blending and Retouching Photos with Color Style Transfer 388. Transunet: Transformers make strong encoders for medical image segmentation 389. List-wise learning-to-rank with convolutional neural networks for person re-identification 390. Intra-Camera Supervised Person Re-Identification 391. Generating Masks from Boxes by Mining Spatio-Temporal Consistencies in Videos 392. A Deep Convolutional Encoder-Decoder Architecture Approach for Sheep Weight Estimation 393. Automatic Borescope Damage Assessments for Gas Turbine Blades via Deep Learning 394. Foreground-aware Semantic Representations for Image Harmonization 395. A Learning-Based Approach to Parametric Rotoscoping of Multi-Shape Systems 396. Energy-efficient cluster-based unmanned aerial vehicle networks with deep learning-based scene classification model 397. Mobile-Aware Deep Learning Algorithms for Malaria Parasites and White Blood Cells Localization in Thick Blood Smears 398. SChISM: Semantic Clustering via Image Sequence Merging for Images of Human-Decomposition 399. Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models 400. Locate Globally, Segment Locally: A Progressive Architecture With Knowledge Review Network for Salient Object Detection 401. Using Feature Selection Based on Multi-view for Rice Seed Images Classification 402. Road images augmentation with synthetic traffic signs using neural networks 403. Java Tools For Image Understanding The Java Imaging and Vision Environment (JIVE) 404. Improved ECO Algorithm Based on Residual Neural Network 405. Viewpoint and Scale Consistency Reinforcement for UAV Vehicle Re-Identification 406. Automated Surveillance Model for Video-Based Anomalous Activity Detection Using Deep Learning Architecture 407. Real-Time, YOLO-Based Intelligent Surveillance and Monitoring System Using Jetson TX2 408. Countering Inconsistent Labelling by Google’s Vision API for Rotated Images 409. Towards Accurate Camouflaged Object Detection with Mixture Convolution and Interactive Fusion 410. Effectiveness of arbitrary transfer sets for data-free knowledge distillation 411. Accelerated High-Level Synthesis Feature Detection for FPGAs Using HiFlipVX 412. Supervised deep learning of elastic SRV distances on the shape space of curves 413. Evaluating GAN-Based Image Augmentation for Threat Detection in Large-Scale Xray Security Images 414. Where to Start Your Deep Learning 415. Real-Time Detection and Spatial Localization of Insulators for UAV Inspection Based on Binocular Stereo Vision 416. Excitation dropout: Encouraging plasticity in deep neural networks 417. The Edge Computing Cloud Architecture Based on 5G Network for Industrial Vision Detection 418. Estimating Galactic Distances From Images Using Self-supervised Representation Learning 419. Real-Time Hair Segmentation Using Mobile-Unet 420. Driving among Flatmobiles: Bird-Eye-View occupancy grids from a monocular camera for holistic trajectory planning 421. Attention-based context aggregation network for monocular depth estimation 422. A Large-Scale, Time-Synchronized Visible and Thermal Face Dataset 423. Identification of Suitable Contrast Enhancement Technique for Improving the Quality of Astrocytoma Histopathological Images. 424. Run-Time Monitoring of Machine Learning for Robotic Perception: A Survey of Emerging Trends 425. Prevalence and risk factor assessment of digital eye strain among children using online e-learning during the COVID-19 pandemic: Digital eye strain among … 426. Pedestrian Detection: Unification of Global and Local Features 427. A comprehensive analysis of weakly-supervised semantic segmentation in different image domains 428. Deep learning assisted vision inspection of resistance spot welds 429. Identifying centres of interest in paintings using alignment and edge detection: Case studies on works by Luc Tuymans 430. Going deeper with image transformers 431. Spike-thrift: Towards energy-efficient deep spiking neural networks by limiting spiking activity via attention-guided compression 432. Adversarial feature distribution alignment for semi-supervised learning 433. G2d: Generate to detect anomaly 434. Anomaly Detection in Crowded Scenes Using Motion Influence Map and Convolutional Autoencoder 435. A study on attention-based LSTM for abnormal behavior recognition with variable pooling 436. Deep convolutional neural network based autonomous drone navigation 437. A survey on contrastive self-supervised learning 438. Independent Learning of Motion Parameters for Deep Visual Odometry 439. Image segmentation using deep learning: A survey 440. Using Interaction Protocols to Control Vision Systems 441. TGCN: Time Domain Graph Convolutional Network for Multiple Objects Tracking 442. A Survey on Crowd Counting Methods and Datasets 443. RGSR: A two-step lossy JPG image super-resolution based on noise reduction 444. Image processing effects on the deep face recognition system [J] 445. A weakly supervised consistency-based learning method for covid-19 segmentation in ct images 446. Improving Few-Shot Learning using Composite Rotation based Auxiliary Task 447. Investigating large-scale graphs for community detection 448. Estimating Galactic Distances From Images Using Self-supervised Representation Learning 449. Pig Breed Detection Using Faster R-CNN 450. A Vision-Based System For Non-Intrusive Posture Correction Notifications 451. Inception recurrent convolutional neural network for object recognition 452. A Meta-Q-Learning Approach to Discriminative Correlation Filter based Visual Tracking 453. SWD: Low-Compute Real-Time Object Detection Architecture 454. An empirical study of multi-scale object detection in high resolution UAV images 455. Vision Guided Robots. Calibration and Motion Correction 456. Deep-emotion: Facial expression recognition using attentional convolutional network 457. Single-Object Tracking Algorithm Based on Two-Step Spatiotemporal Deep Feature Fusion in a Complex Surveillance Scenario 458. A Focus-Measurement Based 3D Surface Reconstruction System for Dimensional Metrology 459. Image Pre-Processing Method of Machine Learning for Edge Detection with Image Signal Processor Enhancement 460. Distress Recognition in Unpaved Roads Using Unmanned Aerial Systems and Deep Learning Segmentation 461. Noise density range sensitive mean-median filter for impulse noise removal 462. A Novel Fusion of Deep Learning and Android Application for Real-Time Mango Fruits Disease Detection 463. Deep Gaussian Denoiser Epistemic Uncertainty and Decoupled Dual-Attention Fusion 464. Patient Emotion Recognition in Human Computer Interaction System Based on Machine Learning Method and Interactive Design Theory 465. Real-Time Hair Segmentation Using Mobile-Unet. Electronics 2021, 10, 99 466. Diagonal-kernel convolutional neural networks for image classification 467. Dense-Resolution Network for Point Cloud Classification and Segmentation 468. U2-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation 469. Anonymous Person Tracking Across Multiple Camera Using Color Histogram and Body Pose Estimation 470. On-line three-dimensional coordinate measurement of dynamic binocular stereo vision based on rotating camera in large FOV 471. Image Compression and Reconstruction Using Encoder–Decoder Convolutional Neural Network 472. Extracting Effective Image Attributes with Refined Universal Detection 473. Self-supervised training for blind multi-frame video denoising 474. On the Tightness of Semidefinite Relaxations for Rotation Estimation 475. Poly Scale Space Technique for Feature Extraction in Lip Reading: A New Strategy 476. Influence of phosphate concentration on amine, amide, and hydroxyl CEST contrast 477. Controlling biases and diversity in diverse image-to-image translation 478. A New Feature Fusion Network for Student Behavior Recognition in Education 479. Localizing License Plates in Real Time with RetinaNet Object Detector 480. Application of a Vision-Based Single Target on Robot Positioning System 481. A predictive machine learning application in agriculture: Cassava disease detection and classification with imbalanced dataset using convolutional neural … 482. Threat Detection in Social Media Images Using the Inception-v3 Model 483. Augmenting Crop Detection for Precision Agriculture with Deep Visual Transfer Learning—A Case Study of Bale Detection 484. Evolving Smooth Manifolds of Arbitrary Codimen-sion in Rn 485. A Unified Learning Approach for Hand Gesture Recognition and Fingertip Detection 486. CNN-Based RGB-D Salient Object Detection: Learn, Select, and Fuse 487. A multi-platform comparison of local feature description methods 488. Smart Vehicle Tracker for Parking System 489. Knowledge distillation for incremental learning in semantic segmentation 490. MSLp: Deep Superresolution for Meteorological Satellite Image 491. Improving Object Detection Quality by Incorporating Global Contexts via Self-Attention 492. Self-Supervised Pretraining of 3D Features on any Point-Cloud 493. Beyond covariance: Sice and kernel based visual feature representation 494. Attention-based VGG-16 model for COVID-19 chest X-ray image classification 495. High-performance large-scale image recognition without normalization 496. Learning data augmentation with online bilevel optimization for image classification 497. DEEP LEARNING IN LANDCOVER CLASSIFICATION 498. Cobot User Frame Calibration: Evaluation and Comparison between Positioning Repeatability Performances Achieved by Traditional and Vision-Based Methods 499. A Survey of Image Enhancement and Object Detection Methods 500. Multi-object Tracking with a Hierarchical Single-branch Network 501. A Hybrid Approach Based on Lp1 Norm-Based Filters and Normalized Cut Segmentation for Salient Object Detection 502. Skin Lesion Classification Using Deep Learning 503. S-VVAD: Visual Voice Activity Detection by Motion Segmentation 504. Spatio-temporal attention on manifold space for 3D human action recognition 505. Mixup Without Hesitation 506. Deep learning for real-time semantic segmentation: Application in ultrasound imaging 507. Self-supervised monocular depth estimation with direct methods 508. Small object detection using context and attention 509. Resolution invariant person reid based on feature transformation and self-weighted attention 510. Deep learning enables accurate diagnosis of novel coronavirus (COVID-19) with CT images 511. Steel bridge corrosion inspection with combined vision and thermographic images 512. Instance Segmentation for Direct Measurements of Satellites in Metal Powders and Automated Microstructural Characterization from Image Data 513. Minimal solution for estimating fundamental matrix under planar motion 514. Component-level Script Classification Benchmark with CNN on AUTNT Dataset 515. Jaa-net: Joint facial action unit detection and face alignment via adaptive attention 516. Machine Learning Techniques for Predicting Crop Production in India 517. Introduction to Natural Language Processing 518. Structured Scene Memory for Vision-Language Navigation 519. Comparison-Based Study to Predict Breast Cancer: A Survey 520. Aadhaar-Based Authentication and Authorization Scheme for Remote Healthcare Monitoring 521. Nonlinear Approximation and (Deep) ReLU Networks 522. A Workflow Allocation Strategy Under Precedence Constraints for IaaS Cloud Environment 523. Person Identification Using Histogram of Gradient and Support Vector Machine on GEI 524. Adaptive streaming of 360-degree videos with reinforcement learning 525. Video Tagging and Recommender System Using Deep Learning 526. Scale variance minimization for unsupervised domain adaptation in image segmentation 527. Convolutional Elman Jordan Neural Network for Reconstruction and Classification Using Attention Window 528. Channel Capacity in Psychovisual Deep-Nets: Gaussianization Versus Kozachenko-Leonenko 529. Assessing the Viability of Visual Vibrometry for Use in Structural Engineering 530. A Deep Learning-Based Hotel Image Classifier for Online Travel Agencies 531. UVCE-IIITT@ DravidianLangTech-EACL2021: Tamil Troll Meme Classification: You need to Pay more Attention 532. Through-Wall Human Pose Reconstruction via UWB MIMO Radar and 3D CNN 533. Novel Assessments of Technical and Nontechnical Cardiac Surgery Quality: Protocol for a Mixed Methods Study 534. Biologically inspired visual computing: the state of the art 535. A Robust Surf-Based Online Human Tracking Algorithm Using Adaptive Object Model 536. Attention based pruning for shift networks 537. Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation 538. Hexacopter design for carrying payload for warehouse applications 539. Implementation of Recommender System Using Neural Networks and Deep Learning 540. A multi-scale and multi-level feature aggregation network for crowd counting 541. Foodborne Disease Outbreak Prediction Using Deep Learning 542. LRA-Net: local region attention network for 3D point cloud completion 543. Partial Domain Adaptation Using Selective Representation Learning For Class-Weight Computation 544. Estimating Body Pose from A Single Image 545. Hybrid Feature Selection Method for Predicting the Kidney Disease Membranous Nephropathy 546. Prognosis of Breast Cancer by Implementing Machine Learning Algorithms Using Modified Bootstrap Aggregating 547. Robust local binary descriptor in rotation change using polar location 548. A Multimodal Biometric System Based on Finger Knuckle Print, Fingerprint, and Palmprint Traits 549. Efficientps: Efficient panoptic segmentation 550. ShadingNet: image intrinsics by fine-grained shading decomposition 551. Perception of Plant Diseases in Color Images Through Adaboost 552. Data Mining in Cloud Computing: Survey 553. Artificial Neural Network Analysis for Predicting Spatial Patterns of Urbanization in India 554. Navier–Stokes-Based Image Inpainting for Restoration of Missing Data Due to Clouds 555. An Attribute-Based Break-Glass Access Control Framework for Medical Emergencies 556. Detection of Life Threatening ECG Arrhythmias Using Morphological Patterns and Wavelet Transform Method 557. Improvement of Identity Recognition with Occlusion Detection-Based Feature Selection 558. Generative Video Compression as Hierarchical Variational Inference 559. SLiKER: Sparse loss induced kernel ensemble regression 560. Is Space-Time Attention All You Need for Video Understanding? 561. Cityguide: A seamless indoor-outdoor wayfinding system for people with vision impairments 562. An Efficient Multimodal Biometric System Integrated with Liveness Detection Technique 563. Microscopic brain tumor detection and classification using 3D CNN and feature selection architecture 564. A Survey of Brain-Inspired Intelligent Robots: Integration of Vision, Decision, Motion Control, and Musculoskeletal Systems 565. Scale selection 566. Automated detection and classification of spilled loads on freeways based on improved YOLO network 567. Information Granules and Granular Computing 568. DAN : Breast Cancer Classification from High-Resolution Histology Images Using Deep Attention Network 569. Image Segmentation of MR Images with Multi-directional Region Growing Algorithm 570. Satellite Radar Interferometry for DEM Generation Using Sentinel-1A Imagery 571. On the Effect of Training Convolution Neural Network for Millimeter-Wave Radar-Based Hand Gesture Recognition 572. Synthetic Vision for Virtual Character Guidance 573. Evaluation of AOMDV Routing Protocol for Optimum Transmitted Power in a Designed Ad-hoc Wireless Sensor Network 574. Evaluation metrics for conditional image generation 575. Image Inpainting Using Double Discriminator Generative Adversarial Networks 576. Crowd counting method based on the self-attention residual network 577. Cyber Espionage—An Ethical Analysis 578. Analysis of Gait and Face Biometric Traits from CCTV Streams for Forensics 579. A facial expression recognition model using hybrid feature selection and support vector machines 580. Predicting the Big-Five Personality Traits from Handwriting 581. Image engineering 582. Learning efficient text-to-image synthesis via interstage cross-sample similarity distillation 583. Faster and Secured Web Services Communication Using Modified IDEA and Custom-Level Security 584. Improved Image Deblurring Using GANs 585. Hand-Based Person Identification using Global and Part-Aware Deep Feature Representation Learning 586. Identification for Recycling Polyethylene Terephthalate (PET) Plastic Bottles by Polarization Vision 587. Holistic filter pruning for efficient deep neural networks 588. Vision-based vehicle speed estimation: A survey 589. Learning Temporal Dynamics from Cycles in Narrated Video 590. Analysis of MRI Image Compression Using Compressive Sensing 591. Analysis of PQ Disturbances in Renewable Grid Integration System Using Non-parametric Spectral Estimation Approach 592. A Porcine Abdomen Cutting Robot System Using Binocular Vision Techniques Based on Kernel Principal Component Analysis 593. Arc Length method for extracting crack pattern characteristics 594. AF-EMS Detector: Improve the Multi-Scale Detection Performance of the Anchor-Free Detector 595. A Transfer Learning Approach for Drowsiness Detection from EEG Signals 596. Deep audio-visual learning: A survey 597. MAFNet: Multi-style attention fusion network for salient object detection 598. HAVANA: Hierarchical and Variation-Normalized Autoencoder for Person Re-identification 599. An Empirical Comparison of Generative Adversarial Network (GAN) Measures 600. An enhanced 3DCNN-ConvLSTM for spatiotemporal multimedia data analysis

More Research Topics on Image processing

  • Image Based Rendering Research Topics
  • Hyperspectral image analysis Research Topics 
  • Medical Imaging Research Topics
  • Computer vision Research Topics

Related Posts:

  • What is PAO 11/82 of vision?
  • How to Write a Vision Statement?
  • List of Journals on Digital image Processing, Usability and Vision
  • Computer-Aided Design (CAD) Research Topics Ideas
  • Research Topics ideas of HCI Human-Computer Interaction for MS PhD
  • Computational Geometry Research Topics ideas
  • Bibliography
  • More Referencing guides Blog Automated transliteration Relevant bibliographies by topics
  • Automated transliteration
  • Relevant bibliographies by topics
  • Referencing guides

Dissertations / Theses on the topic 'Machine vision systems'

Create a spot-on reference in apa, mla, chicago, harvard, and other styles.

Select a source type:

  • Journal article
  • Video (online)
  • All types...
  • Archival document
  • Book chapter
  • Complete reference
  • Conference paper
  • Copyright certificate
  • Dictionary entry
  • Dissertation / Thesis
  • Encyclopedia
  • Encyclopedia article
  • Extended abstract of dissertation
  • Newspaper article
  • Press release
  • Religious text
  • Social media post

Consult the top 50 dissertations / theses for your research on the topic 'Machine vision systems.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

Bowman, C. C. "High speed image processing for machine vision." Thesis, Cardiff University, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.383161.

Caleb-Solly, Praminda. "An interactive evolutionary approach for configuring machine vision systems." Thesis, University of the West of England, Bristol, 2006. http://eprints.uwe.ac.uk/19195/.

Arshad, Norhashim Mohd. "Real-time data compression for machine vision measurement systems." Thesis, Liverpool John Moores University, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.285284.

Fang, Yajun Ph D. Massachusetts Institute of Technology. "Fusion-layer-based machine vision for intelligent transportation systems/." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/60143.

Rose, Valerie. "Automatic multilevel feature abstraction in adaptable machine vision systems." Thesis, Open University, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.534389.

Dunn, Mark. "Applications of vision sensing in agriculture." University of Southern Queensland, Faculty of Engineering and Surveying, 2007. http://eprints.usq.edu.au/archive/00004102/.

Kelly, P. D. "Environmental analysis thorugh integration of geographical information systems and machine vision systems." Thesis, Queen's University Belfast, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.419439.

Eaton, Gilbert A. "Machine vision approach to identifying and grading Strawberries." Thesis, Griffith University, 2020. http://hdl.handle.net/10072/393978.

Zhang, Jingbing. "On flexibly integrating machine vision inspection systems in PCB manufacture." Thesis, Loughborough University, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.314613.

Andrews, Michael J. "An Information Theoretic Hierarchical Classifier for Machine Vision." Digital WPI, 1999. https://digitalcommons.wpi.edu/etd-theses/807.

Öberg, Filip. "Football analysis using machine learning and computer vision." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-85276.

Khalili, K. "Enhancing vision data using prior knowledge for assembly applications." Thesis, University of Salford, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.360432.

Mashner, Michael. "Multirate machine vision based Kalman filtering and state feedback control." Thesis, Georgia Institute of Technology, 2002. http://hdl.handle.net/1853/16082.

MacLaren, Ian J. H. (Ian James Henry) Carleton University Dissertation Information and Systems Science. "Machine identification of facial images." Ottawa, 1989.

Ruffner, Matt Phillip. "DESIGN OF A MACHINE VISION CAMERA FOR SPATIAL AUGMENTED REALITY." UKnowledge, 2018. https://uknowledge.uky.edu/ece_etds/129.

Bari, Farooq. "A machine vision system for classifying rectangular cabinet frames." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-12042009-020159/.

Oechsle, Olly. "Towards the automatic construction of machine vision systems using genetic programming." Thesis, University of Essex, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.510504.

Mammarella, Marco. "Evaluation of machine vision techniques for use within flight control systems." Morgantown, W. Va. : [West Virginia University Libraries], 2008. https://eidr.wvu.edu/etd/documentdata.eTD?documentid=5970.

Stoddart, Evan. "Computer Vision Techniques for Automotive Perception Systems." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1555357244145006.

Darrell, Trevor Jackson. "Perceptive agents with attentive interfaces : learning and vision for man-machine systems." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/29107.

Harshana, Habaragamuwa. "Deep Convolutional Neural Network's Applicability and Interpretability for Agricultural Machine Vision Systems." Kyoto University, 2018. http://hdl.handle.net/2433/235995.

Beale, Dan. "Autonomous visual learning for robotic systems." Thesis, University of Bath, 2012. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.558886.

Van, Horne Chris. "Machine Vision and Autonomous Integration Into an Unmanned Aircraft System." International Foundation for Telemetering, 2013. http://hdl.handle.net/10150/579707.

Zhang, Zhengwen. "Self-learning systems and neural networks for image texture analysis." Thesis, Brunel University, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.296217.

Megahed, Fadel Mounir. "Towards the Utilization of Machine Vision Systems as an Integral Component of Industrial Quality Monitoring Systems." Thesis, Virginia Tech, 2009. http://hdl.handle.net/10919/36243.

Kim, Sunyoung. "The mathematics of object recognition in machine and human vision." CSUSB ScholarWorks, 2003. https://scholarworks.lib.csusb.edu/etd-project/2425.

Cheng, Kelvin. "0Direct interaction with large displays through monocular computer vision." Connect to full text, 2008. http://ses.library.usyd.edu.au/handle/2123/5331.

Bowskill, Jeremy Michael. "An object-oriented framework for integrating vision systems into surface mount technology manufacturing." Thesis, University of Brighton, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.260975.

Dunn, Zelda. "Improved feed utilisation in cage aquaculture by use of machine vision." Thesis, Stellenbosch : Stellenbosch University, 2008. http://hdl.handle.net/10019.1/2824.

Niemi, Mikael. "Machine Learning for Rapid Image Classification." Thesis, Linköpings universitet, Institutionen för medicinsk teknik, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-97375.

Irving, Paul Anthony. "The development of optical techniques for component inspection in the aerospace industry." Thesis, University of Huddersfield, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.277709.

Rasolzadeh, Babak. "Visual Attention in Active Vision Systems : Attending, Classifying and Manipulating Objects." Doctoral thesis, KTH, Datorseende och robotik, CVAP, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-53484.

English, Jonathan. "Machine vision for the determination of identity, orientation and position of two dimensional industrial components." Thesis, De Montfort University, 1996. http://hdl.handle.net/2086/4811.

Parvez, Bilal. "Embedded Vision Machine Learning on Embedded Devices for Image classification in Industrial Internet of things." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-219622.

Bihi, Thabo George. "Assembly-setup verification and quality control using machine vision within a reconfigurable assembly system." Thesis, [Bloemfontein?] : Central University of Technology, Free State, 2014. http://hdl.handle.net/11462/188.

Ejdeholm, Dawid, and Jacob Harsten. "Detection of common envoirmental interferences in front of a camera lens." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-37350.

Paul, Rohan. "Long term appearance-based mapping with vision and laser." Thesis, University of Oxford, 2012. http://ora.ox.ac.uk/objects/uuid:8d59bf8c-bec8-4782-b100-aa80d1136802.

Adeboye, Taiyelolu. "Robot Goalkeeper : A robotic goalkeeper based on machine vision and motor control." Thesis, Högskolan i Gävle, Avdelningen för elektronik, matematik och naturvetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-27561.

Nadella, Sunita. "Effect of machine vision based traffic data collection accuracy on traffic noise." Ohio : Ohio University, 2002. http://www.ohiolink.edu/etd/view.cgi?ohiou1174681979.

Larsson, Joel, and Rasmus Hedberg. "Development of machine learning models for object identification of parasite eggs using microscopy." Thesis, Uppsala universitet, Signaler och system, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-414386.

Almin, Fredrik. "Detection of Non-Ferrous Materials with Computer Vision." Thesis, Linköpings universitet, Datorseende, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-175519.

Nilsson, Jim, and Peter Valtersson. "Machine Vision Inspection of the Lapping Process in the Production of Mass Impregnated High Voltage Cables." Thesis, Blekinge Tekniska Högskola, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-16707.

Jerkenhag, Joakim. "Comparing machine learning methods for classification and generation of footprints of buildings from aerial imagery." Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-18543.

Erlandsson, Niklas. "Utilizing machine learning in wildlife camera traps for automatic classification of animal species : An application of machine learning on edge devices." Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-104952.

Morar, Sameer Harish. "The use of machine vision to describe and evaluate froth phase behaviour and performance in mineral flotation systems." Doctoral thesis, University of Cape Town, 2010. http://hdl.handle.net/11427/11712.

Olson, Clinton Leif. "Leveraging Contextual Relationships Between Objects for Localization." PDXScholar, 2015. https://pdxscholar.library.pdx.edu/open_access_etds/2204.

Gheorghe, I. V. "Semantic segmentation of terrain and road terrain for advanced driver assistance systems." Thesis, Coventry University, 2015. http://curve.coventry.ac.uk/open/items/42ddefa0-42d3-4e6e-81d4-7b84452652a5/1.

Bartoli, Giacomo. "Edge AI: Deep Learning techniques for Computer Vision applied to embedded systems." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/16820/.

Zhu, Yuehan. "Automated Supply-Chain Quality Inspection Using Image Analysis and Machine Learning." Thesis, Högskolan Kristianstad, Fakulteten för naturvetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:hkr:diva-20069.

DIMAS, FIRMANDA AL RIZA. "Potato surface defect detection using machine vision systems based on spectral reflection and fluorescence characteristics in the UV-NIR region." Kyoto University, 2019. http://hdl.handle.net/2433/244556.

Kindson The Genius

Kindson The Genius

Providing the best learning experience for professionals

10 Machine Learning Project (Thesis) Topics for 2020


Are you looking for some interesting project ideas for your thesis, project or dissertation? Then be sure that a machine learning topic would be a very good topic to write on. I have outlined 10 different topics. These topics are really good because you can easily obtain the dataset (i will provide the link to the dataset) and you can as well get some support from me. Let me know if you need any support in preparing your thesis.

You can leave a comment below in the comment area.

machine vision thesis topics

1.  Machine Learning Model for Classification and Detection of Breast Cancer (Classification)

The data is provided by the Oncology department and details instances and related attributes which are nine in all.

You can obtain the dataset from here

2. Intelligent Internet Ads Generation (Classification)

This is one of the most interesting topics for me. The reason is because the revenue generated or expended by ads campaign depends not just on the volume of the ads, but also on the relevance of the ads. Therefore it is possible to increase revenue and reduce spending by developing a Machine Learning model that select relevants ads with a high level of accuracy.  The dataset provides a collection of ads as well as the structure and geometry of the ads.

Get the ads dataset from here

3. Feature Extraction for National Census Data (Clustering)

This looks like big data stuff. But no! It’s simply dataset you can use for analysis. It is the actual data obtained from the US census in 1990. There are 68 attributes for each of the records and clustering would be performed to identify trends in the data.

You can obtain census the dataset from here

4. Movie Outcome Prediction (Classification)

This is quite a tasking project but its quite interesting. Before now, there exists models to predict the ratings of movies on a scale of 0 to 10 or 1 to 5. But this takes it a step further. You actually need to determine the outcome of the movie.  The data set is a large multivariate dataset of movie director, cast, individual roles of the actor, remarks, studio and relevant documents.

You can get the movies dataset from here

5. Forest Fire Area Coverage Prediction (Regression)

This project have been classified as difficult but I don’t think so. The objective to predict the the area affected by forest fires. Dataset include relevant meteological information and other parameters taken from a region of Portugal.

You can get the fire dataset from here

6. Atmospheric Ozone Level Analysis and Detection (Clustering)

Two ground ozone datasets are provided for this. Data includes temperatures at various times of the day as well as wind speed. The data included in the dataset was collected in a span of 6 years from 1998 to 2004.

You can get the Ozone dataset from here

7. Crime Prediction in New York City (Regression)

If you have watched the movie, ‘Person of Interest’ directed by Jonathan Nolan, then you will appreciate the fact that there is a possibility of predicting  violent criminal activities before they actually occur. Dataset would contain historical data on crime rate, types of crimes occurrence per region.

You can get the crime dataset from here

8. Sentiment Analysis on Amazon ECommerce User Reviews (Classification)

The dataset for this project is derived from user review comments from Amazon users. The model should be able to perform analysis on the training dataset and come up with a model that classifies the reviews based on sentiments. Granularity can be improved by generating predictions based on location and other factors.

You can get the reviews dataset from here

9. Home Eletrical Power Consumption Analysis (Regression)

Everyone uses electricity at home. Or rather, almost everyone! Would is not be great to have a system that helps to predict electricity consumption. Training dataset provided for this project includes feature set such as the size of the home, duration and more

You can get the dataset from here

10. Predictive Modelling of Individual Human Knowledge (Classification and Clustering)

Here the available dataset provide a collection of data about an individual on a subject matter. You are required to create a model that would try to quantify the amount of knowledge the individual have on the given subject. You can be creating by trying to also infer the performance of the user on certain exams.

I hope these 10 Machine Learning Project topic would be helpful to you.

Thanks for reading and do leave a comment below if you need some support

User Avatar


Kindson Munonye is currently completing his doctoral program in Software Engineering in Budapest University of Technology and Economics

You might also like

Pca tutorial 2 – how to perform principal components analysis (pca), machine learning 101 – minimizing misclassification rate in bayes’ classifier, from programmer to data scientist – 5 steps, 2 thoughts on “ 10 machine learning project (thesis) topics for 2020 ”.

Is there any suggestion related to educational data mining?

I’m working on this. You can subscribe to my channel so when I make the update, you can get notified https://www.youtube.com/channel/UCvHgEAcw6VpcOA3864pSr5A

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Affiliate program

Refer our service to your friend and receive 10% from every order

Essay writing help has this amazing ability to save a student’s evening. For example, instead of sitting at home or in a college library the whole evening through, you can buy an essay instead, which takes less than one minute, and save an evening or more. A top grade for homework will come as a pleasant bonus! Here’s what you have to do to have a new 100% custom essay written for you by an expert.

To get the online essay writing service, you have to first provide us with the details regarding your research paper. So visit the order form and tell us a paper type, academic level, subject, topic, number and names of sources, as well as the deadline. Also, don’t forget to select additional services designed to improve your online customer experience with our essay platform.

Once all the form fields are filled, submit the order form that will redirect you to a secure checkout page. See if all the order details were entered correctly and make a payment. Just as payment is through, your mission is complete. The rest is on us!

Enjoy your time, while an online essay writer will be doing your homework. When the deadline comes, you’ll get a notification that your order is complete. Log in to your Customer Area on our site and download the file with your essay. Simply enter your name on the title page on any text editor and you’re good to hand it in. If you need revisions, activate a free 14-30-day revision period. We’ll revise the work and do our best to meet your requirements this time.


Finished Papers

machine vision thesis topics

Original Drafts

Perfect Essay

machine vision thesis topics

Affiliate program

Refer our service to your friend and receive 10% from every order

Finished Papers

Student Feedback on Our Paper Writers

Customer Reviews


  1. 7 Stages Of A Machine Vision Project

    machine vision thesis topics

  2. (PDF) Research and Application of Machine Vision in Industry

    machine vision thesis topics

  3. Top 5 Thesis Topics for Machine Learning [Customized Research Support]

    machine vision thesis topics

  4. Master thesis: Real time machine vision on FPGA.

    machine vision thesis topics

  5. (PDF) Literature Review of Machine Vision in Application Field

    machine vision thesis topics

  6. Introduction to Machine Vision

    machine vision thesis topics


  1. MBS Thesis

  2. Machine Vision Conference 2023

  3. Intro to project and materials YouTube channel

  4. Lecture #14: 2023: Career counseling

  5. Machine Vision Engineering

  6. 7 hours Study +Thesis Writing |Background noise, 15 min Break, No music


  1. Theses

    A list of completed theses and new thesis topics from the Computer Vision Group. Are you about to start a BSc or MSc thesis? Please read our instructions for preparing and delivering your work. PhD Theses Master Theses Bachelor Theses Thesis Topics Novel Techniques for Robust and Generalizable Machine Learning PDF Abstract

  2. Deep learning, machine vision in agriculture in 2021

    1. Introduction The use of artificial intelligence has expanded rapidly in recent years. Researchers from various fields of science use in practice the functionality of neural networks of machine learning and machine vision.

  3. Dissertations / Theses on the topic 'Machine vision'

    Dissertations / Theses on the topic 'Machine vision' To see the other types of publications on this topic, follow the link: Machine vision. Author: Grafiati Published: 4 June 2021 Last updated: 6 September 2023 Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles Select a source type: Book Website Journal article

  4. Top 10 Research and Thesis Topics for ML Projects in 2022

    Computer Vision Natural Language Processing Cognitive Computing AIOps Conversational AI Emotional AI Face/Image Recognition Self-Driving Cars Big Data Analytics Data Science Business Analytics Business Intelligence Augmented Analytics Data Management People Analytics Text Analytics Speech Analytics Cloud Computing Edge Computing Quantum Computing

  5. The Future of AI Research: 20 Thesis Ideas for Undergraduate ...

    Each thesis idea includes an introduction, which presents a brief overview of the topic and the research objectives. The ideas provided are related to different areas of machine learning and deep learning, such as computer vision, natural language processing, robotics, finance, drug discovery, and more.

  6. Computer Vision Group

    Research Areas Research Areas Our research group is working on a range of topics in Computer Vision and Image Processing, many of which are using Artifical Intelligence. Computer Vision is about interpreting images. More specifically the goal is to infer properties of the observed world from an image or a collection of images. Our work combines a range of mathematical domains including ...

  7. 10 Compelling Machine Learning Ph.D. Dissertations for 2020

    This dissertation explores three topics related to random forests: tree aggregation, variable importance, and robustness. 10. Climate Data Computing: Optimal Interpolation, Averaging, Visualization and Delivery. This dissertation solves two important problems in the modern analysis of big climate data.

  8. Research Topics

    Research Topics. The current plethora of imaging technologies such as magnetic resonance imaging (MR), computed tomography (CT), position emission tomography (PET), optical coherence tomography (OCT), and ultrasound provide great insight into the different anatomical and functional processes of the human body. Computer vision is the science and ...

  9. Efficient Implementations of Machine Vision Algorithms using a

    That is, this thesis is about a different way of implementing machine vision systems. The work could be applied to prototype and in some cases implement machine vision systems in industrial ...

  10. Dissertations / Theses: 'Computer vision;Machine learning'

    The research topics can also be categorized by the equipment or techniques used, for example, image processing, computer vision, machine learning, and localization. This dissertation primarily reports on computer vision and machine learning algorithms and their implementations for autonomous vehicles.

  11. Dissertations / Theses on the topic 'Machine vision ...

    Thesis focus with in the field of Machine Vision that is used for optical online quality inspection of the cutting knifes in a wood chipper that is also the title of the thesis.The work is focused on measuring the quality of the cutting knifes that are moving with the speed of 45 m/s in a real time wood chipper.

  12. Computer Vision really cool ideas for a thesis? : r/computervision

    herrtim. • 12 yr. ago. One aspect of all computer vision that will always need improving is optimization and speed improvements of existing classification, tracking, recognition and machine learning algorithms. To me that's the most exciting area and where the most impact can be had.

  13. Dissertations / Theses: 'Machine vision; Computer'

    Dissertations / Theses on the topic 'Machine vision; Computer' To see the other types of publications on this topic, follow the link: Machine vision; Computer. Author: Grafiati. Published: 4 June 2021 Last updated: 2 February 2022 Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles ...

  14. Dissertations / Theses: 'Machine vision; Neural networks'

    Dissertations / Theses on the topic 'Machine vision; Neural networks' To see the other types of publications on this topic, follow the link: Machine vision; Neural networks. Author: Grafiati. Published: 4 June 2021 Last updated: 16 February 2022 Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles ...

  15. Computer vision Research Topics Ideas

    List of Computer vision Research Topics Ideas for MS and PH.D. 1. Deep learning-enabled medical computer vision. 2. Deep learning and computer vision will transform entomology. 3. Exploring human-nature interactions in national parks with social media photographs and computer vision. 4.

  16. Computer vision research topic ideas (UNDERGRAD)

    Pattern recognition, Machine learning, Deep learning these are topics are recently have more works. Further these three topics one to one related to each other. Vision based inventory System. You ...

  17. Struggling to find a research topic in computer vision for masters' thesis

    Computer Vision is the scientific subfield of AI concerned with developing algorithms to extract meaningful information from raw images, videos, and sensor data. This community is home to the academics and engineers both advancing and applying this interdisciplinary field, with backgrounds in computer science, machine learning, robotics ...

  18. Dissertations / Theses: 'Machine vision systems'

    List of dissertations / theses on the topic 'Machine vision systems'. Scholarly publications with full text pdf download. Related research topic ideas.

  19. Machine-vision

    Research on Printing Image Quality Detection Technology Based on HALCON, JinCan / Central South University ,0/42 Research and Implementation of the System of Glass Bottles On-line Detection Based on Machine Vision, WeiSongLin / Huazhong University of Science and Technology ,0/6

  20. 10 Machine Learning Project (Thesis) Topics for 2020

    1. Machine Learning Model for Classification and Detection of Breast Cancer (Classification) The data is provided by the Oncology department and details instances and related attributes which are nine in all. You can obtain the dataset from here . 2. Intelligent Internet Ads Generation (Classification) This is one of the most interesting topics ...

  21. Machine Vision Thesis Topics

    Machine Vision Thesis Topics: 4.9/5. 1(888)814-4206 1(888)499-5521. Looking for something more advanced and urgent? Then opt-in for an advanced essay writer who'll bring in more depth to your research and be able to fulfill the task within a limited period of time. In college, there are always assignments that are a bit more complicated and ...

  22. [N] Vision for 2024: An Uncensored AI Platform : r/MachineLearning

    0. 4 Share. Sort by: modi123_1. • 25 min. ago. Ah so I see a brand new account rambling about 'woke' AI has entered the chat. Given your post history of lame conspiracy BS how about you go take a hike? 9. Smallpaul.

  23. Machine Vision Thesis Topics

    Machine Vision Thesis Topics A professional essay writing service is an instrument for a student who's pressed for time or who doesn't speak English as a first language. However, in 2022 native English-speaking students in the U.S. become to use essay help more and more.

  24. Machine Vision Thesis Topics

    Machine Vision Thesis Topics Frequently Asked Questions REVIEWS HIRE REVIEWS HIRE 1378 Customer Reviews Level: University, College, High School, Master's, PHD, Undergraduate ID 19673 REVIEWS HIRE Essay on Public Relations Nursing Business and Economics Management Marketing +130 What is the best essay writer?