Projects

Analyzing the Effectiveness of K-FAC on Training MLP-Mixers for Image Classification

University of Toronto, CSC2541 Neural Network Training Dynamics (Winter 2022)

K-FAC is an approximate second order optimization method that has been shown to speed up training convergence while achieving competitive performance when training logistic autoencoders, convolutional networks, and RNNs. Here, we apply K-FAC to the MLP-Mixer architecture, and demonstrate that (without pretraining) K-FAC outperforms SGD and Adam when performing CIFAR-100 image classification and when handling large input sizes on CIFAR-10. Additionally, we perform learning rate grafting to confirm that the implicit learning rate schedule is the primary factor dictating classification performance.

[Project Report] [Code]

Learning Heuristics for Minimum Latency Problem with RL and GNN

University of Toronto, MIE1666 Machine Learning for Mathematical Optimization (Fall 2021)

Recent work has successfully applied methods from reinforcement learning (RL) to derive domain-dependent heuristics for challenging combinatorial optimization problems from a set of training instances. We build on this idea and show how RL can be applied to the Minimum Latency Problem by using a graph attention network to encode a stochastic policy for constructively building partial paths, yielding solutions which are comparable to state-of-the-art, hand-engineered methods.

[Project Report] [Code]

Analyzing the Effect of Adversarial Inputs on Saliency Maps

University of Toronto, CSC413 Neural Networks and Deep Learning (Winter 2021)

FGSM is a method for generating adversarial examples by applying small perturbations to input images that can fool deep networks into making confident but incorrect predictions. In this work, saliency maps, namely GradCam, GuidedBackprop, and SmoothGrad, were used to compare the salient features of the original images from CIFAR-10 and their adversarial counterparts engineered using FGSM. For both untargetted and targetted attacks, quantitative measurements of similarity demonstrated that GradCam was the most successful at detecting adversarial inputs.

[Project Report] [Code]

Real-Time Face Mask Detector using a Faster-RCNN (FPN backbone) Architecture

University of Toronto, APS360 Applied Fundamentals of Machine Learning (Fall 2020)

Fast (~14 fps) face mask detector which accurately identifies bounding boxes of faces and labels whether a mask is worn (correctly) or not worn using a Faster-RCNN (with FPN backbone) architecture. Achieves 85% mAP@0.5 on a kaggle face mask detection dataset.

[Project Report] [Code]

Full visual odometry using single scale and transformation invariant feature detectors

University of Toronto, ROB501 Computer Vision for Robotics (Fall 2020)

In this project, a complete feature-based Visual Odometry (VO) pipeline is implemented and the performance from using the Harris corner detector is compared against the more recently developed SIFT and BRISK feature detectors (which are scale and transformation invariant). The VO pipeline was evaluated on stereo imagery data from the Canadian Planetary Emulation Terrain Energy-Aware Rover Navigation Dataset.

[Project Report] [Code]