Hi, I'm Sushant Gautam.

As a dedicated Computer Science graduate student with a focus on Machine Learning and Data Science, I am seeking an internship/full-time machine learning engineer position where I can leverage my expertise in Machine Learning, Deep Learning, Computer Vision, and AWS Cloud technologies. My goal is to apply my theoretical knowledge and innovative ideas to challenging projects, thereby gaining invaluable hands-on experience.

The only way to do great work is to love what you do.

About

Machine Learning Engineer with Expertise in AI Development and MLOps

As a Computer Science graduate student, I specialize in Machine Learning, Data Science, and AI development. Since 2017, I have been driven by a passion for leveraging AI to transform complex data into actionable insights.

My expertise lies in building machine learning models, deploying robust MLOps pipelines, and optimizing workflows for secure and efficient AI applications. I am also actively expanding my knowledge in scalable cloud-based deployment architectures to further enhance my skill set.

Inspired by the transformative potential of AI, I am dedicated to solving real-world challenges and collaborating with innovative teams to deliver impactful, data-driven solutions.

Experience

Research Assistant
  • Improved image restoration accuracy by 20% through the implementation of novel generative models, enhancing the overall quality of visual data for the WRIVA project in collaboration with Stanford, Princeton University and BlueHalo. Used different GAN and diffusion models to restore global artifacts like blurring, overexposure, lens flare, and JPEG compression, resulting in a 15% increase in image quality. .
  • Created CI/CD process that lints builds into the container, tests, vulnerability scans, and pushes machine learning software to the registry for use in orchestration software airflow.
  • Pytorch-Lightning, Optuna, PyTorch, Scikit-learn, and Docker are some of the tools used during my research.
August 2023 - Present | Starkville, USA
Machine Learning Engineer
  • Spearheaded the research and development of advanced AI models for detecting and classifying brain-related diseases, including Large Vessel Occlusion (LVO), and hypodensity, with deep learning models like YOLOv7, detectron2 enhancing diagnostic accuracy and efficiency.
  • Engineered and deployed robust APIs for the AI models, facilitating seamless integration into production environments. This deployment significantly improved the operational workflow and led to more reliable and rapid diagnosis of disease
Feb 2023 - August 2023 | Victoria, Australia (Remote)
Research Scientist(3 months contract)
  • Pioneering Research in Synthetic Datasets: Spearheaded research initiatives on synthetic datasets derived from a limited number of authentic disease images. This innovative approach focused on enhancing data availability for robust model training, particularly in scenarios with scarce real-world data.
  • Development of GANs for Disease Image Generation: Played a crucial role in designing and developing Generative Adversarial Networks (GANs). These models were adept at producing high-quality synthetic disease images. The use of these GAN-generated images was instrumental in significantly elevating the accuracy of disease identification models. This breakthrough demonstrated the potential of GANs in overcoming data limitations in medical imaging.
  • Leadership and Training: As the team leader, led a group of researchers and developers in applying this cutting-edge research to product development. Additionally, undertook the responsibility of trainer, imparting essential skills and knowledge related to GANs and synthetic dataset utilization. This role encompassed mentoring the team to efficiently translate research findings into practical, market-ready healthcare solutions.
Feb - April 2022 | San Francisco, CA (Remote)
Lecturer
Taught Bachelor In Computer Application Students.
  • Spring 2022
    • CACS354 - Advance Java Programming (3 cr)
    • CACS355 - Network Programming with Java (3 cr)
  • Summer 2022
    • CACS204 - Object Oriented Java Programming (3 cr)
    • CACS402 - Cloud Computing (3 cr)
    • CACS456 - Machine Learning (3 cr)
  • Spring 2023
    • CACS354 - Advance Java Programming (3 cr)
    • CACS355 - Network Programming with Java (3 cr)
Dec 2021 - June 2023 | Lalitpur, Nepal
Machine Learning/Computer Vision Engineer
  • Worked under the Prof. Dr. Suresh Manandhar
  • Collaborated as a key member of an international team specializing in the research and design of machine learning and computer vision solutions for healthcare applications.
  • Led the development of APIs and user interfaces for disease recognition models, enhancing the usability and accessibility of machine learning tools in medical diagnostics.
  • Engineered a custom Machine Learning Pipeline tailored for medical image diagnosis, enabling dynamic, user-driven model training online for various diseases.
  • Innovatively researched and developed a Deep Learning Medical Image Diagnosis Model, achieving significant performance improvements in detecting diabetic retinopathy, skin lesions, and lung diseases.
Oct 2020 - Oct 2021 | Bellevue - WA , USA (Remote)
Deep Learning Engineer | Computer Vision Developer
  • Made video summarization and scene finding project using state-of-art Deep Learning model.
May 2020 - Sep 2020 | (Remote)
Machine Learning Research Intern
  • Work under the Prof. Dr. Suresh Manandhar
  • Developing language models to improve offline handwritten recognition tasks
  • Synthesize handwritten images with GAN.
    • Used of different types of GAN like Cycle GAN, Text2Image GAN, and Variational Encoder
Jan 2020 - June 2020 | Kathmandu, Nepal

PROJECTS

Deep Learning Projects

Handwritten Detection
Handwritten Detection

Handwritten Line Text Recognition using Deep Learning with Tensorflow

Accomplishments
  • Tools: Flask, HTML, CSS, Bootstrap, Tensorflow, OpenCV
  • This is my major project which I do with quiet research and found the state-of-the-art architecture for real-time handwritten recognition with CER 4.32%.
  • Uses CNN, LSTM, CTC loss function and trained on an IAM dataset with a self-created dataset that can detect handwritten text in real-time.
  • Perform data augmentation to make a robust model and improve accuracy.
  • Also make some changes on the current architecture which is able to detect handwritten images in Devnagari Script with CER 8.32%.
Nepali Handwritten Detection
Nepali Handwritten Detection

Nepali Handwritten Line Text Recognition using Deep Learning with Tensorflow

Accomplishments
  • Tools: Flask, HTML, CSS, Bootstrap, Tensorflow, OpenCV
  • This is my major project which I do with quiet research and found the state-of-the-art architecture for real-time handwritten recognition with CER 4.32%.
  • Uses CNN, LSTM, CTC loss function and trained on an IAM dataset with a self-created dataset that can detect handwritten text in real-time.
  • Perform data augmentation to make a robust model and improve accuracy.
  • Also make some changes on the current architecture which is able to detect Nepali handwritten images in Devnagari Script with CER 10.32%.
Neural Style Transfer
Neural Style Transfer

Neural Style Transfer Paper Implementation with PyTorch.

Accomplishments
  • Neural Style Transfer Paper Implemenation with PyTorch
  • Try to go through all the techniques mentioned in the [paper](https://arxiv.org/abs/1508.06576) .
quiz app
Nepali Poem Generation

A Nepali POEM Generation using AI

Accomplishments
  • Tools: Numpy, Tensorflow, keras, NLP
  • This project use the char-rnn model.
  • Train on Nepali POEM by Adikabi "Laxmi Prasad Devkota" who is pioneer in Nepali Literature with more than 200k characters.
  • The char-RNN model is here used to generate the Nepali Poem
  • Accuracy will be increase if more amount of data is available.
quiz app
Reverse Image Search Engine with Python

Reverse Search Engine with Transfer Learning and K nearest-neighbors.

Accomplishments
  • Tools: Numpy, Scikit-learn,Tensorflow v2, Flask
  • Image Search Engine using Deep Learning Model (ResNet50)
  • Use ResNet=50 model to extract features from Caltech101 datasets, train a K nearest-neighbors model using the brute-force algorithm to find the nearest n neighbors based on Euclidean distance.
  • Based on less distance, top n images are returned.
  • Also some analysis on how to increase speed and accuracy is given.
Screenshot of web app
Celebrity Face Image Generation with GANs

Fake Celebrity face image generation with DCGAN trained on Kaggle Celebrities 100k Dataset.

Accomplishments
  • Tools: Python, PyTorch
  • Architecture: DCGAN
  • DCGAN Paper Implementation with some improvements
  • Trained for 60 epochs, output is satisfactory
Screenshot of web app
Music Generation with AI

A simple and extensible generating Music with Char ABC notation similar to Irish folk song with LSTM in Keras.

Accomplishments
  • Tools: Python, Keras, cuda
  • Trained with char ABC notation with char rnn model
  • Accuracy: 80% and loss: 0.6
Screenshot of project
Self-Driving Car Projects

This project simulates the autopilot's key function of predicting steering angle using the front picture of the car as input.

Accomplishments
  • Tools: Python, PyTorch, OpenCV
  • Train an End-to-End Architecture for steering angle prediction.
Screenshot of project
Interact with Paper

This project presents an innovative simple approach to enhancing the interaction with scientific papers through a blend of summarization and question-answering capabilities, leveraging the strengths of large language models (LLMs) and retrieval-augmented generation (RAG) methodologies.

Accomplishments
  • Tools: Python, PyTorch, OpenAI
  • The workflow integrates document parsing, summary generation, embedding computation, and similarity-based retrieval to create a responsive and informative user interface.

Other Projects

Screenshot of web app
Encryption And Decryption

A simple and powerful Encryption and Decryption Application with own ASCII shifting algorithm.

Accomplishments
  • Tools: JAVA, Swing
  • ALL the main logic is that read the ASCII value of the each character of the string and shift (add) the ASCII value by the key + extra value and in decryption remove the extra value by the given key + extra value to extract the original information
Screenshot of web app
Wave Simulation in C++

Wave Simulation application in C++, that performs various operations between signals and display with graphics library.

Accomplishments
  • Tools: C++, Graphics.h, Codeblocks

Blog Posts & Paper Reading

Blog Posts

In my free time, I usually create blog posts on fascinating deep learning and computer vision studies that go into extensive detail about ideas, architecture, and mathematical equations.

View all posts

Annotated papers

I annotated numerous interesting computer vision papers, particularly GANs and Vision Transformers, by providing an overview of each with details in mathematical calculations.

Read more

Paper Scratch Implementation

Deep Learning Paper Scratch Implementation with PyTorch & Tensorflow.

Read more

Skills

Skills Word Cloud

Languages and Databases

Python
HTML5
CSS3
MySQL
PostgreSQL
Shell Scripting

Libraries

NumPy
Pandas
OpenCV
scikit-learn
matplotlib

Frameworks

Django
Flask
Bootstrap
Keras
TensorFlow
PyTorch

Other

Git
AWS ML
CI/CD

Certificates

Extensive MLOps course conducted by the School of AI. Skills I learned are: Docker · Kubeflow · kafka · Data version control · ML Models Deployment · Model Explanation(XAI Methods) · Distributed CUDA Training · PyTorch Lightning · Automated Machine Learning (AutoML) · Continuous Integration and Continuous Delivery (CI/CD) · AWS Lambda · AWS SageMaker · Computer Vision
Course Project Repository

Education

Mississippi State University

2023-2025| Starkville, USA

  • Degree:Masters In Computer Science(Thesis Track)
  • Concentration: Artificial Intelligence
  • Relevant Coursework: Algorithms, Machine Learning, AI Robotics
  • GPA: 4.0
2023-Present | Starkville, USA

Tribhuvan University

2015-2019 | Kathmandu, Nepal

Degree: Bachelor In Computer Engineering
Grade: A

    Accomplishments

    • For four years during my undergraduate studies, I received a merit-based scholarship for outstanding academic excellence.
    • Completed Undergraduate In Computer Engineering with Distinction.
    • Topper of my Batch in my University
    • First position in instant software competition in my university.

    Relevant Courseworks:

    • Data Structures and Algorithms
    • Database Management Systems
    • Operating Systems
    • Artificial Intelligence
    • Digital Image Processing And Pattern Recogntion
    • Big Data Technologies
    • Advance Mathematics
    • Probability And Statistics


2015-2019 | Kathmandu, Nepal

Contact