I have 3+ years of work experience in Computer Vision field and a Bachelor Degree in Electrical and Electronics.
I'm a self-taught, fast learner with good problem solving skill. I'm keen interested in building and training
SOTA deep learning models from scratch with custom datasets. I'm energetic to master in computer vision
and deep learning in order to form machine vision as competent as human vision.
I have a great passion for high-tech solutions to address real-world problems with my background in
deep Learning and machine learning. In 2018, I have focused on the field computer vision as an intermediate field
because I believe that sooner or later AI will inevitably impact all aspects of our lives and industries.
The goals of this project are to be modular, scalable and extensible code based project and to create skeleton based multi person action recognition system in real time. A human pose estimation model, DeepSort tracker and a multilayer perceptron (MLP) classifier are used to achieve the complete system.
Among these models, I mainly focus on pose estimation and tracking models to achieve a decent accuracy with fast inference speed in crowded and occluded scenerios.
- For pose estimation, I used pose estimation model with bottom-up approaches because it is faster than top-down approches, especially there are multi person in the image. But there are trade of between accuracy and speed as a usual problem in deep learning.
- For Tracking, I modify the original deepsort source codes and functionality to be compatible with skeleton outputs. Train the custom reidentification models for deepsort ReID models and convert the models to TensorRT and Onnx.
This project is to learn about the difference between background removal algorithms and semantic segmentation models, and to easily replace our videos and images background with any background image or video clip. A pretrained MODNet is used to complete this project.
Github View Video