Intro

Your Image

Nikhil Das Karavatt is a dynamic Data Scientist with a relentless passion for harnessing the power of data to drive innovation and make informed decisions. With a strong academic foundation in Mechanical Engineering and a Master of Science in Data Science, Nikhil brings a unique blend of technical prowess and analytical acumen to every project.

Nikhil's career journey is a testament to adaptability and dedication. He has seamlessly transitioned from a fascination with gadgets to a deep passion for data science, demonstrating a commitment to continuous learning and growth.

With hands-on expertise in SQL, Python, and Tableau, Nikhil excels at extracting valuable insights from complex datasets. His technical proficiency extends to machine learning, neural networks, and deep learning, enabling him to develop cutting-edge solutions for real-world challenges.

Nikhil's professional journey includes impactful roles at Accenture, where he honed his problem-solving skills and automation expertise. His ability to streamline processes and improve operational efficiency has earned him accolades from clients and colleagues alike.

Beyond his technical prowess, Nikhil's dedication to community engagement is evident through his volunteer work with the Cancer Patients Aid Society. His commitment to making a positive impact extends to every facet of his life.

Intriguingly, Nikhil's career journey is deeply influenced by his father's self-made success story, instilling in him an unwavering determination to overcome challenges and pursue his aspirations with courage.

In a rapidly evolving data landscape, Nikhil Das Karavatt is at the forefront of innovation, poised to transform industries and make a meaningful difference. His journey exemplifies the fusion of technical excellence, adaptability, and a commitment to driving data-driven solutions.

Work Experience

work-image
Research Data Scientist | University of Texas at Arlington Research Institute (UTARI), USA
Sep 2023 - Oct 2024
  • Groundbreaking Computer Vision Project:
    • Spearheaded groundbreaking project in the realm of computer vision and image processing, specializing in the intricate art of human body motion tracking within the realm of golf.
    • Utilized cutting-edge technologies and tools to revolutionize swing analysis and enhance performance insights for golf enthusiasts.
  • Expertise in OpenCV:
    • Commanded an arsenal of expertise in OpenCV, a leading computer vision library, while seamlessly integrating state-of-the-art 3D cameras.
    • Worked with cutting-edge hardware, including the prestigious Zed 3D Camera by StereoLabs and the formidable Intel RealSense 3D Camera, to capture high-precision motion data.
  • Data Collection and Management:
    • Expertly managed data collection efforts, including the creation of JSON files and leveraging pgAdmin for PostgreSQL database management.
    • Systematically gathered, stored, and analyzed essential swing data, enabling data-driven insights and enhancing the accuracy of swing analysis.
  • Advanced Analytical Techniques:
    • Applied causal inference through A/B testing, identifying a 25% improvement in system performance and data quality.
    • Elevated model accuracy by 40% through implementing Dynamic Time Warping (DTW) and Dynamic Motion Primitives (DMPs), optimizing system performance.
    • Collaborated with stakeholders to interpret results and present findings through data visualization and detailed reports.
Data Scientist Intern (Capstone Project) | Bank of America, Livermore, CA, USA
Jan 2024 - May 2024
  • Deep Learning Model Development:
    • Designed a cutting-edge deep learning model tailored to achieve a recall rate of over 85% in identifying DeepFakes during virtual conferences.
    • Utilized Generative Adversarial Networks (GANs) to create 500 personalized DeepFake videos, significantly enhancing model robustness.
  • Implementation of Advanced Models:
    • Engineered three high-performance models (CNN + LSTM, ResNet, and MTCNN + MesoNet) using TensorFlow, Keras, and PyTorch frameworks.
    • Integrated these models into a comprehensive system for real-time DeepFake detection and analysis.
  • Fine-Tuning and Optimization of GenAI Models:
    • Fine-tuned pre-trained GenAI predictive models to improve performance and adapt them to specific tasks, enhancing the overall model accuracy and efficiency.
    • Implemented advanced techniques like quantization with LoRA (Low-Rank Adaptation) adaptors to optimize the model's size and reduce computational resources required for training and inference.
    • By using quantization, reduced the model's memory footprint, allowing for faster model deployment and inference with minimal loss in accuracy.
Senior Data Engineer & Analyst | Accenture, MH, India
Dec 2020 - Aug 2022
  • Data Integration and Optimization:
    • Integrated large-scale data from Oracle DB using PL/SQL to AWS Redshift, reducing processing time by 25% through the use of AWS Glue and S3 for optimized ETL workflows.
    • Enhanced Oracle Retail Demand Forecasting (RDF) with parameter tuning, improving forecast accuracy by 5%, and delivering actionable insights for better inventory and supply chain management.
  • Machine Learning Model Deployment:
    • Successfully deployed predictive machine learning models via OpenShift and Jenkins CI/CD, driving advanced analytics capabilities and reducing stockouts by 20%, directly impacting business operations.
    • Implemented automated workflows for continuous model training and deployment, ensuring a robust, scalable solution for real-time decision-making.
  • Cross-Functional Collaboration and Data Modeling:
    • Collaborated with data scientists, product teams, and business leaders to design and implement data models that significantly enhanced decision-making and aligned with key business objectives.
    • Worked closely with stakeholders to ensure data models were built for scalability and accuracy, supporting both strategic planning and operational efficiency.
  • Jenkins Pipeline Orchestration:
    • Streamlined Continuous Integration and Continuous Delivery (CI/CD) pipelines using Jenkins, enabling faster deployment cycles and improving software development and operational efficiencies.
  • SQL Database Expertise:
    • Executed 10-20 SQL database tasks weekly, optimizing real-time data reporting queries for business intelligence, improving data-driven decision-making across the organization.
    • Developed and maintained efficient SQL queries and reports, ensuring data quality and accuracy for analytics and reporting.
  • Diverse Deployment Management:
    • Managed 4-6 deployment requests weekly, including integrations with Informatica workflows, Autosys jobs, Oracle Forms and Reports, and OpenShift, ensuring smooth and timely project execution.
    • Led cross-functional teams to implement complex deployment processes, ensuring successful delivery of high-quality solutions on schedule.
  • Agile Methodology Expertise:
    • Collaborated in Agile teams, working closely with leads and managers to implement 10-30 weekly procedural changes, fostering a responsive, fast-paced development environment.
    • Demonstrated adaptability in meeting evolving project requirements while maintaining high standards for quality and efficiency in all deliverables.
Data Engineer | Accenture, KA, India
Jul 2018 - Dec 2020
  • POS Data Flow Automation:
    • Coordinated the seamless flow of POS data to CSPARC, automating ETL processes using Autosys and Informatica, resulting in a 30% reduction in reporting time and increasing overall efficiency.
    • Implemented error detection mechanisms, ensuring the integrity and accuracy of data delivered across the pipeline.
  • Data Integrity and Error Detection:
    • Ensured data integrity by utilizing event handlers and RabbitMQ, detecting and resolving errors in real-time, and ensuring timely, accurate data delivery to stakeholders.
    • Minimized operational disruptions by implementing robust error-handling processes, contributing to overall system stability.
  • End-to-End Data Pipeline Optimization:
    • Optimized end-to-end data pipelines and managed Oracle DB, achieving 98% uptime and ensuring consistent data availability for business operations and reporting.
    • Recognized as Employee of the Month for delivering a significant performance improvement in data pipeline operations.
  • Automation with Blue Prism & RPA:
    • Implemented process automation using Blue Prism and Robotic Process Automation (RPA) technologies, reducing 10 hours of weekly manual reporting tasks and improving reporting efficiency.
    • Enhanced client satisfaction by delivering timely, accurate reports, leading to better client relationships and trust.
  • Large-Scale Database Migration:
    • Led a successful migration of over 1,000,000 records in a large-scale database transition, demonstrating technical expertise and ensuring minimal disruption to operations.
    • Ensured data integrity throughout the migration process, preserving accuracy and preventing data loss.
  • Client-IT Issue Resolution:
    • Efficiently resolved 3-5 daily client-IT issues, employing root cause analysis (RCA) and swift troubleshooting, ensuring operational stability and reducing downtime.
    • Exhibited strong problem-solving abilities, providing effective and timely solutions to client challenges.
  • Support Documentation and Training:
    • Developed comprehensive support documentation, enabling training for 5 teams across the organization, improving team competency and operational efficiency.
    • Contributed to a 20% reduction in resolution time by empowering teams with the resources and knowledge to efficiently address issues.
Associate Data Analyst | Accenture, KA, India
Dec 2016 - Jul 2018
  • Sales Data Analysis and Optimization:
    • Performed correlation and regression analysis on sales data, identifying key drivers that resulted in a 7% increase in sales performance.
    • Provided actionable insights to stakeholders, influencing business strategies and contributing to overall growth.
  • Power BI Dashboard Development:
    • Designed and developed an interactive Power BI dashboard, visualizing sales KPIs that empowered data-driven decisions during vendor negotiations, leading to a 12% increase in profitability.
    • Streamlined reporting processes, allowing stakeholders to easily track performance and identify areas for improvement.
  • Exploratory Data Analysis (EDA) and Python Expertise:
    • Performed comprehensive exploratory data analysis (EDA) using Python on Apache Spark, uncovering valuable insights for leadership to guide strategic decision-making.
    • Leveraged advanced data processing techniques to clean and structure data for better analysis and reporting.
  • Data Collaboration and Stakeholder Engagement:
    • Collaborated closely with cross-functional teams to understand data needs and deliver accurate, actionable insights that enhanced business decision-making processes.
    • Enabled a 20% increase in overall productivity by improving the data-driven decision-making process across departments.
  • Quality Assurance and Bug Reduction:
    • Led unit testing efforts with over 100 test cases to validate data precision and software functionality, ensuring high-quality outputs.
    • Reduced production bugs by 50% through effective testing procedures and proactive debugging strategies, contributing to smoother software performance.
  • System Monitoring and Optimization:
    • Monitored system logs to ensure the uninterrupted operation of automated processes, including SQL queries, ETL operations, and Autosys-driven processes.
    • Implemented optimization techniques to enhance system performance and minimize downtime, ensuring seamless operations.
  • Shell Scripting and Automation:
    • Developed and enhanced 2-5 UNIX shell scripts weekly to automate critical tasks, improving operational efficiency.
    • Automated data import/export operations and email reporting, streamlining workflows and reducing manual intervention.
Project Trainee | Automotive Research Association of India, India
Jan 2016 - May 2016
  • Designed and developed a bamboo-metal matrix frame for a two-wheeler vehicle.
  • Utilized bamboo, known for its rapid growth and high strength-to-weight ratio, as the primary building material.
  • Aimed to create an eco-friendly and cost-effective alternative to traditional metal frames.
  • Carried out the design process using CATIA and conducted structural analysis using ANSYS.
  • Ensured that the designed frame reduced the overall weight of the vehicle by at least 10%.
  • Replaced some metal parts with bamboo tubes to achieve significant weight and cost reductions.
  • Conducted various tests to validate the design, including real-time data collection and parametric analysis for factors such as safety, strength, vehicle behavior, and suspension characteristics.
Project Trainee | General Motors Company, India
May 2014 - July 2014
  • Engaged in Indirect Scheduling Operations within the Global Purchase and Supply Chain Department, specifically within the Direct Material Storage division of the Global Supply Chain Operations.
  • Acquired valuable expertise in the efficient management of materials, overseeing the storage processes, and ensuring the seamless flow of parts and assemblies throughout the workspace. Emphasis was placed on maintaining a constant availability of materials as per operational requirements.
  • Additionally, developed proficiency in adhering to stringent safety protocols, encompassing best practices, equipment handling, and control measures aimed at mitigating workplace accidents associated with material handling and storage.
  • Furthermore, acquired knowledge and proficiency in SharePoint software, enhancing skills related to digital collaboration and document management.

Projects

AI-Powered Golf Swing Training System

Circle Info
Project Image

As a Research Assistant at UTARI, I contribute to the development of an AI-driven golf swing training system. This innovative project combines computer vision and AI to offer golfers immediate feedback on their swings, promoting independent skill development. My responsibilities at the moment include integrating advanced 3D cameras, creating structured JSON files, and managing a PostgreSQL database to collect and store swing analysis data. This project demonstrates the transformative potential of AI in sports technology, making skill improvement more accessible and efficient.

Coffee Shop AI Chatbot 🤖☕️

GitHub
Project Image

The Coffee Shop AI Chatbot is an advanced AI-powered system designed to revolutionize customer interactions in a coffee shop environment. Leveraging Large Language Models (LLMs) and Natural Language Processing (NLP), it integrates seamlessly into a React Native app to facilitate real-time order placement, provide detailed menu information using a Retrieval-Augmented Generation (RAG) system, and offer personalized product recommendations via a market basket analysis engine. Its modular, agent-based architecture ensures scalability and efficiency, with specialized agents for order management, menu queries, and safe interactions.

Soccer Analysis System

GitHub
Project Image

The Soccer Analysis System is a cutting-edge project that combines machine learning, computer vision, and deep learning techniques to provide in-depth analysis of football games. By employing state-of-the-art technologies such as YOLOv8, this system detects players, referees, and footballs, and includes custom-trained models to enhance detection accuracy. The system also integrates various techniques to measure and analyze player movements, ball interactions, and more.

Student Performance Indicator End to End Machine Learning Project

GitHub
Project Image

This is a web application that predicts a student's math score based on various inputs such as gender, ethnicity, parental education level, lunch type, test preparation course, and scores in reading and writing. The application is built using Flask, a lightweight web framework for Python. The project includes a Flask-based web application for real-time prediction, with model deployment and continuous integration using AWS. The model is built using several machine learning algorithms including CatBoost, AdaBoost, Gradient Boosting, Random Forest, Linear Regression, Decision Tree, and XGBoost. Performance was evaluated using the r2 score and the best model was selected through hyperparameter tuning with GridSearchCV.

TF-IDF Search Engine

GitHub
Project Image

Implemented a toy "search engine" in Python that reads a corpus, produces TF-IDF vectors for documents, and returns the document with the highest cosine similarity score for a given query. The project involved natural language processing, tokenization, stopword removal, stemming, and computation of TF-IDF vectors. It showcases proficiency in Python, NLTK, and information retrieval techniques. The search engine follows the ltc.lnc weighting scheme for query-document similarity, demonstrating a solid understanding of information retrieval principles.

Cladocopium Machine Learning Classification

GitHub
Project Image

This project focuses on the algorithmic analysis for Cladocopium classification based on the host coral species (Orbicella annularis, OANN) using various machine learning algorithms. The goal is to provide a comprehensive understanding of the classification performance and identify significant features contributing to the classification outcome.

NBA Player Position Classification

GitHub
Project Image

This project involved the classification of NBA players into their respective positions using machine learning techniques. The model, built with a Support Vector Classifier, achieved improved accuracy through careful data preprocessing, feature selection, and hyperparameter tuning. The project showcases my skills in data analysis, classification, and model evaluation using Python and scikit-learn.

Graph Analysis Proficiency

GitHub
Project Image

This project demonstrated our proficiency in graph analysis. We utilized techniques such as in-degree centrality and clique identification to highlight the top 5 cited papers and maximal groups of mutually connected authors. The project showcased our skills in data analysis, pattern recognition, graph mining, and Python-based network analysis.

New York Motor Vehicle Collision Analysis

GitHub
Project Image

In this project, the goal was to analyze the trend in the number of accidents in New York City from September 2017 to August 2019. We started by cleaning and modifying the data using Python’s Pandas package, extracting appropriate vehicle names, and grouping the data by months and years. The project also involved creating compelling visualizations with Python’s Matplotlib package. These visualizations proved invaluable in understanding various analyses, such as the car maker with the most accidents in a year, trends in accidents over months, and the types of vehicles involved.

Personalized Workout Prediction and Recommendation Engine

GitHub
Project Image

In this project, we developed an advanced machine learning system capable of predicting exercises with 94% accuracy using Python scripting. We expertly implemented a neural network in Keras, processing vast datasets of over 1000 users. The project also focused on enhancing data quality, reducing inaccuracies by 50%, and mitigating biases for fair recommendations.

IMDB DBLP Dataset Analysis

GitHub
Project Image

This project aimed to analyze the performance of different actors and actresses over their entire careers. To achieve this, we designed SQL queries to extract the number of movies done by each actor or actress in a year within their respective career spans. We also calculated the average ratings of their movie performances in those years. The project involved visual analysis and comparisons of year-wise and overall performance of each actor and actress based on the number of movies and their average movie ratings.

Custom Decision Tree Classifier

GitHub
Project Image

In this project, we implemented a custom Decision Tree Classifier from scratch in Python, providing a versatile machine learning model for classification tasks. Our goal was to create a powerful and interpretable tool for decision-making and pattern recognition. This project explores the inner workings of decision trees, from tree growth and splitting criteria to prediction and evaluation.

Deep Convolutional Generative Adversarial Network (DCGAN) Implementation

GitHub
Project Image

For this project, we designed and implemented a Deep Convolutional Generative Adversarial Network (DCGAN) for image synthesis. This involved collaboration on TensorFlow and Keras platforms to enhance discriminator and generator functions. We applied gradient-based training for GANs through the model.fit() API on the MNIST dataset using scikit-learn.

Multi-Layer Neural Network with TensorFlow

GitHub
Project Image

This project involved building and training a multi-layer neural network using TensorFlow. The network architecture included options for specifying the number of layers, activation functions, learning rate, batch size, and training epochs. It provided support for Mean Squared Error (MSE), Support Vector Machine (SVM), and Cross-Entropy loss functions.

Convolutional Neural Network (CNN) with TensorFlow and Keras

GitHub
Project Image

Convolutional Neural Networks (CNNs) are a class of deep neural networks commonly used for image classification and recognition tasks. This project provided a flexible and customizable CNN implementation using TensorFlow and Keras. It featured various components like input layers, dense layers, convolutional layers, max-pooling layers, flattening, and customization options for training and evaluation, all powered by TensorFlow and Keras.

Elements

Text

This is bold and this is strong. This is italic and this is emphasized. This is superscript text and this is subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.


Heading Level 2

Heading Level 3

Heading Level 4

Heading Level 5
Heading Level 6

Blockquote

Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.

Preformatted

i = 0;

while (!deck.isInOrder()) {
    print 'Iteration ' + i;
    deck.shuffle();
    i++;
}

print 'It took ' + i + ' iterations to sort the deck.';

Lists

Unordered

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Alternate

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Ordered

  1. Dolor pulvinar etiam.
  2. Etiam vel felis viverra.
  3. Felis enim feugiat.
  4. Dolor pulvinar etiam.
  5. Etiam vel felis lorem.
  6. Felis enim et feugiat.

Icons

Actions

Table

Default

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Alternate

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Buttons

  • Disabled
  • Disabled

Form