Hi there,   I'm

ASHISH AGARWAL

About

Hi there! 👋

I'm an experienced data engineer with 4+ years of expertise, established in both academic and professional spheres in Connecticut, USA (Flexible on Location within the U.S.). I’m passionate about merging my artistic flair with technical know-how to craft solutions that not only work but inspire. For me, problem-solving is more than a skill—it’s a creative outlet where I get to watch brainstorming ideas evolve into meaningful outcomes.

Problem Solver || Creative Thinker || Detail-Oriented

I thrive on breaking down complex problems and piecing together innovative solutions. Each challenge is a new puzzle, and I find joy in tailoring solutions that resonate with the unique needs of each project.

I recently graduated in December 2024 with a Master’s in Data Science from the University of New Haven, eager to embark on the next chapter of my professional journey. With over 4 years of experience in business intelligence, MLOPS, LLMOPS and data engineering, I’m ready to bring my insights and skills to a team that values creativity and precision.

If you’re looking for someone who brings both technical expertise and a fresh perspective to the table, let’s connect and see what we can create together!

Skills

Proficiency.

Python 95%
Linux/Bash 90%
AWS [EC2, S3, Lambda, Redshift, Kafka, SNS, Athena, Kinesis, Cloudwatch] 90%
PowerBI/Microstrategy/Tableau/QLIK 90%
DL libraries [Pytorch/Tensorflow] 90%
SQL[MySQL, SQLite, Postgres, Redshift] 90%
NoSQL[MongoDB, Cassandra, DynamoDB] 85%
PYTHON Libraries [Pandas, Polars, Dask, Numpy, Scikit-Learn, Django, Matplotlib, Seaborn, Plotly, Altair]95%
C++ 80%
GIT85%
MS-TOOLS [Excel, Doc, Powerpoint, Sharepoint] 90%
R 80%
Containerization [Docker, Kubernetes]85%
ERP/SCM/CRM [Netsuite, MasterControl] 80%

Resume

Educational background and Job experiences.

Sumary

Ashish  Agarwal

With over 4 years of industry experience in Data Science, Data Engineering, and LLM Engineering, I excel at building optimized, scalable solutions. I have deep expertise in cloud technologies and stay current with the latest AI and data trends. My strengths lie in automation and developing robust pipelines for efficient problem-solving and quality assuarance.

  • +1 (203) 626 2315
  • agarwal.ashish.singhal[at]gmail.com

Education

Master of Science

Aug 2023 - Dec 2024

Data Science

University Of New Haven, Connecticut, USA

  • Excelled in courses such as Data Science, Artificial Intelligence, Mathematics, Distributed and Scalable Data Engineering, Deep Learning and Natural Language Processing.
  • Bachelor in Engineering

    Nov 2016 - Sep 2021

    Computer Engineering

    Institute of Engineering (IOE), Tribhuwan University, Pulchowk Campus, Nepal

  • Excelled in courses such as Data Mining, Distributed Systems, and Big Data Technologies, gaining hands-on experience in designing and implementing scalable systems.
  • Led group projects focusing on the integration of cloud platforms and containerization technologies to build efficient distributed systems for large-scale data processing.
  • Participated in numerous workshops and seminars on emerging trends in machine learning, artificial intelligence, and data security, expanding expertise in the field.
  • Professional Experience

    Data Engineering Intern

    June 2024 - Aug 2024

    Northeast Scientific, CT, USA

    • Led the development of data pipelines using Mastercontrol and Netsuite APIs, automating data extraction, transformation, Validation and storage in MySQL database and Excel Files, reducing external dependencies.
    • Developed and implemented KPIs, metrics, dashboards, and reports for Inventory, Sales, and Production insights using Qlik Sense Cloud, enabling real-time data-driven decision-making from raw ETL-processed data.
    • Designed a Retrieval-Augmented Generation (RAG) system for document processing using LLama3-8B model, Open WebUI and FastAPI, improving knowledge retrieval efficiency and streamlining employee training.
    • Streamlined data validation and automation processes using cron jobs and GitHub Actions, automating tasks like timely file downloads, data validation, historical data logging, and reconciliation. Delivered reports to stakeholders with automated notifications, achieving 100% accuracy and saving 90% of the time spent on manual processes, significantly boosting efficiency.
    • Quickly learned and addressed a time-sensitive label printing issue unrelated to my role, using ZPL to reprogram the scanner and printer, delivering the solution within the required timeline.

    Graduate Research Assistant and Advisor for Capstone Projects

    Mar 2024 - Dec 2024

    Sail-lab in University of New Haven, CT, USA

    • Engineered multi-tenant GPU environments as a SaaS, similar to Google Colab, using Docker, Helm, and Kubernetes, optimizing computational resource allocation by 40% and reducing manual resource management time and effort by 99%.
    • Conducting advanced research in Brain-Computer Interface (BCI) technology utilizing Electroencephalography (EEG) for signal processing and neural decoding.
    • Mentoring capstone teams on best practices in data engineering, model deployment, and machine learning pipelines, ensuring integration of scalable solutions.
    • Collaborating with interdisciplinary teams to build and deploy predictive analytics models for both academic research and industrial applications.

    Download my resume to know more.

    Projects

    Multi-Tenant GPU Cluster (Onpremise Google Colab [Saas])

    Built a Kubernetes-based GPU cluster with JupyterHub integration, enabling multi-user access and efficient GPU resource sharing.
    Configured multi-tenant resource profiles using Kubernetes and Helm, supporting customized resource allocations, which improved utilization by 30%.
    Developed a secure access framework through Kubernetes Dashboard and JupyterHub authentication, ensuring isolated and reliable user access.
    Authored detailed documentation covering setup, deployment, troubleshooting, and maintenance steps, streamlining cluster management for research and high-compute workloads.

    Analyzing Trending Youtube Videos

    Developed a Cloud-based scalable and deployable pipeline that fetches data from Youtube API and provide insights on trends and viewer engagement patterns.
    Worked with NLTK and Scikit-library to work on classification and sentiment analysis from video metadata. Also used docker and AWS ECR to package this huge libraries in AWS Lambda.
    Automated the process with the help of cronjob based triggers and monitoring the resources with the help of cloudwatch.
    Worked on ETL pipeline with focus on data security, storage, monitoring, optimization, analysis and visualization.

    Face Recognition and Face Sort

    Developed an automated face recognition system to efficiently organize images into individual folders based on identified individuals.
    Implemented facial feature encoding and matching algorithms to accurately classify and sort images.
    Designed a user-friendly command-line interface that allows users to specify the folder containing images for sorting.
    Ensured compatibility with .jpg and .png formats, with potential for future expansion to additional image types.
    Conducted performance testing to optimize sorting speed and accuracy, improving the overall reliability of the image management process.

    Contact

    Don’t hesitate to get in touch via the contact information below for conversations about technology, business, my projects, or even just to say hello.

    Location:

    USA

    Skype:

    live:.cid.8857c823f333c0cc