Masood Salman Choudhury

Senior Data Engineer & AI Solutions Architect

Manchester, Manchester, United Kingdom
Masood Salman Choudhury

About

Senior Data Engineer & AI Solutions Architect with 5+ years' experience delivering end-to-end data platforms and intelligent applications across fintech, SaaS, and industrial analytics. Expert in designing scalable data pipelines, building AI-powered systems with LLMs and machine learning models, and deploying robust cloud-native solutions on AWS, Azure, and GCP.

Experience

  • -

    The Hague, Netherlands - Remote

    Summary:

    • Worked as a lead data engineer delivering end-to-end solutions for multiple international clients across fintech, enterprise SaaS, and industrial analytics domains. Responsible for architecting and implementing data platforms, AI solutions, and cloud-based applications.

    Responsibilities:

    • Designed, built, and deployed Azure data pipelines in Databricks (PySpark, SparkSQL) to ingest large-scale structured and semi-structured datasets from Blob Storage and Azure Cosmos DB, execute complex transformations, and persist curated features in Delta Lake for downstream machine learning workflows
    • Led development of an enterprise-grade Multi-Stage RAG system for clients, combining Python-based data pipelines (Google Drive, Slack) with LLMs, Langchain, Reranker, and Pinecone VectorDB to deliver highly accurate, context-aware retrieval workflows
    • Designed, built, and deployed an end-to-end Azure data pipeline using App Functions and CosmosDB, applying Kimball dimensional modeling for Power BI datasets with DAX-based measures, enabling KPI reporting and forecasting
    • Led development of a full-stack AI SaaS app (Android and iOS) leveraging OpenAI Assistant API with a FastAPI backend, React Native (Expo) frontend, and PostgreSQL database, including OAuth 2.0 authentication and integrated Stripe payment processing
    • Designed and implemented end-to-end CI/CD pipelines for fully automated deployment of containerized applications to AWS, ECS using Docker, Docker Compose, and infrastructure-as-code best practices (Terraform)
    • Secured and deployed SaaS applications with SSL, reverse proxies, Zero Trust controls, and firewall rules, optimising load balancing for high availability, and set up logging and monitoring with Prometheus and Grafana to ensure system reliability and observability.
    • Provided technical mentorship and conducted code reviews for junior and mid-level engineers, promoting best practices, improving code quality, and accelerating team growth
    • Python
    • Databricks
    • PySpark
    • Delta Lake
    • Langchain
    • Pinecone
    • React Native
    • PostgreSQL
    • MySQL
    • Docker
    • AWS
    • Azure
    • Terraform
    • Git
    • Prometheus
    • Grafana
    • SSL
    • Nginx
  • -

    London, United Kingdom - On-Site

    Summary:

    • Architected complete data warehouse solutions and led development of scalable ETL pipelines for financial data processing.

    Responsibilities:

    • Architected a Kimball style star schema data warehouse using Elasticsearch and BigQuery, enabling real-time KPI dashboards in Kibana and empowering data-driven decision-making for stakeholders
    • Led the design and implementation of multiple scalable ETL pipelines with Python, GCP Dataflow, Scrapy, and managed workflow orchestration with Apache Airflow, processing over 10 million financial data rows daily for real-time analytics
    • Co-led the agile development of a secure, scalable Savings Platform using FastAPI, delivering the product in 3 months; currently manages $2M+ in monthly customer deposits
    • Managed and optimized databases including MongoDB, PostgreSQL, Elasticsearch, and BigQuery by tuning queries, indexes, and partitions, resulting in significant performance improvements
    • Conducted deep data analysis with Pandas to detect anomalies and identify potential fraud patterns, enhancing platform security
    • Automated data validation workflows using Python scripts, ensuring pipeline integrity and achieving 100% uptime for critical microservices
    • Deployed containerized applications with Docker and Kubernetes, ensuring high availability, scalability, and streamlined CI/CD operations
    • Developed and implemented a Random Forest classification model to identify high-value clients during signup, enabling personalized onboarding experiences
    • Configured and monitored Google Analytics and Tag Manager dashboards to track user behavior and support data-driven marketing strategies
    • Python
    • Elasticsearch
    • BigQuery
    • Kibana
    • GCP
    • FastAPI
    • MongoDB
    • PostgreSQL
    • Apache Airflow
    • Pandas
    • Docker
    • Kubernetes
  • -

    Guwahati, India - On-site

    Summary:

    • Delivered actionable insights via Tableau dashboards and automated data collection processes.

    Responsibilities:

    • Delivered actionable insights via Tableau dashboards (waterfall/cohort analysis), improving stakeholder decision-making
    • Automated competitor data scraping (Scrapy) and ETL into MySQL, reducing manual effort by 50%
    • Analyzed sales and geographic data to guide strategic fibre network expansion
    • Python
    • Tableau
    • MySQL
    • Scrapy
    • Pandas

Projects

Education

    University of Liverpool

    MSc Business Analytics and Big Data
    Liverpool, United Kingdom
    Distinction
    Key Modules: Data Mining & Machine Learning, Big Data Analytics, Enterprise Systems with SAP, Digital Business Technology and Management, Digital Strategy

    Asian Institute of Management and Technology

    Bachelor of Business Administration
    Guwahati, India
    First-Class
    Key Modules: Statistics, Mathematics, Production & Operation Management

Certificates

Skills

  • Python
  • Databricks
  • PySpark
  • Delta Lake
  • Langchain
  • Pinecone
  • React Native
  • PostgreSQL
  • MySQL
  • Docker
  • AWS
  • Azure
  • GCP
  • Terraform
  • Git
  • Elasticsearch
  • BigQuery
  • FastAPI
  • MongoDB
  • Apache Airflow
  • Pandas
  • Kubernetes
  • Tableau
  • Scrapy
  • Prometheus
  • Grafana
  • Nginx