Data scientist, Tech Lead

Michael Y

Information

Available hours \ week

40 h/w

Seniority level

Senior

Years of experience

9 yrs.

Location

Portugal

Nationality

Ukraine

Timezone

(GMT+02:00) Kyiv

Languages

English

Advanced (C1)

About

Nine years developing data science solutions across agritech, defence, fintech, and sports analytics. Built Quantum's entire DS department from scratch — hired over 30 people, created learning pathways, and mentored MSc and PhD students along the way. Most of my hands-on work focuses on the intersection of computer vision and geospatial data. I've spent years working with drone and satellite imagery for precision agriculture (SeeTree, AgroScout, Grower Adviser) and UAV navigation in GPS-denied environments. One of my proudest achievements is a patented method for geo-temporal orthomosaic alignment that automated 85% of SeeTree's data processing pipeline. Recently, I've been immersed in LLMs and RAG systems, leading the MayaBot project, where we developed an AI chatbot platform integrating OpenAI, Anthropic, and Mistral models with custom optimisations. I am comfortable working across the full stack of a DS project, from research and prototyping to deployment on AWS/GCP. I hold patents in the Artificial Intelligence domain.

Main technologies

Data Science9 yrs.
TensorRT9 yrs.
XGBoost9 yrs.
Shapely9 yrs.
GeoPandas9 yrs.
FastAPI9 yrs.
Rasterio9 yrs.
PostgreSQL9 yrs.
YOLO9 yrs.
Pandas9 yrs.
Python Numpy9 yrs.
Scikit Learn9 yrs.
Geographical Information System (GIS)9 yrs.
Google Cloud (GCP)9 yrs.
OpenCV9 yrs.
PyTorch9 yrs.
Deep Learning9 yrs.
Machine Learning9 yrs.
Python9 yrs.
MLOps8 yrs.
Team Lead5 yrs.
LangChain3 yrs.
Tech Lead3 yrs.

Additional skills

TensorFlow9 yrs.
Git9 yrs.
Docker7 yrs.
Django6 yrs.
NLP4 yrs.
MAVLink4 yrs.
Raspberry Pi4 yrs.
ScikitLearn4 yrs.
Databricks3 yrs.
QGroundControl3 yrs.
Unreal Engine3 yrs.
Amazon (AWS)3 yrs.
AirSim3 yrs.
MySQL2 yrs.
RAG2 yrs.
LLM2 yrs.
ArduPilot2 yrs.
Azure1 yrs.

Experience

GROWER ADVISER

Data Science Technical lead

About the Project

Grower Adviser is a precision agriculture platform that helps greenhouse operators optimize growing strategies, reduce costs, and streamline logistics. The system processes drone-captured imagery combined with climate sensor data to deliver automated analysis of plant phenotype features, detect anomalies in crop health, and forecast yields. The platform serves commercial greenhouse operations where small improvements in growing conditions translate directly into significant revenue impact.

AgriTech
Analytics

Responsibilities

Led the data science team for 20 months through all phases from research to production. Built and maintained computer vision pipelines for plant phenotype analysis using PyTorch and OpenCV, running on Google Cloud's Vertex AI platform. Designed the anomaly detection system that flags unhealthy plants from drone imagery, helping growers catch problems before visible symptoms appear. Managed model versioning and deployment using DVC, and oversaw the integration of climate data streams with visual analytics to produce actionable recommendations for growers.

Skills & technologies

MAYABOT

Data Science Technical lead

About the Project

MayaBot is a platform for building AI-powered chatbots tailored to small and medium businesses. It integrates with current generation AI providers like OpenAI, Anthropic, and Mistral while adding custom optimizations on top of cloud pipelines. The architecture is built around Python microservices with PostgreSQL (using pgvector for embeddings) as the main storage layer, deployed across AWS and Azure. The platform lets businesses create, train, and deploy conversational agents without needing in-house AI expertise.

AI
Machine Learning

Responsibilities

Led the technical side of the project, from architecture decisions through to production deployment. Designed and built the RAG pipeline that grounds chatbot responses in client-specific knowledge bases, significantly reducing hallucinations. Implemented prompt engineering frameworks and evaluation pipelines to ensure consistent response quality across different LLM providers. Managed the data layer using PostgreSQL with vector embedding extensions for semantic search. Handled cloud infrastructure across both AWS and Azure, and coordinated with the wider team on integration testing and release cycles.

Skills & technologies

AGROSCOUT

Data Science Technical lead

About the Project

A comprehensive agricultural analytics project focused on detecting and analyzing various crops including potatoes, corn, onions, and sugar cane. The system uses both drone-captured and satellite imagery to perform crop detection, classification, yield counting, disease identification, and pest detection. Satellite data adds a macro layer for identifying problem areas like drought conditions, failed growth regions, and infrastructure features. The combination of aerial and satellite data gives agricultural operators a complete view of their fields at multiple scales.

AgriTech
Analytics

Responsibilities

Led the data science team through research, development, and deployment of multiple detection and classification models. Built object detection pipelines using PyTorch with YOLO architectures, trained on custom-labeled datasets managed through Label Studio. Developed the satellite imagery analysis module for large-area classification tasks. Handled model optimization for deployment efficiency and managed the end-to-end pipeline from raw image ingestion to actionable analytics output. Coordinated across the team to maintain consistent labeling standards and model evaluation metrics.

Skills & technologies

VISUAL / SATELLITE NAVIGATION

Data Science Technical lead

About the Project

Developing navigation systems for autonomous UAV operations in environments where GPS is unreliable or deliberately denied. The project has two main tracks: a visual navigation system that captures sequential images to build a visual path for autonomous return and position holding, and a satellite-based system that correlates real-time optical data from UAV cameras with pre-existing satellite imagery to determine position. Both systems needed to run on embedded hardware (Raspberry Pi, Jetson Nano) with real-time constraints during actual flight missions.

Defense
UAVs/drones
Navigation

Responsibilities

Led a team of engineers through 18 months of R&D, from initial research through simulation testing to real-world flight validation. Built the full simulation environment using Unreal Engine with Cesium terrain and AirSim for controlled testing before hardware deployment. Designed the satellite image matching pipeline that processes camera feeds against pre-loaded map tiles in real time. Managed integration with ArduPilot flight controllers via MAVLink protocol. Oversaw hardware deployment on Raspberry Pi and Jetson Nano platforms, including CSI camera configuration and RTSP streaming setup.

Skills & technologies

Python
Unreal Engine
AirSim
ArduPilot
Rasterio
GeoPandas
Shapely
QGroundControl
MAVLink
Raspberry Pi
Docker
Git

FOOTBALL ANALYTICS SOLUTION

Data Science Technical lead

About the Project

A computer vision system for automated football match analysis. The platform processes match video to detect and track players and the ball in real time, then feeds that tracking data into an analytics module that generates per-player statistics. The entire system runs on AWS cloud infrastructure, handling the compute-intensive inference workloads required for real-time video processing during live matches.

Sports
Analytics

Responsibilities

Managed the full project lifecycle over 12 months, from model research through production deployment on AWS. Built the player detection and tracking pipeline using PyTorch with TensorRT optimization for real-time inference performance. Developed the ball detection module, which required handling heavy occlusion and fast movement. Designed the analytics layer that translates raw tracking coordinates into meaningful player statistics like distance covered, sprint counts, and positional heatmaps. Handled AWS infrastructure setup, including GPU instance management and auto-scaling for match-day loads.

Skills & technologies

Python
PyTorch
OpenCV
TensorRT
Amazon (AWS)

AUTONOMOUS GREENHOUSE CHALLENGE

Data Science Team/Tech lead

About the Project

Leading a multidisciplinary team (VeggieMight) of growers, researchers, and data scientists in the 3rd Autonomous Greenhouse Competition hosted by Wageningen University. The challenge is to grow lettuce in a real greenhouse using fully autonomous climate control. The team's solution has two parts: a reinforcement learning algorithm that uses weather and sensor data to predict optimal climate settings, and a computer vision module that measures plant size and determines spacing events. The team reached the finals, placing 4th in the second round.

AgriTech

Responsibilities

Led the full team across research, algorithm development, and competition rounds. Managed the reinforcement learning component for climate control optimization, which processes real-time weather and sensor data to make autonomous decisions about temperature, humidity, lighting, and irrigation. Oversaw the computer vision module for plant growth monitoring. Coordinated between growers who provided domain knowledge and data scientists who built the models. The algorithm is actively growing lettuce in a real greenhouse facility.

Skills & technologies

DRONE ORTHOMOSAIC ALIGNMENT

Data Science Team lead

About the Project

SeeTree is an agritech company that helps farmers manage millions of trees by providing per-tree analytics from drone and satellite imagery. A core challenge was aligning orthomosaics captured at different times and altitudes into a unified, georeferenced view. Without reliable alignment, temporal comparisons and change detection were impossible at scale. The project required developing a novel approach to geo-temporal alignment that could handle varying resolutions, seasonal changes in canopy appearance, and large datasets covering thousands of hectares. The solution needed to work fully autonomously in a production pipeline running on Google Cloud.

Analytics
AgriTech

Responsibilities

Led research into image registration methods and developed a novel alignment algorithm that became the basis of a patent. The approach combined visual feature extraction with geometric constraints to achieve sub-pixel accuracy across different capture conditions. Built the full production pipeline on GCloud, from data ingestion to aligned output delivery. This single technology automated 85% of SeeTree's manual data processing workflow, directly reducing operational costs and turnaround times. Managed the team through research, development, and production rollout phases.

Skills & technologies

TensorFlow
Python
OpenCV
Rasterio
Shapely
GeoPandas

DRONE IMAGE GEOREFERENСING

Data Science Tech lead

About the Project

Developing an algorithm that takes individual drone photos and aligns them precisely to existing orthomosaics captured at different altitudes. The system uses a combination of image sensor metadata and visual feature matching to achieve accurate alignment between images taken at 60m height and reference orthomosaics generated from 90-150m altitude flights. This lets SeeTree's customers seamlessly browse imagery captured at different heights and times without the computational cost of regenerating full orthomosaics.

AgriTech
Analytics

Responsibilities

Led the research and development of the image alignment algorithm. Built the feature extraction and matching pipeline that identifies corresponding points between low-altitude drone photos and high-altitude reference mosaics despite significant scale and appearance differences. Handled the geometric transformation calculations to warp source images into the target coordinate system. Managed testing across diverse agricultural landscapes and seasonal conditions. Maintained the production system running on GCloud.

Skills & technologies

Python
PyTorch
OpenCV
Rasterio

TREE POLYGONIZATION

Data Science Technical lead

About the Project

Building an automated pipeline that takes aerial and satellite imagery of agricultural land and turns it into individual tree polygons. Each tree gets its own geospatial boundary, enabling per-tree analytics at farm scale. The pipeline uses deep learning segmentation to find tree pixels, normalized digital surface models (NDSM) to filter out non-tree objects, and a custom splitting algorithm to separate overlapping canopies into individual polygons. Runs as a production service on Google Cloud with automated MLOps for weekly retraining.

AgriTech
Analytics

Responsibilities

Led the technical development of the full polygonization pipeline. Designed the segmentation approach using TensorFlow to classify tree vs non-tree pixels from aerial imagery. Built the NDSM-based filtering step that removes false positives from buildings and other tall objects. Developed the polygon splitting algorithm that handles overlapping canopies in dense orchards. Deployed the solution as a standalone production service on GCloud. Created a Kubeflow-based MLOps system for automated model quality assessment and weekly retraining cycles.

Skills & technologies

TensorFlow
Python
OpenCV
GeoPandas
Shapely
MLOps