Data scientist

Viktoriia A

Information

Available hours \ week

40 h/w

Seniority level

Senior

Years of experience

6 yrs.

Location

Spain

Nationality

Ukraine

Timezone

(GMT+02:00) Kyiv

Languages

English

Upper-Intermediate (B2)

About

Viktoriia is a Data Scientist specialising in machine learning and AI applications, bringing six years of experience in developing advanced data-driven solutions. Proficient in Python, she has used libraries such as XGBoost, Pandas, and ScikitLearn to construct robust models across various sectors, including agriculture, defence, and blockchain technology. Her project portfolio includes deep learning, computer vision, NLP, and applied machine learning, but recently she has devoted most of her time to large language model (LLM)-based solutions—building RAG pipelines, evaluation frameworks, and integrating models like GPT, Claude, and Llama into commercial products. Her work ranges from drone-based target tracking systems (14 months on the Striker Coach project) to beauty technology (skin analysis, body measurements, virtual try-on using diffusion models) and blockchain security. She is comfortable communicating technical work to both engineering teams and business stakeholders. Viktoriia has also mentored junior engineers, contributed to pre-sales technical assessments, and publishe

Main technologies

Python6 yrs.
Data Science6 yrs.
OpenCV6 yrs.
Deep Learning6 yrs.
Python Numpy6 yrs.
XGBoost6 yrs.
ScikitLearn6 yrs.
Machine Learning6 yrs.
Pandas6 yrs.
Docker4 yrs.
YOLO4 yrs.
NLTK2 yrs.
Transformers2 yrs.

Additional skills

Experience

APPLIED LLM RESEARCH. EVALUATION FRAMEWORK

Data Science Engineer

About the Project

A research project focused on establishing best practices for production-grade LLM applications. The work covered advanced prompt engineering experiments, development of a Retrieval-Augmented Generation (RAG) pipeline, and implementation of an LLM-as-a-Judge framework for automated response quality evaluation. The goal was to build a repeatable methodology for improving answer quality, minimizing hallucinations, and creating reliable evaluation criteria that can be applied across future commercial LLM projects.

AI
Machine Learning

Responsibilities

Conducted systematic prompt engineering experiments across multiple LLM providers (GPT, Claude, Llama) to identify effective patterns for different task types. Built the RAG pipeline including document chunking, embedding generation, and retrieval optimization. Implemented the LLM-as-a-Judge evaluation framework that uses one model to assess the quality of another's outputs against defined criteria. Handled all testing and troubleshooting across different model configurations. Documented findings into practical guidelines for the team.

Skills & technologies

GROWER ADVISER

Data Science Engineer

About the Project

A precision agriculture platform providing greenhouse operators with data-driven insights to optimize growing strategies and reduce costs. The system combines drone-captured imagery with climate sensor data to automate plant health analysis, detect anomalies, and forecast yields. Computer vision models analyze phenotype features from aerial photos while climate data adds context about growing conditions, enabling growers to make informed decisions about irrigation, nutrition, and harvesting timing.

AgriTech

Responsibilities

Built and maintained AI model pipelines for plant phenotype analysis, including anomaly detection and yield estimation. Implemented new features into the existing processing system and handled testing and troubleshooting across different crop types and growing conditions. Conducted research into improved detection methods and contributed to cross-team code reviews. Managed documentation for the data science components and worked with the broader engineering team on integration points between the ML pipeline and the platform's data infrastructure.

Skills & technologies

OMD

Data Science Engineer

About the Project

An AI-powered drone system that detects airplanes as targets using onboard object detection models. Once the target is identified, the drone switches to autonomous tracking mode, maintaining visual lock on the target aircraft. The system combines real-time YOLO-based detection with MAVLink-based drone control for responsive autonomous behavior during flight.

Defense
UAVs/drones

Responsibilities

Built and optimized the YOLO-based object detection model for identifying aircraft from drone camera feeds. Integrated the detection pipeline with Pymavlink for autonomous drone control commands. Developed new features in the autonomous tracking system to improve responsiveness and reliability. Handled testing and troubleshooting across different flight conditions and target scenarios. Maintained project documentation and participated in cross-team code reviews. Managed the Docker-based deployment environment.

Skills & technologies

Python
PyTorch
OpenCV
Docker
YOLO
Git

TRACKSENSE

Data Science Engineer

About the Project

An OpenDroneMap-based system for processing drone imagery for large-scale railway infrastructure monitoring. The project evaluates how drone photogrammetry can be used to detect track conditions, identify maintenance needs, and monitor rail infrastructure across long distances. The work included benchmarking processing approaches and estimating the cloud infrastructure needed for production-scale deployment.

Transportation

Responsibilities

Set up the processing environment both locally and on AWS cloud infrastructure. Handled dataset processing, benchmarking different approaches, and quality assessment of the photogrammetric outputs. Estimated AWS infrastructure requirements and costs for scaling to production workloads. Prepared the service architecture documentation and deployment plans.

Skills & technologies

STRIKER COACH

Data Science Engineer

About the Project

An autonomous drone system that uses AI object detection to identify targets and then tracks and guides itself toward them without human intervention. The system combines real-time computer vision for target detection with autonomous flight control for approach and tracking. This was the longest-running project (14 months), involving continuous iteration on detection accuracy, tracking stability, and autonomous guidance algorithms under real flight conditions.

Defense
UAVs/drones

Responsibilities

Built and iterated on the AI detection and tracking pipeline over 14 months. Developed YOLO-based object detection models optimized for onboard processing. Implemented the autonomous guidance logic using Pymavlink for drone control. Handled the full development cycle: model training, pipeline integration, field testing, and performance analysis. Conducted code reviews across the team and maintained project documentation. Managed the Docker-based deployment environment for consistent builds across development and testing hardware.

Skills & technologies

Python
PyTorch
OpenCV
Docker
YOLO
Git

SECURITY IN BLOCKCHAIN

Data Science Engineer

About the Project

A system for detecting malicious activity in blockchain networks. The project applied semi-supervised learning techniques to identify suspicious transaction patterns, addressing the challenge that labeled examples of blockchain fraud are scarce compared to normal transactions. The system processes transaction data, extracts behavioral features, and classifies activity as legitimate or potentially malicious, with MLflow tracking for experiment management and model versioning.

Blockchain
Fintech
Cyber-security
Analytics

Responsibilities

Researched and implemented semi-supervised learning approaches suitable for the highly imbalanced blockchain security dataset. Built the ML pipeline from feature extraction through model training and evaluation, using scikit-learn for the core models. Set up MLflow for experiment tracking and model registry. Managed the MongoDB data layer for storing processed transaction features. Handled DVC for data versioning and deployed the system on Google Cloud Platform. Ran iterative experiments to optimize the balance between detection rate and false positive rate.

Skills & technologies

SENTIMENT ANALYSIS OF REVIEWS

Data Science Engineer

About the Project

An NLP system for extracting detailed sentiment information from customer reviews. Rather than just classifying reviews as positive or negative, the system performs aspect-based sentiment analysis - identifying specific topics mentioned in each review and the sentiment expressed toward each. Reviews go through data cleaning and formatting stages, then are processed with specialized models to extract keywords, sentiment scores, and opinion summaries.

E-commerce
Analytics

Responsibilities

Researched aspect-based sentiment analysis approaches and selected PyABSA as the core framework. Built the data cleaning and preprocessing pipeline. Developed the extraction pipeline using SpaCy for linguistic processing and NLTK for text analysis. Implemented the full system from research through to working prototype, handling integration and testing across different review formats and domains.

Skills & technologies

Python
NLTK
Git
NLP

SKIN FEATURES DETECTION

Data Science Engineer

About the Project

A computer vision API for automated skin analysis. Users upload a face photo, and the system detects various skin features (blemishes, wrinkles, pores, discoloration), visualizes them on the image, and calculates quality scores for each feature category. The solution combines face detection using MediaPipe, advanced segmentation with Meta's Segment Anything model, and custom classification models for individual skin features. Deployed as a FastAPI service on AWS.

Beauty
Healthcare

Responsibilities

Led the research phase to evaluate different approaches for skin feature detection, settling on a pipeline that combines MediaPipe face landmarks with Segment Anything for precise region extraction. Developed custom classification models for scoring individual skin features. Built the full API service using FastAPI, handling image upload, processing, visualization generation, and score calculation. Managed Docker-based deployment on AWS. Handled integration testing to ensure consistent results across different lighting conditions, skin tones, and image qualities.

Skills & technologies

WATERMARK DETECTION

Data Science Engineer

About the Project

An application for automatically detecting watermarks on images. The system identifies whether an image contains a watermark, where the watermark is located, and what type it is. This is useful for content platforms that need to filter watermarked images from their collections or for automated image processing pipelines where watermark presence affects downstream handling.

Management

Responsibilities

Researched watermark detection approaches and built the detection model using PyTorch. Set up experiment tracking with Weights & Biases (WandB) to manage training runs and compare model configurations. Built the Streamlit-based demo interface for testing and stakeholder review. Handled the Docker-based deployment setup. Managed the full 8-month development cycle from initial research through model optimization and delivery.

Skills & technologies

Python
PyTorch
Docker
Git
Streamlit

HAND-WRITTEN DIAGRAM RECOGNITION

Data Science Engineer

About the Project

An application that recognizes hand-drawn content, including diagrams, flowcharts, and sketches. The system takes photos or scans of handwritten diagrams and converts them into structured digital representations. This bridges the gap between quick whiteboard sketches and formal digital documentation, useful for meeting notes, architecture planning, and collaborative design sessions.

Productivity
Analytics

Responsibilities

Researched recognition approaches for hand-drawn diagram elements (shapes, connectors, text). Built the recognition pipeline using PyTorch for the core models and Flask for the API layer. Handled development, testing, and integration across different diagram styles and input qualities. Managed Docker-based deployment for the complete service.