Back to Projects
2026 – Present·SKYFORGE
ML Engineer, FlightReady AI

SkyForge.

Aviation AI, fine-tuned from the ground up.

Interactive Preview

See It in Action

By the Numbers

5,000+

Training Pairs

Generated via GPT-4o teacher model across 6 categories

200

Benchmark Questions

SkyBench evaluation suite with GPT-4o judge

6

Aircraft Manuals

POH manuals parsed and structured

10×

Cost Reduction

vs. Gemini API with comparable accuracy

Overview

SkyForge is being built as the ML backbone of FlightReady AI. I'm building the pipeline from data ingestion (6 aircraft POH manuals, FAA handbooks, 461 emergency procedures) to training (QLoRA rank-64 adapters) to evaluation (SkyBench, a 200-question benchmark scored by GPT-4o). Currently using Gemini as the primary LLM while SkyForge is in development.

Architecture

Pattern

End-to-End ML Pipeline

DI Strategy

Config-driven with YAML hyperparameters

AI Integrations

Mistral 7BGPT-4o (teacher/judge)Gemini (baseline)

Backend

Modal serverless + vLLM inference

Testing

SkyBench 200-question benchmark

Key Features

What I Built

End-to-End Training Pipeline

Data ingestion from PDFs → GPT-4o teacher model dataset generation → QLoRA fine-tuning → SkyBench evaluation → Modal deployment. Fully automated.

QLoRA Fine-Tuning

4-bit NF4 quantization, rank-64 LoRA adapters targeting all 7 attention+MLP components. Flash Attention 2, sequence packing, gradient checkpointing on A100.

SkyBench Evaluation Suite

200 questions across 6 domains (emergency, W&B, regulations, systems, weather, decision-making). GPT-4o judge scores correctness, completeness, hallucination, and specificity.

Planned Production Serving

Targeting OpenAI-compatible API via vLLM on Modal with auto-scaling. Will serve as a drop-in replacement for Gemini in FlightReady AI's production app.

Performance

Optimizations

4-bit quantization reduces memory by ~75%

Sequence packing enables effective batch size of 16 on single A100

Target <500ms TTFT, <2s full response

10x cost reduction vs. Gemini API

Tech Stack

PythonPyTorchHugging FaceQLoRAvLLMModalWeights & BiasesFastAPI

Interested in learning more?

Check out the live project or get in touch to discuss the technical details.