YlyaBot

Lead Systems Architect & AI EngineerProduction-Ready Recruiter Agent & RAG Portal

Live Sync

An architectural digital twin running on Next.js 16 App Router. Features semantic multi-repository ingestion (portfolio.json), stateless RAG pipeline embedding queries, and React 19 async streaming. (In Development)

Project Overview

An event-driven, decoupled Retrieval-Augmented Generation (RAG) agent and streaming chat interface acting as an interactive professional clone of Ylya Martchenko. The system implements a dynamic Git Submodule pattern to expose the bot as a standalone module inside a core Next.js host while retaining an independent repository structure. Leveraging an optimized PostgreSQL Remote Procedure Call (RPC), the bot matches recruiting queries with sub-10ms database latency, utilizing vector cosine similarity calculations to deliver high-precision context grounding. It features a robust 5-tier model fallback hierarchy with timezone-aware cookie exclusions, resolving Gemini API rate limits with zero downtime. Additionally, it integrates a production-grade secure real-time analytics dashboard monitoring conversation telemetry, model efficiency, and geographic request metrics.

View on GitHub Open Live Demo

Specifications

My RoleLead Systems Architect & AI Engineer
Development PhaseProduction-Ready Recruiter Agent & RAG Portal
Languages
TypeScriptJavaScriptSQL
Tools & Frameworks
Next.js 16 (App Router)React 19SupabasepgvectorGoogle GenAI SDKGemini APIVercel AI SDKUpstash Redis Rate Limiting

Execution Latency

Sub-second (under 900ms) Time-To-First-Token streaming on free-tier APIs, theoretically scalable to under 150ms with enterprise Context Caching; sub-10ms PostgreSQL vector retrieval execution speed; 0ms runtime write overhead by backgrounding telemetry transactions via the Next.js `after()` API.

UI Performance

60 FPS responsive animations utilizing client-side stream reading hooks, a 60ms batching threshold to eliminate TCP backpressure, and strict layout containment supporting dynamic HTML5 semantic structures. Exclusively compiled via the native React 19 compiler.

Operational Cost

$0/month. Operates entirely within Google AI Studio, Supabase, and Upstash free-tier thresholds, dropping system maintenance overhead to zero.

Engineering Highlights & Achievements

Architected a Decoupled RAG Architecture using Git Submodules to host the YlyaBot agent independently, separating presentation layers from ingestion codebases while keeping global workspace configurations clean.

Engineered a robust 5-Tier prioritized Model Fallback Hierarchy (gemini-3.1-flash-lite -> gemini-3.5-flash -> gemini-2.5-flash -> gemma-4-31b-it -> gemini-flash-latest) to guarantee consistent high-uptime LLM performance.

Implemented a Time-Zone Aware Cookie Exclusion system expiring at exactly midnight Pacific Time (America/Los_Angeles) to cache quota-exhausted endpoints, preventing redundant and failing API calls.

Designed a high-performance vector matching function inside Supabase using PL/pgSQL, computing multi-dimensional cosine distance metrics natively on the database hardware layer.

Engineering Challenges (STAR Method)

Exhaustive situation-action-result breakdowns showcasing problem-solving and architectural execution.

CHALLENGE 1

Situation & Impediment

Free-tier Gemini API endpoints frequently return 429 Rate Limit Exceeded or Quota Exhausted errors under concurrent recruiter evaluation sessions, causing the AI bot to crash and return blank streams.

Engineering Action

Engineered a prioritized 5-tier fallback model hierarchy coupled with a timezone-aware cookie exclusion engine. The Next.js Server Action synchronously attempts to resolve the first stream chunk from active models. If a model fails due to a quota exhaustion, the server catches the error, writes a cookie marking that model as exhausted until exactly midnight Pacific Time (America/Los_Angeles), and seamlessly falls back to the next model in the priority list.

Architectural Deep Dive

Low-level component relationships, system boundaries, and runtime flows.

The engine matches input queries against vectorized segments of profile.json and individual repository records. High-fidelity embedding is achieved via gemini-embedding-2 compressed natively to 768 dimensions, stored and indexed in Supabase. Recruiter chat queries are processed by a typesafe React Server Action that validates inputs with Zod, checks Upstash Redis rate limits, filters models against active quota exclusion cookies, and streams Gemini tokens securely to the client. Detailed telemetry records are stored atomically inside Supabase, while aggregate session distributions (daily count history, models, countries) are resolved via Upstash Redis pipelines, all surfaced dynamically inside our secure metrics workspace.

Lessons Learned & Core Takeaways

Injecting structural key data (like languages or contact details) directly into system contexts guarantees deterministic truth, while keeping semantic RAG strictly for project implementation logs yields the highest conversational coherence.

Live Interactive Preview

Rendered live in real-time. Direct URL: /ylya-bot

If the preview remains blank, the site's security policies may restrict iframe embedding. Open the link directly instead.

/ylya-bot

YlyaBot

Lead Systems Architect & AI EngineerProduction-Ready Recruiter Agent & RAG Portal

Live Sync

Project Overview

View on GitHub Open Live Demo

Specifications

My RoleLead Systems Architect & AI Engineer
Development PhaseProduction-Ready Recruiter Agent & RAG Portal
Languages
TypeScriptJavaScriptSQL
Tools & Frameworks
Next.js 16 (App Router)React 19SupabasepgvectorGoogle GenAI SDKGemini APIVercel AI SDKUpstash Redis Rate Limiting

Execution Latency

UI Performance

Operational Cost

$0/month. Operates entirely within Google AI Studio, Supabase, and Upstash free-tier thresholds, dropping system maintenance overhead to zero.

Engineering Highlights & Achievements

Implemented a Time-Zone Aware Cookie Exclusion system expiring at exactly midnight Pacific Time (America/Los_Angeles) to cache quota-exhausted endpoints, preventing redundant and failing API calls.

Designed a high-performance vector matching function inside Supabase using PL/pgSQL, computing multi-dimensional cosine distance metrics natively on the database hardware layer.

Engineering Challenges (STAR Method)

Exhaustive situation-action-result breakdowns showcasing problem-solving and architectural execution.

CHALLENGE 1

Situation & Impediment

Engineering Action

Architectural Deep Dive

Low-level component relationships, system boundaries, and runtime flows.

Lessons Learned & Core Takeaways

Live Interactive Preview

Rendered live in real-time. Direct URL: /ylya-bot

If the preview remains blank, the site's security policies may restrict iframe embedding. Open the link directly instead.

/ylya-bot

YlyaBot

Project Overview

Specifications

Execution Latency

UI Performance

Operational Cost

Engineering Highlights & Achievements

Engineering Challenges (STAR Method)

CHALLENGE 1

Architectural Deep Dive

Lessons Learned & Core Takeaways

Live Interactive Preview

YlyaBot

Project Overview

Specifications

Execution Latency

UI Performance

Operational Cost

Engineering Highlights & Achievements

Engineering Challenges (STAR Method)

CHALLENGE 1

Architectural Deep Dive

Lessons Learned & Core Takeaways

Live Interactive Preview

CHALLENGE 2

CHALLENGE 3