Autonomous Execution

© 2026 HY13dev™. All Rights Reserved.
Get in touch
HY13dev Logo
Home
Home
Projects Icon
ProjectsProjects
Education Icon
EducationEducation
Experience Icon
ExperienceExperience
Contact Icon
ContactContact
YlyaBot Icon
YlyaBotYlyaBot
Ylya Martchenko
2026
Back to Projects
Full Stack Developer•Senior Software Engineer•Next.js & React Expert•GenAI & RAG Systems Engineer•Full Stack Developer•Senior Software Engineer•Next.js & React Expert•GenAI & RAG Systems Engineer•

YlyaBot

Lead Systems Architect & AI EngineerProduction-Ready Recruiter Agent & RAG Portal
Live Sync

An architectural digital twin running on Next.js 16 App Router. Features semantic multi-repository ingestion (portfolio.json), stateless RAG pipeline embedding queries, and React 19 async streaming. (In Development)

Project Overview

An event-driven, decoupled Retrieval-Augmented Generation (RAG) agent and streaming chat interface acting as an interactive professional clone of Ylya Martchenko. The system implements a dynamic Git Submodule pattern to expose the bot as a standalone module inside a core Next.js host while retaining an independent repository structure. Leveraging an optimized PostgreSQL Remote Procedure Call (RPC), the bot matches recruiting queries with sub-10ms database latency, utilizing vector cosine similarity calculations to deliver high-precision context grounding. It features a robust 5-tier model fallback hierarchy with timezone-aware cookie exclusions, resolving Gemini API rate limits with zero downtime. Additionally, it integrates a production-grade secure real-time analytics dashboard monitoring conversation telemetry, model efficiency, and geographic request metrics.

View on GitHubOpen Live Demo

Specifications

  • My RoleLead Systems Architect & AI Engineer
  • Development PhaseProduction-Ready Recruiter Agent & RAG Portal
  • Languages
    TypeScriptJavaScriptSQL
  • Tools & Frameworks
    Next.js 16 (App Router)React 19SupabasepgvectorGoogle GenAI SDKGemini APIVercel AI SDKUpstash Redis Rate Limiting

Execution Latency

Under 150ms time-to-first-token token streaming; sub-10ms PostgreSQL vector retrieval execution speed; sub-5ms stateless cryptographic JWT signature verification.

UI Performance

60 FPS responsive animations utilizing client-side stream reading hooks, rate-limiting guards, and strict layout containment supporting dynamic HTML5 semantic structures (<header>, <main>, <section>). Exclusively compiled via the native React 19 compiler with zero legacy hook overhead.

Operational Cost

$0/month. Operates entirely within Google AI Studio, Supabase, and Upstash free-tier thresholds, dropping system maintenance overhead to zero.

Engineering Highlights & Achievements

1

Architected a Decoupled RAG Architecture using Git Submodules to host the YlyaBot agent independently, separating presentation layers from ingestion codebases while keeping global workspace configurations clean.

2

Engineered a robust 5-Tier prioritized Model Fallback Hierarchy (gemini-flash-latest -> gemini-3.5-flash -> gemini-2.5-flash -> gemini-3.1-flash-lite -> gemma-4) to guarantee consistent high-uptime LLM performance.

3

Implemented a Time-Zone Aware Cookie Exclusion system expiring at exactly midnight Pacific Time (America/Los_Angeles) to cache quota-exhausted endpoints, preventing redundant and failing API calls.

4

Designed a high-performance vector matching function inside Supabase using PL/pgSQL, computing multi-dimensional cosine distance metrics natively on the database hardware layer.

Engineering Challenges (STAR Method)

Exhaustive situation-action-result breakdowns showcasing problem-solving and architectural execution.

CHALLENGE 1

Situation & Impediment

Free-tier Gemini API endpoints frequently return 429 Rate Limit Exceeded or Quota Exhausted errors under concurrent recruiter evaluation sessions, causing the AI bot to crash and return blank streams.

Engineering Action

Engineered a prioritized 5-tier fallback model hierarchy coupled with a timezone-aware cookie exclusion engine. The Next.js Server Action synchronously attempts to resolve the first stream chunk from active models. If a model fails due to a quota exhaustion, the server catches the error, writes a cookie marking that model as exhausted until exactly midnight Pacific Time (America/Los_Angeles), and seamlessly falls back to the next model in the priority list.

Architectural Deep Dive

Low-level component relationships, system boundaries, and runtime flows.

The engine matches input queries against vectorized segments of profile.json and individual repository records. High-fidelity embedding is achieved via gemini-embedding-2 compressed natively to 768 dimensions, stored and indexed in Supabase. Recruiter chat queries are processed by a typesafe React Server Action that validates inputs with Zod, checks Upstash Redis rate limits, filters models against active quota exclusion cookies, and streams Gemini tokens securely to the client. Detailed telemetry records are stored atomically inside Supabase, while aggregate session distributions (daily count history, models, countries) are resolved via Upstash Redis pipelines, all surfaced dynamically inside our secure metrics workspace.

Lessons Learned & Core Takeaways

Injecting structural key data (like languages or contact details) directly into system contexts guarantees deterministic truth, while keeping semantic RAG strictly for project implementation logs yields the highest conversational coherence.

Live Interactive Preview

Rendered live in real-time. Direct URL: /ylya-bot

If the preview remains blank, the site's security policies may restrict iframe embedding. Open the link directly instead.
/ylya-bot
Upstash Redis (Atomic Telemetry & Global Counters)
Supabase Database (Detailed Interaction Telemetry)
Stateless JWT (HMAC-SHA256 Sessions)
HTTP Cookies (Pacific Midnight Expiry)
Git Submodules
  • Stars0
  • Forks0
  • Last Updated5/28/2026
  • 5

    Built a mobile-first, high-fidelity chat dashboard utilizing the Vercel AI SDK, optimizing response times to under 150ms time-to-first-token using React Server Actions and modern streamable value client iterators.

    6

    Developed a Secure Telemetry & Real-Time Analytics Dashboard (/ylya-bot/metrics) protected by stateless cryptographic JWT authentication, combining live PostgreSQL database logs with high-performance rolling Redis aggregate metrics.

    Quantifiable Result

    Completely resolved free-tier API quota exhaustions with zero user downtime, maintaining 100% agent uptime while keeping monthly maintenance costs at absolute zero ($0.00).

    CHALLENGE 2

    Situation & Impediment

    Traditional RAG pipelines require continuous polling servers or expensive always-on extraction endpoints to index portfolio data changes.

    Engineering Action

    Engineered an event-driven ingestion script executed purely within serverless ephemeral GitHub Actions runners, triggered instantly by code pushes.

    Quantifiable Result

    Completely eliminated runtime compute overhead and reduced system operational maintenance costs to zero.

    CHALLENGE 3

    Situation & Impediment

    Building real-time dashboards in modern React frameworks regularly triggers hydration mismatches when server-rendered timestamps and client-rendered layouts conflict, especially on high-frequency mobile viewport states.

    Engineering Action

    Adopted pure utility-first layouts, isolated dynamic locks inside deferred CSS micro-animation states rather than synchronous React render frames, migrated client-side telemetry queries to server-rendered Next.js Async Components, and replaced legacy useMemo hooks with native React 19 Compiler instructions.

    Quantifiable Result

    Fully resolved layout shift flashes and hydration anomalies, creating a rock-solid, fully responsive mobile analytics console that compiles with zero console warnings.