Technologies: Python | React | FastAPI | ChromaDB | LangChain
I engineered a RAG pipeline and architected a containerized deployment for a specialized LLM. This involved implementing Parameter-Efficient Fine-Tuning (PEFT) using LoRA to adapt the model to specific domains, ensuring high-performance inference through a FastAPI backend and a React/Chainlit frontend.
I also specified an evaluation framework using DeepEval and 'LLM-as-a-judge' methodologies to systematically measure the model's faithfulness and reasoning capabilities.