Source code for the blog entry: https://samarkanov.info/blog/2025/jun/automating-airflow-with-an-llm-agent.html
I'm developing an LLM agent designed to autonomously generate and trigger an Apache Airflow pipeline that ingests dataseries, calculates moving averages, and exports results into an HTML report. As part of this exercise exercise I'm evaluating the agent’s capacity for self-correction: I want to see if it can complete the task in a single execution, recover from runtime errors, and adapt its generated code dynamically. By granting the LLM agency over a suite of specialized tools (which, hopefully, prevent hallucinations) the goal is to determine how effectively the model can take responsibility for tool selection to maintain a reliable, end-to-end workflow.