...The system operates in a loop where the agent modifies code, evaluates results against measurable metrics, and either keeps or discards changes based on performance. It generalizes the concept of autoresearch beyond machine learning, allowing optimization of test coverage, latency, lint errors, and overall code quality. Developers define a goal and verification command, and the agent continuously runs experiments to reach the desired outcome. The framework supports multiple operational modes, including debugging, planning, security auditing, and release validation. It can run unattended for extended periods, producing logs of experiments and improvements. ...