Contexte et atouts du poste
You will work inside the MAGNET team at Inria Lille
More particularly, on the Reasoning Core project
Funded by ANR Adada, collaborating with the PI, a PhD student and an engineer
Mission confiée
The internship will focus on the design of a reinforcement learning environment for procedural task generation in code completion and understanding.
Unlike existing benchmarks that rely on static datasets, the objective is to create renewable, verifiable tasks with an adaptive difficulty knob and curriculum learning.
This will involve:
Studying existing environments for procedural reasoning and code generation, such as Reasoning Gym (Shen et al., Reasoning Gym: Procedurally Generated Reasoning Environments for Pretraining and Evaluating LLMs, 2024), DELTA-Code (Sun et al., DELTA-Code: A Benchmark for RL-based Algorithm Learning in Code LLMs, ICLR 2025), StepCoder (Dou et al., StepCoder: Improving LLM-Based Code Generation via Step-wise RL, 2024), and RLEF (Gehring et al., Reinforcement Learning from Execution Feedback for Code Generation, NeurIPS 2024).
Identifying gaps in existing approaches, in particular the lack of procedurally generated static analysis tasks (e.g., call-graph construction, error detection, performance profiling).
Proposing new generators for Python code tasks that produce infinite, verifiable problem instances, with evaluation grounded in unit tests, execution traces, or static analyzers.
Principales activités
Main activities
Implement procedural generators for code tasks (Python focus), inspired by TinyPy-style grammar synthesis (Naïr et al., Curriculum Learning for Code Execution with Synthetic Python Programs, 2023) and code perturbation methods
Integrate these generators in a Gym-style RL environment with automatic scoring and curriculum scheduling (easy → hard).
Explore a novel task family (e.g., static code reasoning, code completion, debugging, refactoring, performance analysis, or multi-file repository puzzles) and evaluate whether LLMs can improve through curriculum-based RL.
Compétences
Excellent python coding skill and code understanding
Critical thinking and scientific culture
Good vibe coding capabilities
Some perfectionism, liking iteration and refinment
Ability to process scientific literature
Avantages
Rémunération
internship allowance according to the amount in force