sileod/reasoning-core

GitHub: sileod/reasoning-core

面向语言模型预训练和后训练的程序化推理数据生成工具套件,通过算法确定性生成逻辑、数学、规划等符号推理任务的训练数据与评测样本。

Stars: 38 | Forks: 3

# Reasoning 核心 ◉ reasoning-core 是一套用于语言模型预训练和后训练的文本程序化数据生成器。 它以富有表现力的形式化与算法任务为核心,包括完整的一阶逻辑、基于 TPTP 的形式化数学、规划以及 CFG 语法任务。 我们发布了规模超过 100 亿 token 的预生成数据 🤗 [https://hf.co/collections/reasoning-core/datasets](https://huggingface.co/collections/reasoning-core/datasets) # 独立运行 ``` uv pip install reasoning-core from reasoning_core import list_tasks, get_task, score_answer T = get_task('arithmetics')() x = T.generate_example() assert score_answer(x.answer, x)==1 ``` # 任务示例与任务编写指南 [图库](https://github.com/sileod/reasoning-core/blob/main/GALLERY.md)(名称链接到任务代码) [`planning`](GALLERY.md#planning) · [`table_qa`](GALLERY.md#table_qa) · [`table_conversion`](GALLERY.md#table_conversion) · [`equation_system`](GALLERY.md#equation_system) · [`code_execution`](GALLERY.md#code_execution) · [`diff_prediction`](GALLERY.md#diff_prediction) · [`diff_patching`](GALLERY.md#diff_patching) · [`regex_following`](GALLERY.md#regex_following) · [`regex_induction`](GALLERY.md#regex_induction) · [`graph_pathfinding`](GALLERY.md#graph_pathfinding) · [`graph_node_centrality`](GALLERY.md#graph_node_centrality) · [`graph_isomorphism`](GALLERY.md#graph_isomorphism) · [`arithmetics`](GALLERY.md#arithmetics) · [`symbolic_arithmetics`](GALLERY.md#symbolic_arithmetics) · [`sequential_induction`](GALLERY.md#sequential_induction) · [`conjecture_entailment`](GALLERY.md#conjecture_entailment) · [`proof_reconstruction`](GALLERY.md#proof_reconstruction) · [`bayesian_association`](GALLERY.md#bayesian_association) · [`bayesian_intervention`](GALLERY.md#bayesian_intervention) · [`logic_nli`](GALLERY.md#logic_nli) · [`evidence_retrieval`](GALLERY.md#evidence_retrieval) · [`parsability`](GALLERY.md#parsability) · [`parsing`](GALLERY.md#parsing) · [`continuation`](GALLERY.md#continuation) · [`set_intersection`](GALLERY.md#set_intersection) · [`set_missing_element`](GALLERY.md#set_missing_element) · [`count_elements`](GALLERY.md#count_elements) · [`set_equality`](GALLERY.md#set_equality) [任务编写指南](https://github.com/sileod/reasoning_core/blob/main/TASK_AUTHORING_GUIDE.md) # 并行生成脚本 运行 `bash run_generate.sh` 以进行多线程生成并输出为 json 文件(可被 Huggingface Datasets 读取)。 # 集成 ### Prime 环境 Hub ``` #!pip install uv #install uv if needed !uv tool install prime --with openai -q !uv tool run prime -- env install sileod/reasoning-core-env from verifiers import load_environment import os; from openai import OpenAI env = load_environment("reasoning-core-env") client = OpenAI( base_url="https://openrouter.ai/api/v1", api_key=os.getenv("OPENROUTER_API_KEY")) #🔑 results = env.evaluate(client=client, model="gpt-4.1-mini", num_examples=20, rollouts_per_example=1) df=env.make_dataset(results).to_pandas() ``` ### Reasoning gym 我们使用了自定义但兼容的接口。我们的任务与 RG 大部分正交,可以导入到其中。 ``` import reasoning_gym, reasoning_core from reasoning_gym.composite import DatasetSpec reasoning_core.register_to_reasoning_gym() # registers RC tasks into RG specs = [ DatasetSpec(name='leg_counting', weight=1, config={}), #from reasoning_gym 🏋 DatasetSpec(name='arithmetics', weight=1, config={}), #from reasoning_core ◉ ] D=reasoning_gym.create_dataset('composite', size=10, seed=42, datasets=specs) ``` 反之亦然: ``` from reasoning_core import get_task t=get_task('reasoning_gym') t.generate_example(level=1, rg_task='lcm') #or unspecified for random task ``` ### Openreward https://openreward.ai/dsileo/reasoning-core ## 引用与论文 ``` @article{reasoningcore2026, title={Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training}, author={Lacombe, Valentin and Quesnel, Valentin and Sileo, Damien}, journal={arXiv preprint arXiv:2603.02208}, year={2026}, url={https://arxiv.org/abs/2603.02208} } ``` https://arxiv.org/abs/2603.02208 联系方式:damien.sileo@inria.fr
标签:Apex, DLL 劫持, Hugging Face, Petitpotam, Python, TPTP, 一阶逻辑, 人工智能, 代码执行, 合成数据生成, 图论, 大语言模型, 形式推理, 数学推理, 无后门, 机器学习, 用户模式Hook绕过, 符号计算, 算法任务, 规划算法, 贝叶斯推断, 逆向工具, 逻辑推理, 预训练数据