pgmpy/pgmpy

GitHub: pgmpy/pgmpy

pgmpy 是一个功能完备的 Python 因果 AI 库，提供贝叶斯网络等概率图模型的构建、学习、推理与仿真能力。

Stars: 3277 | Forks: 1127

pgmpy 是一个用于通过图形模型进行因果和概率建模的 Python 库。它提供了统一的 API 来构建、学习和分析模型，例如贝叶斯网络、动态贝叶斯网络、有向无环图和结构方程模型。通过整合概率推理和因果推理的工具，pgmpy 使用户能够在预测性分析和因果分析之间无缝切换。

| | **[文档](https://pgmpy.org/)** · **[示例](https://pgmpy.org/examples.html)** . [教程](https://github.com/pgmpy/pgmpy_tutorials) | |---|---| | **开源** | [![GitHub License](https://img.shields.io/github/license/pgmpy/pgmpy)](https://github.com/pgmpy/pgmpy/blob/main/LICENSE) | | **教程** | [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/pgmpy/pgmpy/dev?filepath=examples) | **社区** | [![!discord](https://img.shields.io/static/v1?logo=discord&label=discord&message=chat&color=lightgreen)](https://discord.gg/DRkdKaumBs) [![!slack](https://img.shields.io/static/v1?logo=linkedin&label=LinkedIn&message=news&color=lightblue)](https://www.linkedin.com/company/pgmpy/) | | **CI/CD** | [![github-actions](https://img.shields.io/github/actions/workflow/status/pgmpy/pgmpy/ci.yml?logo=github)](https://github.com/pgmpy/pgmpy/actions/workflows/ci.yml) [![asv](http://img.shields.io/badge/benchmarked%20by-asv-blue.svg?style=flat)](http://pgmpy.org/pgmpy-benchmarks/) [![platform](https://img.shields.io/conda/pn/conda-forge/pgmpy)](https://github.com/pgmpy/pgmpy) | | **代码** | [![!pypi](https://img.shields.io/pypi/v/pgmpy?color=orange)](https://pypi.org/project/pgmpy/) [![!conda](https://img.shields.io/conda/vn/conda-forge/pgmpy)](https://anaconda.org/conda-forge/pgmpy) [![!python-versions](https://img.shields.io/pypi/pyversions/pgmpy)](https://www.python.org/) [![!black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) | | **下载量** | ![PyPI - Downloads](https://img.shields.io/pypi/dw/pgmpy) ![PyPI - Downloads](https://img.shields.io/pypi/dm/pgmpy) [![Downloads](https://static.pepy.tech/personalized-badge/pgmpy?period=total&units=international_system&left_color=grey&right_color=blue&left_text=cumulative%20(pypi))](https://pepy.tech/project/pgmpy) | | **支持方** | [![GC.OS Sponsored](https://img.shields.io/badge/GC.OS-Sponsored%20Project-orange.svg?style=flat&colorA=0eac92&colorB=2077b4)](https://gc-os-ai.github.io/) [![FLOSS/FUND](https://floss.fund//static/badge.svg)](https://floss.fund/) | ## 主要功能 | 功能 | 描述 | |--------|-------------| | [**因果发现 / 结构学习**](https://pgmpy.org/examples/Structure%20Learning%20in%20Bayesian%20Networks.html) | 从数据中学习模型结构，可选择集成 **专家知识**。 | | [**因果验证**](https://pgmpy.org/metrics/metrics.html) | 评估因果结构与数据的兼容程度。 | | [**参数学习**](https://pgmpy.org/examples/Learning%20Parameters%20in%20Discrete%20Bayesian%20Networks.html) | 从观测数据中估计模型参数（例如，条件概率分布）。 | | [**概率推理**](https://pgmpy.org/examples/Inference%20in%20Discrete%20Bayesian%20Networks.html) | 计算以观测证据为条件的后验分布。 | | [**因果推理**](https://pgmpy.org/examples/Causal%20Inference.html) | 使用 do-calculus 计算干预分布和反事实分布。 | | [**模拟**](https://github.com/pgmpy/pgmpy/blob/dev/examples/Simulating_Data.ipynb) | 在指定证据或干预下生成合成数据。 | ### 资源与链接 - **示例 Notebook：** [Examples](https://github.com/pgmpy/pgmpy/tree/dev/examples) - **教程 Notebook：** [Tutorials](https://github.com/pgmpy/pgmpy_notebook) - **博客文章：** [Medium](https://medium.com/@ankurankan_23083) - **文档：** [Website](https://pgmpy.org/) - **Bug 报告和功能请求：** [GitHub Issues](https://github.com/pgmpy/pgmpy/issues) - **问题：** [discord](https://discord.gg/DRkdKaumBs) · [Stack Overflow](https://stackoverflow.com/questions/tagged/pgmpy) ## 快速开始 ### 安装 pgmpy 可在 [PyPI](https://pypi.org/project/pgmpy/) 和 [anaconda](https://anaconda.org/conda-forge/pgmpy) 上获取。要从 PyPI 安装，请使用： ``` pip install pgmpy ``` 要从 conda-forge 安装，请使用： ``` conda install conda-forge::pgmpy ``` ### 示例 #### 离散数据 ``` from pgmpy.utils import get_example_model # 加载 Discrete Bayesian Network 并模拟数据。 discrete_bn = get_example_model("alarm") alarm_df = discrete_bn.simulate(n_samples=100) # 从模拟数据学习网络。 from pgmpy.estimators import PC dag = PC(data=alarm_df).estimate(ci_test="chi_square", return_type="dag") # 从数据学习参数。 from pgmpy.models import DiscreteBayesianNetwork discrete_bn = DiscreteBayesianNetwork(dag.edges()) discrete_bn.add_nodes_from(dag.nodes()) dag_fitted = discrete_bn.fit(alarm_df) dag_fitted.get_cpds() # 删除一列并使用学习到的模型进行预测。 evidence_df = alarm_df.drop(columns=["FIO2"], axis=1) pred_FIO2 = dag_fitted.predict(evidence_df) ``` #### 线性高斯数据 ``` from pgmpy.utils import get_example_model # 加载示例 Gaussian Bayesian Network 并模拟数据 gaussian_bn = get_example_model("ecoli70") ecoli_df = gaussian_bn.simulate(n_samples=100) # 从模拟数据学习网络。 from pgmpy.estimators import PC dag = PC(data=ecoli_df).estimate(ci_test="pearsonr", return_type="dag") # 从数据学习参数。 from pgmpy.models import LinearGaussianBayesianNetwork gaussian_bn = LinearGaussianBayesianNetwork(dag.edges()) dag_fitted = gaussian_bn.fit(ecoli_df) dag_fitted.get_cpds() # 删除一列并使用学习到的模型进行预测。 evidence_df = ecoli_df.drop(columns=["ftsJ"], axis=1) pred_ftsJ = dag_fitted.predict(evidence_df) ``` #### 具有任意关系的混合数据 ``` from pgmpy.global_vars import config config.set_backend("torch") import pyro.distributions as dist from pgmpy.models import FunctionalBayesianNetwork from pgmpy.factors.hybrid import FunctionalCPD # 创建包含离散和连续变量混合的 Bayesian Network。 func_bn = FunctionalBayesianNetwork( [ ("x1", "w"), ("x2", "w"), ("x1", "y"), ("x2", "y"), ("w", "y"), ("y", "z"), ("w", "z"), ("y", "c"), ("w", "c"), ] ) # 为每个节点定义 Functional CPDs 并将其添加到模型中。 cpd_x1 = FunctionalCPD("x1", fn=lambda _: dist.Normal(0.0, 1.0)) cpd_x2 = FunctionalCPD("x2", fn=lambda _: dist.Normal(0.5, 1.2)) # 连续中介变量：w = 0.7*x1 - 0.3*x2 + ε cpd_w = FunctionalCPD( "w", fn=lambda parents: dist.Normal(0.7 * parents["x1"] - 0.3 * parents["x2"], 0.5), parents=["x1", "x2"], ) # 带逻辑链路的 Bernoulli 目标：y ~ Bernoulli(sigmoid(-0.7 + 1.5*x1 + 0.8*x2 + 1.2*w)) cpd_y = FunctionalCPD( "y", fn=lambda parents: dist.Bernoulli( logits=(-0.7 + 1.5 * parents["x1"] + 0.8 * parents["x2"] + 1.2 * parents["w"]) ), parents=["x1", "x2", "w"], ) # 受 y 和 w 影响的下游 Bernoulli 变量 cpd_z = FunctionalCPD( "z", fn=lambda parents: dist.Bernoulli( logits=(-1.2 + 0.8 * parents["y"] + 0.2 * parents["w"]) ), parents=["y", "w"], ) # 取决于 y 和 w 的连续结果：c = 0.2 + 0.5*y + 0.3*w + ε cpd_c = FunctionalCPD( "c", fn=lambda parents: dist.Normal(0.2 + 0.5 * parents["y"] + 0.3 * parents["w"], 0.7), parents=["y", "w"], ) func_bn.add_cpds(cpd_x1, cpd_x2, cpd_w, cpd_y, cpd_z, cpd_c) func_bn.check_model() # 从模型模拟数据 df_func = func_bn.simulate(n_samples=1000, seed=123) # 关于 Functional Bayesian Networks 的学习和推理，请参考示例 notebook：https://github.com/pgmpy/pgmpy/blob/dev/examples/Functional_Bayesian_Network_Tutorial.ipynb ``` ## 贡献我们欢迎对 pgmpy 的所有贡献——不仅仅是代码。请参考 [贡献指南](https://github.com/pgmpy/pgmpy/blob/dev/Contributing.md) 以获取更多详细信息。我们还为新贡献者提供指导，并维护一份潜在的[指导项目](https://github.com/pgmpy/pgmpy/wiki/Mentored-Projects)列表。如果您有兴趣为 pgmpy 做贡献，请加入我们的 [discord](https://discord.gg/DRkdKaumBs) 服务器并介绍一下自己。我们将很高兴帮助您入门。

标签：Apex, pgmpy, 人工智能, 凭据扫描, 动态贝叶斯网络, 因果AI, 因果推断, 图形模型, 开源库, 搜索引擎爬虫, 数据科学, 有向无环图, 机器学习, 概率建模, 概率推理, 用户模式Hook绕过, 算法库, 结构方程模型, 统计分析, 贝叶斯网络, 资源验证, 逆向工具, 预测分析