pgmpy/pgmpy
GitHub: pgmpy/pgmpy
pgmpy 是一个功能完备的 Python 因果 AI 库,提供贝叶斯网络等概率图模型的构建、学习、推理与仿真能力。
Stars: 3192 | Forks: 1019
| | **[文档](https://pgmpy.org/)** · **[示例](https://pgmpy.org/examples.html)** . [教程](https://github.com/pgmpy/pgmpy_tutorials) |
|---|---|
| **开源** | [](https://github.com/pgmpy/pgmpy/blob/main/LICENSE) |
| **教程** | [](https://mybinder.org/v2/gh/pgmpy/pgmpy/dev?filepath=examples)
| **社区** | [](https://discord.gg/DRkdKaumBs) [](https://www.linkedin.com/company/pgmpy/) |
| **CI/CD** | [](https://github.com/pgmpy/pgmpy/actions/workflows/ci.yml) [](http://pgmpy.org/pgmpy-benchmarks/) [](https://github.com/pgmpy/pgmpy) |
| **代码** | [](https://pypi.org/project/pgmpy/) [](https://anaconda.org/conda-forge/pgmpy) [](https://www.python.org/) [](https://github.com/psf/black) |
| **下载量** |   [)](https://pepy.tech/project/pgmpy) |
| **支持方** | [](https://gc-os-ai.github.io/) [](https://floss.fund/) |
## 主要功能
| 功能 | 描述 |
|--------|-------------|
| [**因果发现 / 结构学习**](https://pgmpy.org/examples/Structure%20Learning%20in%20Bayesian%20Networks.html) | 从数据中学习模型结构,可选择集成 **专家知识**。 |
| [**因果验证**](https://pgmpy.org/metrics/metrics.html) | 评估因果结构与数据的兼容程度。 |
| [**参数学习**](https://pgmpy.org/examples/Learning%20Parameters%20in%20Discrete%20Bayesian%20Networks.html) | 从观测数据中估计模型参数(例如,条件概率分布)。 |
| [**概率推理**](https://pgmpy.org/examples/Inference%20in%20Discrete%20Bayesian%20Networks.html) | 计算以观测证据为条件的后验分布。 |
| [**因果推理**](https://pgmpy.org/examples/Causal%20Inference.html) | 使用 do-calculus 计算干预分布和反事实分布。 |
| [**模拟**](https://github.com/pgmpy/pgmpy/blob/dev/examples/Simulating_Data.ipynb) | 在指定证据或干预下生成合成数据。 |
### 资源与链接
- **示例 Notebook:** [Examples](https://github.com/pgmpy/pgmpy/tree/dev/examples)
- **教程 Notebook:** [Tutorials](https://github.com/pgmpy/pgmpy_notebook)
- **博客文章:** [Medium](https://medium.com/@ankurankan_23083)
- **文档:** [Website](https://pgmpy.org/)
- **Bug 报告和功能请求:** [GitHub Issues](https://github.com/pgmpy/pgmpy/issues)
- **问题:** [discord](https://discord.gg/DRkdKaumBs) · [Stack Overflow](https://stackoverflow.com/questions/tagged/pgmpy)
## 快速开始
### 安装
pgmpy 可在 [PyPI](https://pypi.org/project/pgmpy/) 和 [anaconda](https://anaconda.org/conda-forge/pgmpy) 上获取。要从 PyPI 安装,请使用:
```
pip install pgmpy
```
要从 conda-forge 安装,请使用:
```
conda install conda-forge::pgmpy
```
### 示例
#### 离散数据
```
from pgmpy.utils import get_example_model
# 加载 Discrete Bayesian Network 并模拟数据。
discrete_bn = get_example_model("alarm")
alarm_df = discrete_bn.simulate(n_samples=100)
# 从模拟数据学习网络。
from pgmpy.estimators import PC
dag = PC(data=alarm_df).estimate(ci_test="chi_square", return_type="dag")
# 从数据学习参数。
from pgmpy.models import DiscreteBayesianNetwork
discrete_bn = DiscreteBayesianNetwork(dag.edges())
discrete_bn.add_nodes_from(dag.nodes())
dag_fitted = discrete_bn.fit(alarm_df)
dag_fitted.get_cpds()
# 删除一列并使用学习到的模型进行预测。
evidence_df = alarm_df.drop(columns=["FIO2"], axis=1)
pred_FIO2 = dag_fitted.predict(evidence_df)
```
#### 线性高斯数据
```
from pgmpy.utils import get_example_model
# 加载示例 Gaussian Bayesian Network 并模拟数据
gaussian_bn = get_example_model("ecoli70")
ecoli_df = gaussian_bn.simulate(n_samples=100)
# 从模拟数据学习网络。
from pgmpy.estimators import PC
dag = PC(data=ecoli_df).estimate(ci_test="pearsonr", return_type="dag")
# 从数据学习参数。
from pgmpy.models import LinearGaussianBayesianNetwork
gaussian_bn = LinearGaussianBayesianNetwork(dag.edges())
dag_fitted = gaussian_bn.fit(ecoli_df)
dag_fitted.get_cpds()
# 删除一列并使用学习到的模型进行预测。
evidence_df = ecoli_df.drop(columns=["ftsJ"], axis=1)
pred_ftsJ = dag_fitted.predict(evidence_df)
```
#### 具有任意关系的混合数据
```
from pgmpy.global_vars import config
config.set_backend("torch")
import pyro.distributions as dist
from pgmpy.models import FunctionalBayesianNetwork
from pgmpy.factors.hybrid import FunctionalCPD
# 创建包含离散和连续变量混合的 Bayesian Network。
func_bn = FunctionalBayesianNetwork(
[
("x1", "w"),
("x2", "w"),
("x1", "y"),
("x2", "y"),
("w", "y"),
("y", "z"),
("w", "z"),
("y", "c"),
("w", "c"),
]
)
# 为每个节点定义 Functional CPDs 并将其添加到模型中。
cpd_x1 = FunctionalCPD("x1", fn=lambda _: dist.Normal(0.0, 1.0))
cpd_x2 = FunctionalCPD("x2", fn=lambda _: dist.Normal(0.5, 1.2))
# 连续中介变量:w = 0.7*x1 - 0.3*x2 + ε
cpd_w = FunctionalCPD(
"w",
fn=lambda parents: dist.Normal(0.7 * parents["x1"] - 0.3 * parents["x2"], 0.5),
parents=["x1", "x2"],
)
# 带逻辑链路的 Bernoulli 目标:y ~ Bernoulli(sigmoid(-0.7 + 1.5*x1 + 0.8*x2 + 1.2*w))
cpd_y = FunctionalCPD(
"y",
fn=lambda parents: dist.Bernoulli(
logits=(-0.7 + 1.5 * parents["x1"] + 0.8 * parents["x2"] + 1.2 * parents["w"])
),
parents=["x1", "x2", "w"],
)
# 受 y 和 w 影响的下游 Bernoulli 变量
cpd_z = FunctionalCPD(
"z",
fn=lambda parents: dist.Bernoulli(
logits=(-1.2 + 0.8 * parents["y"] + 0.2 * parents["w"])
),
parents=["y", "w"],
)
# 取决于 y 和 w 的连续结果:c = 0.2 + 0.5*y + 0.3*w + ε
cpd_c = FunctionalCPD(
"c",
fn=lambda parents: dist.Normal(0.2 + 0.5 * parents["y"] + 0.3 * parents["w"], 0.7),
parents=["y", "w"],
)
func_bn.add_cpds(cpd_x1, cpd_x2, cpd_w, cpd_y, cpd_z, cpd_c)
func_bn.check_model()
# 从模型模拟数据
df_func = func_bn.simulate(n_samples=1000, seed=123)
# 关于 Functional Bayesian Networks 的学习和推理,请参考示例 notebook:https://github.com/pgmpy/pgmpy/blob/dev/examples/Functional_Bayesian_Network_Tutorial.ipynb
```
## 贡献
我们欢迎对 pgmpy 的所有贡献——不仅仅是代码。请参考
[贡献指南](https://github.com/pgmpy/pgmpy/blob/dev/Contributing.md)
以获取更多详细信息。我们还为新贡献者提供指导,并维护一份
潜在的[指导项目](https://github.com/pgmpy/pgmpy/wiki/Mentored-Projects)列表。如果您
有兴趣为 pgmpy 做贡献,请加入我们的
[discord](https://discord.gg/DRkdKaumBs) 服务器并介绍一下自己。我们将
很高兴帮助您入门。
标签:Apex, pgmpy, 人工智能, 凭据扫描, 动态贝叶斯网络, 因果AI, 因果推断, 图形模型, 开源库, 搜索引擎爬虫, 数据科学, 有向无环图, 机器学习, 概率建模, 概率推理, 用户模式Hook绕过, 算法库, 结构方程模型, 统计分析, 贝叶斯网络, 资源验证, 逆向工具, 预测分析
