tafia/quick-xml

GitHub: tafia/quick-xml

Rust 编写的高性能近乎零拷贝 XML 解析与生成库,支持 Serde 序列化。

Stars: 1476 | Forks: 277

# quick-xml ![status](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/333a2d1f45194017.svg) [![Crate](https://img.shields.io/crates/v/quick-xml.svg)](https://crates.io/crates/quick-xml) [![docs.rs](https://docs.rs/quick-xml/badge.svg)](https://docs.rs/quick-xml) [![codecov](https://img.shields.io/codecov/c/github/tafia/quick-xml)](https://codecov.io/gh/tafia/quick-xml) [![MSRV](https://img.shields.io/badge/rustc-1.56.0+-ab6000.svg)](https://blog.rust-lang.org/2021/10/21/Rust-1.56.0.html) 高性能 XML pull reader/writer。 该 reader: - 几乎是零拷贝的(尽可能使用 `Cow`) - 对内存分配友好(API 提供了重用缓冲区的方式) - 支持多种编码(通过 `encoding` feature)、命名空间解析和特殊字符。 语法灵感来源于 [xml-rs](https://github.com/netvl/xml-rs)。 ## 示例 ### Reader ``` use quick_xml::events::Event; use quick_xml::reader::Reader; let xml = r#" Test Test 2 "#; let mut reader = Reader::from_str(xml); reader.config_mut().trim_text(true); let mut count = 0; let mut txt = Vec::new(); let mut buf = Vec::new(); // The `Reader` does not implement `Iterator` because it outputs borrowed data (`Cow`s) loop { // NOTE: this is the generic case when we don't know about the input BufRead. // when the input is a &str or a &[u8], we don't actually need to use another // buffer, we could directly call `reader.read_event()` match reader.read_event_into(&mut buf) { Err(e) => panic!("Error at position {}: {:?}", reader.error_position(), e), // exits the loop when reaching end of file Ok(Event::Eof) => break, Ok(Event::Start(e)) => { match e.name().as_ref() { b"tag1" => println!("attributes values: {:?}", e.attributes().map(|a| a.unwrap().value) .collect::>()), b"tag2" => count += 1, _ => (), } } Ok(Event::Text(e)) => txt.push(e.decode().unwrap().into_owned()), // There are several other `Event`s we do not consider here _ => (), } // if we don't keep a borrow elsewhere, we can clear the buffer to keep memory usage low buf.clear(); } ``` ### Writer ``` use quick_xml::events::{Event, BytesEnd, BytesStart}; use quick_xml::reader::Reader; use quick_xml::writer::Writer; use std::io::Cursor; let xml = r#"text"#; let mut reader = Reader::from_str(xml); reader.config_mut().trim_text(true); let mut writer = Writer::new(Cursor::new(Vec::new())); loop { match reader.read_event() { Ok(Event::Start(e)) if e.name().as_ref() == b"this_tag" => { // creates a new element ... alternatively we could reuse `e` by calling // `e.into_owned()` let mut elem = BytesStart::new("my_elem"); // collect existing attributes elem.extend_attributes(e.attributes().map(|attr| attr.unwrap())); // copy existing attributes, adds a new my-key="some value" attribute elem.push_attribute(("my-key", "some value")); // writes the event to the writer assert!(writer.write_event(Event::Start(elem)).is_ok()); }, Ok(Event::End(e)) if e.name().as_ref() == b"this_tag" => { assert!(writer.write_event(Event::End(BytesEnd::new("my_elem"))).is_ok()); }, Ok(Event::Eof) => break, // we can either move or borrow the event to write, depending on your use-case Ok(e) => assert!(writer.write_event(e).is_ok()), Err(e) => panic!("Error at position {}: {:?}", reader.error_position(), e), } } let result = writer.into_inner().into_inner(); let expected = r#"text"#; assert_eq!(result, expected.as_bytes()); ``` ## Serde 当使用 `serialize` feature 时,quick-xml 可以与 serde 的 `Serialize`/`Deserialize` traits 配合使用。 XML 与 Rust 类型之间的映射,特别是允许你区分*元素*(elements)和*属性*(attributes)的语法, 在 [反序列化](https://docs.rs/quick-xml/latest/quick_xml/de/) 的文档中有详细描述。 ### 解析标签的“值” 如果你有一个形如 `bar` 的输入,并且想要获取 `bar`, 你可以使用特殊名称 `$text` 或特殊名称 `$value`: ``` struct Foo { #[serde(rename = "@abc")] pub abc: String, #[serde(rename = "$text")] pub body: String, } ``` 在 [文档](https://docs.rs/quick-xml/latest/quick_xml/de/index.html#difference-between-text-and-value-special-names) 中阅读关于它们差异的说明。 ### 性能 请注意,尽管没有专注于性能(存在一些不必要的拷贝),但它仍然比 serde-xml-rs 快约 10 倍。 # 功能 - `encoding`:支持非 utf8 xml - `serialize`:支持 serde `Serialize`/`Deserialize` ## 性能 基准测试很难做,结果取决于你的输入文件和机器。 在我的特定文件上,quick-xml 大约比 [xml-rs](https://crates.io/crates/xml-rs) crate **快 50 倍**。 ``` // quick-xml benches test bench_quick_xml ... bench: 198,866 ns/iter (+/- 9,663) test bench_quick_xml_escaped ... bench: 282,740 ns/iter (+/- 61,625) test bench_quick_xml_namespaced ... bench: 389,977 ns/iter (+/- 32,045) // same bench with xml-rs test bench_xml_rs ... bench: 14,468,930 ns/iter (+/- 321,171) // serde-xml-rs vs serialize feature test bench_serde_quick_xml ... bench: 1,181,198 ns/iter (+/- 138,290) test bench_serde_xml_rs ... bench: 15,039,564 ns/iter (+/- 783,485) ``` 关于功能和性能的对比,你也可以查看 RazrFalcon 的 [解析器对比表](https://github.com/RazrFalcon/roxmltree#parsing)。 ## 贡献 欢迎任何 PR! ## 许可证 MIT
标签:Crate, Io, Parser, Pull解析, Rust, Serialization, XML, XML读写, 内存安全, 反序列化, 可视化界面, 序列化, 开发组件, 开源库, 搜索引擎爬虫, 数据结构, 编码转换, 网络流量审计, 解析器, 通知系统, 通知系统, 零拷贝