To Tag, or Not to Tag: Translating C's Unions to Rust's Tagged Unions
Abstrak
Automatic C-to-Rust translation is a promising way to enhance the reliability of legacy system software. However, C2Rust, an industrially developed translator, generates Rust code with unsafe features, undermining the translation's objective. While researchers have proposed techniques to remove unsafe features in C2Rust-generated code, these efforts have targeted only a limited subset of unsafe features. One important unsafe feature remaining unaddressed is a union, a type consisting of multiple fields sharing the same memory storage. Programmers often place a union with a tag in a struct to record the last-written field, but they can still access wrong fields. In contrast, Rust's tagged unions combine tags and unions at the language level, ensuring correct value access. In this work, we propose techniques to replace unions with tagged unions during C-to-Rust translation. We develop a static analysis that facilitates such replacement by identifying tag fields and the corresponding tag values. The analysis involves a must-points-to analysis computing struct field values and a heuristic interpreting these results. To enhance efficiency, we adopt intraprocedural function-wise analysis, allowing selective analysis of functions. Our evaluation on 36 real-world C programs shows that the proposed approach is (1) precise, identifying 74 tag fields with no false positives and only five false negatives, (2) mostly correct, with 17 out of 23 programs passing tests post-transformation, and (3) efficient, capable of analyzing and transforming 141k LOC in 4,910 seconds.
Topik & Kata Kunci
Penulis (2)
Jaemin Hong
Sukyoung Ryu
Akses Cepat
- Tahun Terbit
- 2024
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓