The Checked C project is extending the C programming language so that programmers can write more secure and reliable C programs. The project is developing an extension to C called Checked C that adds checking to C to detect or prevent common programming errors such as buffer overruns, out-of-bounds memory accesses, and incorrect type casts. The extension is designed to be used for existing system software written in C.
Finding out more
Here is quick list of resources:
- Visit the Checked C Wiki (opens in new tab) on GitHub to learn more about the Checked C extension.
- Download a fork of the clang compiler for trying out Checked C on Windows here (opens in new tab).
- The latest version of the Checked C language extension design is available on GitHub here. (opens in new tab)
- We do all of our on-going work on Github, with repos for the language design (opens in new tab) and compiler implementation. (opens in new tab)
Checked C is an open, collaborative project. Developers and researchers are welcome to try it out, provide feedback, or contribute to the efforts.
Researchers working on the project include Michael Hicks (opens in new tab), Ray Chen (opens in new tab), and Hasan Touma (opens in new tab) at the University of Maryland (opens in new tab), Aravind Machiry (opens in new tab) at UC Santa Barbara, Jorge Navas (opens in new tab) at SRI, and Arie Gurfinkel (opens in new tab) at the University of Waterloo, and Andrew Ruef. They have been working a conversion tool for converting C programs to Checked C, as well as converting existing code and providing feedback on the language design. We have also worked in the past with researchers at Samsung. We are grateful to all of our collaborators and interns for their contributions to the project.
David Tarditi (opens in new tab) gave a research talk on Checked C at the University of Washington in October, 2016. The talk (opens in new tab) is available on YouTube.
Current interns (2020)
- Yahui Sun (opens in new tab) (Texas A\&M): Yahui Sun worked on converting networking in MUSL, a widely-used C runtime, to use Checked C. He also worked on improving error messages explaining why bounds declarations cannot be proved to be valid.
- Esmaeil Mohammadian Koruyeh (opens in new tab) (University of California – Riverside): Esmaeil worked on converting string process code in MUSL, a widley-used C runtime, to use Checked C. He also worked on extending the kinds of conditional tests that result in widening of pointers to null-terminated arrays.
Past Interns
2019
- Jie Zhou (opens in new tab) (University of Rochester): Jie is working on how to dynamically detect memory management errors such as use after free.
- Pardis Pashakhanloo (opens in new tab) (Univ. of Pennsylvania): Pardis is improving the static checking of bounds declarations in the Checked C compiler.
- Abel Nieto (opens in new tab) (Univ. of Waterloo): Abel is implementing support for generic data structures in in the Checked C compiler.
2018
- Shen Liu (opens in new tab) (Penn State): Shen worked on static checking for Checked C, including inferring widened bounds for null-terminated arrays.
- Prabhu Karthikeyan Rajasekaran (opens in new tab) (UC Irvine).: Prabhu added bounds-safe interfaces for generic functions to Checked C and investigated using Checked C in Linux.
- Anna Kornfeld Simpson (opens in new tab) (University of Washington): Anna evaluated Checked C on real-world code, modifying several-open source code bases to use Checked C.
2017
- Sam Elliott (opens in new tab) (University of Washington): Sam worked an implementing the dynamic checking for Checked C. He wrote a technical report (opens in new tab) describing his work in detail.
- Jay Lim (opens in new tab) (Rutgers University): Jay extended the Checked C implementation of clang to support polymorphically-typed functions. This provides a type-safe replacement for many uses of void pointers in C
2016
- Andrew Ruef (opens in new tab) (University of Maryland): Andrew wrote a tool for rewriting C programs to use Checked C extensions, specifically the ptr type.
Detailed Description
Most system software is written in C or C++, which is based on C. System software includes operating systems, browsers, databases, and programming language interpreters. System software is the “infrastructure” software that the world runs on.
There are certain kinds of programming errors such as buffer overruns and incorrect type casts that programmers can make when writing C or C++ programs. These errors can lead to security vulnerabilities or software reliability problems. The Checked C extension will let programmers add checking to their programs to detect these kinds of errors when a program runs or while it is being written. Existing system software can be modified incrementally in a backwards-compatible fashion to have this checking.
In C, programmers use pointers to access data. A pointer is the address of a memory cell. It is easy for programmers to make mistakes when working with pointers, such that a program reads or writes the wrong data. These mistakes can cause programs to crash, misbehave, or allow the program to be taken over by a malicious adversary. Checked C allows programmers to better describe how they intend to use pointers and the range of memory occupied by data that a pointer points to. This information is then used to add checking at runtime to detect mistakes where the wrong data is accessed, instead of the error occurring silently and without detection. This information also can be used detect programming errors while the program is being written. The checking is called “bounds-checking” because it checks that data is being accessed within its intended bounds. The name Checked C reflects the fact that static and dynamic checking is being added to C.
Many programming languages already have bounds checking. C# and Java are examples of such languages. However, those languages automatically add the information needed for bounds checking to data structures. This is a problem for system software, where the programmer needs precise control over what a program is doing. In Checked C, the programmer controls the placement of information needed for bounds-checking and how the information flows through the program, so the programmer retains precise control over what a program is doing.