Microsoft Office is a very commonly used software that can be found on almost any standard computer. It is also integrated inside many products of the Microsoft/Windows ecosystem such as Office itself, Outlook and Office Online. In this blog, we describe our attempts to fuzz a specific component in Microsoft Office and how the results affect this whole ecosystem. We also evaluate the pros and cons of the different fuzzing solutions we experimented with in the course of our research.
We chose the MSGraph COM component (MSGraph.Chart.8, GRAPH.EXE) as our fuzzing target, as it is quite an old piece of code that has existed since at least the days of Office 2003 or earlier. To our knowledge, this component has not received too much attention from the security community until now, making it a fertile ground for bugs.
MSGraph is a component that can be embedded inside many Microsoft Office products (such as Word, Outlook, PowerPoint, etc.), and is used to display graphs and charts. In terms of attack surface, MSGraph is quite similar to Microsoft Equation Editor 3.0. However, unlike Microsoft Equation Editor, MSGraph is still updated in every Office patch and receives the latest mitigations (such as ASLR and DEP), which makes successful exploitation harder. We later found that this attack surface also applies to other Microsoft Office products, including Excel and Office Online, that share the same code.
MSGraph is a symbol-less piece of software that utilizes the Windows COM model in some parts of its code. This makes MSGraph a not-so-trivial target to harness and fuzz. On top of that, MSGraph specifically, and Office in general, utilizes and runs a very large number of components and external DLLs, making the process of reverse-engineering harder. As this is a graybox target (we do not have source code, but we do have the ability to modify the binary, place breakpoints and examine the disassembly), harnessing the program can be tricky. There is no obvious “this_function_parses_stuff” function exported, and even the function that we identified as the parsing function contained some GUI calls that made the fuzzing process slow and clumsy.
In fuzzing terminology, a “harness” usually refers to a small program that triggers the functionality we want to fuzz. To learn more about this topic, we recommend reading our previous blogpost: 50 CVEs in 50 Days: Fuzzing Adobe Reader. The simplest way to create a harness for graybox targets is to utilize an exported function. This didn’t work in our case as we didn’t have such a function. An alternative is to trace and reverse-engineer the execution process of the application to find which function is responsible for the parsing logic and then call it directly.
While IDA was processing the binary (which took some time), we decided to test our luck and try a slightly mutated input. We used radamsa to mutate a legitimate input file and we manually tested the results (by clicking buttons in the GUI). We managed to find a crashing input (int29h / Assertion Error) on our 8th attempt. Talk about luck…
Although this crash is not interesting by itself, it definitely helped us identify which code is responsible for the parsing process, or at the very least, is prone to bugs. This is also when we understood why it was probably a good idea for Microsoft to “isolate” this component inside a COM object. In case this component happens to crash, the whole Word document still remains intact.
As we mentioned earlier, our target is graybox, so we have to use some Dynamic Binary Instrumentation (DBI) engine to instrument our target in order to collect coverage and fuzz efficiently. We tested multiple DBI solutions for this purpose and here are our results. You should remember, however, that MSGraph is a complicated target and these results can vary across different targets!
Tracing Solution |
Pros + |
Cons – |
---|---|---|
Integration within WinAFL. |
Newer builds for Windows are not supported. | |
Javascript Bindings. |
Difficulties on our side. | |
Designed to be easily modified. |
Requires IDA. | |
Worked for us out-of-the-box. |
Relatively new. |
In theory, we could have invested more time to understand why each solution did not work for us, and we would probably have done so if TinyInst had not worked so well for us.
As TinyInst ended up working for us out-of-the-box and has a good integration with Jackalope, we decided to use Jackalope as our fuzzing solution. We also used gflags to help us catch heap-related bugs. After about a week of on-and-off fuzzing, Jackalope managed to find 4 different bugs in MSGraph. These are the CVEs:
As Jackalope is still in the early development stages, it lacks some statistical information that is currently present in AFL, for example. Usually this information is used as a convenient way to measure the efficiency of the fuzzing process. However, working out-of-the-box is a major plus 🙂
Another great feature of Jackalope is that it is easily customizable and hackable. The process of adding a custom mutator to the fuzzer was pretty straight-forward and increased our fuzzing effectiveness with very little development cost.
After we identified the vulnerable function inside MSGraph, we found through code similarity checks that the vulnerable function is commonly used across multiple different Microsoft Office products, such as Excel (EXCEL.EXE), Office Online Server (EXCELCNV.EXE) and Excel for OSX. We successfully reproduced some of the bugs in these products.
As you can see in the images below, the same crash can be reproduced on these different products:
Even though we researched a single component of Microsoft Office, we managed to find several vulnerabilities that affect multiple products in this ecosystem. The results of this research were a set of files that could be embedded in different ways to potentially exploit different Office products across multiple platforms. As a bonus, we also had the opportunity to experiment with multiple different fuzzing solutions. We hope you find our notes useful.
Disclosure Timeline