Leveraging Optimization Methods for Dynamically Assisted Control- Flow Integrity Mechanisms Computer Systems Laboratory

Leveraging Optimization Methods
for Dynamically Assisted ControlFlow Integrity Mechanisms
Computer Systems Laboratory
Institute of Computing
Unicamp
João Moreira, Lucas Teixeira, Edson Borin, Sandro Rigo
Overview
Software
DBM Tools
QEMU, DinamoRio,
Pin
Any Help ?
Process Address Space
Control Flow Integrity
(CFI)
● 
● 
How?
Overhead
Attack Code
...
Return Address
Locals
Buffer
String
●  Binary Translation
●  Optimization
●  Instrumentation
Background
•  Kiriansky et al. Secure execution via program
shepherding. USENIX Security Symposium, 2002.
•  Moreira et al. Asynchronous Program Flow
Verification. AMAS-BT, 2012
•  Kangkook Jee, et al. ShadowReplica: Efficient
parallelization of dynamic data flow tracking. ACM
SIGSAC Conference on Computer &
Communications Security, CCS ’13.
Contributions
• This work describes an asynchronous CFI
mechanism implemented over a DBM tool
• The main contributions of this paper are:
• Description of an asynchronous CFI implementation
• Study on the efficiency of communication
• Support for binary instrumentation and CFI on QEMU
• Guidelines for complex CFI policies implementation
Outline
●  Asynchronous CFI on QEMU
●  Optimizations on dynamically assisted CFI
●  Evaluation
●  Conclusions
Control Flow Integrity
• Assurance of correct program flow
• Monitored control flow transfers
• Different security policies
• Deny control transfers to suspicious memory areas
• ret instructions may only target addresses after
calls
• Relaxed x Strict CFI policies
• Strict policies may harm program's freedom
• Indirect control transfers are a particular challenge
QEMU
• Open source machine emulator
• Runs cross platform compiled apps and OS
• Two stage dynamic binary translation:
• Translates original instructions into IR
• Translates IR into native code
• Acceptable performance
Instrumenting with QEMU
• QEMU helper functions
• C Functions that implement instructions behavior
• Compiled with QEMU
• Emit a call instead of translating complex instructions
• Access to CPU state
• Front-end stage of translation is modified
• Instrumentation included as helper functs
Asynchronous CFI
• Two processes:
1 - QEMU running application
• QEMU instruments the application
• Outputs control transfer information
2 - Verification
• Analyses outputted control transfer information
• Checks for control flow corruption
Asynchronous CFI
• Instrumented CF instructions
• Control transfer information is logged
• Logs are exported for external verification
• No binary modification or compiler support required
• Implements call/ret parity check mechanism
• 2 flavours: Shadow-stack based or hash-map based
• Focused on detecting stack corruption
• Does not tackle indirect branches yet
Asynchronous x Online
• Asynchronous approach advantages:
• Parallelizable architecture
• Decoupling analysis overhead from main process
• Open space for more complex analysis
• Asynchronous approach disadvantages:
• Delay of reaction mechanisms
• Additional communication overheads
Communication
• Shared memory
• Does not use messages
• Pipe-based raw communication
• One message for every operation
• Pipe-based call buffer
• Calls are stored in a buffer
• ret occurs: all buffer is sent in one message
• Reduces the overall number of messages
Communication
• Call buffer worst case scenario:
• Consecutive leaf functions
multiple messages with only two logged instructions
• Functions with a single call consecutively
called
Long 1st message + many single instruction messages
• Amortizing leaf function worst case scenario
• Hybrid verification approach
• Mixes call buffer with online verification
Evaluation
• Two sets of applications:
• Synthetic exploits based on public vulnerabilities
• Subset of applications from Mibench benchmark
Evaluation
• Attack detection:
a)
●  Success
a: normal program flow
b, c: corrupted program flow
Evaluation
• Attack detection:
• Success
• Hash-map policy may be
b)
bypassed for b, under
certain circumstances
corrupted program flow
Evaluation
• Attack detection:
• Success
c)
corrupted program flow
Evaluation
Slowdown in comparison with the non-instrumented QEMU version
Evaluation
Speedups of each communication scheme in comparison with nonoptimized instrumented QEMU version
Evaluation
• Hybrid verification - most efficient approach:
• 1.46x average slowdown
• Worst slowdown is 2.22x
• 13.84x speedup on CRC32
• Computation sensitive:
Complex analysis will increase slowdown
Conclusions
• Implementation of efficient asynchronous CFI
• Several communication schemes
• Binary instrumentation on QEMU
• Shown guidelines for more complex CFI
analysis and policies
Leveraging Optimization Methods
for Dynamically Assisted ControlFlow Integrity Mechanisms
Thank You
João Moreira, Lucas Teixeira, Edson Borin, Sandro Rigo