PieSlicer: Dynamically Improving Response Time for Cloud-based CNN Inference

Samuel S. Ogden, Xiangnan Kong, Tian Guo

Research output: Contribution to journalArticlepeer-review


Executing deep-learning inference on cloud servers enables the usage of high complexity models for mobile devices with limited resources. However, pre-execution time-the time it takes to prepare and transfer data to the cloud-is variable and can take orders of magnitude longer to complete than inference execution itself. This pre-execution time can be reduced by dynamically deciding the order of two essential steps, preprocessing and data transfer, to better take advantage of on-device resources and network conditions. In this work, we present PieSlicer, a system for making dynamic preprocessing decisions to improve cloud inference performance using linear regression models. PieSlicer then leverages these models to select the appropriate preprocessing location. We show that for image classification applications PieSlicer reduces median and 99th percentile pre-execution time by up to 50.2ms and 217.2ms respectively when compared to static preprocessing methods.
Original languageAmerican English
JournalICPE '21: Proceedings of the ACM/SPEC International Conference on Performance Engineering
StatePublished - 2021
Externally publishedYes


  • Cloud inference
  • mobile deep learning
  • performance modeling


  • Computer Sciences

Cite this