To evaluate the performance of mobile app designs, designers and researchers employ techniques such as A/B, usability, and analytics-driven testing. While these are all useful strategies for evaluating known designs, comparing many divergent solutions to identify the most performant remains a costly and difficult problem. This paper introduces a design performance testing approach that leverages existing app implementations and crowd workers to enable comparative testing at scale. This approach is manifest in ZIPT, a zero-integration performance testing platform that allows designers to collect detailed design and interaction data over any Android app — including apps they do not own and did not build. Designers can deploy scripted tests via ZIPT to collect aggregate user performance metrics (e.g., completion rate, time on task) and qualitative feedback over third-party apps. Through case studies, we demonstrate that designers can use ZIPT’s aggregate data and visualizations to understand the relative performance of interaction patterns found in the wild, and identify usability issues in existing Android apps.