Jian Fang, Jianyu Chen, et al.
FCCM 2019
Today's common practice in developing conversational agents is pipelining off-the-shelf modularized services as ready-made building blocks. However, the discrete and sequential nature of the modules yields long response latency. We introduce Sci-Fii, a speculative inference framework accelerating conversational agent systems built with off-the-shelf modules, while keeping the modules unchanged.