Abstract
We introduce Reagent, a technology that can be
used in conjunction with automated speech recognition to allow users to query and manipulate ordinary webpages via speech and pointing. Reagent
can be used out-of-the-box with third-party websites, as it requires neither special instrumentation
from website developers nor special domain knowledge to capture semantically-meaningful mouse interactions with structured elements such as tables
and plots. When it is unable to infer mappings between domain vocabulary and visible webpage content on its own, Reagent proactively seeks help by
engaging in a voice-based interaction with the user