Sahaj Garg, co-founder and CTO of Wispr, a voice-to-text AI that turns speech into polished writing, talks with host Amey Ambade about designing programs for the anomaly that’s inherent in human enter (textual content, voice, multimodal). Sahaj focuses on concrete architectural and coaching methods for constructing sturdy AI programs. This episode examines the issue of ambiguity, the place it exhibits up, constructing sturdy programs, personalization, speaking uncertainty, and analysis. The dialog begins by exploring the distinction between inherent and reducible ambiguity, main classes of ambiguity together with lexical, syntactic, and pragmatic, and the extra sources of ambiguity in voice, comparable to homophones and accents. Garg particulars how one can construct programs by mannequin coaching, together with offering further context and establishing datasets for good annotation. They talk about personalization with a concentrate on “revealed preferences”—studying from consumer habits with out specific suggestions—and combating the issue of AI writing that “regresses to the imply.” Lastly, they contemplate how one can talk uncertainty to customers with out degrading the expertise, in addition to strategies for evaluating ambiguity decision by offline and on-line alerts.
Delivered to you by IEEE Laptop Society and IEEE Software program journal.

