
The Gemini API additionally now offers extra granular management over multimodal imaginative and prescient processing, with a media_resolution parameter for configuring what number of tokens are used for picture, video, and doc inputs. Builders can stability visible constancy with token utilization. Decision could be set utilizing media_resolution_low, media_resolution_medium, or media_resolution_high. Greater decision boosts the mannequin’s skill to learn tremendous textual content or determine small particulars, Google mentioned.
Beginning with Gemini 3, Gemini API additionally brings again thought signatures to enhance operate calling and picture technology. Thought signatures are encrypted representations of the mannequin’s inner thought course of. By passing these signatures again to the mannequin in subsequent API calls, builders can be sure that Gemini 3 maintains its chain of reasoning throughout a dialog. That is vital for complicated, multi-step agentic workflows the place preserving the “why” behind a call is as vital as the choice itself, Google mentioned.
Moreover, builders now can mix structured outputs with Gemini-hosted instruments, particularly Grounding with Google Search and URL Context. Combining structured outputs is particularly highly effective for constructing brokers that should fetch reside data from the net or particular internet pages and extract the info right into a JSON format for downstream duties, Google mentioned, noting that it has up to date the pricing of Grounding with Google Search to raised assist agentic workflows. The pricing mannequin modifications from a flat charge of $35 per 1K prompts to a usage-based charge of $14 per 1,000 search queries.
