OpenAI has revealed its first custom-built AI inference processor, developed in collaboration with semiconductor giant Broadcom. Named Jalapeño, the chip was designed specifically to handle OpenAI’s inference workloads — the process of running pre-trained AI models in response to live user requests — and was developed with assistance from OpenAI’s own AI models.
While still in testing, early results indicate materially stronger performance-per-watt compared to current alternatives. The chip’s announcement highlighted its efficiency when running real-time coding models, pointing to cost reduction as a primary objective.
The move positions OpenAI alongside Google and Amazon, both of which have developed proprietary AI accelerators to reduce dependence on Nvidia’s GPUs. President Greg Brockman had previously outlined the company’s rationale, describing a focus on identifying specific workloads that existing hardware underserves and building silicon capable of accelerating what those workloads demand.

OpenAI framed Jalapeño as part of a broader strategy to own the full AI infrastructure stack — spanning chip architecture, memory systems, networking, scheduling, and deployment — so that every layer can be optimised around a single goal: making its models faster, more reliable, and cheaper to run.
More compute-intensive tasks such as pre-training are expected to continue relying on Nvidia hardware for the foreseeable future.