aiOla Unveils the First One-step AI Model that Masks Sensitive Information to Preserve Privacy

AI Use Cases

Insider Brief

aiOla has launched Whisper-NER, the first AI model combining automatic speech recognition (ASR) with built-in named entity recognition (NER) to transcribe and mask sensitive information (e.g., names, addresses) in one step, enhancing privacy, security, and compliance.
Unlike traditional multi-step approaches that leave sensitive data vulnerable, Whisper-NER processes audio securely, ensuring sensitive data is never stored or generated, making it adaptable for various applications like healthcare, compliance, and inventory management.
Built on OpenAI’s Whisper and trained on synthetic datasets, Whisper-NER is open-source, offering the community a secure, efficient, and ethical solution for handling sensitive speech data.

PRESS RELEASE — aiOla, a leader in speech AI technology, announced today the release of the first-of-its-kind AI model for automatic speech recognition with built-in named entity recognition capabilities. aiOla’s model addresses a range of critical challenges for enterprises, including the automatic detection and masking of sensitive information such as names, phone numbers, and addresses all in one step during the transcription of audio.

Voice is the most seamless way to interact with technology, making audio transcription a vital part of any speech-powered application. A key challenge in automatic speech recognition is ensuring privacy and security, as users’ speech often includes sensitive data. This risk was underscored in 2023, when a company offering transcription services to healthcare organizations and physicians fell victim to a breach, leading to the theft of data from more than 9 million patients. Companies typically process the transcribed text to remove sensitive information. However, this multi-step approach leaves the data vulnerable when it’s stored and transferred prior to processing, and creates regulatory and compliance issues.

aiOla’s Whisper-NER model recognizes and masks sensitive information during transcription. Users input an audio file along with the names of entities they want to be identified, for example, “Patient Name”, “Patient Address” or “Phone Number”. The model then transcribes the audio while simultaneously masking the entities so that sensitive personal information isn’t stored, even temporarily, enhancing privacy, security, and compliance. Additionally, for use cases where privacy and security are not a concern, the model offers flexible output options and can be configured to identify and tag entities without masking them. This customization makes the model adaptable to various uses, including speech-powered applications for inventory management, quality control, compliance, inspections, and beyond.

“Whisper-NER is the first open-source AI model that not only detects and masks sensitive data but can ensure that sensitive information is never generated in the first place,” said Gill Hetz, VP of Research at aiOla. “Our approach allows us to structure unstructured transcriptions without relying on generic models like ChatGPT, and without requiring separate ASR and NER processes, which can negatively impact privacy and security. Whisper-NER operates as a zero-shot solution, combining both tasks in one elegant step, significantly improving efficiency while maintaining supreme accuracy. This innovation not only boosts performance but also strengthens ethical AI practices, fostering trust in the secure and responsible collection of speech data.”

Whisper-NER, built on top of OpenAI’s Whisper, was trained using a synthetic dataset that combines large amounts of synthetic speech with open NER text datasets. This approach allowed the model to learn both transcription and entity recognition in parallel. aiOla is releasing Whisper-NER as an open-source model on GitHub and Hugging Face, making this advanced solution accessible to the community, with a demo available here for users to explore.

About aiOla:

aiOla’s patented technology comprehends over 100 languages, and discerns jargon, abbreviations and acronyms, demonstrating a low error rate even in noisy environments. aiOla’s technology converts manual processes in critical industries into data-driven, paperless, AI-powered workflows through cutting-edge speech recognition.

Contact:

Gavriel Cohen
Concrete Media for aiOla
aiOla@concrete.media

SOURCE