As generative AI continues to dominate the headlines, it is typically difficult to seek out actual working business use cases amid the hype. author is a San Francisco-based startup working to develop generative AI writing products for businesses. Today the corporate announced a brand new feature for its Palmyra model that generates text from images, including graphics and diagrams, they call Palmyra Vision.

May Habib, co-founder and CEO of the corporate, says that they’ve made a strategic decision to concentrate on multimodal content and that the power to generate text from images is a component of this strategy. “We will concentrate on multimodal inputs, but on text outputs, so text generation and insights conveyed through text,” Habib told TechCrunch.

The company followed this guiding star and decided to investigate images relatively than produce them (not less than for now). They reserve the fitting to create charts and graphs from data in some unspecified time in the future, but that is not something they’re currently doing. This particular version focuses on generating text from such images.

The company uses a multi-model approach to create the Palmyra Vision results, with each model having a selected job to find out what’s contained within the image after which generating the text with 4 nines accuracy, so Habib.

This has a lot of use cases, including an e-commerce site generating text from hundreds of adjusting images to populate the positioning with the newest merchandise without requiring a human to maintain up with every change, or more importantly, automatic interpretation Insights from charts and graphs. Another example is a compliance audit. For example, a pharmaceutical company could use Palmyra-Vision to perform an automatic FDA compliance check on ad copy to be certain that the ad complies with FDA regulations described in an associated document, as shown in the next example.

Example for the author Palmyra Vision for a pharmaceutical company comparing the ad against a document with FDA requirements.

Palmyra Vision example for a pharmaceutical company comparing an ad against a document with FDA requirements. Photo credit: author

Ultimately, the product can interpret handwritten notes and summarize them into text, but Habib says it requires training the model for individual use cases, equivalent to medicine or insurance, for the accuracy to be there.

Habib says she doesn’t recommend using these tools without human review as a part of the workflow. She believes this is completely essential because any model can hallucinate (make things up) or just get facts incorrect, and it is important for people to examine the outcomes. Although they all the time recommend this to each customer and most now understand it, she believes that in some unspecified time in the future it’s going to require a more automated workflow to implement this consistently across all customers, which is what she says they’re working towards.

The company has raised $126 million to date Crunchbase data. and is currently talking to major cloud infrastructure platforms about partnering to assist scale the business. The most up-to-date round was a $100 million Series B round last September led by Iconiq.

The latest Palmyra version with image-to-text capabilities is out there today.

This article was originally published at