Databricks spent $10 million on recent generative AI model DBRX, but it will probably’t beat GPT-4

If you wanted to extend awareness of your big tech company and had $10 million to spend, how would you spend it? On a Super Bowl business? An F1 sponsorship?

You spend it training a generative AI model. Generative models may not market in the standard sense, but they attract attention – and are increasingly becoming a part of providers’ standard services and products.

Check out Databricks’ DBRX, a brand new generative AI model announced today that is analogous to OpenAI’s GPT series and Google’s Gemini. Base versions (DBRX Base) and optimized versions (DBRX Instruct) can be found on GitHub and the Hugging Face AI development platform for research and business use and could be run and optimized on public, custom, or otherwise proprietary data.

“DBRX was trained to be useful and supply information on a wide selection of topics,” said Naveen Rao, vp of generative AI at Databricks, in an interview with TechCrunch. “DBRX has been optimized and tuned to make use of the English language, but is able to holding and translating conversations into a wide range of languages, equivalent to French, Spanish and German.”

Databricks describes DBRX as “open source,” just like “open source” models like Meta’s Llama 2 and AI startup Mistral’s models. (It is the topic of robust debate whether these models really meet the definition of open source.)

Databricks says it spent about $10 million and eight months training DBRX, which it claims (quoting a press release) “outperforms all existing open source models on standard benchmarks.”

But – and here lies the marketing problem – it’s exceedingly difficult to make use of DBRX unless you’re a Databricks customer.

Because to run DBRX in the usual configuration, you wish a server or PC with at the least 4 Nvidia H100 GPUs. A single H100 costs 1000’s of dollars – possibly much more. This could also be a no brainer for the common company, but for a lot of developers and solopreneurs it’s well out of reach.

And there remains to be positive print. According to Databricks, firms with greater than 700 million energetic users will face “certain limitations.” comparable to Metas for Llama 2, and that each one users must comply with terms that ensure they use DBRX “responsibly.” (Databricks had not voluntarily disclosed the small print of those terms on the time of publication.)

Databricks presents its Mosaic AI Foundation Model product as a managed solution to those obstacles, providing a training stack for fine-tuning DBRX on custom data along with running DBRX and other models. Customers can host DBRX privately using Databricks’ Model Serving offering, Rao suggested, or they’ll work with Databricks to deploy DBRX on the hardware of their selection.

Rao added:

We are focused on making the Databricks platform the most effective selection for custom model constructing, so the last word profit to Databricks is more users using our platform. DBRX is an illustration of our world-class pre-training and tuning platform that enables customers to construct their very own models from scratch. It’s a straightforward way for purchasers to start with Databricks Mosaic AI’s generative AI tools. And DBRX is able to use out of the box and could be tuned for superior performance on specific tasks with higher economy than large, enclosed models.

Databricks states that DBRX runs as much as 2x faster than Llama 2, partly resulting from its MoE (Mix of Experts) architecture. MoE — which DBRX shares with Llama 2, Mistral’s newer models, and Google’s recently announced Gemini 1.5 Pro — mainly divides data processing tasks into multiple subtasks after which delegates those subtasks to smaller, specialized “expert” models.

Most MoE models have eight experts. DBRX has 16, which Databricks says improves quality.

However, quality is relative.

While Databricks claims that DBRX outperforms the Llama 2 and Mistral models in certain language understanding, programming, math, and logic benchmarks, DBRX falls behind arguably the leading generative AI model in most areas outside of area of interest use cases like database programming, OpenAI’s GPT-4, back language generation.

Rao admits that DBRX also has other limitations, namely that – like all other generative AI models – it will probably fall victim to “hallucinating” responses to queries, despite Databricks’ work on security testing and red teaming. Because the model has only been trained to associate words or phrases with specific concepts, if these associations aren’t entirely correct, the answers won’t at all times be correct.

Additionally, unlike some recent flagship generative AI models, including Gemini, DBRX shouldn’t be multimodal. (It can only process and generate text, not images.) And we do not know exactly what data sources were used for training. Rao only disclosed that no Databricks customer data was utilized in training DBRX.

“We trained DBRX on a considerable amount of data from different sources,” he added. “We used open data sets that the community knows, loves and uses daily.”

I asked Rao if any of the DBRX training datasets were copyrighted, licensed, or showed obvious signs of bias (e.g. racial bias), but he didn’t answer directly, only saying, “We were careful with the info used.” and conducted red-teaming exercises to enhance the model’s weaknesses.” Generative AI models tend to regurgitate training data, which is a serious problem for business users of models based on unlicensed, proprietary, or blatantly biased models data was trained. In a worst-case scenario, a user could find themselves in ethical and legal trouble for unintentionally incorporating IP-infringing or biased work from a model into their projects.

Some firms that train and publish generative AI models offer policies that cover legal fees related to possible violations. Databricks doesn’t currently do that – Rao says the corporate is “exploring scenarios” under which this is perhaps the case.

Given these and the opposite ways during which DBRX misses the mark, the model appears to be a tough sell to any current or potential Databricks customer. Databricks’ competitors within the generative AI space, including OpenAI, offer equally, if no more, compelling technologies at very competitive prices. And many generative AI models are closer to the commonly understood definition of open source than DBRX.

Rao guarantees that Databricks will proceed to refine DBRX and release recent versions as the corporate’s Mosaic Labs research and development team – the team behind DBRX – explores recent generative AI possibilities.

“DBRX advances the open source model space and challenges us to create future models much more efficiently,” he said. “We will release variants as we apply techniques to enhance output quality by way of reliability, security and bias…We see the open model as a platform upon which our customers can construct tailored capabilities using our tools.”

Given where DBRX currently stands in comparison with its peers, there remains to be an exceptionally long technique to go.

This article was originally published at techcrunch.com

Databricks spent $10 million on recent generative AI model DBRX, but it will probably’t beat GPT-4

About The Author

MyAiQ

Leave a reply Cancel reply

Recent Posts

Databricks spent $10 million on recent generative AI model DBRX, but it will probably’t beat GPT-4

About The Author

MyAiQ

Related Posts

Using generative AI to enhance software testing

Generative AI is coming to healthcare and never everyone is worked up

Roads of Destruction: We found numerous illegal “ghost roads” that were getting used to interrupt up pristine rainforest

Ahead of the curve: How generative AI is revolutionizing the content supply chain

Leave a reply Cancel reply

Recent Posts