NVIDIA has released Chat with RTX as a tech demo of how AI chatbots might be run locally on Windows PCs using its RTX GPUs.

The standard approach of using an AI chatbot is to make use of an internet platform like ChatGPT or to run queries via an API, with inference going down on cloud computing servers. The drawbacks of this are the prices, latency, and privacy concerns with personal or corporate data transferring backwards and forwards.

NVIDIA’s RTX range of GPUs is now making it possible to run an LLM locally in your Windows PC even should you’re not connected to the web.

Chat with RTX lets users create a customized chatbot using either Mistral or Llama 2. It uses retrieval-augmented generation (RAG) and NVIDIA’s inference optimizing TensorRT-LLM software.

You can direct Chat with RTX to a folder in your PC after which ask it questions related to the files within the folder. It supports various file formats, including .txt, .pdf, .doc/.docx and .xml.

Because the LLM is analyzing locally stored files with inference happening in your machine, it is de facto fast and none of your data is shared on potentially unsecured networks.

You could also prompt it with a YouTube video URL and ask it questions on the video. That requires web access however it’s an excellent option to get answers without having to look at a protracted video.

You can download Chat with RTX free of charge but you’ll should be running Windows 10 or 11 in your PC with a GeForce RTX 30 Series GPU or higher, with a minimum 8GB of VRAM.

Chat with RTX is a demo, fairly than a finished product. It’s somewhat buggy and doesn’t remember context so you may’t ask it follow up questions. But it’s a pleasant example of the way in which we’ll use LLMs in the long run.

Using an AI chatbot locally with zero API call costs and little or no latency is probably going the way in which most users will eventually interact with LLMs. The open-source approach that firms like Meta have taken will see on-device AI drive the adoption of their free models fairly than proprietary ones like GPT.

That being said, mobile and laptop users may have to attend some time yet before the computing power of an RTX GPU can fit into smaller devices.

This article was originally published at dailyai.com