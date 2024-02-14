Nvidia is the first of the top four AI giants to release a free chatbot dedicated to local offline use. The new Chat with RTX app allows users to use open language models locally without connecting to the cloud. The only real drawback with "Chat with RTX" is its sky-high system requirements and huge download size, particularly when compared with cloud solutions. You'll need an RTX 30-series GPU or later with at least 8GB of VRAM and Windows 10 or 11 to run it.



Chat with RTX's main selling point is its offline capabilities and user customizability. Nvidia's chatbot allows you to specify your own files so the chatbot can produce answers custom-tailored to your needs. You can import various file formats, including .txt, PDF, Word, and XML documents. YouTube videos can also be downloaded and parsed by the chatbot.

For example, I fed Chat with RTX two YouTube videos, one from MKBHD and one from Formula 1. I asked the chatbot about specific topics in each video, and it responded with concise answers taken directly from each video. Once you feed it a question that correlates with a topic in a file or video that you've injected into the chatbot, it will let you know which file/video its answer came from.



It's a pretty cool application that gives you more control than online solutions like Copilot and Gemini. However, its localized design does have a drawback in that it won't search the internet for answers. Depending on the use case, this can be a positive or a negative.

(Image credit: Nvidia)

(Image credit: Nvidia)

Chat with RTX utilizes the Mistral or LIama 2 open-source language models combined with Retrieval-Augmented Generation (RAG) — RAG being the mode of tuning the responses with your own data. To bring it to desktops, Nvidia is using its TensorRT-LLM software to allow the app to better utilize the tensor cores of RTX 30- or RTX 40-series GPUs.



The application is in its demonstration phase, so don't expect it to produce perfect answers. RTX 20-series owners are out of luck as well. It's not entirely clear why the original Turing architecture GPUs aren't supported, though Nvidia suggested time constraints were a factor and that it could make its way to 20-series cards in the future.