Month: June 2025

Misc

NVIDIA NeMo Retriever Scores First Place for Visual Retrieval

Post author By
Post date June 30, 2025
No Comments on NVIDIA NeMo Retriever Scores First Place for Visual Retrieval

NeMo Retriever tops several visual document retrieval leaderboards, setting new standards for RAG apps.

Misc

Best-in-Class Multimodal RAG: How the Llama 3.2 NeMo Retriever Embedding Model Boosts Pipeline Accuracy

Post author By
Post date June 30, 2025
No Comments on Best-in-Class Multimodal RAG: How the Llama 3.2 NeMo Retriever Embedding Model Boosts Pipeline Accuracy

Data goes far beyond text—it is inherently multimodal, encompassing images, video, audio, and more, often in complex and unstructured formats. While the…

Data goes far beyond text—it is inherently multimodal, encompassing images, video, audio, and more, often in complex and unstructured formats. While the common method is to convert PDFs, scanned images, slides, and other documents into text, it is challenging to capture all information in text format, as shown in Figure 1. The loss of visual information in text motivated the development of…

Source

Misc

Introducing NVFP4 for Efficient and Accurate Low-Precision Inference

Post author By
Post date June 27, 2025
No Comments on Introducing NVFP4 for Efficient and Accurate Low-Precision Inference

To get the most out of AI, optimizations are critical. When developers think about optimizing AI models for inference, model compression techniques—such as quantization, distillation, and pruning—typically come to mind. The most common of the three, without a doubt, is quantization. This is typically due to its post-optimization task-specific accuracy performance and broad choice of supported frameworks and techniques.
Read Article

Misc

Run Google DeepMind’s Gemma 3n on NVIDIA Jetson and RTX

Post author By
Post date June 27, 2025
No Comments on Run Google DeepMind’s Gemma 3n on NVIDIA Jetson and RTX

NVIDIA now supports the general availability of Gemma 3n on NVIDIA RTX and Jetson. Gemma, previewed by Google DeepMind at Google I/O last month, includes two new models optimized for multi-modal on-device deployment. Gemma now includes audio in addition to the text and vision capabilities introduced in version 3.5. Each component integrates trusted research models: Universal Speech Model for audio, MobileNet v4 for vision, and MatFormer for text.
Read Article

Misc

Just Released: NVIDIA PhysicsNeMo v25.06

Post author By
Post date June 27, 2025
No Comments on Just Released: NVIDIA PhysicsNeMo v25.06

New functionality to curate and train DoMINO at scale and validate against a physics-based benchmark suite.

Source

Misc

How to Work with Data Exceeding VRAM in the Polars GPU Engine

Post author By
Post date June 27, 2025
No Comments on How to Work with Data Exceeding VRAM in the Polars GPU Engine

In high-stakes fields such as quant finance, algorithmic trading, and fraud detection, data practitioners frequently need to process hundreds of gigabytes (GB)…

In high-stakes fields such as quant finance, algorithmic trading, and fraud detection, data practitioners frequently need to process hundreds of gigabytes (GB) of data to make quick, informed decisions. Polars, one of the fastest-growing data processing libraries, meets this need with a GPU engine powered by NVIDIA cuDF that accelerates compute-bound queries that are common in these fields.

Source

Misc

AI Analyzes Nurses’ Observations to Reduce Patient Danger

Post author By
Post date June 27, 2025
No Comments on AI Analyzes Nurses’ Observations to Reduce Patient Danger

Researchers have developed an AI-powered tool that can analyze nurses’ shift notes to identify—far earlier than traditional methods—when an admitted…

Researchers have developed an AI-powered tool that can analyze nurses’ shift notes to identify—far earlier than traditional methods—when an admitted patient’s health may be deteriorating or on the cusp of “crashing.” In early trials, the AI-tool, dubbed the CONCERN Early Warning System (CONCERN EWS), helped lower a patient’s risk of death by more than 35% while reducing the average hospital…

Source

Misc

Boost Embedding Model Accuracy for Custom Information Retrieval

Post author By
Post date June 26, 2025
No Comments on Boost Embedding Model Accuracy for Custom Information Retrieval

Customizing embedding models is crucial for effective information retrieval, especially when working with domain-specific data like legal text, medical records,…

Customizing embedding models is crucial for effective information retrieval, especially when working with domain-specific data like legal text, medical records, or multi-turn customer conversations. Generic, open-domain models often struggle to capture the nuances and structure of such specialized content. Coxwave Align, an analytics platform for conversational-AI products…

Source

Misc

Run Google DeepMind’s Gemma 3n on NVIDIA Jetson and RTX

Post author By
Post date June 26, 2025
No Comments on Run Google DeepMind’s Gemma 3n on NVIDIA Jetson and RTX

As of today, NVIDIA now supports the general availability of Gemma 3n on NVIDIA RTX and Jetson. Gemma, previewed by Google DeepMind at Google I/O last month,…

As of today, NVIDIA now supports the general availability of Gemma 3n on NVIDIA RTX and Jetson. Gemma, previewed by Google DeepMind at Google I/O last month, includes two new models optimized for multi-modal on-device deployment. Gemma now includes audio in addition to the text and vision capabilities introduced in version 3.5. Each component integrates trusted research models: Universal…

Source

Misc

Gemma 3n fully available in the open-source ecosystem!

Post author By
Post date June 26, 2025
No Comments on Gemma 3n fully available in the open-source ecosystem!