Categories
Misc

Microsoft Experience Centers Display Scalable, Real-Time Graphics With NVIDIA RTX and Mosaic Technology

When customers walk into a Microsoft Experience Center in New York City, Sydney or London, they’re instantly met with stunning graphics displayed on multiple screens and high-definition video walls inside a multi-story building. Built to showcase the latest technologies, Microsoft Experience Centers surround customers with vibrant, immersive graphics as they explore new products, watch technical Read article >

The post Microsoft Experience Centers Display Scalable, Real-Time Graphics With NVIDIA RTX and Mosaic Technology appeared first on NVIDIA Blog.

Categories
Misc

Upcoming Event: Guide to Minimizing Jetson Disk Usage

Learn the steps for reducing disk usage on NVIDIA Jetson in this webinar on November 1.

Learn the steps for reducing disk usage on NVIDIA Jetson in this webinar on November 1.

Categories
Misc

Make Gaming a Priority: Special Membership Discount Hits GeForce NOW for Limited Time

This spook-tacular Halloween edition of GFN Thursday features a special treat: 40% off a six-month GeForce NOW Priority Membership — get it for just $29.99 for a limited time. Several sweet new games are also joining the GeForce NOW library. Creatures of the night can now stream vampire survival game V Rising from the cloud. Read article >

The post Make Gaming a Priority: Special Membership Discount Hits GeForce NOW for Limited Time appeared first on NVIDIA Blog.

Categories
Misc

Upcoming Event: Improve Your Cybersecurity Posture with AI

Find out how federal agencies are adopting AI to improve cybersecurity in this November 16 webinar featuring Booz Allen Hamilton.

Find out how federal agencies are adopting AI to improve cybersecurity in this November 16 webinar featuring Booz Allen Hamilton.

Categories
Misc

Explainer: What Is Edge AI and How Does It Work?

Edge AI is the deployment of AI applications in devices throughout the physical world. It’s called “edge AI” because the AI computation is done near the…

Edge AI is the deployment of AI applications in devices throughout the physical world. It’s called “edge AI” because the AI computation is done near the user at the edge of the network, close to where the data is located, rather than centrally in a cloud computing facility or private data center.

Categories
Offsites

Natural Language Assessment: A New Framework to Promote Education

Whether it’s a professional honing their skills or a child learning to read, coaches and educators play a key role in assessing the learner’s answer to a question in a given context and guiding them towards a goal. These interactions have unique characteristics that set them apart from other forms of dialogue, yet are not available when learners practice alone at home. In the field of natural language processing, this type of capability has not received much attention and is technologically challenging. We set out to explore how we can use machine learning to assess answers in a way that facilitates learning.

In this blog, we introduce an important natural language understanding (NLU) capability called Natural Language Assessment (NLA), and discuss how it can be helpful in the context of education. While typical NLU tasks focus on the user’s intent, NLA allows for the assessment of an answer from multiple perspectives. In situations where a user wants to know how good their answer is, NLA can offer an analysis of how close the answer is to what is expected. In situations where there may not be a “correct” answer, NLA can offer subtle insights that include topicality, relevance, verbosity, and beyond. We formulate the scope of NLA, present a practical model for carrying out topicality NLA, and showcase how NLA has been used to help job seekers practice answering interview questions with Google’s new interview prep tool, Interview Warmup.

Overview of Natural Language Assessment (NLA)

The goal of NLA is to evaluate the user’s answer against a set of expectations. Consider the following components for an NLA system interacting with students:

  • A question presented to the student
  • Expectations that define what we expect to find in the answer (e.g., a concrete textual answer, a set of topics we expect the answer to cover, conciseness)
  • An answer provided by the student
  • An assessment output (e.g., correctness, missing information, too specific or general, stylistic feedback, pronunciation, etc.)
  • [Optional] A context (e.g., a chapter in a book or an article)

With NLA, both the expectations about the answer and the assessment of the answer can be very broad. This enables teacher-student interactions that are more expressive and subtle. Here are two examples:

  1. A question with a concrete correct answer: Even in situations where there is a clear correct answer, it can be helpful to assess the answer more subtly than simply correct or incorrect. Consider the following:

    Context: Harry Potter and the Philosopher’s Stone
    Question: “What is Hogwarts?”
    Expectation: “Hogwarts is a school of Witchcraft and Wizardry” [expectation is given as text]
    Answer: “I am not exactly sure, but I think it is a school.”

    The answer may be missing salient details but labeling it as incorrect wouldn’t be entirely true or useful to a user. NLA can offer a more subtle understanding by, for example, identifying that the student’s answer is too general, and also that the student is uncertain.

    Illustration of the NLA process from input question, answer and expectation to assessment output

    This kind of subtle assessment, along with noting the uncertainty the student expressed, can be important in helping students build skills in conversational settings.

  2. Topicality expectations: There are many situations in which a concrete answer is not expected. For example, if a student is asked an opinion question, there is no concrete textual expectation. Instead, there’s an expectation of relevance and opinionation, and perhaps some level of succinctness and fluency. Consider the following interview practice setup:

    Question: “Tell me a little about yourself?”
    Expectations: { “Education”, “Experience”, “Interests” } (a set of topics)
    Answer: “Let’s see. I grew up in the Salinas valley in California and went to Stanford where I majored in economics but then got excited about technology so next I ….”

    In this case, a useful assessment output would map the user’s answer to a subset of the topics covered, possibly along with a markup of which parts of the text relate to which topic. This can be challenging from an NLP perspective as answers can be long, topics can be mixed, and each topic on its own can be multi-faceted.

A Topicality NLA Model

In principle, topicality NLA is a standard multi-class task for which one can readily train a classifier using standard techniques. However, training data for such scenarios is scarce and it would be costly and time consuming to collect for each question and topic. Our solution is to break each topic into granular components that can be identified using large language models (LLMs) with a straightforward generic tuning.

We map each topic to a list of underlying questions and define that if the sentence contains an answer to one of those underlying questions, then it covers that topic. For the topic “Experience” we might choose underlying questions such as:

  • Where did you work?
  • What did you study?

While for the topic “Interests” we might choose underlying questions such as:

  • What are you interested in?
  • What do you enjoy doing?

These underlying questions are designed through an iterative manual process. Importantly, since these questions are sufficiently granular, current language models (see details below) can capture their semantics. This allows us to offer a zero-shot setting for the NLA topicality task: once trained (more on the model below), it is easy to add new questions and new topics, or adapt existing topics by modifying their underlying content expectation without the need to collect topic specific data. See below the model’s predictions for the sentence “I’ve worked in retail for 3 years” for the two topics described above:

A diagram of how the model uses underlying questions to predict the topic most likely to be covered by the user’s answer.

Since an underlying question for the topic “Experience” was matched, the sentence would be classified as “Experience”.

Application: Helping Job Seekers Prepare for Interviews

Interview Warmup is a new tool developed in collaboration with job seekers to help them prepare for interviews in fast-growing fields of employment such as IT Support and UX Design. It allows job seekers to practice answering questions selected by industry experts and to become more confident and comfortable with interviewing. As we worked with job seekers to understand their challenges in preparing for interviews and how an interview practice tool could be most useful, it inspired our research and the application of topicality NLA.

We build the topicality NLA model (once for all questions and topics) as follows: we train an encoder-only T5 model (EncT5 architecture) with 350 million parameters on Question-Answers data to predict the compatibility of an <underlying question, answer> pair. We rely on data from SQuAD 2.0 which was processed to produce <question, answer, label> triplets.

In the Interview Warmup tool, users can switch between talking points to see which ones were detected in their answer.

The tool does not grade or judge answers. Instead it enables users to practice and identify ways to improve on their own. After a user replies to an interview question, their answer is parsed sentence-by-sentence with the Topicality NLA model. They can then switch between different talking points to see which ones were detected in their answer. We know that there are many potential pitfalls in signaling to a user that their response is “good”, especially as we only detect a limited set of topics. Instead, we keep the control in the user’s hands and only use ML to help users make their own discoveries about how to improve.

So far, the tool has had great results helping job seekers around the world, including in the US, and we have recently expanded it to Africa. We plan to continue working with job seekers to iterate and make the tool even more helpful to the millions of people searching for new jobs.

A short film showing how Interview Warmup and its NLA capabilities were developed in collaboration with job seekers.

Conclusion

Natural Language Assessment (NLA) is a technologically challenging and interesting research area. It paves the way for new conversational applications that promote learning by enabling the nuanced assessment and analysis of answers from multiple perspectives. Working together with communities, from job seekers and businesses to classroom teachers and students, we can identify situations where NLA has the potential to help people learn, engage, and develop skills across an array of subjects, and we can build applications in a responsible way that empower users to assess their own abilities and discover ways to improve.

Acknowledgements

This work is made possible through a collaboration spanning several teams across Google. We’d like to acknowledge contributions from Google Research Israel, Google Creative Lab, and Grow with Google teams among others.

Categories
Misc

Jetson-Driven Grub Getter: Cartken Rolls Out Robots-as-a-Service for Deliveries

There’s a new sidewalk-savvy robot, and it’s delivering coffee, grub and a taste of fun. The bot is garnering interest for Oakland, Calif., startup Cartken. The company, founded in 2019, has rapidly deployed robots for a handful of customer applications, including for Starbucks and Grubhub deliveries. Cartken CEO Chris Bersch said that he and co-founders Read article >

The post Jetson-Driven Grub Getter: Cartken Rolls Out Robots-as-a-Service for Deliveries appeared first on NVIDIA Blog.

Categories
Misc

MONAI Drives Medical AI on Google Cloud with Medical Imaging Suite

Medical imaging is an essential instrument for healthcare, powering screening, diagnostics, and treatment workflows around the world. Innovations and…

Medical imaging is an essential instrument for healthcare, powering screening, diagnostics, and treatment workflows around the world. Innovations and breakthroughs in computer vision are transforming the healthcare landscape with new SDKs accelerating this renaissance. 

MONAI, the Medical Open Network for AI, houses many of these SDKs in its open-source suite built to drive medical AI workflows. To learn more about MONAI, see Open-Source Healthcare AI Innovation Continues to Expand with MONAI v1.0.

To run these SDKs in the cloud and connect them to the medical imaging ecosystem, platforms are needed that are accessible, secure, and strategically integrated into infrastructure like storage and networking.

Recently announced, the Google Cloud Medical Imaging Suite is one such platform that enables development of AI for imaging to support faster, more accurate diagnosis of images, increase productivity for healthcare workers, and improve access to better care and outcomes for patients. Google Cloud has adopted MONAI into their medical imaging suite, providing radiologists and pathologists with critical and compelling tools for simplifying the development and adoption of AI into their clinical practice.

Data interoperability for medical imaging workflows

The Google Cloud Imaging Suite addresses common pain points organizations face in developing artificial intelligence and machine learning (ML) models and it uses AI and ML to enable data interoperability. It includes services for imaging storage with the Cloud Healthcare API, allowing easy and secure data exchange using DICOMweb. The enterprise-grade development environment is fully managed, highly scalable, and includes services for de-identification. 

The Medical Imaging Suite also includes Imaging Lab, helping to automate the highly manual and repetitive task of labeling medical images with AI-assisted annotation tools from MONAI. The Google Cloud Medical Imaging Lab is an extension of the base Jupyter environment which is packaged with the Google Cloud Deep Learning VM (DLVM) product. 

This extension is accomplished by adding additional software packages to the base DLVM image, which add graphical capabilities to the Jupyter environment. This makes it possible to develop Python notebooks that interact with several medical imaging applications. This graphical environment includes the popular image analysis application 3DSlicer, pre-installed with the MONAILabel plugin. 

A list of the different software layers of the Google Cloud Medical Imaging Lab package.
Figure 1. The different software layers of the Google Cloud Medical Imaging Lab package

The Jupyter-based architecture allows data scientists to leverage the power of the Python language, including PyTorch models, and to quickly visualize the results using graphical applications such as 3DSlicer. The MONAILabel server is configured to have secure access to the Google Cloud Healthcare API such that it can store images and the result of image annotations in the DICOM format. 

A diagram showing the end-to-end deployment of the Google Cloud Medical Imaging Lab
Figure 2. The end-to-end deployment of the Google Cloud Medical Imaging Lab 

The Google Cloud Medical Imaging Suite also includes services to build cohorts and image datasets, enabling organizations to view and search petabytes of imaging data to perform advanced analytics and create training datasets with zero operational overhead using BigQuery and Looker. 

Imaging AI pipelines help to accelerate development of scalable AI models and imaging deployment offers flexible options for cloud, on-prem, or edge deployment. These services are both included in the suite and allow organizations to meet diverse sovereignty, data security, and privacy requirements while providing centralized management and policy enforcement.

Transforming the end-to-end medical AI lifecycle

MONAI provides a suite of open source tools for training, labeling, and deploying medical models into the imaging ecosystem. With regular updates and feature releases, MONAI continues to add critical and compelling components to simplify the development and adoption of AI into clinical practice. 

MONAI adds critical path services to Google Cloud Medical Imaging Suite, and enables data scientists and developers on Medical Imaging Suite with the following services:

  • MONAI Label: Integrated into medical imaging grade viewers like OHIF and 3D Slicer, and with support for pathology and enterprise imaging viewers. Users can quickly create an active learning annotation framework, to segment organs and pathologies in seconds. This establishes ground truth which can drive model training.
  • MONAI Core: A PyTorch-driven library for deep learning tasks that include domain-optimized capabilities data scientists and researchers need for developing medical imaging training workflows. Use the MONAI Bundle, a self-contained model package with pretrained weights and training scripts, to quickly start fine-tuning a model.
  • MONAI Deploy: Delivers a quick, easy, and standardized way to define a model specification using an industry-standard called MONAI Deploy Application Packages (MAPs). Turn a model into an application and run the application in a real-world clinical environment.
  • MONAI Model Zoo: A hub for sharing pretrained models, enabling data scientists and clinical researchers to jump-start their AI development. Browse the Model Zoo for a model that can support your training, or submit your model to help further MONAI’s goal of a common standard for reproducible research and collaboration.

MONAI, integrated inside of Google Cloud Medical Imaging Suite, is poised to transform the end-to-end, medical AI lifecycle. Starting with data labeling to training models and running them at scale, MONAI on the medical imaging suite is being integrated into medical ecosystems using interoperable industry standards with  hardware and software services in the cloud to drive this at scale. 

Visit Google Medical Imaging Suite to get started.

Categories
Offsites

Open Images V7 — Now Featuring Point Labels

Open Images is a computer vision dataset covering ~9 million images with labels spanning thousands of object categories. Researchers around the world use Open Images to train and evaluate computer vision models. Since the initial release of Open Images in 2016, which included image-level labels covering 6k categories, we have provided multiple updates to enrich annotations and expand the potential use cases of the dataset. Through several releases, we have added image-level labels for over 20k categories on all images and bounding box annotations, visual relations, instance segmentations, and localized narratives (synchronized voice, mouse trace, and text caption) on a subset of 1.9M images.

Today, we are happy to announce the release of Open Images V7, which expands the Open Images dataset even further with a new annotation type called point-level labels and includes a new all-in-one visualization tool that allows a better exploration of the rich data available.

Point Labels

The main strategy used to collect the new point-level label annotations leveraged suggestions from a machine learning (ML) model and human verification. First, the ML model selected points of interest and asked a yes or no question, e.g., “is this point on a pumpkin?”. Then, human annotators spent an average of 1.1 seconds answering the yes or no questions. We aggregated the answers from different annotators over the same question and assigned a final “yes”, “no”, or “unsure” label to each annotated point.

Illustration of the annotations interface.
(Image by Lenore Edman, under CC BY 2.0 license)

For each annotated image, we provide a collection of points, each with a “yes” or “no” label for a given class. These points provide sparse information that can be used for the semantic segmentation task. We collected a total of 38.6M new point annotations (12.4M with “yes” labels) that cover 5.8 thousand classes and 1.4M images.

By focusing on point labels, we expanded the number of images annotated and categories covered. We also concentrated the efforts of our annotators on efficiently collecting useful information. Compared to our instance segmentation, the new points include 16x more classes and cover more images. The new points also cover 9x more classes than our box annotations. Compared to existing segmentation datasets, like PASCAL VOC, COCO, Cityscapes, LVIS, or ADE20K, our annotations cover more classes and more images than previous work. The new point label annotations are the first type of annotation in Open Images that provides localization information for both things (countable objects, like cars, cats, and catamarans), and stuff categories (uncountable objects like grass, granite, and gravel). Overall, the newly collected data is roughly equivalent to two years of human annotation effort.

Our initial experiments show that this type of sparse data is suitable for both training and evaluating segmentation models. Training a model directly on sparse data allows us to reach comparable quality to training on dense annotations. Similarly, we show that one can directly compute the traditional semantic segmentation intersection-over-union (IoU) metric over sparse data. The ranking across different methods is preserved, and the sparse IoU values are an accurate estimate of its dense version. See our paper for more details.

Below, we show four example images with their point-level labels, illustrating the rich and diverse information these annotations provide. Circles ⭘ are “yes” labels, and squares are “no” labels.

Four example images with point-level labels.
Images by Richie Diesterheft, John AM Nueva, Sarah Ackerman, and C Thomas, all under CC BY 2.0 license.

New Visualizers

In addition to the new data release, we also expanded the available visualizations of the Open Images annotations. The Open Images website now includes dedicated visualizers to explore the localized narratives annotations, the new point-level annotations, and a new all-in-one view. This new all-in-one view is available for the subset of 1.9M densely annotated images and allows one to explore the rich annotations that Open Images has accumulated over seven releases. On average these images have annotations for 6.7 image-labels (classes), 8.3 boxes, 1.7 relations, 1.5 masks, 0.4 localized narratives and 34.8 point-labels per image.

Below, we show two example images with various annotations in the all-in-one visualizer. The figures show the image-level labels, bounding boxes, box relations, instance masks, localized narrative mouse trace and caption, and point-level labels. The + classes have positive annotations (of any kind), while classes have only negative annotations (image-level or point-level).

Two example images with various annotations in the all-in-one visualizer.
Images by Jason Paris, and Rubén Vique, all under CC BY 2.0 license.

Conclusion

We hope that this new data release will enable computer vision research to cover ever more diverse and challenging scenarios. As the quality of automated semantic segmentation models improves over common classes, we want to move towards the long tail of visual concepts, and sparse point annotations are a step in that direction. More and more works are exploring how to use such sparse annotations (e.g., as supervision for instance segmentation or semantic segmentation), and Open Images V7 contributes to this research direction. We are looking forward to seeing what you will build next.

Acknowledgements

Thanks to Vittorio Ferrari, Jordi Pont-Tuset, Alina Kuznetsova, Ashlesha Sadras, and the annotators team for their support creating this new data release.

Categories
Misc

Upcoming Event: Retail Edge Computing 101: An Introduction to the Edge for Retail

​​Join us on November 9th for the Retail Edge Computing 101: An Introduction to the Edge for Retail webinar to get everything you wanted to know about the…

​​Join us on November 9th for the Retail Edge Computing 101: An Introduction to the Edge for Retail webinar to get everything you wanted to know about the edge, from the leader in AI and accelerated computing.