University of Chicago Uses SciVis for a Billion Cells, COVID-19, and Invisible Monsters
Subscribe Now
Stay in the know on all things CODE. Updates are delivered to your inbox.
Teodora (Dora) Szasz and Donna Nemshick discuss how scientific visualization (SciVis) is opening up new insights into healthcare and other research areas that change our understanding of the world. It’s an evolving field where extracting visualizations from large datasets can improve analysis, diagnosis, and patient outcomes while reducing costs. SciVis is transformative in unveiling a billion simulated cells for tumors. It helped create visualization of the COVID-19 model and is pioneering other key areas of medical care. It has even been used for studying images in children’s books for gender inequality and hidden discrimination, which Dora calls the "invisible monsters."
Teodora (Dora) Szasz
Computational scientist, Research Computing Center at the University of Chicago
Donna Nemshick
Performance validation lead, Intel Advanced Rendering and Visualization Architecture Group
Intel® oneAPI Rendering Toolkit
Accelerate high-fidelity rendering and visualization applications with a set of rendering and ray-tracing libraries optimized for XPU performance at any scale.
Use Visualization to See a Billion Cells and Combat Invisible Monsters
Video
Guests: Teodora (Dora) Szasz of University of Chicago and Donna Nemshick of Intel Corporation
Welcome to Code Together, an interview series exploring the possibilities of cross-architecture development with those at the forefront. Scientific visualization is providing new insights in healthcare and advancing medical research in ways that we couldn't imagine a decade ago. It's an evolving and promising field where extracting visualizations from very large datasets can improve analysis, diagnosis, and patient outcomes while reducing costs. Today we'll hear from two guests who are working in scientific visualization in different ways.
Teodora (Dora) Szasz: At the University of Chicago my role was to help researchers across different departments, hospitals, public policy, chemistry, and so on. I remember one story when the surgery department contacted me and brought me a large data that of this billion [sic] simulated cells coming from a supercomputing simulation. So they were trying to basically study intestinal immune diseases. So they are in the intestine, and these diseases affect one in one hundred people worldwide. And I got the data...I was trying different tools that existed to visualize. Texas Advanced Computing Center (TACC) on visualization [sic]. I was going to Python* libraries and openCV and OpenGL*. I heard about this hackathon that was organized at the...and [that's] where I met this amazing community. I met Jim Jeffers from Intel, Paul Navrátil from TACC. So when I arrived there with my data, it was good timing because they were testing their new developed tool, and it was like magic—plug and play. I could see the visualization of my billion cells in multiple colors, different orientation [sic]. Now, also, having a lot of visual intelligence to visualize. I had a very big screen. We call it the wall of knowledge. So when I brought the billion cells and visualize it in a screen [sic], it was a different experience to be able to rotate and zoom in and see the difference in the pattern. Amazing.
Donna Nemshick: So visualization has really changed your research dramatically. With the big datasets and being able to visualize them and manipulate them on this giant wall of knowledge, that is truly amazing. So, I'm curious, how has [COVID-19] impacted your work and not being able to stand in front of that screen and do your research and really understand the depths of the medical processes you're looking at?
Dora: The last day before [COVID-19], when I was there, we were just able to reproduce on the screen—this impressive visualization that...a massive star, a hundred-thousand times bigger than the sun. Using ParaView* for that visualization, it was amazing. So I was just getting started being super excited on using the wall when pandemic came, so I couldn't go anymore in the laboratory [sic].
So I transitioned more into projects that are less into visualization—more computer vision, deep learning. So one of the projects I'm working and I love working this project [sic]. It's impressive. It's really measuring structural inequality in children's books. And, of course, I'm interested in the images of the children's book—and I like to call this the invisible monster, just to relate to children. And this invisible monster basically is the discrimination that is built in to social structures and institutions that don't really notice. So we are trying to understand how the images in this book really influence what children think that's possible, not only for them but also for the others [sic]. So, for example, if a man never sees a woman as a president, then he might think that this is not even possible to vote for a woman as a president because he cannot envision this possibility.
Donna: It was great that you were able to almost reinvent yourself and take on some new projects still with visualizing and imagery in the context of diversity. Has [COVID-19] led you to research directly related to [COVID-19] itself?
Dora: I started supporting a team in the section of pulmonary and critical-care medicine at [the University of] Chicago. It was a project that uses artificial intelligence to predict the maximal oxygen support that will be needed for the COVID-19 patients based on their chest X-rays and their clinical data. At the beginning of the pandemic, it was a crisis on even hospital bed, also ICU bed [sic], and not everyone was lucky in some countries to have the really good care that they needed when they got infected with [COVID-19]. So this team were dubbed the pioneer in placing these critical patients into, we call them oxygenated helmet [sic]. So instead of pursuing the more clinically involved options of using ventilators, which requires going in incubation and taking a patient to [the] ICU for a certain number of days, the team looked at if the level of oxygenation is not needed as much, then they put the patient in helmet [sic].
Donna: What a novel idea to be able to give acute care and advanced notification to patients who may need intensive care, and there's minimal beds and what a great research [sic]. Let's circle back a little bit more on the visualization, and you mentioned big data sizes and clusters. Can you share a bit more insight on your computing infrastructure?
Dora: So it's very exciting because our cluster is named Midway. But this year we launched Midway3, which is high-performance computing cluster of about 20-20 node and over 10,000 cores now together with the Midway2, we make around 40,000 cores [sic]. But what's special about Midway3—I'm so excited, because it's really dedicated for AI-extensive jobs and deep learning. It's the first time we built a system that is both Intel® [processors] and AMD* processors. And, also, we have NVIDIA* A100, V100 RTX6000 GPU capabilities nowadays. The difference between our previous cluster compared to this one on doing deep learning [sic]. I can really see the performance going super up.
Donna: How has performance impacted your research, like the performance of this infrastructure?
Dora: Yeah, for example, for the Midway3, all my projects include deep learning. So any time I go and train a model, instead of going and waiting for the training to happen today, it might happen in several hours. Also, we are really excited because Midway3 this year was used to...I don't know if you know, the University of Chicago team there was the first to generate a computational model of the [COVID-19] virus. And I know they use Midway and Frontera systems at TACC on other systems. But it's interesting to have this computational power to generate models and looking forward into doing maybe in-situ validation and looking at how the model looked like, while the simulation is generated.
Donna: So, then, on top of the infrastructure, we've got this layer of software, and you had mentioned here the [Intel®] oneAPI Rendering Toolkit. How do you use the ray-tracing APIs in your research?
Dora: So mostly this day we are looking in the medical field [sic]. I'm really interested in, for example, simulating how cancer lesions might develop over time or how the tumor is growing in certain areas of the body. And using imaging, we know the location of the tumor but, also, using simulation and basically knowing the mechanical process on the cancer growth, we can look at how the cancer is spreading in the surrounding tissues. So [the Intel oneAPI Rendering Toolkit] would be really interesting to have visualization of the structure and segmenting them and watch over time by simulating how they could develop in what surrounding region, in what shapes and form [sic].
Donna: In your simulation, do you generate a whole lot of data and look at it at the back end, or do you use in-situ-type simulation where, in real time, you're observing the data as it's being generated?
Dora: How I like to describe in-situ to people that do not know what in-situ is...Imagine I'm buying a car, and I give to the manufacturer all the specifications: color, what seat, what the noise [sic], what music system, everything. And they come back to me six months later and say, here's your car, bye...and it's not like that. I would like to go, maybe, and visit them every week and check on the progress and adjust little parameters. More and more, we transition into having simulation and generating visualization real time [sic] as the simulation is moving on and is advancing. So we can check on results every time, and we can go back and change parameters if we see that the simulation goes in the directions we don't want. Right now, in the majority of our in-situ visualization, we are using ParaView and Pathway for processing them.
Donna: So what I do, and the advanced rendering and visualization architecture team, is those components that you know and love out in open source, be it Intel® Embree, [Intel®] OSPRay, [Intel®] Open Volume Kernel Library, [Intel®] OSPRay Studio, [Intel®] Open Image Denoise [sic]. I take those components and add a little layer of glue on it to format them into the [Intel] oneAPI Rendering Toolkit bundle. And what that all means is you get the same great APIs, no change. I pull the source code from the GitHub* releases and I distribute easy-to-use binaries in a toolkit bundle. And you will have this toolkit installed on your Windows* machine, on your Linux* workstation, and the installation will download everything you need to run those libraries. And that's the ease of use that we've brought to the user community through the [Intel] oneAPI Rendering Toolkit. So you can make your development real streamlined, make updates of the components within your applications very easy at the click of a button [sic].
Dora: I think it's all into generating real-time visualization and being able to get immersive experiences. And we are working with the simulation centers sometimes, where they have simulation to tap into tissues and see the development of certain tumors. But also on the other note, I couldn't stop being so impressed by Donna’s mention of how the [Intel] OSPRay library and oneAPI framework and toolkit are really easy to install and reproduce and are open source. I wish the code in the academia was as easy to install and reproduce. So, this year, maybe because I had time to be down more and be with the code, I got really into advocating for bringing more industry resources and best practices for writing code and creating products to the academic area. So having reproducibility and great documentation as the [Intel] OSPRay library has, and collaborating [sic].
Donna: I am thinking big, big, big, and big. I'm thinking big data, data getting even bigger, big compute power on HPC [high-performance computing] clusters, big storage capability, factoring in the powerful [Intel®] Optane™ memory that can drop right into your machines and expand your memory footprint into the terabytes [sic].