Increased traffic congestion in the Seattle area is a good analogy for a similar increase in congestion on high-performance computing (HPC) systems, according to the Pacific Northwest National Laboratory (PNNL) scientist.
More complex workloads, such as training artificial intelligence (AI) models, are responsible for HPC bottlenecks, say scientists in a paper published in The Next Wave, the National Security Agency’s review of emerging technologies .
“We can solve congestion through how we create the network,” said Sinan Aksoy, senior data scientist and team leader at PNNL who specializes in the mathematical field of graph theory and complex networks.
In HPC systems, hundreds of individual computer servers, known as nodes, function as a single supercomputer. The arrangement of the nodes and the links between them is the topology of the network.
HPC congestion occurs when data exchange between nodes funnels over the same link, creating a bottleneck.
HPC system bottlenecks are more common today than when the systems were first designed, explain Aksoy and his colleagues Roberto Gioiosa, a computer scientist in PNNL’s HPC group, and Stephen Young, a mathematician in PNNL’s mathematics group, in The Next wave.
That’s because the way people use HPC systems today is different than how they did it when the systems were first developed.
“This is a life-changing artifact,” Gioiosa said. “We didn’t have Facebook 20 years ago, we didn’t have this big data, we didn’t have big AI models, we didn’t have ChatGPT.”
Big technology expands
Beginning in the 1990s, the information technology industry began to flourish. New companies have disrupted the Seattle area economy and where people live and work. The resulting traffic patterns have become less predictable, less structured, and more congested, particularly along the east-west axis which limits traffic to two bridges over Lake Washington.
Traditional HPC network topologies resemble the Seattle-area road network, according to PNNL researchers. The topologies are optimized for physical simulations of things like interactions between molecules or regional climate systems, not modern AI workloads.
In physics simulations, calculations on one server inform calculations on neighboring servers. As a result, network topologies optimize data exchange between neighboring servers.
For example, in a physical simulation of a regional climate system, one server might simulate the climate over Seattle and another the climate over the waters of Puget Sound west of Seattle.
“The Puget Sound climate model isn’t going to affect what’s happening in New York City — I mean, ultimately it is — but it actually needs to talk to the Seattle model, so I might as well plug the Puget Sound computer and the Seattle computers next to each other,” said Young, a mathematician in PNNL’s computational mathematics group.
Communication patterns in data analytics and AI applications are erratic and unpredictable. Calculations on a server can inform calculations on a computer across the room. Running these workloads on traditional HPC networks is like driving around the greater Seattle region today on a rush hour treasure hunt, according to Gioiosa.
Network expansion
To overcome HPC bottlenecks, the PNNL research team proposed using graph theory, a mathematical field that explores the relationships and connections between a number, or clusters, of points in a space.
Young and Aksoy are experts in expanders, a class of graphs that can spread network traffic so that “there will always be many options for getting from point A to point B,” Aksoy explained.
Their network, called SpectralFly, exhibits perfect mathematical symmetry: each node is connected to the same number of other nodes, and each node’s connections look the same throughout the network.
The options for switching between nodes, with each option identical to any node in the network, also mean it’s easier for computer programmers to route information through the network, Aksoy added.
“It’s the same roadmap wherever you are, so it’s much less computationally expensive to figure out how to route information on this network,” he said, noting that this feature is like being in a city where directions from any neighborhood to all destination neighborhoods are the same for any starting point.
Simulation results
The PNNL research team ran simulations of their SpectralFly network across workloads from traditional physics-based simulations to AI model training and compared the results with those of other types of HPC network topologies.
They found that SpectralFly outperformed other network topologies on modern AI workloads and achieved comparable performance on traditional workloads, indicating it could serve as a hybrid topology for people looking to do traditional science and AI on the same HPC system. .
“We are trying to merge the two worlds, the traditional and the emerging one so that we can still do science and we can also do artificial intelligence and big data,” Gioiosa said.
###
About PNNL
Pacific Northwest National Laboratory draws on its distinctive strengths in chemistry, earth science, biology, and data science to advance scientific knowledge and address the challenges of sustainable energy and national security. Founded in 1965, PNNL is managed by Battelle for the Department of Energy’s Office of Science, which is the single largest supporter of basic research in the physical sciences in the United States. The DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science. For more information about PNNL, visit the PNNL News Center. Follow us on ChirpingFacebook, Linkedin and Instagram.
Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of the press releases posted on EurekAlert! by taxpayer agencies or for the use of any information through the EurekAlert system.
#Mathematics #prepares #highperformance #computing #age #artificial #intelligencePosted in ComputingTagged age, artificial, Computing, highperformance, intelligence, Mathematics, prepares
Post navigation
Previous:How the technology behind ChatGPT could make mind reading a reality | CNN business