Use of geometry and physics to describe feature learning in deep nerve networks

Use of geometry and physics to describe feature learning in deep nerve networks

Use of geometry and physics to describe feature learning in deep nerve networks

The imitation of a handmade folding ruler, which can be used by the team for DNN training models in different governments. Credit: Shi, Pan and Duke Manic.

Deep neural networks (DNNS), machine learning algorithms identify the work of large language models (LLMS) and other artificial intelligence (AI) models, learn to analyze large quantities of data and learn to make accurate predictions. These networks are formed in layers, each of which transforms input data into ‘features’ that guide the next layer analysis.

The process through which the DNNS features learn have been the subject of numerous research studies and eventually the key to good performance on a variety of works of these models. Recently, some computer scientists have begun to find the possibility of learning the modeling feature in DNN using the approach connected in framework and physics.

Researchers at the University of Basel and the University of Science and Technology of China discovered a phase arogram, a graph that is similar to those used in thermodynamics to explain water liquid, gas and solid stages, which represents DNN in different conditions. Their paper, appeared in Physical review postsA DNN makes the spring model as a block chain, a simple mechanical system that is often used to study interaction between linear (spring) and nun liner (friction) forces.

“Cheng and I were in a workshop where there was an impressive conversation on the ‘Law of Statistics’.” Deep nervous network layers (but also layers of biological nerve networks such as human visual cortex) slowly simplify them by simplifying and facilitating them.

“The deeper you are in the network, the more regular, the more geometric in these representatives, means that it is more different and easier to distinguish (eg, representing cats and dogs). This is a way to measure this separation.

“The conversation shows that well in well -trained neurological nets, it is often that these data separation ‘summary figures’ behave in a simple way, even for the most complex deep neurological networks trained on complex data: each layer is better than one layer.”

The team found that the ‘data separation law’ in networks with the commonly used ‘hyperpressors’, such as learning rates and noise, was valid, but the choice of different hyperpressors. No. They felt why this happens to understand how DNNS models can learn good qualities. Thus they are ready to find a proper ideological explanation of these interesting searches.

“At the same time, we were among some of the Geo -Physics projects where people use the block model in the spring as a phenomocological model of error and earthquake dynamics,” said Dokmani. “Finiology of separation of figures reminded us of it. We thought about many other species. For example, Cheng thought that equal data separation was like a coat hanger who was back. I thought it was a bit like a folding ruler.

“We spent the winter holidays exchanging photos and videos of various ‘layer -made’ household items and tools, including this coat hanger, folding rulers, etc. I have been discussed whether a special pull is a good model for a certain deep nervous net.

After identifying various potential ideological models and layered physical systems that can be used to study how DNNS learn the features, researchers finally decided to focus on Spring Block models. These models have already proven to be valuable valuable to study a wide range of real -world phenomena, including earthquakes and content malfunction.

Use of geometry and physics to describe feature learning in deep nerve networks

Deep neurological networks data representing the team’s block theory in the spring of the team. Credit: Shi, Pan and Duke Manic.

“We have shown that the separation of this figure is the same as the behavior of the blocks connected by Springs, which is slipping on a rough level (but also by the behavior of other mechanical systems such as folding rulers).”

“How easy a layer makes, it is similar to how much the spring increases. Non -calligraphy in the network is similar to how much friction is between the blocks and the surface. We can add noise to both systems.”

When looking at the two systems in the context of the law of separation from the data, Dukmani and his colleagues found that DNN’s behavior was the same as the spring block chains. A DNN responds to the loss of training (ie, request to describe the observed data) by separating the data layer from the layer. Similarly, a spring block responds to the power that is separated by the layer layer.

“The more intolerance is, the more contradictory between the outer (deep) and the inner (shallow) layers: the deep layers learn / are separated,” said Dokmani.

“However, if we add the training noise or start to shake / mobilize the spring-block system, we will spend some time ‘in the air’ without experiencing friction, and it will give the springs a bit a chance to equalize separation. This is in fact a geographical, geographical engineer, and a geographical engineering. “

This recent study introduces a new theoretical approach to study DNNS and how they learn the features over time. In the future, this point of view can help deepen the algorithm of deep learning and the current understanding of these processes through which they learn to deal with specific tasks.

“Most current results treat easy networks that are practically losing the key aspects of real deep nets used.

“These works study the same effect on a stylized model, but the prediction of the success of the deep nets is done on the accumulation of factors (depth, non -calligraphy, noise, learning rate, normalization,… …), on the contrary, we have not already achieved all of these things.”

In the spring used by researchers, the block theory was found to be easy and efficient to understand the ability of DNNS to make it public in various landscapes so far. In his dissertation, Dukmani and his colleagues successfully used it to count DNN data separation curves during training, and found that the shape of these curves was a sign of a trained network on the unseen data.







Folding ruler’s experiences and DNN training videos in various governments. Credit: Physical review posts (2025) DOI: 10.1103/YS4N-2TJ3

“Since we also understand how to change the shape of a data separation curve in any direction through different noise and non -letters, it gives us a (potentially) powerful tool to accelerate very large net training,” said Dokmani.

“Most people have a strong chaos about springs and blocks but not about deep nerves. Our theory says that we can make interesting, useful, real statements about a simple mechanical system about our intuitive deep network. It is very good because there are billions of nerves in our nerves.

The theoretical model used by this team of researchers can soon be used to further investigate the defects of algorithm -based algorithms based on both theorists and computer scientists. As part of his next education, Dukmani and his colleagues hope that they will also use their ideological approach to find features from a microscope point of view.

“We are nearing explaining the first principles for the block trends (or perhaps folding ruler trends) in the spring,” said Dokmani, “said Dokmani.

“The other direction that we are looking for is to really double to put into practice to improve deep pure training, especially very large transformer -based networks such as large language models. Usually being a proxy that is a proxy that is cheaper for training, and is generally a renowned way to improve training.

Understanding how DNN training can be carefully engineered to improve their ability to make it common in other tasks, researchers can also develop diagnostic tools for major nervous networks. For example, this device can help identify areas that need to be improved so as to promote the performance of a model, as well as how stress maps are used in structural mechanics so that they can identify the territory of stressful stress.

Dukamani added, “By analyzing the internal load distribution in the nerve net, we can find layers / regions that can be more burdensome that can indicate maximum fitting and injury to normal, or layers that are barely used, which indicates spare.”

Our author was written for you by Angarid Fudley, which has been edited by Gabie Clark, and the facts have been checked and reviewed by Robert Eagen. This article is the result of cautious humanitarian work. We rely on readers like you to keep free science journalism alive. If this reporting is important to you, please consider a donation (especially monthly). You will get one Ed -free Thanks as thank you.

More information:
Cheng Shee Et El, Deep Nervous Networks Feature Learning Spring-Black Theory, Physical review posts (2025) DOI: 10.1103/YS4N-2TJ3.

© 2025 Science X network

Reference: Geometry and physics for the Deep Nervous Networks (August 10, August 10) to describe the feature learning and recovered from https://phys.org/news/2025-08-geometry-physics-eep-neural.html.

This document is subject to copyright. In addition to any fair issues for the purpose of private study or research, no part can be re -reproduced without written permission. The content is provided only for information purposes.

Share this article

Leave a Reply

Your email address will not be published. Required fields are marked *