LIVIVO - Search results -

Search results

Result 1 - 10 of total 133

Search options

Article ; Online: Designing Universally-Approximating Deep Neural Networks: A First-Order Optimization Approach.

Wu, Zhoutong / Xiao, Mingqing / Fang, Cong / Lin, Zhouchen

IEEE transactions on pattern analysis and machine intelligence

2024 Volume PP

Abstract: Universal approximation capability, also referred to as universality, is an important property of deep neural networks, endowing them with the potency to accurately represent the underlying target function in learning tasks. In practice, the architecture ...

Abstract	Universal approximation capability, also referred to as universality, is an important property of deep neural networks, endowing them with the potency to accurately represent the underlying target function in learning tasks. In practice, the architecture of deep neural networks largely influences the performance of the models. However, most existing methodologies for designing neural architectures, such as the heuristic manual design or neural architecture search, ignore the universal approximation property, thus losing a potential safeguard about the performance. In this paper, we propose a unified framework to design the architectures of deep neural networks with a universality guarantee based on first-order optimization algorithms, where the forward pass is interpreted as the updates of an optimization algorithm. The (explicit or implicit) network is designed by replacing each gradient term in the algorithm with a learnable module similar to a two-layer network or its derivatives Specifically, we explore the realm of width-bounded neural networks, a common practical scenario, showcasing their universality. Moreover, adding operations of normalization, downsampling, and upsampling does not hurt the universality. To the best of our knowledge, this is the first work that width-bounded networks with universal approximation guarantee can be designed in a principled way. Our framework can inspire a variety of neural architectures including some renowned structures such as ResNet and DenseNet, as well as novel innovations. The experimental results on image classification problems demonstrate that the newly inspired networks are competitive and surpass the baselines of ResNet, DenseNet, as well as the advanced ConvNeXt and ViT, testifying to the effectiveness of our framework.
Language	English
Publishing date	2024-03-25
Publishing country	United States
Document type	Journal Article
ISSN	1939-3539
ISSN (online)	1939-3539
DOI	10.1109/TPAMI.2024.3380007
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Towards Understanding Convergence and Generalization of AdamW.

Zhou, Pan / Xie, Xingyu / Lin, Zhouchen / Yan, Shuicheng

IEEE transactions on pattern analysis and machine intelligence

2024 Volume PP

Abstract: AdamW modifies Adam by adding a decoupled weight decay to decay network weights per training iteration. For adaptive algorithms, this decoupled weight decay does not affect specific optimization steps, and differs from the widely used ... ...

Abstract	AdamW modifies Adam by adding a decoupled weight decay to decay network weights per training iteration. For adaptive algorithms, this decoupled weight decay does not affect specific optimization steps, and differs from the widely used l
Language	English
Publishing date	2024-03-27
Publishing country	United States
Document type	Journal Article
ISSN	1939-3539
ISSN (online)	1939-3539
DOI	10.1109/TPAMI.2024.3382294
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: On the $O(\frac{\sqrt{d}}{T^{1/4}})$ Convergence Rate of RMSProp and Its Momentum Extension Measured by $\ell_1$ Norm

Li, Huan / Lin, Zhouchen

Better Dependence on the Dimension

2024

Abstract: Although adaptive gradient methods have been extensively used in deep learning, their convergence rates have not been thoroughly studied, particularly with respect to their dependence on the dimension. This paper considers the classical RMSProp and its ... ...

Abstract	Although adaptive gradient methods have been extensively used in deep learning, their convergence rates have not been thoroughly studied, particularly with respect to their dependence on the dimension. This paper considers the classical RMSProp and its momentum extension and establishes the convergence rate of $\frac{1}{T}\sum_{k=1}^TE\left[\\|\nabla f(x^k)\\|_1\right]\leq O(\frac{\sqrt{d}}{T^{1/4}})$ measured by $\ell_1$ norm without the bounded gradient assumption, where $d$ is the dimension of the optimization variable and $T$ is the iteration number. Since $\\|x\\|_2\ll\\|x\\|_1\leq\sqrt{d}\\|x\\|_2$ for problems with extremely large $d$, our convergence rate can be considered to be analogous to the $\frac{1}{T}\sum_{k=1}^TE\left[\\|\nabla f(x^k)\\|_2\right]\leq O(\frac{1}{T^{1/4}})$ one of SGD measured by $\ell_1$ norm.
Keywords	Mathematics - Optimization and Control ; Computer Science - Artificial Intelligence
Subject code	519
Publishing date	2024-02-01
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Sampling complex topology structures for spiking neural networks.

Yan, Shen / Meng, Qingyan / Xiao, Mingqing / Wang, Yisen / Lin, Zhouchen

Neural networks : the official journal of the International Neural Network Society

2024 Volume 172, Page(s) 106121

Abstract: Spiking Neural Networks (SNNs) have been considered a potential competitor to Artificial Neural Networks (ANNs) due to their high biological plausibility and energy efficiency. However, the architecture design of SNN has not been well studied. Previous ... ...

Abstract	Spiking Neural Networks (SNNs) have been considered a potential competitor to Artificial Neural Networks (ANNs) due to their high biological plausibility and energy efficiency. However, the architecture design of SNN has not been well studied. Previous studies either use ANN architectures or directly search for SNN architectures under a highly constrained search space. In this paper, we aim to introduce much more complex connection topologies to SNNs to further exploit the potential of SNN architectures. To this end, we propose the topology-aware search space, which is the first search space that enables a more diverse and flexible design for both the spatial and temporal topology of the SNN architecture. Then, to efficiently obtain architecture from our search space, we propose the spatio-temporal topology sampling (STTS) algorithm. By leveraging the benefits of random sampling, STTS can yield powerful architecture without the need for an exhaustive search process, making it significantly more efficient than alternative search strategies. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet demonstrate the effectiveness of our method. Notably, we obtain 70.79% top-1 accuracy on ImageNet with only 4 time steps, 1.79% higher than the second best model. Our code is available under https://github.com/stiger1000/Random-Sampling-SNN.
MeSH term(s)	Neural Networks, Computer ; Algorithms
Language	English
Publishing date	2024-01-10
Publishing country	United States
Document type	Journal Article
ZDB-ID	740542-x
ISSN	1879-2782 ; 0893-6080
ISSN (online)	1879-2782
ISSN	0893-6080
DOI	10.1016/j.neunet.2024.106121
Database	MEDical Literature Analysis and Retrieval System OnLINE

In stock of ZB MED Cologne/Königswinter

Zs.A 2365: Show issues

Location:
Je nach Verfügbarkeit (siehe Angabe bei Bestand)
bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular
Jg. 1995 - 2021: Lesesall (2.OG)
ab Jg. 2022: Lesesaal (EG)

Order via subito

Details ▾
- See ZB MED holdings
- Order with fees

Article ; Online: Efficient learning of Scale-Adaptive Nearly Affine Invariant Networks.

Shen, Zhengyang / Qiu, Yeqing / Liu, Jialun / He, Lingshen / Lin, Zhouchen

Neural networks : the official journal of the International Neural Network Society

2024 Volume 174, Page(s) 106229

Abstract: Recent research has demonstrated the significance of incorporating invariance into neural networks. However, existing methods require direct sampling over the entire transformation set, notably computationally taxing for large groups like the affine ... ...

Abstract	Recent research has demonstrated the significance of incorporating invariance into neural networks. However, existing methods require direct sampling over the entire transformation set, notably computationally taxing for large groups like the affine group. In this study, we propose a more efficient approach by addressing the invariances of the subgroups within a larger group. For tackling affine invariance, we split it into the Euclidean group E(n) and uni-axial scaling group US(n), handling invariance individually. We employ an E(n)-invariant model for E(n)-invariance and average model outputs over data augmented from a US(n) distribution for US(n)-invariance. Our method maintains a favorable computational complexity of O(N
MeSH term(s)	Neural Networks, Computer ; Learning
Language	English
Publishing date	2024-03-11
Publishing country	United States
Document type	Journal Article
ZDB-ID	740542-x
ISSN	1879-2782 ; 0893-6080
ISSN (online)	1879-2782
ISSN	0893-6080
DOI	10.1016/j.neunet.2024.106229
Database	MEDical Literature Analysis and Retrieval System OnLINE

In stock of ZB MED Cologne/Königswinter

Zs.A 2365: Show issues

Location:
Je nach Verfügbarkeit (siehe Angabe bei Bestand)
bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular
Jg. 1995 - 2021: Lesesall (2.OG)
ab Jg. 2022: Lesesaal (EG)

Order via subito

Details ▾
- See ZB MED holdings
- Order with fees

Article: Efficient and generalizable cross-patient epileptic seizure detection through a spiking neural network.

Zhang, Zongpeng / Xiao, Mingqing / Ji, Taoyun / Jiang, Yuwu / Lin, Tong / Zhou, Xiaohua / Lin, Zhouchen

Frontiers in neuroscience

2024 Volume 17, Page(s) 1303564

Abstract: Introduction: Epilepsy is a global chronic disease that brings pain and inconvenience to patients, and an electroencephalogram (EEG) is the main analytical tool. For clinical aid that can be applied to any patient, an automatic cross-patient epilepsy ... ...

Abstract	Introduction: Epilepsy is a global chronic disease that brings pain and inconvenience to patients, and an electroencephalogram (EEG) is the main analytical tool. For clinical aid that can be applied to any patient, an automatic cross-patient epilepsy seizure detection algorithm is of great significance. Spiking neural networks (SNNs) are modeled on biological neurons and are energy-efficient on neuromorphic hardware, which can be expected to better handle brain signals and benefit real-world, low-power applications. However, automatic epilepsy seizure detection rarely considers SNNs. Methods: In this article, we have explored SNNs for cross-patient seizure detection and discovered that SNNs can achieve comparable state-of-the-art performance or a performance that is even better than artificial neural networks (ANNs). We propose an EEG-based spiking neural network (EESNN) with a recurrent spiking convolution structure, which may better take advantage of temporal and biological characteristics in EEG signals. Results: We extensively evaluate the performance of different SNN structures, training methods, and time settings, which builds a solid basis for understanding and evaluation of SNNs in seizure detection. Moreover, we show that our EESNN model can achieve energy reduction by several orders of magnitude compared with ANNs according to the theoretical estimation. Discussion: These results show the potential for building high-performance, low-power neuromorphic systems for seizure detection and also broaden real-world application scenarios of SNNs.
Language	English
Publishing date	2024-01-10
Publishing country	Switzerland
Document type	Journal Article
ZDB-ID	2411902-7
ISSN	1662-453X ; 1662-4548
ISSN (online)	1662-453X
ISSN	1662-4548
DOI	10.3389/fnins.2023.1303564
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Article ; Online: SPIDE: A purely spike-based method for training feedback spiking neural networks.

Xiao, Mingqing / Meng, Qingyan / Zhang, Zongpeng / Wang, Yisen / Lin, Zhouchen

Neural networks : the official journal of the International Neural Network Society

2023 Volume 161, Page(s) 9–24

Abstract: Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware. However, most supervised SNN training methods, such as conversion from artificial neural networks ... ...

Abstract	Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware. However, most supervised SNN training methods, such as conversion from artificial neural networks or direct training with surrogate gradients, require complex computation rather than spike-based operations of spiking neurons during training. In this paper, we study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method, implicit differentiation on the equilibrium state (IDE), for supervised learning with purely spike-based computation, which demonstrates the potential for energy-efficient training of SNNs. Specifically, we introduce ternary spiking neuron couples and prove that implicit differentiation can be solved by spikes based on this design, so the whole training procedure, including both forward and backward passes, is made as event-driven spike computation, and weights are updated locally with two-stage average firing rates. Then we propose to modify the reset membrane potential to reduce the approximation error of spikes. With these key components, we can train SNNs with flexible structures in a small number of time steps and with firing sparsity during training, and the theoretical estimation of energy costs demonstrates the potential for high efficiency. Meanwhile, experiments show that even with these constraints, our trained models can still achieve competitive results on MNIST, CIFAR-10, CIFAR-100, and CIFAR10-DVS.
MeSH term(s)	Feedback ; Action Potentials/physiology ; Neural Networks, Computer ; Membrane Potentials ; Computers
Language	English
Publishing date	2023-01-24
Publishing country	United States
Document type	Journal Article
ZDB-ID	740542-x
ISSN	1879-2782 ; 0893-6080
ISSN (online)	1879-2782
ISSN	0893-6080
DOI	10.1016/j.neunet.2023.01.026
Database	MEDical Literature Analysis and Retrieval System OnLINE

In stock of ZB MED Cologne/Königswinter

Zs.A 2365: Show issues

Location:
Je nach Verfügbarkeit (siehe Angabe bei Bestand)
bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular
Jg. 1995 - 2021: Lesesall (2.OG)
ab Jg. 2022: Lesesaal (EG)

Order via subito

Details ▾
- See ZB MED holdings
- Order with fees

Article ; Online: Learning Deep Sparse Regularizers With Applications to Multi-View Clustering and Semi-Supervised Classification.

Wang, Shiping / Chen, Zhaoliang / Du, Shide / Lin, Zhouchen

IEEE transactions on pattern analysis and machine intelligence

2022 Volume 44, Issue 9, Page(s) 5042–5055

Abstract: Sparsity-constrained optimization problems are common in machine learning, such as sparse coding, low-rank minimization and compressive sensing. However, most of previous studies focused on constructing various hand-crafted sparse regularizers, while ... ...

Abstract	Sparsity-constrained optimization problems are common in machine learning, such as sparse coding, low-rank minimization and compressive sensing. However, most of previous studies focused on constructing various hand-crafted sparse regularizers, while little work was devoted to learning adaptive sparse regularizers from given input data for specific tasks. In this paper, we propose a deep sparse regularizer learning model that learns data-driven sparse regularizers adaptively. Via the proximal gradient algorithm, we find that the sparse regularizer learning is equivalent to learning a parameterized activation function. This encourages us to learn sparse regularizers in the deep learning framework. Therefore, we build a neural network composed of multiple blocks, each being differentiable and reusable. All blocks contain learnable piecewise linear activation functions which correspond to the sparse regularizer to be learned. Furthermore, the proposed model is trained with back propagation, and all parameters in this model are learned end-to-end. We apply our framework to multi-view clustering and semi-supervised classification tasks to learn a latent compact representation. Experimental results demonstrate the superiority of the proposed framework over state-of-the-art multi-view learning models.
Language	English
Publishing date	2022-08-04
Publishing country	United States
Document type	Journal Article
ISSN	1939-3539
ISSN (online)	1939-3539
DOI	10.1109/TPAMI.2021.3082632
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Restarted Nonconvex Accelerated Gradient Descent

Li, Huan / Lin, Zhouchen

No More Polylogarithmic Factor in the $O(\epsilon^{-7/4})$ Complexity

2022

Abstract: This paper studies accelerated gradient methods for nonconvex optimization with Lipschitz continuous gradient and Hessian. We propose two simple accelerated gradient methods, restarted accelerated gradient descent (AGD) and restarted heavy ball (HB) ... ...

Abstract	This paper studies accelerated gradient methods for nonconvex optimization with Lipschitz continuous gradient and Hessian. We propose two simple accelerated gradient methods, restarted accelerated gradient descent (AGD) and restarted heavy ball (HB) method, and establish that our methods achieve an $\epsilon$-approximate first-order stationary point within $O(\epsilon^{-7/4})$ number of gradient evaluations by elementary proofs. Theoretically, our complexity does not hide any polylogarithmic factors, and thus it improves over the best known one by the $O(\log\frac{1}{\epsilon})$ factor. Our algorithms are simple in the sense that they only consist of Nesterov's classical AGD or Polyak's HB iterations, as well as a restart mechanism. They do not invoke negative curvature exploitation or minimization of regularized surrogate functions as the subroutines. In contrast with existing analysis, our elementary proofs use less advanced techniques and do not invoke the analysis of strongly convex AGD or HB. Code is avaliable at https://github.com/lihuanML/RestartAGD.
Keywords	Mathematics - Optimization and Control ; Computer Science - Machine Learning
Subject code	510
Publishing date	2022-01-27
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Code Prompting

Hu, Yi / Yang, Haotong / Lin, Zhouchen / Zhang, Muhan

a Neural Symbolic Method for Complex Reasoning in Large Language Models

2023

Abstract: Large language models (LLMs) have scaled up to unlock a wide range of complex reasoning tasks with the aid of various prompting methods. However, current prompting methods generate natural language intermediate steps to help reasoning, which can cause ... ...

Abstract	Large language models (LLMs) have scaled up to unlock a wide range of complex reasoning tasks with the aid of various prompting methods. However, current prompting methods generate natural language intermediate steps to help reasoning, which can cause imperfect task reduction and confusion. To mitigate such limitations, we explore code prompting, a neural symbolic prompting method with both zero-shot and few-shot versions which triggers code as intermediate steps. We conduct experiments on 7 widely-used benchmarks involving symbolic reasoning and arithmetic reasoning. Code prompting generally outperforms chain-of-thought (CoT) prompting. To further understand the performance and limitations of code prompting, we perform extensive ablation studies and error analyses, and identify several exclusive advantages of using symbolic promptings compared to natural language. We also consider the ensemble of code prompting and CoT prompting to combine the strengths of both. Finally, we show through experiments how code annotations and their locations affect code prompting.
Keywords	Computer Science - Computation and Language ; Computer Science - Artificial Intelligence
Subject code	401
Publishing date	2023-05-29
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

To top

More links

Kategorien

Order via subito

Inter-library loan at ZB MED

More links

Kategorien

Order via subito

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

More links

Kategorien

Order via subito

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

More links

Kategorien

Order via subito

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED