LIVIVO - Search results -

Search results

Result 1 - 10 of total 33

Search options

Book ; Online: Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models

Lovenia, Holy / Dai, Wenliang / Cahyawijaya, Samuel / Ji, Ziwei / Fung, Pascale

2023

Abstract: Object hallucination poses a significant challenge in vision-language (VL) models, often leading to the generation of nonsensical or unfaithful responses with non-existent objects. However, the absence of a general measurement for evaluating object ... ...

Abstract	Object hallucination poses a significant challenge in vision-language (VL) models, often leading to the generation of nonsensical or unfaithful responses with non-existent objects. However, the absence of a general measurement for evaluating object hallucination in VL models has hindered our understanding and ability to mitigate this issue. In this work, we present NOPE (Negative Object Presence Evaluation), a novel benchmark designed to assess object hallucination in VL models through visual question answering (VQA). We propose a cost-effective and scalable approach utilizing large language models to generate 29.5k synthetic negative pronoun (NegP) data of high quality for NOPE. We extensively investigate the performance of 10 state-of-the-art VL models in discerning the non-existence of objects in visual questions, where the ground truth answers are denoted as NegP (e.g., "none"). Additionally, we evaluate their standard performance on visual questions on 9 other VQA datasets. Through our experiments, we demonstrate that no VL model is immune to the vulnerability of object hallucination, as all models achieve accuracy below 10\% on NegP. Furthermore, we uncover that lexically diverse visual questions, question types with large scopes, and scene-relevant objects capitalize the risk of object hallucination in VL models.
Keywords	Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Computation and Language
Subject code	004
Publishing date	2023-10-08
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Directional convergence and alignment in deep learning

Ji, Ziwei / Telgarsky, Matus

2020

Abstract: In this paper, we show that although the minimizers of cross-entropy and related classification losses are off at infinity, network weights learned by gradient flow converge in direction, with an immediate corollary that network predictions, training ... ...

Abstract	In this paper, we show that although the minimizers of cross-entropy and related classification losses are off at infinity, network weights learned by gradient flow converge in direction, with an immediate corollary that network predictions, training errors, and the margin distribution also converge. This proof holds for deep homogeneous networks -- a broad class of networks allowing for ReLU, max-pooling, linear, and convolutional layers -- and we additionally provide empirical support not just close to the theory (e.g., the AlexNet), but also on non-homogeneous networks (e.g., the DenseNet). If the network further has locally Lipschitz gradients, we show that these gradients also converge in direction, and asymptotically align with the gradient flow path, with consequences on margin maximization, convergence of saliency maps, and a few other settings. Our analysis complements and is distinct from the well-known neural tangent and mean-field theories, and in particular makes no requirements on network width and initialization, instead merely requiring perfect classification accuracy. The proof proceeds by developing a theory of unbounded nonsmooth Kurdyka-{\L}ojasiewicz inequalities for functions definable in an o-minimal structure, and is also applicable outside deep learning. Comment: To appear, NeuRIPS 2020
Keywords	Computer Science - Machine Learning ; Computer Science - Neural and Evolutionary Computing ; Mathematics - Optimization and Control ; Statistics - Machine Learning
Publishing date	2020-06-11
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Inhibition of thioredoxin-1 enhances the toxicity of glycolysis inhibitor 2-deoxyglucose by downregulating SLC1A5 expression in colorectal cancer cells.

Tang, Tianbin / Fang, Daoquan / Ji, Ziwei / Zhong, Zuyue / Zhou, Baojian / Ye, Lechi / Jiang, Lei / Sun, Xuecheng

Cellular oncology (Dordrecht)

2023 Volume 47, Issue 2, Page(s) 607–621

Abstract: Background: Targeting glycolysis in cancer is an attractive approach for therapeutic intervention. 2-Deoxyglucose (2DG) is a synthetic glucose analog that inhibits glycolysis. However, its efficacy is limited by the systemic toxicity at high doses. ... ...

Abstract	Background: Targeting glycolysis in cancer is an attractive approach for therapeutic intervention. 2-Deoxyglucose (2DG) is a synthetic glucose analog that inhibits glycolysis. However, its efficacy is limited by the systemic toxicity at high doses. Understanding the mechanism of 2DG resistance is important for further use of this drug in cancer treatment. Methods: The expression of thioredoxin-1 (Trx-1) in colorectal cancer (CRC) cells treated with 2DG was detected by Western blotting. The effect of Trx-1 on the cytotoxicity of 2DG in CRC cells was examined in vitro and in vivo. The molecular mechanism involved in Trx-1-mediated activation of the SLC1A5 gene promoter activity was elucidated using in vitro models. Results: Inhibition glycolysis with 2DG increased the expression of Trx-1 in CRC cells. Overexpression of Trx-1 decreased the cytotoxicity of 2DG, whereas knockdown of Trx-1 by shRNA significantly increased the cytotoxicity of 2DG in CRC cells. The Trx-1 inhibitor PX-12 increased the cytotoxicity of 2DG on CRC cells both in vitro and in vivo. In addition, Trx-1 promoted SLC1A5 expression by increasing the promoter activity of the SLC1A5 gene by binding to SP1. We also found that the SLC1A5 expression was upregulated in CRC tissues, and inhibition of SLC1A5 significantly enhanced the inhibitory effect of 2DG on the growth of CRC cells in vitro and in vivo. Overexpression of SLC1A5 reduced the cytotoxicity of 2DG in combination with PX-12 treatment in CRC cells. Conclusion: Our results demonstrate a novel adaptive mechanism of glycolytic inhibition in which Trx-1 increases GSH levels by regulating SLC1A5 to rescue cytotoxicity induced by 2DG in CRC cells. Inhibition of glycolysis in combination with inhibition of Trx-1 or SLC1A5 may be a promising strategy for the treatment of CRC.
MeSH term(s)	Humans ; Deoxyglucose/pharmacology ; Thioredoxins/metabolism ; Thioredoxins/genetics ; Colorectal Neoplasms/genetics ; Colorectal Neoplasms/pathology ; Colorectal Neoplasms/metabolism ; Colorectal Neoplasms/drug therapy ; Glycolysis/drug effects ; Down-Regulation/drug effects ; Cell Line, Tumor ; Animals ; Mice, Nude ; Gene Expression Regulation, Neoplastic/drug effects ; Promoter Regions, Genetic/genetics ; Mice ; Mice, Inbred BALB C ; Sp1 Transcription Factor/metabolism ; Xenograft Model Antitumor Assays ; Disulfides ; Imidazoles
Chemical Substances	Deoxyglucose (9G2MP84A8W) ; Thioredoxins (52500-60-4) ; Sp1 Transcription Factor ; 1-methylpropyl-2-imidazolyl disulfide (8PQ9CZ8BTJ) ; Disulfides ; Imidazoles
Language	English
Publishing date	2023-10-23
Publishing country	Netherlands
Document type	Journal Article
ZDB-ID	2595109-9
ISSN	2211-3436 ; 1875-8606 ; 2211-3428
ISSN (online)	2211-3436
ISSN	1875-8606 ; 2211-3428
DOI	10.1007/s13402-023-00887-6
Database	MEDical Literature Analysis and Retrieval System OnLINE

Full text online

Accessible to users with ZB MED library card

In stock of ZB MED Cologne/Königswinter

Zs.A 6908: Show issues

Location:
Je nach Verfügbarkeit (siehe Angabe bei Bestand)
bis Jg. 2021: Bestellungen von Artikeln über das Online-Bestellformular
ab Jg. 2022: Lesesaal (EG)

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Details ▾

Book ; Online: Plausible May Not Be Faithful

Dai, Wenliang / Liu, Zihan / Ji, Ziwei / Su, Dan / Fung, Pascale

Probing Object Hallucination in Vision-Language Pre-training

2022

Abstract: Large-scale vision-language pre-trained (VLP) models are prone to hallucinate non-existent visual objects when generating text based on visual information. In this paper, we systematically study the object hallucination problem from three aspects. First, ...

Abstract	Large-scale vision-language pre-trained (VLP) models are prone to hallucinate non-existent visual objects when generating text based on visual information. In this paper, we systematically study the object hallucination problem from three aspects. First, we examine recent state-of-the-art VLP models, showing that they still hallucinate frequently, and models achieving better scores on standard metrics (e.g., CIDEr) could be more unfaithful. Second, we investigate how different types of image encoding in VLP influence hallucination, including region-based, grid-based, and patch-based. Surprisingly, we find that patch-based features perform the best and smaller patch resolution yields a non-trivial reduction in object hallucination. Third, we decouple various VLP objectives and demonstrate that token-level image-text alignment and controlled generation are crucial to reducing hallucination. Based on that, we propose a simple yet effective VLP loss named ObjMLM to further mitigate object hallucination. Results show that it reduces object hallucination by up to 17.4% when tested on two benchmarks (COCO Caption for in-domain and NoCaps for out-of-domain evaluation). Comment: Accepted at EACL 2023
Keywords	Computer Science - Computation and Language ; Computer Science - Computer Vision and Pattern Recognition
Subject code	004
Publishing date	2022-10-14
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Contrastive Learning for Inference in Dialogue

Ishii, Etsuko / Xu, Yan / Wilie, Bryan / Ji, Ziwei / Lovenia, Holy / Chung, Willy / Fung, Pascale

2023

Abstract: Inference, especially those derived from inductive processes, is a crucial component in our conversation to complement the information implicitly or explicitly conveyed by a speaker. While recent large language models show remarkable advances in ... ...

Abstract	Inference, especially those derived from inductive processes, is a crucial component in our conversation to complement the information implicitly or explicitly conveyed by a speaker. While recent large language models show remarkable advances in inference tasks, their performance in inductive reasoning, where not all information is present in the context, is far behind deductive reasoning. In this paper, we analyze the behavior of the models based on the task difficulty defined by the semantic information gap -- which distinguishes inductive and deductive reasoning (Johnson-Laird, 1988, 1993). Our analysis reveals that the disparity in information between dialogue contexts and desired inferences poses a significant challenge to the inductive inference process. To mitigate this information gap, we investigate a contrastive learning approach by feeding negative samples. Our experiments suggest negative samples help models understand what is wrong and improve their inference generations. Comment: Accepted to EMNLP2023
Keywords	Computer Science - Computation and Language
Subject code	160
Publishing date	2023-10-19
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks

Ji, Ziwei / Telgarsky, Matus

2019

Abstract: Recent theoretical work has guaranteed that overparameterized networks trained by gradient descent achieve arbitrarily low training error, and sometimes even low test error. The required width, however, is always polynomial in at least one of the sample ... ...

Abstract	Recent theoretical work has guaranteed that overparameterized networks trained by gradient descent achieve arbitrarily low training error, and sometimes even low test error. The required width, however, is always polynomial in at least one of the sample size $n$, the (inverse) target error $1/\epsilon$, and the (inverse) failure probability $1/\delta$. This work shows that $\widetilde{O}(1/\epsilon)$ iterations of gradient descent with $\widetilde{\Omega}(1/\epsilon^2)$ training examples on two-layer ReLU networks of any width exceeding $\mathrm{polylog}(n,1/\epsilon,1/\delta)$ suffice to achieve a test misclassification error of $\epsilon$. The analysis further relies upon a margin property of the limiting kernel, which is guaranteed positive, and can distinguish between true labels and random labels.
Keywords	Computer Science - Machine Learning ; Mathematics - Optimization and Control ; Statistics - Machine Learning
Subject code	512
Publishing date	2019-09-26
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Generalization bounds via distillation

Hsu, Daniel / Ji, Ziwei / Telgarsky, Matus / Wang, Lan

2021

Abstract: This paper theoretically investigates the following empirical phenomenon: given a high-complexity network with poor generalization bounds, one can distill it into a network with nearly identical predictions but low complexity and vastly smaller ... ...

Abstract	This paper theoretically investigates the following empirical phenomenon: given a high-complexity network with poor generalization bounds, one can distill it into a network with nearly identical predictions but low complexity and vastly smaller generalization bounds. The main contribution is an analysis showing that the original network inherits this good generalization bound from its distillation, assuming the use of well-behaved data augmentation. This bound is presented both in an abstract and in a concrete form, the latter complemented by a reduction technique to handle modern computation graphs featuring convolutional layers, fully-connected layers, and skip connections, to name a few. To round out the story, a (looser) classical uniform convergence analysis of compression is also presented, as well as a variety of experiments on cifar and mnist demonstrating similar generalization performance between the original network and its distillation. Comment: To appear, ICLR 2021
Keywords	Computer Science - Machine Learning
Publishing date	2021-04-12
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Characterizing the implicit bias via a primal-dual analysis

Ji, Ziwei / Telgarsky, Matus

2019

Abstract: This paper shows that the implicit bias of gradient descent on linearly separable data is exactly characterized by the optimal solution of a dual optimization problem given by a smoothed margin, even for general losses. This is in contrast to prior ... ...

Abstract	This paper shows that the implicit bias of gradient descent on linearly separable data is exactly characterized by the optimal solution of a dual optimization problem given by a smoothed margin, even for general losses. This is in contrast to prior results, which are often tailored to exponentially-tailed losses. For the exponential loss specifically, with $n$ training examples and $t$ gradient descent steps, our dual analysis further allows us to prove an $O(\ln(n)/\ln(t))$ convergence rate to the $\ell_2$ maximum margin direction, when a constant step size is used. This rate is tight in both $n$ and $t$, which has not been presented by prior work. On the other hand, with a properly chosen but aggressive step size schedule, we prove $O(1/t)$ rates for both $\ell_2$ margin maximization and implicit bias, whereas prior work (including all first-order methods for the general hard-margin linear SVM problem) proved $\widetilde{O}(1/\sqrt{t})$ margin rates, or $O(1/t)$ margin rates to a suboptimal margin, with an implied (slower) bias rate. Our key observations include that gradient descent on the primal variable naturally induces a mirror descent update on the dual variable, and that the dual objective in this setting is smooth enough to give a faster rate.
Keywords	Computer Science - Machine Learning ; Mathematics - Optimization and Control ; Statistics - Machine Learning
Subject code	510
Publishing date	2019-06-11
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Actor-critic is implicitly biased towards high entropy optimal policies

Hu, Yuzheng / Ji, Ziwei / Telgarsky, Matus

2021

Abstract: We show that the simplest actor-critic method -- a linear softmax policy updated with TD through interaction with a linear MDP, but featuring no explicit regularization or exploration -- does not merely find an optimal policy, but moreover prefers high ... ...

Abstract	We show that the simplest actor-critic method -- a linear softmax policy updated with TD through interaction with a linear MDP, but featuring no explicit regularization or exploration -- does not merely find an optimal policy, but moreover prefers high entropy optimal policies. To demonstrate the strength of this bias, the algorithm not only has no regularization, no projections, and no exploration like $\epsilon$-greedy, but is moreover trained on a single trajectory with no resets. The key consequence of the high entropy bias is that uniform mixing assumptions on the MDP, which exist in some form in all prior work, can be dropped: the implicit regularization of the high entropy bias is enough to ensure that all chains mix and an optimal policy is reached with high probability. As auxiliary contributions, this work decouples concerns between the actor and critic by writing the actor update as an explicit mirror descent, provides tools to uniformly bound mixing times within KL balls of policy space, and provides a projection-free TD analysis with its own implicit bias which can be run from an unmixed starting distribution. Comment: v2 primarily improved the proofs, with minimal changes to the body
Keywords	Computer Science - Machine Learning
Subject code	519
Publishing date	2021-10-21
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Early-stopped neural networks are consistent

Ji, Ziwei / Li, Justin D. / Telgarsky, Matus

2021

Abstract: This work studies the behavior of shallow ReLU networks trained with the logistic loss via gradient descent on binary classification data where the underlying data distribution is general, and the (optimal) Bayes risk is not necessarily zero. In this ... ...

Abstract	This work studies the behavior of shallow ReLU networks trained with the logistic loss via gradient descent on binary classification data where the underlying data distribution is general, and the (optimal) Bayes risk is not necessarily zero. In this setting, it is shown that gradient descent with early stopping achieves population risk arbitrarily close to optimal in terms of not just logistic and misclassification losses, but also in terms of calibration, meaning the sigmoid mapping of its outputs approximates the true underlying conditional distribution arbitrarily finely. Moreover, the necessary iteration, sample, and architectural complexities of this analysis all scale naturally with a certain complexity measure of the true conditional model. Lastly, while it is not shown that early stopping is necessary, it is shown that any univariate classifier satisfying a local interpolation property is inconsistent.
Keywords	Computer Science - Machine Learning ; Statistics - Machine Learning
Publishing date	2021-06-10
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

To top

Your last searches

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED