LIVIVO - Search results -

Search results

Result 1 - 10 of total 84

Search options

Article ; Online: In-Domain GAN Inversion for Faithful Reconstruction and Editability.

Zhu, Jiapeng / Shen, Yujun / Xu, Yinghao / Zhao, Deli / Chen, Qifeng / Zhou, Bolei

IEEE transactions on pattern analysis and machine intelligence

2024 Volume 46, Issue 5, Page(s) 2607–2621

Abstract: Generative Adversarial Networks (GANs) have significantly advanced image synthesis through mapping randomly sampled latent codes to high-fidelity synthesized images. However, applying well-trained GANs to real image editing remains challenging. A common ... ...

Abstract	Generative Adversarial Networks (GANs) have significantly advanced image synthesis through mapping randomly sampled latent codes to high-fidelity synthesized images. However, applying well-trained GANs to real image editing remains challenging. A common solution is to find an approximate latent code that can adequately recover the input image to edit, which is also known as GAN inversion. To invert a GAN model, prior works typically focus on reconstructing the target image at the pixel level, yet few studies are conducted on whether the inverted result can well support manipulation at the semantic level. This work fills in this gap by proposing in-domain GAN inversion, which consists of a domain-guided encoder and a domain-regularized optimizer, to regularize the inverted code in the native latent space of the pre-trained GAN model. In this way, we manage to sufficiently reuse the knowledge learned by GANs for image reconstruction, facilitating a wide range of editing applications without any retraining. We further make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property. Such a trade-off sheds light on how a GAN model represents an image with various semantics encoded in the learned latent distribution.
Language	English
Publishing date	2024-04-03
Publishing country	United States
Document type	Journal Article
ISSN	1939-3539
ISSN (online)	1939-3539
DOI	10.1109/TPAMI.2023.3310872
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Development and Validation of Esophageal Squamous Cell Carcinoma Risk Prediction Models Based on an Endoscopic Screening Program.

Han, Junming / Guo, Xiaolei / Zhao, Li / Zhang, Huan / Ma, Siqi / Li, Yan / Zhao, Deli / Wang, Jialin / Xue, Fuzhong

JAMA network open

2023 Volume 6, Issue 1, Page(s) e2253148

Abstract: Importance: Assessment tools are lacking for screening of esophageal squamous cell cancer (ESCC) in China, especially for the follow-up stage. Risk prediction to optimize the screening procedure is urgently needed.: Objective: To develop and validate ...

Abstract	Importance: Assessment tools are lacking for screening of esophageal squamous cell cancer (ESCC) in China, especially for the follow-up stage. Risk prediction to optimize the screening procedure is urgently needed. Objective: To develop and validate ESCC prediction models for identifying people at high risk for follow-up decision-making. Design, setting, and participants: This open, prospective multicenter diagnostic study has been performed since September 1, 2006, in Shandong Province, China. This study used baseline and follow-up data until December 31, 2021. The data were analyzed between April 6 and May 31, 2022. Eligibility criteria consisted of rural residents aged 40 to 69 years who had no contraindications for endoscopy. Among 161 212 eligible participants, those diagnosed with cancer or who had cancer at baseline, did not complete the questionnaire, were younger than 40 years or older than 69 years, or were detected with severe dysplasia or worse lesions were eliminated from the analysis. Exposures: Risk factors obtained by questionnaire and endoscopy. Main outcomes and measures: Pathological diagnosis of ESCC and confirmation by cancer registry data. Results: In this diagnostic study of 104 129 participants (56.39% women; mean [SD] age, 54.31 [7.64] years), 59 481 (mean [SD] age, 53.83 [7.64] years; 58.55% women) formed the derivation set while 44 648 (mean [SD] age, 54.95 [7.60] years; 53.51% women) formed the validation set. A total of 252 new cases of ESCC were diagnosed during 424 903.50 person-years of follow-up in the derivation cohort and 61 new cases from 177 094.10 person-years follow-up in the validation cohort. Model A included the covariates age, sex, and number of lesions; model B included age, sex, smoking status, alcohol use status, body mass index, annual household income, history of gastrointestinal tract diseases, consumption of pickled food, number of lesions, distinct lesions, and mild or moderate dysplasia. The Harrell C statistic of model A was 0.80 (95% CI, 0.77-0.83) in the derivation set and 0.90 (95% CI, 0.87-0.93) in the validation set; the Harrell C statistic of model B was 0.83 (95% CI, 0.81-0.86) and 0.91 (95% CI, 0.88-0.95), respectively. The models also had good calibration performance and clinical usefulness. Conclusions and relevance: The findings of this diagnostic study suggest that the models developed are suitable for selecting high-risk populations for follow-up decision-making and optimizing the cancer screening process.
MeSH term(s)	Humans ; Female ; Middle Aged ; Male ; Esophageal Squamous Cell Carcinoma/diagnosis ; Esophageal Squamous Cell Carcinoma/epidemiology ; Esophageal Neoplasms/diagnosis ; Esophageal Neoplasms/epidemiology ; Esophageal Neoplasms/pathology ; Prospective Studies ; Risk Factors ; Endoscopy, Gastrointestinal
Language	English
Publishing date	2023-01-03
Publishing country	United States
Document type	Multicenter Study ; Journal Article ; Research Support, Non-U.S. Gov't
ISSN	2574-3805
ISSN (online)	2574-3805
DOI	10.1001/jamanetworkopen.2022.53148
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article: Purine salvage-associated metabolites as biomarkers for early diagnosis of esophageal squamous cell carcinoma: a diagnostic model-based study.

Sun, Yawen / Liu, Wenjuan / Su, Mu / Zhang, Tao / Li, Xia / Liu, Wenbin / Cai, Yuping / Zhao, Deli / Yang, Ming / Zhu, Zhengjiang / Wang, Jialin / Yu, Jinming

Cell death discovery

2024 Volume 10, Issue 1, Page(s) 139

Abstract: Esophageal squamous cell carcinoma (ESCC) remains an important health concern in developing countries. Patients with advanced ESCC have a poor prognosis and survival rate, and achieving early diagnosis remains a challenge. Metabolic biomarkers are ... ...

Abstract	Esophageal squamous cell carcinoma (ESCC) remains an important health concern in developing countries. Patients with advanced ESCC have a poor prognosis and survival rate, and achieving early diagnosis remains a challenge. Metabolic biomarkers are gradually gaining attention as early diagnostic biomarkers. Hence, this multicenter study comprehensively evaluated metabolism dysregulation in ESCC through an integrated research strategy to identify key metabolite biomarkers of ESCC. First, the metabolic profiles were examined in tissue and serum samples from the discovery cohort (n = 162; ESCC patients, n = 81; healthy volunteers, n = 81), and ESCC tissue-induced metabolite alterations were observed in the serum. Afterward, RNA sequencing of tissue samples (n = 46) was performed, followed by an integrated analysis of metabolomics and transcriptomics. The potential biomarkers for ESCC were further identified by censoring gene-metabolite regulatory networks. The diagnostic value of the identified biomarkers was validated in a validation cohort (n = 220), and the biological function was verified. A total of 457 dysregulated metabolites were identified in the serum, of which 36 were induced by tumor tissues. The integrated analyses revealed significant alterations in the purine salvage pathway, wherein the abundance of hypoxanthine/xanthine exhibited a positive correlation with HPRT1 expression and tumor size. A diagnostic model was developed using two purine salvage-associated metabolites. This model could accurately discriminate patients with ESCC from normal individuals, with an area under the curve (AUC) (95% confidence interval (CI): 0.680-0.843) of 0.765 in the external cohort. Hypoxanthine and HPRT1 exerted a synergistic effect in terms of promoting ESCC progression. These findings are anticipated to provide valuable support in developing novel diagnostic approaches for early ESCC and enhance our comprehension of the metabolic mechanisms underlying this disease.
Language	English
Publishing date	2024-03-14
Publishing country	United States
Document type	Journal Article
ISSN	2058-7716
ISSN	2058-7716
DOI	10.1038/s41420-024-01896-6
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Region-Based Semantic Factorization in GANs

Zhu, Jiapeng / Shen, Yujun / Xu, Yinghao / Zhao, Deli / Chen, Qifeng

2022

Abstract: Despite the rapid advancement of semantic discovery in the latent space of Generative Adversarial Networks (GANs), existing approaches either are limited to finding global attributes or rely on a number of segmentation masks to identify local attributes. ...

Abstract	Despite the rapid advancement of semantic discovery in the latent space of Generative Adversarial Networks (GANs), existing approaches either are limited to finding global attributes or rely on a number of segmentation masks to identify local attributes. In this work, we present a highly efficient algorithm to factorize the latent semantics learned by GANs concerning an arbitrary image region. Concretely, we revisit the task of local manipulation with pre-trained GANs and formulate region-based semantic discovery as a dual optimization problem. Through an appropriately defined generalized Rayleigh quotient, we manage to solve such a problem without any annotations or training. Experimental results on various state-of-the-art GAN models demonstrate the effectiveness of our approach, as well as its superiority over prior arts regarding precise control, region robustness, speed of implementation, and simplicity of use.
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	006
Publishing date	2022-02-19
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: UKnow

Gong, Biao / Xie, Xiaoying / Feng, Yutong / Lv, Yiliang / Shen, Yujun / Zhao, Deli

A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training

2023

Abstract: This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data. Particularly focusing on visual and linguistic modalities, we categorize data knowledge into five unit types, namely, ... ...

Abstract	This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data. Particularly focusing on visual and linguistic modalities, we categorize data knowledge into five unit types, namely, in-image, in-text, cross-image, cross-text, and image-text, and set up an efficient pipeline to help construct the multimodal knowledge graph from any data collection. Thanks to the logical information naturally contained in knowledge graph, organizing datasets under UKnow format opens up more possibilities of data usage compared to the commonly used image-text pairs. Following UKnow protocol, we collect, from public international news, a large-scale multimodal knowledge graph dataset that consists of 1,388,568 nodes (with 571,791 vision-related ones) and 3,673,817 triplets. The dataset is also annotated with rich event tags, including 11 coarse labels and 9,185 fine labels. Experiments on four benchmarks demonstrate the potential of UKnow in supporting common-sense reasoning and boosting vision-language pre-training with a single dataset, benefiting from its unified form of knowledge organization. Code, dataset, and models will be made publicly available.
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	004
Publishing date	2023-02-14
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Composer

Huang, Lianghua / Chen, Di / Liu, Yu / Shen, Yujun / Zhao, Deli / Zhou, Jingren

Creative and Controllable Image Synthesis with Composable Conditions

2023

Abstract: Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability. This work offers a new generation paradigm that allows flexible control of the output image, such as spatial ... ...

Abstract	Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability. This work offers a new generation paradigm that allows flexible control of the output image, such as spatial layout and palette, while maintaining the synthesis quality and model creativity. With compositionality as the core idea, we first decompose an image into representative factors, and then train a diffusion model with all these factors as the conditions to recompose the input. At the inference stage, the rich intermediate representations work as composable elements, leading to a huge design space (i.e., exponentially proportional to the number of decomposed factors) for customizable content creation. It is noteworthy that our approach, which we call Composer, supports various levels of conditions, such as text description as the global information, depth map and sketch as the local guidance, color histogram for low-level details, etc. Besides improving controllability, we confirm that Composer serves as a general framework and facilitates a wide range of classical generative tasks without retraining. Code and models will be made available. Comment: Project page: https://damo-vilab.github.io/composer-page/
Keywords	Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Graphics
Subject code	004
Publishing date	2023-02-20
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: ViM

Feng, Yutong / Gong, Biao / Jiang, Jianwen / Lv, Yiliang / Shen, Yujun / Zhao, Deli / Zhou, Jingren

Vision Middleware for Unified Downstream Transferring

2023

Abstract: Foundation models are pre-trained on massive data and transferred to downstream tasks via fine-tuning. This work presents Vision Middleware (ViM), a new learning paradigm that targets unified transferring from a single foundation model to a variety of ... ...

Abstract	Foundation models are pre-trained on massive data and transferred to downstream tasks via fine-tuning. This work presents Vision Middleware (ViM), a new learning paradigm that targets unified transferring from a single foundation model to a variety of downstream tasks. ViM consists of a zoo of lightweight plug-in modules, each of which is independently learned on a midstream dataset with a shared frozen backbone. Downstream tasks can then benefit from an adequate aggregation of the module zoo thanks to the rich knowledge inherited from midstream tasks. There are three major advantages of such a design. From the efficiency aspect, the upstream backbone can be trained only once and reused for all downstream tasks without tuning. From the scalability aspect, we can easily append additional modules to ViM with no influence on existing modules. From the performance aspect, ViM can include as many midstream tasks as possible, narrowing the task gap between upstream and downstream. Considering these benefits, we believe that ViM, which the community could maintain and develop together, would serve as a powerful tool to assist foundation models.
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	004
Publishing date	2023-03-13
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: CLIP-guided Prototype Modulating for Few-shot Action Recognition

Wang, Xiang / Zhang, Shiwei / Cen, Jun / Gao, Changxin / Zhang, Yingya / Zhao, Deli / Sang, Nong

2023

Abstract: Learning from large-scale contrastive language-image pre-training like CLIP has shown remarkable success in a wide range of downstream tasks recently, but it is still under-explored on the challenging few-shot action recognition (FSAR) task. In this work, ...

Abstract	Learning from large-scale contrastive language-image pre-training like CLIP has shown remarkable success in a wide range of downstream tasks recently, but it is still under-explored on the challenging few-shot action recognition (FSAR) task. In this work, we aim to transfer the powerful multimodal knowledge of CLIP to alleviate the inaccurate prototype estimation issue due to data scarcity, which is a critical problem in low-shot regimes. To this end, we present a CLIP-guided prototype modulating framework called CLIP-FSAR, which consists of two key components: a video-text contrastive objective and a prototype modulation. Specifically, the former bridges the task discrepancy between CLIP and the few-shot video task by contrasting videos and corresponding class text descriptions. The latter leverages the transferable textual concepts from CLIP to adaptively refine visual prototypes with a temporal Transformer. By this means, CLIP-FSAR can take full advantage of the rich semantic priors in CLIP to obtain reliable prototypes and achieve accurate few-shot classification. Extensive experiments on five commonly used benchmarks demonstrate the effectiveness of our proposed method, and CLIP-FSAR significantly outperforms existing state-of-the-art methods under various settings. The source code and models will be publicly available at https://github.com/alibaba-mmai-research/CLIP-FSAR. Comment: This work has been submitted to the Springer for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	006
Publishing date	2023-03-06
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Rethinking Efficient Tuning Methods from a Unified Perspective

Jiang, Zeyinzi / Mao, Chaojie / Huang, Ziyuan / Lv, Yiliang / Zhao, Deli / Zhou, Jingren

2023

Abstract: Parameter-efficient transfer learning (PETL) based on large-scale pre-trained foundation models has achieved great success in various downstream applications. Existing tuning methods, such as prompt, prefix, and adapter, perform task-specific lightweight ...

Abstract	Parameter-efficient transfer learning (PETL) based on large-scale pre-trained foundation models has achieved great success in various downstream applications. Existing tuning methods, such as prompt, prefix, and adapter, perform task-specific lightweight adjustments to different parts of the original architecture. However, they take effect on only some parts of the pre-trained models, i.e., only the feed-forward layers or the self-attention layers, which leaves the remaining frozen structures unable to adapt to the data distributions of downstream tasks. Further, the existing structures are strongly coupled with the Transformers, hindering parameter-efficient deployment as well as the design flexibility for new approaches. In this paper, we revisit the design paradigm of PETL and derive a unified framework U-Tuning for parameter-efficient transfer learning, which is composed of an operation with frozen parameters and a unified tuner that adapts the operation for downstream applications. The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning, which prove to achieve on-par or better performances on CIFAR-100 and FGVC datasets when compared with existing PETL methods.
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	004
Publishing date	2023-03-01
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Scanning Only Once

Pan, Yulin / He, Xiangteng / Gong, Biao / Lv, Yiliang / Shen, Yujun / Peng, Yuxin / Zhao, Deli

An End-to-end Framework for Fast Temporal Grounding in Long Videos

2023

Abstract: Video temporal grounding aims to pinpoint a video segment that matches the query description. Despite the recent advance in short-form videos (\textit{e.g.}, in minutes), temporal grounding in long videos (\textit{e.g.}, in hours) is still at its early ... ...

Abstract	Video temporal grounding aims to pinpoint a video segment that matches the query description. Despite the recent advance in short-form videos (\textit{e.g.}, in minutes), temporal grounding in long videos (\textit{e.g.}, in hours) is still at its early stage. To address this challenge, a common practice is to employ a sliding window, yet can be inefficient and inflexible due to the limited number of frames within the window. In this work, we propose an end-to-end framework for fast temporal grounding, which is able to model an hours-long video with \textbf{one-time} network execution. Our pipeline is formulated in a coarse-to-fine manner, where we first extract context knowledge from non-overlapped video clips (\textit{i.e.}, anchors), and then supplement the anchors that highly response to the query with detailed content knowledge. Besides the remarkably high pipeline efficiency, another advantage of our approach is the capability of capturing long-range temporal correlation, thanks to modeling the entire video as a whole, and hence facilitates more accurate grounding. Experimental results suggest that, on the long-form video datasets MAD and Ego4d, our method significantly outperforms state-of-the-arts, and achieves \textbf{14.6$\times$} / \textbf{102.8$\times$} higher efficiency respectively. The code will be released at \url{https://github.com/afcedf/SOONet.git} Comment: 12 pages, 8 figures
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	004 ; 006
Publishing date	2023-03-14
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

To top

More links

Kategorien

Order via subito

Inter-library loan at ZB MED

More links

Kategorien

Order via subito

Inter-library loan at ZB MED

More links

Kategorien

Order via subito

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED