LIVIVO - Search results -

Search results

Result 1 - 10 of total 762844

Article ; Online: X

Zeng, Yan / Zhang, Xinsong / Li, Hang / Wang, Jiawei / Zhang, Jipeng / Zhou, Wangchunshu

IEEE transactions on pattern analysis and machine intelligence

2024 Volume 46, Issue 5, Page(s) 3156–3168

Abstract: ... grained aligning and multi-grained localization simultaneously. Based on it, we present X ...

Abstract	Vision language pre-training aims to learn alignments between vision and language from a large amount of data. Most existing methods only learn image-text alignments. Some others utilize pre-trained object detectors to leverage vision language alignments at the object level. In this paper, we propose to learn multi-grained vision language alignments by a unified pre-training framework that learns multi-grained aligning and multi-grained localization simultaneously. Based on it, we present X
Language	English
Publishing date	2024-04-03
Publishing country	United States
Document type	Journal Article
ISSN	1939-3539
ISSN (online)	1939-3539
DOI	10.1109/TPAMI.2023.3339661
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: X-HRNet

Zhou, Yixuan / Wang, Xuanhan / Xu, Xing / Zhao, Lei / Song, Jingkuan

Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention

2023

Abstract: ... backbone X-HRNet, where `X' represents the estimated cross-shape attention vectors. Extensive experiments ... on the COCO benchmark demonstrate the superiority of our X-HRNet, and comprehensive ablation studies show ... the effectiveness of the SUSA modules. The code is publicly available at https://github.com/cool-xuan/x-hrnet ...

Abstract	High-resolution representation is necessary for human pose estimation to achieve high performance, and the ensuing problem is high computational complexity. In particular, predominant pose estimation methods estimate human joints by 2D single-peak heatmaps. Each 2D heatmap can be horizontally and vertically projected to and reconstructed by a pair of 1D heat vectors. Inspired by this observation, we introduce a lightweight and powerful alternative, Spatially Unidimensional Self-Attention (SUSA), to the pointwise (1x1) convolution that is the main computational bottleneck in the depthwise separable 3c3 convolution. Our SUSA reduces the computational complexity of the pointwise (1x1) convolution by 96% without sacrificing accuracy. Furthermore, we use the SUSA as the main module to build our lightweight pose estimation backbone X-HRNet, where `X' represents the estimated cross-shape attention vectors. Extensive experiments on the COCO benchmark demonstrate the superiority of our X-HRNet, and comprehensive ablation studies show the effectiveness of the SUSA modules. The code is publicly available at https://github.com/cool-xuan/x-hrnet. Comment: Accepted by ICME 2022
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	004
Publishing date	2023-10-12
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: X-MLP

Wang, Xinyue / Cai, Zhicheng / Peng, Chenglei

A Patch Embedding-Free MLP Architecture for Vision

2023

Abstract: ... However, existing vision MLP architectures always depend on convolution for patch embedding. Thus we propose X-MLP ... channel independently and alternately. X-MLP is tested on ten benchmark datasets, all obtaining better ...

Abstract	Convolutional neural networks (CNNs) and vision transformers (ViT) have obtained great achievements in computer vision. Recently, the research of multi-layer perceptron (MLP) architectures for vision have been popular again. Vision MLPs are designed to be independent from convolutions and self-attention operations. However, existing vision MLP architectures always depend on convolution for patch embedding. Thus we propose X-MLP, an architecture constructed absolutely upon fully connected layers and free from patch embedding. It decouples the features extremely and utilizes MLPs to interact the information across the dimension of width, height and channel independently and alternately. X-MLP is tested on ten benchmark datasets, all obtaining better performance than other vision MLP models. It even surpasses CNNs by a clear margin on various dataset. Furthermore, through mathematically restoring the spatial weights, we visualize the information communication between any couples of pixels in the feature map and observe the phenomenon of capturing long-range dependency. Comment: IJCNN 2023
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	006
Publishing date	2023-07-02
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: X-Adapter

Ran, Lingmin / Cun, Xiaodong / Liu, Jia-Wei / Zhao, Rui / Zijie, Song / Wang, Xintao / Keppo, Jussi / Shou, Mike Zheng

Adding Universal Compatibility of Plugins for Upgraded Diffusion Model

2023

Abstract: We introduce X-Adapter, a universal upgrader to enable the pretrained plug-and-play modules (e.g ... model with the new text-image data pairs. In detail, X-Adapter keeps a frozen copy of the old model ... to preserve the connectors of different plugins. Additionally, X-Adapter adds trainable mapping layers ...

Abstract	We introduce X-Adapter, a universal upgrader to enable the pretrained plug-and-play modules (e.g., ControlNet, LoRA) to work directly with the upgraded text-to-image diffusion model (e.g., SDXL) without further retraining. We achieve this goal by training an additional network to control the frozen upgraded model with the new text-image data pairs. In detail, X-Adapter keeps a frozen copy of the old model to preserve the connectors of different plugins. Additionally, X-Adapter adds trainable mapping layers that bridge the decoders from models of different versions for feature remapping. The remapped features will be used as guidance for the upgraded model. To enhance the guidance ability of X-Adapter, we employ a null-text training strategy for the upgraded model. After training, we also introduce a two-stage denoising strategy to align the initial latents of X-Adapter and the upgraded model. Thanks to our strategies, X-Adapter demonstrates universal compatibility with various plugins and also enables plugins of different versions to work together, thereby expanding the functionalities of diffusion community. To verify the effectiveness of the proposed method, we conduct extensive experiments and the results show that X-Adapter may facilitate wider application in the upgraded foundational diffusion model. Comment: Project page: https://showlab.github.io/X-Adapter/
Keywords	Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence ; Computer Science - Multimedia
Subject code	004
Publishing date	2023-12-04
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: X-Mesh

Ma, Yiwei / Zhang, Xiaioqing / Sun, Xiaoshuai / Ji, Jiayi / Wang, Haowei / Jiang, Guannan / Zhuang, Weilin / Ji, Rongrong

Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance

2023

Abstract: ... to unsatisfactory stylization and slow convergence. To address these limitations, we present X-Mesh, an innovative ... objective comparisons. Our extensive qualitative and quantitative experiments demonstrate that X-Mesh ...

Abstract	Text-driven 3D stylization is a complex and crucial task in the fields of computer vision (CV) and computer graphics (CG), aimed at transforming a bare mesh to fit a target text. Prior methods adopt text-independent multilayer perceptrons (MLPs) to predict the attributes of the target mesh with the supervision of CLIP loss. However, such text-independent architecture lacks textual guidance during predicting attributes, thus leading to unsatisfactory stylization and slow convergence. To address these limitations, we present X-Mesh, an innovative text-driven 3D stylization framework that incorporates a novel Text-guided Dynamic Attention Module (TDAM). The TDAM dynamically integrates the guidance of the target text by utilizing text-relevant spatial and channel-wise attentions during vertex feature extraction, resulting in more accurate attribute prediction and faster convergence speed. Furthermore, existing works lack standard benchmarks and automated metrics for evaluation, often relying on subjective and non-reproducible user studies to assess the quality of stylized 3D assets. To overcome this limitation, we introduce a new standard text-mesh benchmark, namely MIT-30, and two automated metrics, which will enable future research to achieve fair and objective comparisons. Our extensive qualitative and quantitative experiments demonstrate that X-Mesh outperforms previous state-of-the-art methods. Comment: Technical report
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	410
Publishing date	2023-03-28
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: X-Adv

Liu, Aishan / Guo, Jun / Wang, Jiakai / Liang, Siyuan / Tao, Renshuai / Zhou, Wenbo / Liu, Cong / Liu, Xianglong / Tao, Dacheng

Physical Adversarial Object Attacks against X-ray Prohibited Item Detection

2023

Abstract: ... However, attacks targeting texture-free X-ray images remain underexplored, despite the widespread application ... of X-ray imaging in safety-critical scenarios such as the X-ray detection of prohibited items ... In this paper, we take the first step toward the study of adversarial attacks targeted at X-ray prohibited item ...

Abstract	Adversarial attacks are valuable for evaluating the robustness of deep learning models. Existing attacks are primarily conducted on the visible light spectrum (e.g., pixel-wise texture perturbation). However, attacks targeting texture-free X-ray images remain underexplored, despite the widespread application of X-ray imaging in safety-critical scenarios such as the X-ray detection of prohibited items. In this paper, we take the first step toward the study of adversarial attacks targeted at X-ray prohibited item detection, and reveal the serious threats posed by such attacks in this safety-critical scenario. Specifically, we posit that successful physical adversarial attacks in this scenario should be specially designed to circumvent the challenges posed by color/texture fading and complex overlapping. To this end, we propose X-adv to generate physically printable metals that act as an adversarial agent capable of deceiving X-ray detectors when placed in luggage. To resolve the issues associated with color/texture fading, we develop a differentiable converter that facilitates the generation of 3D-printable objects with adversarial shapes, using the gradients of a surrogate model rather than directly generating adversarial textures. To place the printed 3D adversarial objects in luggage with complex overlapped instances, we design a policy-based reinforcement learning strategy to find locations eliciting strong attack performance in worst-case scenarios whereby the prohibited items are heavily occluded by other items. To verify the effectiveness of the proposed X-Adv, we conduct extensive experiments in both the digital and the physical world (employing a commercial X-ray security inspection system for the latter case). Furthermore, we present the physical-world X-ray adversarial attack dataset XAD. Comment: Accepted by USENIX Security 2023
Keywords	Computer Science - Cryptography and Security ; Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition
Subject code	006
Publishing date	2023-02-19
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: X-Transfer

Zhang, Lei / Chen, Hao / Hu, Shu / Zhu, Bin / Lin, Ching Sheng / Wu, Xi / Hu, Jinrong / Wang, Xin

A Transfer Learning-Based Framework for GAN-Generated Fake Image Detection

2023

Abstract: ... algorithm called X-Transfer, which enhances transfer learning by utilizing two neural networks that employ ...

Abstract	Generative adversarial networks (GANs) have remarkably advanced in diverse domains, especially image generation and editing. However, the misuse of GANs for generating deceptive images, such as face replacement, raises significant security concerns, which have gained widespread attention. Therefore, it is urgent to develop effective detection methods to distinguish between real and fake images. Current research centers around the application of transfer learning. Nevertheless, it encounters challenges such as knowledge forgetting from the original dataset and inadequate performance when dealing with imbalanced data during training. To alleviate this issue, this paper introduces a novel GAN-generated image detection algorithm called X-Transfer, which enhances transfer learning by utilizing two neural networks that employ interleaved parallel gradient transmission. In addition, we combine AUC loss and cross-entropy loss to improve the model's performance. We carry out comprehensive experiments on multiple facial image datasets. The results show that our model outperforms the general transferring approach, and the best metric achieves 99.04%, which is increased by approximately 10%. Furthermore, we demonstrate excellent performance on non-face datasets, validating its generality and broader application prospects. Comment: 9 pages, 3 figures, and 6 tables; references added
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	006
Publishing date	2023-10-06
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: X-Dreamer

Ma, Yiwei / Fan, Yijun / Ji, Jiayi / Wang, Haowei / Sun, Xiaoshuai / Jiang, Guannan / Shu, Annan / Ji, Rongrong

Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation

2023

Abstract: ... to suboptimal outcomes. To address this issue, we present X-Dreamer, a novel approach for high-quality text ... The key components of X-Dreamer are two innovative designs: Camera-Guided Low-Rank Adaptation (CG-LoRA ... github.io/Projects/X-Dreamer . ...

Abstract	In recent times, automatic text-to-3D content creation has made significant progress, driven by the development of pretrained 2D diffusion models. Existing text-to-3D methods typically optimize the 3D representation to ensure that the rendered image aligns well with the given text, as evaluated by the pretrained 2D diffusion model. Nevertheless, a substantial domain gap exists between 2D images and 3D assets, primarily attributed to variations in camera-related attributes and the exclusive presence of foreground objects. Consequently, employing 2D diffusion models directly for optimizing 3D representations may lead to suboptimal outcomes. To address this issue, we present X-Dreamer, a novel approach for high-quality text-to-3D content creation that effectively bridges the gap between text-to-2D and text-to-3D synthesis. The key components of X-Dreamer are two innovative designs: Camera-Guided Low-Rank Adaptation (CG-LoRA) and Attention-Mask Alignment (AMA) Loss. CG-LoRA dynamically incorporates camera information into the pretrained diffusion models by employing camera-dependent generation for trainable parameters. This integration enhances the alignment between the generated 3D assets and the camera's perspective. AMA loss guides the attention map of the pretrained diffusion model using the binary mask of the 3D object, prioritizing the creation of the foreground object. This module ensures that the model focuses on generating accurate and detailed foreground objects. Extensive evaluations demonstrate the effectiveness of our proposed method compared to existing text-to-3D approaches. Our project webpage: https://xmuxiaoma666.github.io/Projects/X-Dreamer . Comment: Technical report
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	004
Publishing date	2023-11-30
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: PaLI-X

Chen, Xi / Djolonga, Josip / Padlewski, Piotr / Mustafa, Basil / Changpinyo, Soravit / Wu, Jialin / Ruiz, Carlos Riquelme / Goodman, Sebastian / Wang, Xiao / Tay, Yi / Shakeri, Siamak / Dehghani, Mostafa / Salz, Daniel / Lucic, Mario / Tschannen, Michael / Nagrani, Arsha / Hu, Hexiang / Joshi, Mandar / Pang, Bo /

On Scaling up a Multilingual Vision and Language Model

2023

Abstract: We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language ... in-context) learning, as well as object detection, video question answering, and video captioning. PaLI-X ...

Abstract	We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture. Our model achieves new levels of performance on a wide-range of varied and complex tasks, including multiple image-based captioning and question-answering tasks, image-based document understanding and few-shot (in-context) learning, as well as object detection, video question answering, and video captioning. PaLI-X advances the state-of-the-art on most vision-and-language benchmarks considered (25+ of them). Finally, we observe emerging capabilities, such as complex counting and multilingual object detection, tasks that are not explicitly in the training mix.
Keywords	Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Computation and Language ; Computer Science - Machine Learning
Publishing date	2023-05-29
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article: X-ray Detectors Based on Ga

Zhang, Chongyang / Dou, Wenjie / Yang, Xun / Zang, Huaping / Chen, Yancheng / Fan, Wei / Wang, Shaoyi / Zhou, Weimin / Chen, Xuexia / Shan, Chongxin

Materials (Basel, Switzerland)

2023 Volume 16, Issue 13

Abstract: X-ray detectors have numerous applications in medical imaging, industrial inspection, and crystal ...

Abstract	X-ray detectors have numerous applications in medical imaging, industrial inspection, and crystal structure analysis. Gallium oxide (Ga
Language	English
Publishing date	2023-06-30
Publishing country	Switzerland
Document type	Journal Article
ZDB-ID	2487261-1
ISSN	1996-1944
ISSN	1996-1944
DOI	10.3390/ma16134742
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

To top

More links

Kategorien

Order via subito

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

Order via subito