LIVIVO - Das Suchportal für Lebenswissenschaften

switch to English language
Erweiterte Suche

Suchergebnis

Treffer 1 - 10 von insgesamt 64

Suchoptionen

  1. Artikel ; Online: LF2MV: Learning an Editable Meta-View Towards Light Field Representation.

    Xia, Menghan / Echevarria, Jose / Xie, Minshan / Wong, Tien-Tsin

    IEEE transactions on visualization and computer graphics

    2024  Band 30, Heft 3, Seite(n) 1672–1684

    Abstract: Light fields are 4D scene representations that are typically structured as arrays of views or several directional samples per pixel in a single view. However, this highly correlated structure is not very efficient to transmit and manipulate, especially ... ...

    Abstract Light fields are 4D scene representations that are typically structured as arrays of views or several directional samples per pixel in a single view. However, this highly correlated structure is not very efficient to transmit and manipulate, especially for editing. To tackle this issue, we propose a novel representation learning framework that can encode the light field into a single meta-view that is both compact and editable. Specifically, the meta-view composes of three visual channels and a complementary meta channel that is embedded with geometric and residual appearance information. The visual channels can be edited using existing 2D image editing tools, before reconstructing the whole edited light field. To facilitate edit propagation against occlusion, we design a special editing-aware decoding network that consistently propagates the visual edits to the whole light field upon reconstruction. Extensive experiments show that our proposed method achieves competitive representation accuracy and meanwhile enables consistent edit propagation.
    Sprache Englisch
    Erscheinungsdatum 2024-01-30
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ISSN 1941-0506
    ISSN (online) 1941-0506
    DOI 10.1109/TVCG.2022.3220773
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  2. Artikel ; Online: Taming Reversible Halftoning via Predictive Luminance.

    Lau, Cheuk-Kit / Xia, Menghan / Wong, Tien-Tsin

    IEEE transactions on visualization and computer graphics

    2023  Band PP

    Abstract: Traditional halftoning usually drops colors when dithering images with binary dots, which makes it difficult to recover the original color information. We proposed a novel halftoning technique that converts a color image into a binary halftone with full ... ...

    Abstract Traditional halftoning usually drops colors when dithering images with binary dots, which makes it difficult to recover the original color information. We proposed a novel halftoning technique that converts a color image into a binary halftone with full restorability to its original version. Our novel base halftoning technique consists of two convolutional neural networks (CNNs) to produce the reversible halftone patterns, and a noise incentive block (NIB) to mitigate the flatness degradation issue of CNNs. Furthermore, to tackle the conflicts between the blue-noise quality and restoration accuracy in our novel base method, we proposed a predictor-embedded approach to offload predictable information from the network, which in our case is the luminance information resembling from the halftone pattern. Such an approach allows the network to gain more flexibility to produce halftones with better blue-noise quality without compromising the restoration quality. Detailed studies on the multiple-stage training method and loss weightings have been conducted. We have compared our predictor-embedded method and our novel method regarding spectrum analysis on halftone, halftone accuracy, restoration accuracy, and the data embedding studies. Our entropy evaluation evidences our halftone contains less encoding information than our novel base method. The experiments show our predictor-embedded method gains more flexibility to improve the blue-noise quality of halftones and maintains a comparable restoration quality with a higher tolerance for disturbances.
    Sprache Englisch
    Erscheinungsdatum 2023-05-23
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ISSN 1941-0506
    ISSN (online) 1941-0506
    DOI 10.1109/TVCG.2023.3278691
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  3. Artikel ; Online: Scale-Arbitrary Invertible Image Downscaling.

    Xing, Jinbo / Hu, Wenbo / Xia, Menghan / Wong, Tien-Tsin

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

    2023  Band 32, Seite(n) 4259–4274

    Abstract: Conventional social media platforms usually downscale high-resolution (HR) images to restrict their resolution to a specific size for saving transmission/storage cost, which makes those visual details inaccessible to other users. To bypass this obstacle, ...

    Abstract Conventional social media platforms usually downscale high-resolution (HR) images to restrict their resolution to a specific size for saving transmission/storage cost, which makes those visual details inaccessible to other users. To bypass this obstacle, recent invertible image downscaling methods jointly model the downscaling/upscaling problems and achieve impressive performance. However, they only consider fixed integer scale factors and may be inapplicable to generic downscaling tasks towards resolution restriction as posed by social media platforms. In this paper, we propose an effective and universal Scale-Arbitrary Invertible Image Downscaling Network (AIDN), to downscale HR images with arbitrary scale factors in an invertible manner. Particularly, the HR information is embedded in the downscaled low-resolution (LR) counterparts in a nearly imperceptible form such that our AIDN can further restore the original HR images solely from the LR images. The key to supporting arbitrary scale factors is our proposed Conditional Resampling Module (CRM) that conditions the downscaling/upscaling kernels and sampling locations on both scale factors and image content. Extensive experimental results demonstrate that our AIDN achieves top performance for invertible downscaling with both arbitrary integer and non-integer scale factors. Also, both quantitative and qualitative evaluations show our AIDN is robust to the lossy image compression standard. The source code and trained models are publicly available at https://github.com/Doubiiu/AIDN.
    Sprache Englisch
    Erscheinungsdatum 2023-07-28
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ISSN 1941-0042
    ISSN (online) 1941-0042
    DOI 10.1109/TIP.2023.3296891
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  4. Artikel ; Online: Point Set Self-Embedding.

    Li, Ruihui / Li, Xianzhi / Wong, Tien-Tsin / Fu, Chi-Wing

    IEEE transactions on visualization and computer graphics

    2023  Band 29, Heft 7, Seite(n) 3226–3237

    Abstract: This work presents an innovative method for point set self-embedding, that encodes the structural information of a dense point set into its sparser version in a visual but imperceptible form. The self-embedded point set can function as the ordinary ... ...

    Abstract This work presents an innovative method for point set self-embedding, that encodes the structural information of a dense point set into its sparser version in a visual but imperceptible form. The self-embedded point set can function as the ordinary downsampled one and be visualized efficiently on mobile devices. Particularly, we can leverage the self-embedded information to fully restore the original point set for detailed analysis on remote servers. This task is challenging, since both the self-embedded point set and the restored point set should resemble the original one. To achieve a learnable self-embedding scheme, we design a novel framework with two jointly-trained networks: one to encode the input point set into its self-embedded sparse point set and the other to leverage the embedded information for inverting the original point set back. Further, we develop a pair of up-shuffle and down-shuffle units in the two networks, and formulate loss terms to encourage the shape similarity and point distribution in the results. Extensive qualitative and quantitative results demonstrate the effectiveness of our method on both synthetic and real-scanned datasets. The source code and trained models will be publicly available at https://github.com/liruihui/Self-Embedding.
    Sprache Englisch
    Erscheinungsdatum 2023-05-26
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ISSN 1941-0506
    ISSN (online) 1941-0506
    DOI 10.1109/TVCG.2022.3155808
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  5. Buch ; Online: Manga Rescreening with Interpretable Screentone Representation

    Xie, Minshan / Li, Chengze / Wong, Tien-Tsin

    2023  

    Abstract: The process of adapting or repurposing manga pages is a time-consuming task that requires manga artists to manually work on every single screentone region and apply new patterns to create novel screentones across multiple panels. To address this issue, ... ...

    Abstract The process of adapting or repurposing manga pages is a time-consuming task that requires manga artists to manually work on every single screentone region and apply new patterns to create novel screentones across multiple panels. To address this issue, we propose an automatic manga rescreening pipeline that aims to minimize the human effort involved in manga adaptation. Our pipeline automatically recognizes screentone regions and generates novel screentones with newly specified characteristics (e.g., intensity or type). Existing manga generation methods have limitations in understanding and synthesizing complex tone- or intensity-varying regions. To overcome these limitations, we propose a novel interpretable representation of screentones that disentangles their intensity and type features, enabling better recognition and synthesis of screentones. This interpretable screentone representation reduces ambiguity in recognizing intensity-varying regions and provides fine-grained controls during screentone synthesis by decoupling and anchoring the type or the intensity feature. Our proposed method is demonstrated to be effective and convenient through various experiments, showcasing the superiority of the newly proposed pipeline with the interpretable screentone representations.

    Comment: 10 pages, 11 figures
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition ; Electrical Engineering and Systems Science - Image and Video Processing
    Thema/Rubrik (Code) 004
    Erscheinungsdatum 2023-06-06
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  6. Buch ; Online: Taming Reversible Halftoning via Predictive Luminance

    Lau, Cheuk-Kit / Xia, Menghan / Wong, Tien-Tsin

    2023  

    Abstract: Traditional halftoning usually drops colors when dithering images with binary dots, which makes it difficult to recover the original color information. We proposed a novel halftoning technique that converts a color image into a binary halftone with full ... ...

    Abstract Traditional halftoning usually drops colors when dithering images with binary dots, which makes it difficult to recover the original color information. We proposed a novel halftoning technique that converts a color image into a binary halftone with full restorability to its original version. Our novel base halftoning technique consists of two convolutional neural networks (CNNs) to produce the reversible halftone patterns, and a noise incentive block (NIB) to mitigate the flatness degradation issue of CNNs. Furthermore, to tackle the conflicts between the blue-noise quality and restoration accuracy in our novel base method, we proposed a predictor-embedded approach to offload predictable information from the network, which in our case is the luminance information resembling from the halftone pattern. Such an approach allows the network to gain more flexibility to produce halftones with better blue-noise quality without compromising the restoration quality. Detailed studies on the multiple-stage training method and loss weightings have been conducted. We have compared our predictor-embedded method and our novel method regarding spectrum analysis on halftone, halftone accuracy, restoration accuracy, and the data embedding studies. Our entropy evaluation evidences our halftone contains less encoding information than our novel base method. The experiments show our predictor-embedded method gains more flexibility to improve the blue-noise quality of halftones and maintains a comparable restoration quality with a higher tolerance for disturbances.

    Comment: to be published in IEEE Transactions on Visualization and Computer Graphics
    Schlagwörter Electrical Engineering and Systems Science - Image and Video Processing ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Multimedia
    Thema/Rubrik (Code) 006
    Erscheinungsdatum 2023-06-14
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  7. Buch ; Online: Highly Detailed and Temporal Consistent Video Stylization via Synchronized Multi-Frame Diffusion

    Xie, Minshan / Liu, Hanyuan / Li, Chengze / Wong, Tien-Tsin

    2023  

    Abstract: Text-guided video-to-video stylization transforms the visual appearance of a source video to a different appearance guided on textual prompts. Existing text-guided image diffusion models can be extended for stylized video synthesis. However, they ... ...

    Abstract Text-guided video-to-video stylization transforms the visual appearance of a source video to a different appearance guided on textual prompts. Existing text-guided image diffusion models can be extended for stylized video synthesis. However, they struggle to generate videos with both highly detailed appearance and temporal consistency. In this paper, we propose a synchronized multi-frame diffusion framework to maintain both the visual details and the temporal consistency. Frames are denoised in a synchronous fashion, and more importantly, information of different frames is shared since the beginning of the denoising process. Such information sharing ensures that a consensus, in terms of the overall structure and color distribution, among frames can be reached in the early stage of the denoising process before it is too late. The optical flow from the original video serves as the connection, and hence the venue for information sharing, among frames. We demonstrate the effectiveness of our method in generating high-quality and diverse results in extensive experiments. Our method shows superior qualitative and quantitative results compared to state-of-the-art video editing methods.

    Comment: 11 pages, 11 figures
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition
    Thema/Rubrik (Code) 004
    Erscheinungsdatum 2023-11-24
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  8. Buch ; Online: Scale-arbitrary Invertible Image Downscaling

    Xing, Jinbo / Hu, Wenbo / Wong, Tien-Tsin

    2022  

    Abstract: Conventional social media platforms usually downscale the HR images to restrict their resolution to a specific size for saving transmission/storage cost, which leads to the super-resolution (SR) being highly ill-posed. Recent invertible image downscaling ...

    Abstract Conventional social media platforms usually downscale the HR images to restrict their resolution to a specific size for saving transmission/storage cost, which leads to the super-resolution (SR) being highly ill-posed. Recent invertible image downscaling methods jointly model the downscaling/upscaling problems and achieve significant improvements. However, they only consider fixed integer scale factors that cannot downscale HR images with various resolutions to meet the resolution restriction of social media platforms. In this paper, we propose a scale-Arbitrary Invertible image Downscaling Network (AIDN), to natively downscale HR images with arbitrary scale factors. Meanwhile, the HR information is embedded in the downscaled low-resolution (LR) counterparts in a nearly imperceptible form such that our AIDN can also restore the original HR images solely from the LR images. The key to supporting arbitrary scale factors is our proposed Conditional Resampling Module (CRM) that conditions the downscaling/upscaling kernels and sampling locations on both scale factors and image content. Extensive experimental results demonstrate that our AIDN achieves top performance for invertible downscaling with both arbitrary integer and non-integer scale factors. Code will be released upon publication.
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition ; Electrical Engineering and Systems Science - Image and Video Processing
    Erscheinungsdatum 2022-01-29
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  9. Buch ; Online: CodeTalker

    Xing, Jinbo / Xia, Menghan / Zhang, Yuechen / Cun, Xiaodong / Wang, Jue / Wong, Tien-Tsin

    Speech-Driven 3D Facial Animation with Discrete Motion Prior

    2023  

    Abstract: Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness due to the highly ill-posed nature and scarcity of audio-visual data. Existing works typically formulate the cross-modal mapping into a ...

    Abstract Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness due to the highly ill-posed nature and scarcity of audio-visual data. Existing works typically formulate the cross-modal mapping into a regression task, which suffers from the regression-to-mean problem leading to over-smoothed facial motions. In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty. The codebook is learned by self-reconstruction over real facial motions and thus embedded with realistic facial motion priors. Over the discrete motion space, a temporal autoregressive model is employed to sequentially synthesize facial motions from the input speech signal, which guarantees lip-sync as well as plausible facial expressions. We demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively. Also, a user study further justifies our superiority in perceptual quality.

    Comment: CVPR2023 Camera-Ready. Project Page: https://doubiiu.github.io/projects/codetalker/, Code: https://github.com/Doubiiu/CodeTalker
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition
    Thema/Rubrik (Code) 004
    Erscheinungsdatum 2023-01-06
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  10. Buch ; Online: Point Set Self-Embedding

    Li, Ruihui / Li, Xianzhi / Wong, Tien-Tsin / Fu, Chi-Wing

    2022  

    Abstract: This work presents an innovative method for point set self-embedding, that encodes the structural information of a dense point set into its sparser version in a visual but imperceptible form. The self-embedded point set can function as the ordinary ... ...

    Abstract This work presents an innovative method for point set self-embedding, that encodes the structural information of a dense point set into its sparser version in a visual but imperceptible form. The self-embedded point set can function as the ordinary downsampled one and be visualized efficiently on mobile devices. Particularly, we can leverage the self-embedded information to fully restore the original point set for detailed analysis on remote servers. This task is challenging since both the self-embedded point set and the restored point set should resemble the original one. To achieve a learnable self-embedding scheme, we design a novel framework with two jointly-trained networks: one to encode the input point set into its self-embedded sparse point set and the other to leverage the embedded information for inverting the original point set back. Further, we develop a pair of up-shuffle and down-shuffle units in the two networks, and formulate loss terms to encourage the shape similarity and point distribution in the results. Extensive qualitative and quantitative results demonstrate the effectiveness of our method on both synthetic and real-scanned datasets.

    Comment: Accepted by IEEE Transactions on Visualization and Computer Graphics (IEEE TVCG), 2022. All resources can be found at https://liruihui.github.io/
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition
    Erscheinungsdatum 2022-02-28
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

Zum Seitenanfang