LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 135

Search options

  1. Article ; Online: TraverseNet: Unifying Space and Time in Message Passing for Traffic Forecasting.

    Wu, Zonghan / Zheng, Da / Pan, Shirui / Gan, Quan / Long, Guodong / Karypis, George

    IEEE transactions on neural networks and learning systems

    2024  Volume 35, Issue 2, Page(s) 2003–2013

    Abstract: This article aims to unify spatial dependency and temporal dependency in a non-Euclidean space while capturing the inner spatial-temporal dependencies for traffic data. For spatial-temporal attribute entities with topological structure, the space-time is ...

    Abstract This article aims to unify spatial dependency and temporal dependency in a non-Euclidean space while capturing the inner spatial-temporal dependencies for traffic data. For spatial-temporal attribute entities with topological structure, the space-time is consecutive and unified while each node's current status is influenced by its neighbors' past states over variant periods of each neighbor. Most spatial-temporal neural networks for traffic forecasting study spatial dependency and temporal correlation separately in processing, gravely impaired the spatial-temporal integrity, and ignore the fact that the neighbors' temporal dependency period for a node can be delayed and dynamic. To model this actual condition, we propose TraverseNet, a novel spatial-temporal graph neural network, viewing space and time as an inseparable whole, to mine spatial-temporal graphs while exploiting the evolving spatial-temporal dependencies for each node via message traverse mechanisms. Experiments with ablation and parameter studies have validated the effectiveness of the proposed TraverseNet, and the detailed implementation can be found from https://github.com/nnzhan/TraverseNet.
    Language English
    Publishing date 2024-02-05
    Publishing country United States
    Document type Journal Article
    ISSN 2162-2388
    ISSN (online) 2162-2388
    DOI 10.1109/TNNLS.2022.3186103
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Book ; Online: Extending Input Contexts of Language Models through Training on Segmented Sequences

    Karypis, Petros / McAuley, Julian / Karypis, George

    2023  

    Abstract: Effectively training language models on long inputs poses many technical challenges. As a cost consideration, languages models are pretrained on a fixed sequence length before being adapted to longer sequences. We explore various methods for adapting ... ...

    Abstract Effectively training language models on long inputs poses many technical challenges. As a cost consideration, languages models are pretrained on a fixed sequence length before being adapted to longer sequences. We explore various methods for adapting models to longer inputs by training on segmented sequences and an interpolation-based method for extending absolute positional embeddings. We develop a training procedure to extend the input context size of pretrained models with no architectural changes and no additional memory costs than training on the original input lengths. By sub-sampling segments from long inputs while maintaining their original position the model is able to learn new positional interactions. Our method benefits both models trained with absolute positional embeddings, by extending their input contexts, as well as popular relative positional embedding methods showing a reduced perplexity on sequences longer than they were trained on. We demonstrate our method can extend input contexts by a factor of 4x while improving perplexity.

    Comment: 11 pages, 3 figures
    Keywords Computer Science - Computation and Language ; Computer Science - Machine Learning
    Publishing date 2023-10-23
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: HeMI

    Mavromatis, Costas / Karypis, George

    Multi-view Embedding in Heterogeneous Graphs

    2021  

    Abstract: Many real-world graphs involve different types of nodes and relations between nodes, being heterogeneous by nature. The representation learning of heterogeneous graphs (HGs) embeds the rich structure and semantics of such graphs into a low-dimensional ... ...

    Abstract Many real-world graphs involve different types of nodes and relations between nodes, being heterogeneous by nature. The representation learning of heterogeneous graphs (HGs) embeds the rich structure and semantics of such graphs into a low-dimensional space and facilitates various data mining tasks, such as node classification, node clustering, and link prediction. In this paper, we propose a self-supervised method that learns HG representations by relying on knowledge exchange and discovery among different HG structural semantics (meta-paths). Specifically, by maximizing the mutual information of meta-path representations, we promote meta-path information fusion and consensus, and ensure that globally shared semantics are encoded. By extensive experiments on node classification, node clustering, and link prediction tasks, we show that the proposed self-supervision both outperforms and improves competing methods by 1% and up to 10% for all tasks.
    Keywords Computer Science - Machine Learning
    Subject code 006
    Publishing date 2021-09-14
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: Position-based Hash Embeddings For Scaling Graph Neural Networks

    Kalantzi, Maria / Karypis, George

    2021  

    Abstract: Graph Neural Networks (GNNs) bring the power of deep representation learning to graph and relational data and achieve state-of-the-art performance in many applications. GNNs compute node representations by taking into account the topology of the node's ... ...

    Abstract Graph Neural Networks (GNNs) bring the power of deep representation learning to graph and relational data and achieve state-of-the-art performance in many applications. GNNs compute node representations by taking into account the topology of the node's ego-network and the features of the ego-network's nodes. When the nodes do not have high-quality features, GNNs learn an embedding layer to compute node embeddings and use them as input features. However, the size of the embedding layer is linear to the product of the number of nodes in the graph and the dimensionality of the embedding and does not scale to big data and graphs with hundreds of millions of nodes. To reduce the memory associated with this embedding layer, hashing-based approaches, commonly used in applications like NLP and recommender systems, can potentially be used. However, a direct application of these ideas fails to exploit the fact that in many real-world graphs, nodes that are topologically close will tend to be related to each other (homophily) and as such their representations will be similar. In this work, we present approaches that take advantage of the nodes' position in the graph to dramatically reduce the memory required, with minimal if any degradation in the quality of the resulting GNN model. Our approaches decompose a node's embedding into two components: a position-specific component and a node-specific component. The position-specific component models homophily and the node-specific component models the node-to-node variation. Extensive experiments using different datasets and GNN models show that our methods are able to reduce the memory requirements by 88% to 97% while achieving, in nearly all cases, better classification accuracy than other competing approaches, including the full embeddings.

    Comment: 11 pages
    Keywords Computer Science - Machine Learning ; Computer Science - Neural and Evolutionary Computing
    Subject code 000 ; 006
    Publishing date 2021-08-31
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Article ; Online: Benchmarking Accuracy and Generalizability of Four Graph Neural Networks Using Large In Vitro ADME Datasets from Different Chemical Spaces.

    Broccatelli, Fabio / Trager, Richard / Reutlinger, Michael / Karypis, George / Li, Mufei

    Molecular informatics

    2022  Volume 41, Issue 8, Page(s) e2100321

    Abstract: In this work, we benchmark a variety of single- and multi-task graph neural network (GNN) models against lower-bar and higher-bar traditional machine learning approaches employing human engineered molecular features. We consider four GNN variants - Graph ...

    Abstract In this work, we benchmark a variety of single- and multi-task graph neural network (GNN) models against lower-bar and higher-bar traditional machine learning approaches employing human engineered molecular features. We consider four GNN variants - Graph Convolutional Network (GCN), Graph Attention Network (GAT), Message Passing Neural Network (MPNN), and Attentive Fingerprint (AttentiveFP). So far deep learning models have been primarily benchmarked using lower-bar traditional models solely based on fingerprints, while more realistic benchmarks employing fingerprints, whole-molecule descriptors and predictions from other related endpoints (e. g., LogD7.4) appear to be scarce for industrial ADME datasets. In addition to time-split test sets based on Genentech data, this study benefits from the availability of measurements from an external chemical space (Roche data). We identify GAT as a promising approach to implementing deep learning models. While all the deep learning models significantly outperform lower-bar benchmark traditional models solely based on fingerprints, only GATs seem to offer a small but consistent improvement over higher-bar benchmark traditional models. Finally, the accuracy of in vitro assays from different laboratories predicting the same experimental endpoints appears to be comparable with the accuracy of GAT single-task models, suggesting that most of the observed error from the models is a function of the experimental error propagation.
    MeSH term(s) Benchmarking ; Humans ; Machine Learning ; Neural Networks, Computer
    Language English
    Publishing date 2022-02-23
    Publishing country Germany
    Document type Journal Article
    ZDB-ID 2537668-8
    ISSN 1868-1751 ; 1868-1743
    ISSN (online) 1868-1751
    ISSN 1868-1743
    DOI 10.1002/minf.202100321
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Book ; Online: Better Context Makes Better Code Language Models

    Pei, Hengzhi / Zhao, Jinman / Lausen, Leonard / Zha, Sheng / Karypis, George

    A Case Study on Function Call Argument Completion

    2023  

    Abstract: Pretrained code language models have enabled great progress towards program synthesis. However, common approaches only consider in-file local context and thus miss information and constraints imposed by other parts of the codebase and its external ... ...

    Abstract Pretrained code language models have enabled great progress towards program synthesis. However, common approaches only consider in-file local context and thus miss information and constraints imposed by other parts of the codebase and its external dependencies. Existing code completion benchmarks also lack such context. To resolve these restrictions we curate a new dataset of permissively licensed Python packages that includes full projects and their dependencies and provide tools to extract non-local information with the help of program analyzers. We then focus on the task of function call argument completion which requires predicting the arguments to function calls. We show that existing code completion models do not yield good results on our completion task. To better solve this task, we query a program analyzer for information relevant to a given function call, and consider ways to provide the analyzer results to different code completion models during inference and training. Our experiments show that providing access to the function implementation and function usages greatly improves the argument completion performance. Our ablation study provides further insights on how different types of information available from the program analyzer and different ways of incorporating the information affect the model performance.

    Comment: 12 pages. Accepted to AAAI 2023
    Keywords Computer Science - Software Engineering ; Computer Science - Machine Learning ; I.2.2 ; I.2.7
    Subject code 005
    Publishing date 2023-06-01
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: DistTGL

    Zhou, Hongkuan / Zheng, Da / Song, Xiang / Karypis, George / Prasanna, Viktor

    Distributed Memory-Based Temporal Graph Neural Network Training

    2023  

    Abstract: Memory-based Temporal Graph Neural Networks are powerful tools in dynamic graph representation learning and have demonstrated superior performance in many real-world applications. However, their node memory favors smaller batch sizes to capture more ... ...

    Abstract Memory-based Temporal Graph Neural Networks are powerful tools in dynamic graph representation learning and have demonstrated superior performance in many real-world applications. However, their node memory favors smaller batch sizes to capture more dependencies in graph events and needs to be maintained synchronously across all trainers. As a result, existing frameworks suffer from accuracy loss when scaling to multiple GPUs. Evenworse, the tremendous overhead to synchronize the node memory make it impractical to be deployed to distributed GPU clusters. In this work, we propose DistTGL -- an efficient and scalable solution to train memory-based TGNNs on distributed GPU clusters. DistTGL has three improvements over existing solutions: an enhanced TGNN model, a novel training algorithm, and an optimized system. In experiments, DistTGL achieves near-linear convergence speedup, outperforming state-of-the-art single-machine method by 14.5% in accuracy and 10.17x in training throughput.

    Comment: SC'23
    Keywords Computer Science - Machine Learning
    Subject code 006
    Publishing date 2023-07-14
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: Heterogeneous Molecular Graph Neural Networks for Predicting Molecule Properties

    Shui, Zeren / Karypis, George

    2020  

    Abstract: As they carry great potential for modeling complex interactions, graph neural network (GNN)-based methods have been widely used to predict quantum mechanical properties of molecules. Most of the existing methods treat molecules as molecular graphs in ... ...

    Abstract As they carry great potential for modeling complex interactions, graph neural network (GNN)-based methods have been widely used to predict quantum mechanical properties of molecules. Most of the existing methods treat molecules as molecular graphs in which atoms are modeled as nodes. They characterize each atom's chemical environment by modeling its pairwise interactions with other atoms in the molecule. Although these methods achieve a great success, limited amount of works explicitly take many-body interactions, i.e., interactions between three and more atoms, into consideration. In this paper, we introduce a novel graph representation of molecules, heterogeneous molecular graph (HMG) in which nodes and edges are of various types, to model many-body interactions. HMGs have the potential to carry complex geometric information. To leverage the rich information stored in HMGs for chemical prediction problems, we build heterogeneous molecular graph neural networks (HMGNN) on the basis of a neural message passing scheme. HMGNN incorporates global molecule representations and an attention mechanism into the prediction process. The predictions of HMGNN are invariant to translation and rotation of atom coordinates, and permutation of atom indices. Our model achieves state-of-the-art performance in 9 out of 12 tasks on the QM9 dataset.

    Comment: To appear as a conference paper at ICDM 2020
    Keywords Computer Science - Machine Learning ; Physics - Computational Physics ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2020-09-26
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Graph InfoClust

    Mavromatis, Costas / Karypis, George

    Leveraging cluster-level node information for unsupervised graph representation learning

    2020  

    Abstract: Unsupervised (or self-supervised) graph representation learning is essential to facilitate various graph data mining tasks when external supervision is unavailable. The challenge is to encode the information about the graph structure and the attributes ... ...

    Abstract Unsupervised (or self-supervised) graph representation learning is essential to facilitate various graph data mining tasks when external supervision is unavailable. The challenge is to encode the information about the graph structure and the attributes associated with the nodes and edges into a low dimensional space. Most existing unsupervised methods promote similar representations across nodes that are topologically close. Recently, it was shown that leveraging additional graph-level information, e.g., information that is shared among all nodes, encourages the representations to be mindful of the global properties of the graph, which greatly improves their quality. However, in most graphs, there is significantly more structure that can be captured, e.g., nodes tend to belong to (multiple) clusters that represent structurally similar nodes. Motivated by this observation, we propose a graph representation learning method called Graph InfoClust (GIC), that seeks to additionally capture cluster-level information content. These clusters are computed by a differentiable K-means method and are jointly optimized by maximizing the mutual information between nodes of the same clusters. This optimization leads the node representations to capture richer information and nodal interactions, which improves their quality. Experiments show that GIC outperforms state-of-art methods in various downstream tasks (node classification, link prediction, and node clustering) with a 0.9% to 6.1% gain over the best competing approach, on average.
    Keywords Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2020-09-15
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: Context-aware Non-linear and Neural Attentive Knowledge-based Models for Grade Prediction

    Morsy, Sara / Karypis, George

    2020  

    Abstract: Grade prediction for future courses not yet taken by students is important as it can help them and their advisers during the process of course selection as well as for designing personalized degree plans and modifying them based on their performance. One ...

    Abstract Grade prediction for future courses not yet taken by students is important as it can help them and their advisers during the process of course selection as well as for designing personalized degree plans and modifying them based on their performance. One of the successful approaches for accurately predicting a student's grades in future courses is Cumulative Knowledge-based Regression Models (CKRM). CKRM learns shallow linear models that predict a student's grades as the similarity between his/her knowledge state and the target course. However, prior courses taken by a student can have \black{different contributions when estimating a student's knowledge state and towards each target course, which} cannot be captured by linear models. Moreover, CKRM and other grade prediction methods ignore the effect of concurrently-taken courses on a student's performance in a target course. In this paper, we propose context-aware non-linear and neural attentive models that can potentially better estimate a student's knowledge state from his/her prior course information, as well as model the interactions between a target course and concurrent courses. Compared to the competing methods, our experiments on a large real-world dataset consisting of more than $1.5$M grades show the effectiveness of the proposed models in accurately predicting students' grades. Moreover, the attention weights learned by the neural attentive model can be helpful in better designing their degree plans.

    Comment: arXiv admin note: substantial text overlap with arXiv:1904.11858
    Keywords Computer Science - Machine Learning ; Computer Science - Computers and Society ; Statistics - Machine Learning
    Subject code 370 ; 006
    Publishing date 2020-03-09
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top