Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
-
Updated
Apr 10, 2023 - Python
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)
RAN: Recurrent Attention Networks for Long-text Modeling | Findings of ACL23
LongQLoRA: Extent Context Length of LLMs Efficiently
[DEPRECIATED] Very fast, large music transformer with 8k sequence length, efficient heptabit MIDI notes encoding, true full MIDI instruments range, chords counters and outro tokens
Streamlined variant of Long-Range Arena with pinned dependencies, automated data downloads, and deterministic shuffling.
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
Finetuning and evaluating LLMs to extract GHG emissions from PDF reports using RAG and grammar-based decoding.
Implementation of paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
needle in a haystack for LLMs
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"
LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)
"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang.
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Add a description, image, and links to the long-context topic page so that developers can more easily learn about it.
To associate your repository with the long-context topic, visit your repo's landing page and select "manage topics."