본문 바로가기

Transformer7

Searching for Best Practices in Retrieval-Augmented Generation 리뷰 Searching for Best Practices in Retrieval-Augmented GenerationRetrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been proposedarxiv.org 오랜만에 논문리뷰를 하는 것 같네요. 링크드인으로 요즘 트렌드나 기술을 팔로 업하고 있는데 'RAG를 최적화하기 위한 필독 논문.. 2024. 7. 17.
효과적인 Attention 매커니즘 infini-attention 의 Code 리뷰 https://github.com/jlamprou/Infini-Attention/blob/main/infiniAttention.py Infini-Attention/infiniAttention.py at main · jlamprou/Infini-Attention Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M context keypass retrieval - jlamprou/Infini-Attention github.com + 블로그가 잘안보이는 관계로 https://github.com/jh941213/Code_revi.. 2024. 4. 18.
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention 논문 리뷰 오늘 소개해드릴 논문은 Long-Context에서 효과적인 방법을 위한 새로운 메커니즘 infini-attention에 관한 내용입니다. https://arxiv.org/abs/2404.07143 Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed approach .. 2024. 4. 16.
Jamba:A Hybrid Transformer-Mamba Language Model 리뷰 Jamba: A Hybrid Transformer-Mamba Language Model We present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture. Specifically, Jamba interleaves blocks of Transformer and Mamba layers, enjoying the benefits of both model families. MoE is arxiv.org ai21labs/Jamba-v0.1 · Hugging Face Model Card for Jamba Jamba is a state-of-the-art.. 2024. 4. 2.