Skip to content

Models

f
bart-large-cnnBeta
Summarizationfacebook

BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. You can use this model for text summarization.

    b
    bge-base-en-v1.5
    Text Embeddingsbaai

    BAAI general embedding (Base) model that transforms any given text into a 768-dimensional vector

      b
      bge-large-en-v1.5
      Text Embeddingsbaai

      BAAI general embedding (Large) model that transforms any given text into a 1024-dimensional vector

        b
        bge-small-en-v1.5
        Text Embeddingsbaai

        BAAI general embedding (Small) model that transforms any given text into a 384-dimensional vector

          t
          deepseek-coder-6.7b-base-awqBeta
          Text Generationthebloke

          Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese.

            t
            deepseek-coder-6.7b-instruct-awqBeta
            Text Generationthebloke

            Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese.

              d
              deepseek-math-7b-instructBeta
              Text Generationdeepseek-ai

              DeepSeekMath-Instruct 7B is a mathematically instructed tuning model derived from DeepSeekMath-Base 7B. DeepSeekMath is initialized with DeepSeek-Coder-v1.5 7B and continues pre-training on math-related tokens sourced from Common Crawl, together with natural language and code data for 500B tokens.

                d
                deepseek-r1-distill-qwen-32b
                Text Generationdeepseek-ai

                Deepseek R1 Distilled Qwen 32B

                  f
                  detr-resnet-50Beta
                  Object Detectionfacebook

                  DEtection TRansformer (DETR) model trained end-to-end on COCO 2017 object detection (118k annotated images).

                    t
                    discolm-german-7b-v1-awqBeta
                    Text Generationthebloke

                    DiscoLM German 7b is a Mistral-based large language model with a focus on German-language applications. AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization.

                      HuggingFace logodistilbert-sst-2-int8
                      Text ClassificationHuggingFace

                      Distilled BERT model that was finetuned on SST-2 for sentiment classification

                        l
                        dreamshaper-8-lcmBeta
                        Text-to-Imagelykon

                        Stable Diffusion model that has been fine-tuned to be better at photorealism without sacrificing range.

                          t
                          falcon-7b-instructBeta
                          Text Generationtiiuae

                          Falcon-7B-Instruct is a 7B parameters causal decoder-only model built by TII based on Falcon-7B and finetuned on a mixture of chat/instruct datasets.

                            b
                            flux-1-schnell
                            Text-to-Imageblack-forest-labs

                            FLUX.1 [schnell] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions.

                              Google logogemma-2b-it-loraBeta
                              Text GenerationGoogle

                              This is a Gemma-2B base model that Cloudflare dedicates for inference with LoRA adapters. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.

                              • LoRA
                              Google logogemma-7b-it-loraBeta
                              Text GenerationGoogle

                              This is a Gemma-7B base model that Cloudflare dedicates for inference with LoRA adapters. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.

                              • LoRA
                              Google logogemma-7b-itBeta
                              Text GenerationGoogle

                              Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants.

                              • LoRA
                              n
                              hermes-2-pro-mistral-7bBeta
                              Text Generationnousresearch

                              Hermes 2 Pro on Mistral 7B is the new flagship 7B Hermes! Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house.

                              • Function calling
                              t
                              llama-2-13b-chat-awqBeta
                              Text Generationthebloke

                              Llama 2 13B Chat AWQ is an efficient, accurate and blazing-fast low-bit weight quantized Llama 2 variant.

                                Meta logollama-2-7b-chat-fp16
                                Text GenerationMeta

                                Full precision (fp16) generative text model with 7 billion parameters from Meta

                                  m
                                  llama-2-7b-chat-hf-loraBeta
                                  Text Generationmeta-llama

                                  This is a Llama2 base model that Cloudflare dedicated for inference with LoRA adapters. Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.

                                  • LoRA
                                  Meta logollama-2-7b-chat-int8
                                  Text GenerationMeta

                                  Quantized (int8) generative text model with 7 billion parameters from Meta

                                    Meta logollama-3-8b-instruct-awq
                                    Text GenerationMeta

                                    Quantized (int4) generative text model with 8 billion parameters from Meta.

                                      Meta logollama-3-8b-instruct
                                      Text GenerationMeta

                                      Generation over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.

                                        Meta logollama-3.1-70b-instruct
                                        Text GenerationMeta

                                        The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.

                                          Meta logollama-3.1-8b-instruct-awq
                                          Text GenerationMeta

                                          Quantized (int4) generative text model with 8 billion parameters from Meta.

                                            Meta logollama-3.1-8b-instruct-fast
                                            Text GenerationMeta

                                            [Fast version] The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.

                                              Meta logollama-3.1-8b-instruct-fp8
                                              Text GenerationMeta

                                              Llama 3.1 8B quantized to FP8 precision

                                                Meta logollama-3.1-8b-instruct
                                                Text GenerationMeta

                                                The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.

                                                  Meta logollama-3.2-11b-vision-instruct
                                                  Text GenerationMeta

                                                  The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image.

                                                    Meta logollama-3.2-1b-instruct
                                                    Text GenerationMeta

                                                    The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.

                                                      Meta logollama-3.2-3b-instruct
                                                      Text GenerationMeta

                                                      The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.

                                                        Meta logollama-3.3-70b-instruct-fp8-fast
                                                        Text GenerationMeta

                                                        Llama 3.3 70B quantized to fp8 precision, optimized to be faster.

                                                          t
                                                          llamaguard-7b-awqBeta
                                                          Text Generationthebloke

                                                          Llama Guard is a model for classifying the safety of LLM prompts and responses, using a taxonomy of safety risks.

                                                            l
                                                            llava-1.5-7b-hfBeta
                                                            Image-to-Textllava-hf

                                                            LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.

                                                              Meta logom2m100-1.2b
                                                              TranslationMeta

                                                              Multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation

                                                                m
                                                                meta-llama-3-8b-instruct
                                                                Text Generationmeta-llama

                                                                Generation over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.

                                                                  t
                                                                  mistral-7b-instruct-v0.1-awqBeta
                                                                  Text Generationthebloke

                                                                  Mistral 7B Instruct v0.1 AWQ is an efficient, accurate and blazing-fast low-bit weight quantized Mistral variant.

                                                                    MistralAI logomistral-7b-instruct-v0.1
                                                                    Text GenerationMistralAI

                                                                    Instruct fine-tuned version of the Mistral-7b generative text model with 7 billion parameters

                                                                    • LoRA
                                                                    MistralAI logomistral-7b-instruct-v0.2-loraBeta
                                                                    Text GenerationMistralAI

                                                                    The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2.

                                                                    • LoRA
                                                                    MistralAI logomistral-7b-instruct-v0.2Beta
                                                                    Text GenerationMistralAI

                                                                    The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2. Mistral-7B-v0.2 has the following changes compared to Mistral-7B-v0.1: 32k context window (vs 8k context in v0.1), rope-theta = 1e6, and no Sliding-Window Attention.

                                                                    • LoRA
                                                                    t
                                                                    neural-chat-7b-v3-1-awqBeta
                                                                    Text Generationthebloke

                                                                    This model is a fine-tuned 7B parameter LLM on the Intel Gaudi 2 processor from the mistralai/Mistral-7B-v0.1 on the open source dataset Open-Orca/SlimOrca.

                                                                      o
                                                                      openchat-3.5-0106Beta
                                                                      Text Generationopenchat

                                                                      OpenChat is an innovative library of open-source language models, fine-tuned with C-RLFT - a strategy inspired by offline reinforcement learning.

                                                                        t
                                                                        openhermes-2.5-mistral-7b-awqBeta
                                                                        Text Generationthebloke

                                                                        OpenHermes 2.5 Mistral 7B is a state of the art Mistral Fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.

                                                                          Microsoft logophi-2Beta
                                                                          Text GenerationMicrosoft

                                                                          Phi-2 is a Transformer-based model with a next-word prediction objective, trained on 1.4T tokens from multiple passes on a mixture of Synthetic and Web datasets for NLP and coding.

                                                                            q
                                                                            qwen1.5-0.5b-chatBeta
                                                                            Text Generationqwen

                                                                            Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud.

                                                                              q
                                                                              qwen1.5-1.8b-chatBeta
                                                                              Text Generationqwen

                                                                              Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud.

                                                                                q
                                                                                qwen1.5-14b-chat-awqBeta
                                                                                Text Generationqwen

                                                                                Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization.

                                                                                  q
                                                                                  qwen1.5-7b-chat-awqBeta
                                                                                  Text Generationqwen

                                                                                  Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization.

                                                                                    Microsoft logoresnet-50
                                                                                    Image ClassificationMicrosoft

                                                                                    50 layers deep image classification CNN trained on more than 1M images from ImageNet

                                                                                      d
                                                                                      sqlcoder-7b-2Beta
                                                                                      Text Generationdefog

                                                                                      This model is intended to be used by non-technical users to understand data inside their SQL databases.

                                                                                        r
                                                                                        stable-diffusion-v1-5-img2imgBeta
                                                                                        Text-to-Imagerunwayml

                                                                                        Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images. Img2img generate a new image from an input image with Stable Diffusion.

                                                                                          r
                                                                                          stable-diffusion-v1-5-inpaintingBeta
                                                                                          Text-to-Imagerunwayml

                                                                                          Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask.

                                                                                            Stability.ai logostable-diffusion-xl-base-1.0Beta
                                                                                            Text-to-ImageStability.ai

                                                                                            Diffusion-based text-to-image generative model by Stability AI. Generates and modify images based on text prompts.

                                                                                              b
                                                                                              stable-diffusion-xl-lightningBeta
                                                                                              Text-to-Imagebytedance

                                                                                              SDXL-Lightning is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

                                                                                                n
                                                                                                starling-lm-7b-betaBeta
                                                                                                Text Generationnexusflow

                                                                                                We introduce Starling-LM-7B-beta, an open large language model (LLM) trained by Reinforcement Learning from AI Feedback (RLAIF). Starling-LM-7B-beta is trained from Openchat-3.5-0106 with our new reward model Nexusflow/Starling-RM-34B and policy optimization method Fine-Tuning Language Models from Human Preferences (PPO).

                                                                                                  t
                                                                                                  tinyllama-1.1b-chat-v1.0Beta
                                                                                                  Text Generationtinyllama

                                                                                                  The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. This is the chat model finetuned on top of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T.

                                                                                                    u
                                                                                                    uform-gen2-qwen-500mBeta
                                                                                                    Image-to-Textunum

                                                                                                    UForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering. The model was pre-trained on the internal image captioning dataset and fine-tuned on public instructions datasets: SVIT, LVIS, VQAs datasets.

                                                                                                      f
                                                                                                      una-cybertron-7b-v2-bf16Beta
                                                                                                      Text Generationfblgit

                                                                                                      Cybertron 7B v2 is a 7B MistralAI based model, best on it's series. It was trained with SFT, DPO and UNA (Unified Neural Alignment) on multiple datasets.

                                                                                                        OpenAI logowhisper-large-v3-turboBeta
                                                                                                        Automatic Speech RecognitionOpenAI

                                                                                                        Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.

                                                                                                          OpenAI logowhisper-tiny-enBeta
                                                                                                          Automatic Speech RecognitionOpenAI

                                                                                                          Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalize to many datasets and domains without the need for fine-tuning. This is the English-only version of the Whisper Tiny model which was trained on the task of speech recognition.

                                                                                                            OpenAI logowhisper
                                                                                                            Automatic Speech RecognitionOpenAI

                                                                                                            Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

                                                                                                              t
                                                                                                              zephyr-7b-beta-awqBeta
                                                                                                              Text Generationthebloke

                                                                                                              Zephyr 7B Beta AWQ is an efficient, accurate and blazing-fast low-bit weight quantized Zephyr model variant.