CSSE releases German GPT-2 Medium

25.11.2024 -  

The CSSE is excited to announce the release of kkirchheim/german-gpt2-medium, a GPT-2-style language model pre-trained on German text, offering improved capabilities for natural language processing in the German language. Existing German-only models often fall short in terms of context window size and parameter count. Our model bridges this gap, making it better equipped for tasks requiring larger context understanding and more efficient language modeling at modest parameter-count. Running the quantized model requires less than 1GB of VRAM. Explore technical details here.

Key Features

  • 358M Parameters: Over twice the size of existing German-only GPT-2 models.
  • Extended Context Length: A context window of 2048 tokens, double the standard 1024 tokens in similar models.
  • High-Quality Dataset: Trained on 300GB of high-quality German text, leveraging the German Colossal, Cleaned Common Crawl Corpus (GC4).

Last Modification: 25.11.2024 - Contact Person: Webmaster