NeuroServicesNews

Open-Source Neural Networks: What They Are, Why They Matter, and How to Run Them Locally

< Back to blog

Open-source neural networks are one of the most important trends in the world of AI. The ability to run powerful models on your own computer, without subscriptions, usage limits, or sending data to the cloud, is a game-changer. Let's explore which models are available, how to run them, and when it makes sense to do so.

What Does Open-Source Mean in the Context of AI?

In the world of neural networks, open-source means that the model's weights (its "knowledge" in the form of numerical parameters) and often the training code are available for download and use. However, there are nuances:

Levels of Openness

Fully Open: Code, weights, and training data are all available. Example: OLMo from AI2.

Open Weights: The model weights are available for download and use, but the training data is closed. Example: Llama from Meta, Mistral.

Limited Openness: Weights are available but with restrictions on commercial use or user count. Some models have special licenses.

Why Companies Open Their Models

  • Meta (Llama): To compete with Google and OpenAI and build an ecosystem.
  • Mistral: A startup approach—attract developers first, then monetize via API.
  • Stability AI: Believes in democratizing AI; monetizes through API and enterprise solutions.

Popular Open-Source Models

Language Models (Text)

Llama 3 (Meta)

Llama 3 is Meta's flagship open-source LLM, available in several sizes.

  • Llama 3 8B: A lightweight version that runs on a standard computer.
  • Llama 3 70B: A powerful version that competes with GPT-4.
  • Llama 3 405B: The most powerful version, requiring server-grade hardware.
  • License: Meta Llama License—free for commercial use up to 700 million users.

Mistral / Mixtral

The French startup Mistral releases compact yet powerful models.

  • Mistral 7B: Surprisingly capable for its size.
  • Mixtral 8x7B: Uses a Mixture of Experts architecture for speed and quality.
  • Mistral Large: Competes with the best closed-source models.
  • License: Apache 2.0—maximum freedom.

Other Notable Models

  • Qwen 2.5 (Alibaba): Excellent support for Chinese and other languages.
  • Gemma 2 (Google): Compact models with good quality.
  • Phi-3 (Microsoft): Small models for edge devices.
  • DeepSeek: Strong performance in programming and mathematics.

Image Models

Stable Diffusion (Stability AI)

The most popular open-source model for image generation.

  • SDXL: The main high-quality model.
  • SD3: The latest version with an improved architecture.
  • Thousands of custom models and LoRAs on CivitAI.
  • Runs on a GPU with 8+ GB of VRAM.

FLUX

FLUX from Black Forest Labs is a new competitor to Stable Diffusion.

  • Excellent quality "out of the box."
  • Better handling of text within images.
  • A heavier model but delivers superior results.

Audio Models

Whisper (OpenAI)

Whisper is an open-source model for speech recognition. Despite being from OpenAI, the model is fully open.

  • Supports 99 languages, including Russian.
  • Runs locally.
  • Several sizes: from tiny (39M parameters) to large (1.5B).
  • License: MIT—complete freedom.

Bark

Bark is a model for generating speech from text. It supports realistic speech in multiple languages and can generate laughter, pauses, and emotions. It is fully open-source.

Tools for Local Deployment

Ollama — The Simplest Way

Ollama is a tool that makes running LLMs locally as easy as installing a regular program.

Installation:

curl -fsSL https://ollama.com/install.sh | sh

Running a Model:

ollama run llama3

That's it. The model will download and start. You can interact via the command line or API.

Ollama Features:

  • A catalog of hundreds of models.
  • Automatic memory management.
  • Built-in API server (compatible with OpenAI API).
  • Works on Mac, Linux, Windows.
  • GPU support (NVIDIA, AMD, Apple Silicon).

LM Studio — For Those Who Prefer a GUI

LM Studio is a desktop application with a graphical interface for running LLMs.

Capabilities:

  • User-friendly chat interface.
  • Model catalog with search and filters.
  • Built-in API server.
  • Resource usage visualization.
  • Side-by-side model comparison.

text-generation-webui — For Advanced Users

text-generation-webui (oobabooga) is a web interface with maximum configurability. It supports all model formats (GGUF, GPTQ, AWQ), extensions for RAG and voice input, and an API for integration.

ComfyUI — For Image Generation

ComfyUI is a visual pipeline builder for Stable Diffusion. Its node-based interface allows you to visually construct the generation process with full control. It supports ControlNet, LoRA, IP-Adapter and has a huge library of community extensions.

Hardware Requirements

Minimum Requirements for Different Models

ModelParametersRAMVRAM (GPU)Storage
Mistral 7B (Q4)7B8 GB6 GB4 GB
Llama 3 8B (Q4)8B8 GB6 GB5 GB
Mixtral 8x7B (Q4)46B32 GB24 GB26 GB
Llama 3 70B (Q4)70B64 GB48 GB40 GB
Stable Diffusion XL16 GB8 GB7 GB
Whisper Large1.5B8 GB6 GB3 GB

Q4 — quantization to 4 bits. Reduces model size by 4 times with minimal quality loss.

Recommended Configurations

Entry Level (~$500-800):

  • CPU: Any modern CPU.
  • RAM: 16 GB.
  • GPU: NVIDIA RTX 3060 12GB or RTX 4060.
  • Suitable for: 7-8B models, Stable Diffusion.

Mid Level (~$1500-2500):

  • CPU: AMD Ryzen 7 / Intel i7.
  • RAM: 32 GB.
  • GPU: NVIDIA RTX 4070 Ti 16GB or RTX 4080.
  • Suitable for: 13-30B models, fast image generation.

Advanced Level (~$3000+):

  • CPU: AMD Ryzen 9 / Intel i9.
  • RAM: 64+ GB.
  • GPU: NVIDIA RTX 4090 24GB.
  • Suitable for: 70B models (quantized), professional work.

Apple Silicon

Macs with M1/M2/M3 chips are an excellent option for local models:

  • Unified Memory is used as both RAM and VRAM.
  • A MacBook Pro M3 Max with 64 GB RAM can run 70B models.
  • Energy efficiency is significantly higher than a PC with a GPU.

Open-Source vs. Cloud Services

Advantages of Local Deployment

  1. Privacy: Data never leaves your computer.
  2. No Subscriptions: Download once, use forever.
  3. No Limits: Generate as much as you want.
  4. Customization: Fine-tuning, quantization, any settings.
  5. Offline Work: No internet required.
  6. Speed for Large Volumes: No network latency.

Advantages of the Cloud

  1. Quality: GPT-4 and Claude 4 still outperform open-source models in most tasks.
  2. Simplicity: Register and start working.
  3. No Hardware Needed: Works on any device.
  4. Updates: Models are updated automatically.
  5. Multimodality: Cloud models can do more.

When to Choose Open-Source

  • Confidential data (medicine, finance, law).
  • Large-scale generation (thousands of requests per day).
  • Specific tasks requiring fine-tuning.
  • Offline environments or poor internet.
  • Desire for complete control.

Step-by-Step Tutorial — Running Ollama

Step 1 — Installation

Linux:

curl -fsSL https://ollama.com/install.sh | sh

Mac: Download from ollama.com and install like a regular application.

Windows: Download the installer from ollama.com.

Step 2 — Running Your First Model

ollama run mistral

The model will download automatically (~4 GB) and start. You'll see a prompt for input.

Step 3 — Dialogue

>>> Hello! Tell me about yourself
Hello! I am Mistral, a language model...

Step 4 — Using the API

Ollama automatically starts an API server on port 11434:

curl http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "prompt": "What is a neural network?"
}'

Step 5 — Connecting to Applications

Many applications support Ollama:

  • Open WebUI: A web interface similar to ChatGPT.
  • Continue: An AI assistant for VS Code.
  • Obsidian Copilot: AI in Obsidian notes.

Conclusion

Open-source neural networks are making AI accessible to everyone—without subscriptions, limits, or sharing data with third parties. With tools like Ollama, launching a model takes just one command. Yes, open-source models still lag behind the best commercial ones in general tasks, but the gap is closing rapidly. For many specialized tasks, they are already on par or even better. Try installing Ollama and running Mistral—it takes 5 minutes and could change your approach to working with AI.

Read also