Ollama for Ampere based Servers

Overview

This guide provides instructions for configuring the OLLAMA_IMAGE environment variable to use Ampere-optimized Ollama containers with your View deployment. You can either set this as a persistent environment variable or configure multiple Ollama instances to run concurrently.

Environment Variable Configuration

Default Configuration

Variable: OLLAMA_IMAGE
Default Value: ollama/ollama
Ampere Optimized: ghcr.io/amperecomputingai/ollama-ampere:0.0.8-ub25

Checking for Latest Versions

Before proceeding, check the official Ampere Ollama repository for the latest available versions:

Repository: https://github.com/orgs/AmpereComputingAI/packages/container/package/ollama-ampere
Look for newer versions or alternative tags that may be available for testing

Setting Environment Variables

Temporary Setting (Current Session Only)

To set the environment variable for the current terminal session only:

export OLLAMA_IMAGE=ghcr.io/amperecomputingai/ollama-ampere:0.0.8-ub25

This setting will be lost when you close the terminal or start a new session.

Persistent Setting (Across Sessions)

To make the environment variable persistent across all new sessions, add it to your shell profile:

For Bash users:

echo 'export OLLAMA_IMAGE=ghcr.io/amperecomputingai/ollama-ampere:0.0.8-ub25' >> ~/.bashrc
source ~/.bashrc

For Zsh users:

echo 'export OLLAMA_IMAGE=ghcr.io/amperecomputingai/ollama-ampere:0.0.8-ub25' >> ~/.zshrc
source ~/.zshrc

For system-wide setting:

sudo echo 'export OLLAMA_IMAGE=ghcr.io/amperecomputingai/ollama-ampere:0.0.8-ub25' >> /etc/environment

Verifying the Setting

To verify your environment variable is set correctly:

echo $OLLAMA_IMAGE

Applying Configuration Changes

After making any changes to environment variables or the compose.yaml file, you must restart View for the changes to take effect:

cd /path/to/View
./viewctl restart

Single Instance Configuration

By default View runs a single Ollama instance, setting the OLLAMA_IMAGE environment variable is sufficient to control which version of the Ollama container to run. The existing docker-compose configuration will automatically use your specified image:

ollama-cpu:
  profiles: ["cpu"]
  image: ${OLLAMA_IMAGE:-ollama/ollama}
  container_name: ollama
  networks:
    - private
  volumes:
    - ./working/ollama:/root/.ollama
  restart: unless-stopped
  depends_on:
    mysql:
      condition: service_healthy
    rabbitmq:
      condition: service_healthy

Multiple Instance Configuration

If you want to run multiple Ollama instances concurrently (for example, to compare performance between standard and Ampere-optimized versions), you can add additional services to your compose.yaml file in the View directory.

Adding an Ampere Instance

Add the following section to your compose.yaml file:

ollama-ampere:
  image: ghcr.io/amperecomputingai/ollama-ampere:0.0.8-ub25
  container_name: ampere-ollama
  networks:
    - private
  volumes:
    - ./working/ollama:/root/.ollama
  restart: unless-stopped
  depends_on:
    mysql:
      condition: service_healthy
    rabbitmq:
      condition: service_healthy

Configuring View to Use Multiple Instances

When running multiple instances, you can configure View to use specific instances:

Navigate to the View assistant settings page: /assistant?view=settings
In the "Ollama hostname" field, enter the container name of the desired instance:
- For the standard instance: ollama
- For the Ampere instance: ampere-ollama
- For any custom instance: use the container_name you specified

Remember: After adding new services to compose.yaml, restart View using viewctl restart from the View directory.

Resetting to Default

To revert to the default Ollama image:

Temporary reset:

unset OLLAMA_IMAGE

Permanent reset:

Remove the export line from your shell profile (~/.bashrc, ~/.zshrc, or /etc/environment) and restart your terminal or run:

source ~/.bashrc  # or ~/.zshrc

Important Considerations

⚠️ Important Notice

The Ampere-optimized Ollama instance may not support all models available in the standard Ollama distribution. Compatibility and performance may vary depending on the specific models you intend to use.

These instructions are provided as-is for guidance purposes. View is not responsible for the Ollama codebase, Ampere optimizations, or any issues that may arise from their use. We can only provide guidance based on configurations that have been successful in our testing environment.

Users are encouraged to test thoroughly in their specific environments and refer to the official Ampere Computing AI documentation for technical support related to the Ampere-optimized containers.

Troubleshooting

Environment variable not taking effect: Ensure you've restarted your terminal or sourced your profile after making changes, then restart View with viewctl restart
Container not starting: Verify the image name and version are correct by checking the Ampere repository
Model compatibility issues: Some models may not be optimized for or compatible with the Ampere version
Performance concerns: Compare performance between standard and Ampere versions with your specific workload
Changes not reflected: Always run viewctl restart from the View directory after modifying configuration files or environment variables

For additional support with View-specific configurations, consult the View documentation. For Ollama and Ampere-optimization related issues, refer to the respective project repositories.