Configuration

Wingman has a built-in configuration window that will persist configuration locally into the vscode settings file in your repository. In the future this may move into a specific file that you can gitignore separately.

NOTE

Since Wingman leverages WASM, there is limited OS support currently due to the size of the extension. We are working to resolve this. Here are the supported operating systems and architectures:

  • Windows x64
  • Windows x64 ARM
  • macOS x64
  • macOS ARM

Storage

Wingman only stores data on your machine. In previous extension versions (prior to v0.7.0), configuration used to be stored in your repository. The new storage location contains Wingman configuration, project specific embeddings and more.

Storage Location:

/Home Directory/.wingman

Example on macOS:

/Users/username/.wingman

Supported Models

We aim to support the best models available. We allow the user to configure separate models for chat and code completion, this is especially helpful when running AI models locally. Here is a list of which models we support for each provider:

Anthropic

You can use the following models:

  • Claude 3.5 Sonnet
  • Claude 3 Opus

Anthropic prompt caching is used for Composer mode for optimization reasons

NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.

OpenAI

You can use the following models:

  • GPT-4o
  • GPT-4o-mini
  • GPT-4-Turbo
  • GPT4
  • GPT-1o

NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.

AzureAI

NOTE - AzureAI has general latency due to content filters on models by default. This can cause delays in responses and may require additional configuration to disable content filters.

You can use the following models:

  • GPT-4o
  • GPT-4o-mini
  • GPT-4-Turbo
  • GPT4
  • GPT-1o

NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.

Ollama

NOTE - You can use any quantization for a supported model, you are not limited.

Example: deepseek-coder:6.7b-instruct-q4_0

Supported Models for Code Completion:

Supported Models for Chat:

Hugging Face

Supported Models for Code Completion:

Supported Models for Chat:

NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.

Settings

Settings for the extension are broken down into 3 categories.

AI Provider

Provider settings will include which model to use, endpoints and the API key to use. These will save per provider allowing you to switch on the fly.

Extension Settings

General extension settings are persisted separately from the AI provider, here is a breakdown of the general settings:

Code completion enabled

Code completion can run automatically triggered by line returns, spaces and tabs. Or can by hotkeying the "Wingman: Code Complete" command.

Code streaming

This is an experimental version of code complete that attempts to return results faster, allowing the user to see incremental changes as they accept.

Code context window

During code completion, this controls the amount of surrounding text passed to the AI provider, giving better auto completion results.

Code max tokens

The maximum amount of tokens the code models can generate during code completion.

Chat context window

When using chat, chat will pull code from the current open file around the current cursor position. This controls how many tokens it will include around the cursor. See our features guide for advanced use cases.

Chat max tokens

Controls the maximum about of tokens the AI provider will return.

Embedder Settings

Embedding currently supports Ollama and OpenAI. Choosing a provider mimics many of the general settings for each AI Provider for chat/code.

Dimensions

The dimenions that the embedding model outputs in.

Enabled

Defaults to true, enabled will create a vector index (if it doesn't exist) when the extension launches and begin indexing files on save.

TIP

If enabled is false, documents will not index on save and full index builds will not complete.