Wingman has a built-in configuration window that will persist configuration locally into the vscode settings file in your repository. In the future this may move into a specific file that you can gitignore separately.
Since Wingman leverages WASM, there is limited OS support currently due to the size of the extension. We are working to resolve this. Here are the supported operating systems and architectures:
Wingman only stores data on your machine. In previous extension versions (prior to v0.7.0), configuration used to be stored in your repository. The new storage location contains Wingman configuration, project specific embeddings and more.
Storage Location:
/Home Directory/.wingman
Example on macOS:
/Users/username/.wingman
We aim to support the best models available. We allow the user to configure separate models for chat and code completion, this is especially helpful when running AI models locally. Here is a list of which models we support for each provider:
You can use the following models:
Anthropic prompt caching is used for Composer mode for optimization reasons
NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.
You can use the following models:
NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.
NOTE - AzureAI has general latency due to content filters on models by default. This can cause delays in responses and may require additional configuration to disable content filters.
You can use the following models:
NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.
NOTE - You can use any quantization for a supported model, you are not limited.
Example: deepseek-coder:6.7b-instruct-q4_0
Supported Models for Code Completion:
Supported Models for Chat:
Supported Models for Code Completion:
Supported Models for Chat:
NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.
Settings for the extension are broken down into 3 categories.
Provider settings will include which model to use, endpoints and the API key to use. These will save per provider allowing you to switch on the fly.
General extension settings are persisted separately from the AI provider, here is a breakdown of the general settings:
Code completion can run automatically triggered by line returns, spaces and tabs. Or can by hotkeying the "Wingman: Code Complete" command.
This is an experimental version of code complete that attempts to return results faster, allowing the user to see incremental changes as they accept.
During code completion, this controls the amount of surrounding text passed to the AI provider, giving better auto completion results.
The maximum amount of tokens the code models can generate during code completion.
When using chat, chat will pull code from the current open file around the current cursor position. This controls how many tokens it will include around the cursor. See our features guide for advanced use cases.
Controls the maximum about of tokens the AI provider will return.
Embedding currently supports Ollama and OpenAI. Choosing a provider mimics many of the general settings for each AI Provider for chat/code.
The dimenions that the embedding model outputs in.
Defaults to true, enabled will create a vector index (if it doesn't exist) when the extension launches and begin indexing files on save.
If enabled is false, documents will not index on save and full index builds will not complete.