Corporate Laptop

Everything on one machine: VS Code + Cline Local + LM Studio. Minimal to set up and privacy-friendly.

Who this is for: Standard corporate laptops or desktops, possibly with limited GPU. If you have a strong GPU workstation or server, see Remote GPU Server.

Topology

All components run locally.

laptop:~ — local
VS Code Cline Local LM Studio API http://127.0.0.1:1234 Model Qwen2.5‑Coder 32B (preferred)

Recommended Models

On constrained hardware, consider Qwen 32B 4‑bit quantized as a low-resource fallback. Expect quality/latency trade‑offs. See Quick Tips.

Step‑by‑Step

  1. Install Cline Local (VSIX)
    Download the latest release VSIX from Releases.
    In VS Code: Extensions → ••• → Install from VSIX… → pick the file → Reload.
  2. Install LM Studio
    Download from https://lmstudio.ai and install. Launch LM Studio.
  3. Download a model
    In LM Studio, search for Qwen2.5‑Coder‑32B‑Instruct (or Qwen Coder 30B A3A). Download the variant your hardware supports.
    If VRAM is limited, try a 4‑bit quantized build. Quality may drop vs full precision.
  4. Start the LM Studio API server
    - Open the Server tab in LM Studio
    - Host: 127.0.0.1   Port: 1234
    - Enable CORS and keep‑alive
    - Start server; ensure the model is loaded
    Test locally:
    curl http://127.0.0.1:1234/v1/models
  5. Configure Cline Local
    In Cline Local settings (within VS Code):
    - Provider: LM Studio
    - Endpoint: http://127.0.0.1:1234
    - Model: the exact name LM Studio shows (e.g., qwen2.5-coder-32b-instruct)
  6. Run a quick test
    Start a small coding task and confirm tokens stream. If not, see troubleshooting below.

Troubleshooting

Looking for model selection tips? See Quick Tips.