MCP in VSCode: Fleet Context Where You Actually Write the Scripts
- MacSmithAI

- 22 hours ago
- 6 min read
Two posts ago we set up an Intune MCP server on macOS and wired it into Claude Desktop. Last time we added Raycast so the same server was reachable from a keyboard shortcut. This post closes the loop: connecting that same server to VSCode, so fleet context is available in the Copilot Chat pane while you're writing the PowerShell, Python, or Bash script that acts on it.
This is the one I reach for most when the answer isn't "tell me a number" but "help me write something that uses this number." Generating a CSV export of stale devices. Drafting a remediation script for the noncompliant ones. Wiring a Graph API call into a scheduled job. The LLM needs to see both your code and your fleet in the same turn, and Copilot Agent mode plus MCP is the cleanest way to get there.
What you need
VSCode 1.99 or later (March 2025+). MCP support requires this version or newer.
GitHub Copilot subscription — Pro, Business, or Enterprise. The free tier doesn't include Agent mode, which is where MCP tools surface.
Agent mode access. If you're on Business or Enterprise, your org admin must enable the "MCP servers in Copilot" policy. If it's off, your mcp.json will be accepted without complaint and then silently do nothing. Check this first.
A working Intune MCP server from the first post in this series.
The one gotcha that will waste your afternoon
VSCode's MCP config format is almost identical to Claude Desktop's and Raycast's, but with one critical difference:
The root key is "servers", not "mcpServers".
Claude Desktop, Cursor, and Raycast all use "mcpServers". If you copy your existing config over without changing that one key, VSCode will parse the file as empty and your Copilot tools panel will be bare — no error, no warning. This is the #1 setup mistake and it's cost me more hours than I'd like to admit. When in doubt, run MCP: List Servers from the Command Palette to confirm the server is actually registered.
Step 1: Enable Agent mode
If you've never used Agent mode before, turn it on first:
Open VSCode Settings (⌘, on macOS).
Search for chat.agent.enabled and make sure it's checked.
Open the Copilot Chat pane, find the mode dropdown at the top, and switch from Ask or Edit to Agent.
MCP tools are invisible in Ask and Edit mode — they only show up when Copilot is in Agent mode. If you're looking at the tools panel and it's empty, this is usually why.
Step 2: Add the MCP server
You have two scopes: user-level (available in every VSCode window) or workspace-level (scoped to a specific project via .vscode/mcp.json, which can be committed with the repo).
For the Intune server, I go user-level — it's not project-specific and I don't want it in any Git history. Open the Command Palette (⌘⇧P) and run:
MCP: Open User Configuration
That opens your user-level mcp.json. Drop in:
{
"servers": {
"intune": {
"type": "stdio",
"command": "node",
"args": ["/Users/yourname/mcp-servers/intune-mcp/dist/index.js"],
"env": {
"AZURE_TENANT_ID": "00000000-0000-0000-0000-000000000000",
"AZURE_CLIENT_ID": "11111111-1111-1111-1111-111111111111",
"AZURE_CLIENT_SECRET": "your-secret-value-here"
}
}
}
}Note the "servers" root (not "mcpServers") and the explicit "type": "stdio" field — VSCode also supports HTTP and SSE transports, so the type is required to disambiguate.
Save the file. You don't need to restart VSCode — the MCP system picks up changes live. Run MCP: List Servers from the Command Palette to verify. You should see intune listed with a status of "Running" and a tool count.
Step 3: Verify in Copilot Chat
Open the Copilot Chat pane (⌃⌘I on macOS), confirm the mode is Agent, then click the tools icon (🛠️) near the input field. You should see an "intune" section with each tool the server exposes. Toggle them on if they aren't already.
A simple first prompt:
How many macOS devices are enrolled in Intune?
Copilot will ask for permission before running the tool. Approve it, and you're live.
Where this actually pays off
The "what's my device count" query isn't where VSCode shines — Raycast does that faster. Where VSCode pulls ahead is when the fleet data feeds into code you're writing. A few examples from actual use:
Generating a remediation script from live data
With a Python file open, in Agent mode:
Query Intune for all macOS devices pending restart. Then write a Python script using the Microsoft Graph SDK that sends a rebootNow action to each one, with a 2-second delay between calls and a dry-run flag.
Copilot calls the MCP tool to get the actual device list, then writes the script using the real IDs and device names as comments. The dry-run flag defaults to true because I asked for it — read the generated code before flipping it.
Building a CSV report inline
Get all macOS devices on OS version 14.x, then generate a CSV I can save to disk with: serial, user UPN, OS version, last check-in date, and FileVault escrow status. Save it to ~/reports/sonoma-fleet.csv.
Agent mode will use the MCP tools to pull the data, then use the built-in filesystem tools to write the file. You can approve each step as it goes — which is tedious the first time but genuinely reassuring when an LLM is about to write something to your disk.
Debugging a broken deployment
Look at the Chrome install status across the macOS fleet. For any device where the install failed or is stuck, cross-reference with the last check-in date. If the pattern suggests the issue is tied to devices that haven't checked in recently, tell me. Otherwise, suggest what else I should look at.
This is where having the LLM reason over fleet data pays off more than running the query yourself. You're not asking for a number, you're asking for a hypothesis.
Workspace-scoped servers for team projects
If you're working on a repo that's genuinely about fleet tooling — a remediation script collection, an internal Graph API wrapper, a reporting dashboard — commit a .vscode/mcp.json so teammates get the same MCP setup when they clone. The trick is to reference environment variables instead of hardcoding secrets:
{
"inputs": [
{
"type": "promptString",
"id": "azure-tenant-id",
"description": "Azure Tenant ID"
},
{
"type": "promptString",
"id": "azure-client-id",
"description": "Azure Client ID"
},
{
"type": "promptString",
"id": "azure-client-secret",
"description": "Azure Client Secret",
"password": true
}
],
"servers": {
"intune": {
"type": "stdio",
"command": "node",
"args": ["${workspaceFolder}/tools/intune-mcp/dist/index.js"],
"env": {
"AZURE_TENANT_ID": "${input:azure-tenant-id}",
"AZURE_CLIENT_ID": "${input:azure-client-id}",
"AZURE_CLIENT_SECRET": "${input:azure-client-secret}"
}
}
}
}The inputs block prompts each teammate once per VSCode session for their credentials, and the "password": true flag keeps the secret out of logs and screenshots. Nothing gets committed, nothing leaks in a PR diff.
Premium requests and what they cost
Worth knowing: Agent mode and MCP tool calls consume "premium requests" when you're using a non-base model (Claude, Gemini, o3-mini, etc.). Copilot Pro gets 300/month, Business and Enterprise get more. The base OpenAI model is unlimited but in my experience noticeably weaker at multi-tool reasoning.
For fleet work, I default to Claude Sonnet for the balance of tool-use reliability and cost, and only reach for Opus when the task genuinely needs it. Running a morning status check every day on Sonnet is maybe 30 premium requests a month — well inside the Pro tier. Writing a complex remediation script with five tool calls and multiple iterations will eat through your budget faster than you expect. Keep an eye on it the first couple of weeks.
Useful commands to remember
Memorize these four — they cover 90% of what you'll do:
MCP: List Servers — status check for every configured server
MCP: Add Server — guided setup (handy for one-offs)
MCP: Open User Configuration — edit the user-level config
MCP: Restart Server — kick a server that's hung or picked up stale env vars
If the tools panel is empty after you've added a server, the fix is almost always one of: you're not in Agent mode, you used "mcpServers" instead of "servers", or your org has MCP disabled by policy. Check in that order.
The three-surface split, in one paragraph
If you've followed the series to here, you now have the same Intune MCP server exposed through three surfaces: Claude Desktop for long-form conversations, Raycast for reflex lookups, VSCode for code-adjacent work. The boundaries sort themselves out in practice. Long investigation with multiple hypotheses → Claude Desktop. "What's this device's status?" while reading a ticket → Raycast. "Write a script that uses this data" → VSCode. The same credentials, the same tools, three different interaction modes. That's the payoff for doing the MCP server setup once.
Next in this series: we step away from Intune and look at what changes when you point these same three surfaces at a local filesystem MCP server — and why the security model gets meaningfully more interesting when the LLM can touch your disk.


Comments