Tutorials20 min read398 views

Part 2: Building FileSage MCP — A Production-Grade File Intelligence Server

A
Admin·
Part 2: Building FileSage MCP — A Production-Grade File Intelligence Server

If you read Part 1, you know the theory cold. You know why MCP exists, you understand the Host-Client-Server separation, you can explain the initialization handshake, tools, resources, roots, sampling etc. That was the mental model.

Part 2 is the payoff. We're building a real server, start to finish, and we're not skipping anything.

We're building: FileSage MCP: an intelligent file system server that gives Claude secure, structured access to your local filesystem. It's practical enough to actually use and comprehensive enough to demonstrate every single MCP feature.

By the end, you will have written or touched:

  • Tools (read, write, search, scan, auto-tag)
  • Resources (directory trees, roots listing)
  • Prompts (summarize, code review, find todos)
  • Logging & Progress notifications
  • Roots for security boundaries
  • Sampling (the server asking the LLM)
  • SSE for remote deployment

Let's dive into the architecture.


Project Structure

The first architectural decision is where things live. A monolithic server file that grows to 400 lines is fine for demos, but it becomes hard to navigate fast. We split by responsibility:

📁 FILESAGE_MCP
├── 📄 main.py               # Application Entrypoint & CLI Orchestrator
├── 📄 mcp_client.py         # MCP Client Lifecycle & Roots Injector
├── 📄 mcp_server.py         # Light Server Bootstrapper
├── 📄 pyproject.toml        # Project Dependencies & Metadata
└── 📁 core/                 # The Application Brains
    ├── 📄 server.py         # Shared FastMCP Instance
    ├── 📄 tools.py          # Action Layer (@mcp.tool)
    ├── 📄 resources.py      # Context Data Layer (@mcp.resource)
    ├── 📄 prompts.py        # Static Templates (@mcp.prompt)
    ├── 📄 security.py       # Path Validation & Firewall Guard
    ├── 📄 utils.py          # Path and URI Parsing Helpers
    ├── 📄 openai.py         # OpenAI Provider Adaptor
    ├── 📄 claude.py         # Anthropic Provider Adaptor + Sampling Callback
    ├── 📄 chat.py           # Core Agentic Message Loop
    ├── 📄 cli_chat.py       # Terminal Chat Loop Subclass
    └── 📄 tool_manager.py   # MCP Tool to LLM Schema Translator

The key insight is core/server.py. It holds the single FastMCP instance. Every other module imports mcp from there and registers its handlers with decorators. mcp_server.py itself becomes nearly empty it just imports the modules to trigger registration, then calls mcp.run().

# mcp_server.py — entry point only
import sys
from core.server import mcp
 
import core.tools       
import core.resources   
import core.prompts     
 
if __name__ == "__main__":
    transport = sys.argv[1] if len(sys.argv) > 1 else "stdio"
    if transport == "sse":
        mcp.run(transport="sse", host="0.0.0.0", port=8000)
    else:
        mcp.run(transport="stdio")

And core/server.py is just two meaningful lines:

from mcp.server.fastmcp import FastMCP
mcp = FastMCP("FileSage", log_level="ERROR")

The log_level="ERROR" matters. If FastMCP logs at DEBUG or INFO, it writes to stdout, which corrupts the JSON-RPC stream on stdio transport. Set this explicitly on every server you build.

Find Source Code: https://github.com/BhimPrasadAdhikari/filesage-mcp

Set it up with:

cp .env.example .env   # add your ANTHROPIC_API_KEY
pip install -r requirements.txt
python main.py ~/your-project-dir

1. Tools

In Part 1 we said tools are functions the LLM autonomously decides to call. They have side effects. They act on the world. Writing a good tool is three things: a clear name, an honest description, and tight type hints. The SDK does the rest.

FileSage defines five tools. Let's look at read_file tool:

Code walkthrough

5 steps
1from pathlib import Path
2from mcp.server.fastmcp import Context
3from pydantic import Field
4from core.server import mcp
5from core.security import is_path_allowed
6
7@mcp.tool()
8async def read_file(
9 path: str = Field(description="Absolute path to the file to read"),
10 *,
11 ctx: Context,
12) -> str:
13 """Read the full contents of a file. Path must be within an allowed root."""
14 file_path = Path(path).resolve()
15
16 if not await is_path_allowed(file_path, ctx):
17 raise ValueError(f"Access denied: '{path}' is outside the allowed roots.")
18 if not file_path.exists():
19 raise ValueError(f"File not found: {path}")
20 if not file_path.is_file():
21 raise ValueError(f"Not a file: {path}")
22
23 return file_path.read_text(encoding="utf-8", errors="replace")

2. Resources

Resources are the data layer. No side effects, no mutations — just structured context the LLM can read before deciding what to do. Think of them as your server's GET endpoints, addressed by URI.

FileSage's two resources live in core/resources.py. The tree resource is the more instructive one:

Code walkthrough

4 steps
1from pathlib import Path
2from mcp.server.fastmcp import Context
3from core.server import mcp
4from core.security import is_path_allowed
5from core.utils import file_url_to_path
6
7@mcp.resource("files://roots")
8async def get_roots(ctx: Context) -> str:
9 """List all root directories that this server is allowed to access."""
10 roots_result = await ctx.session.list_roots()
11 if not roots_result.roots:
12 return "No root directories have been configured."
13
14 lines = ["Allowed roots:"]
15 for root in roots_result.roots:
16 root_path = file_url_to_path(root.uri)
17 lines.append(f" • {root.name}: {root_path}")
18 return "\n".join(lines)
19
20
21@mcp.resource("files://tree/{path}")
22async def get_tree(path: str, ctx: Context) -> str:
23 """Return a visual ASCII directory tree for a given path. Limited to 3 levels deep."""
24 dir_path = Path(path).resolve()
25
26 if not await is_path_allowed(dir_path, ctx):
27 raise ValueError(f"Access denied: '{path}'")
28 if not dir_path.is_dir():
29 raise ValueError(f"Not a directory: '{path}'")
30
31 def build_tree(p: Path, prefix: str = "", depth: int = 0) -> list[str]:
32 if depth > 3:
33 return [f"{prefix}... (truncated)"]
34 entries = sorted(p.iterdir(), key=lambda x: (x.is_file(), x.name))
35 lines = []
36 for i, entry in enumerate(entries):
37 connector = "└── " if i == len(entries) - 1 else "├── "
38 lines.append(f"{prefix}{connector}{entry.name}")
39 if entry.is_dir():
40 ext = " " if i == len(entries) - 1 else "│ "
41 lines.extend(build_tree(entry, prefix + ext, depth + 1))
42 return lines
43
44 return "\n".join([str(dir_path)] + build_tree(dir_path))

The naming distinction matters and is worth internalising: if Claude needs to understand something before acting -> read a resource. If Claude needs to change something -> call a tool. Keep this line clean and your architecture stays coherent.


3. Prompts

As discussed earlier, they're slash commands. User selects one, fill the arguments, and host constructs the opening message of conversation using your templates. Not autonomous. User triggers them when they want to use it.

All three prompts live in core/prompts.py and follow the same pattern: a Python function that takes string arguments and returns a formatted string.

Code walkthrough

3 steps
1from core.server import mcp
2
3@mcp.prompt()
4def summarize_file(file_path: str) -> str:
5 """A ready-to-use prompt template for summarizing any file."""
6 return f"""Please read and summarize the file at: {file_path}
7
8Your summary should cover:
91. What the file does and its overall purpose
102. Key components, functions, classes, or data structures (if code)
113. Any notable patterns, potential issues, or things worth knowing
124. A one-sentence TL;DR at the very end"""
13
14
15@mcp.prompt()
16def code_review(file_path: str) -> str:
17 """A structured code review prompt for a given file path."""
18 return f"""Please perform a thorough code review of: {file_path}
19
20Evaluate each of the following:
21
221. Correctness - Bugs, edge cases, or logical errors?
232. Style - Does it follow language conventions and best practices?
243. Performance - Any obvious bottlenecks or inefficiencies?
254. Security - Any vulnerabilities, unsafe inputs, or exposed secrets?
265. Readability - Is the code clear and well-documented?
27
28Conclude with a prioritized list of improvements."""
29
30
31@mcp.prompt()
32def find_todos(directory_path: str) -> str:
33 """A prompt template for hunting down TODO/FIXME/HACK comments in a directory."""
34 return f"""Search the directory '{directory_path}' for all TODO, FIXME, HACK, and NOTE comments.
35
36For every comment found:
37- File path and line number
38- The full comment text
39- Priority level (FIXME = high, TODO = medium, NOTE/HACK = low)
40
41Group results by file. End with a prioritized action list sorted by severity."""

The client reads these via session.list_prompts() and session.get_prompt("code_review", {"file_path": "/path/to/file.py"}). In the FileSage CLI, typing /prompts lists them all.

4. Logging and Progress Notifications

When a tool does something that takes time, we don't make the user stare at silence. MCP has built-in support for real-time log messages and progress notifications. Let's look at the code below:

Code walkthrough

5 steps
1from mcp.server.fastmcp import Context
2from core.server import mcp
3from core.security import is_path_allowed
4
5@mcp.tool()
6async def scan_directory(
7 path: str,
8 *,
9 ctx: Context,
10) -> dict:
11 """
12 Deep scan a directory: count files, categorize by extension, sum total size.
13 Emits real-time logging messages and progress notifications during the scan.
14 """
15 dir_path = Path(path).resolve()
16 await ctx.info(f"Starting deep scan of: {path}")
17
18 all_entries = list(dir_path.rglob("*"))
19 total = len(all_entries)
20
21 stats = {"total_files": 0, "total_dirs": 0, "total_size_bytes": 0, "by_extension": {}}
22
23 for i, entry in enumerate(all_entries):
24 if total > 0 and i % max(1, total // 10) == 0:
25 await ctx.report_progress(i, total)
26
27 if entry.is_dir():
28 stats["total_dirs"] += 1
29 elif entry.is_file():
30 stats["total_files"] += 1
31 ext = entry.suffix.lower() or "(no extension)"
32 stats["by_extension"][ext] = stats["by_extension"].get(ext, 0) + 1
33
34 await ctx.report_progress(total, total)
35 await ctx.info(f"Scan complete - {stats['total_files']} files, {stats['total_dirs']} dirs")
36 return stats

Above both log and notification are fire-and-forget from the server's perspective. The server emits, the client handles the rest.

5. Roots

This is the feature that makes the MCP server secure. Without it, Claude can access your whole filesystem without your permission.

As discussed earlier, roots are directory path the client passes to the server during initialization. Server validates every filesystem path against them. The guard code is in core/security and is called at the tip of every tool and resource that touches disk.

Code walkthrough

5 steps
1from pathlib import Path
2from mcp.server.fastmcp import Context
3from core.utils import file_url_to_path
4
5async def is_path_allowed(requested_path: Path, ctx: Context) -> bool:
6 """
7 Check whether a path falls inside any of the client-provided roots.
8 This is the security boundary between Claude and the filesystem.
9 """
10 roots_result = await ctx.session.list_roots()
11 client_roots = roots_result.roots
12
13 if not requested_path.exists():
14 check_path = requested_path.parent
15 if not check_path.exists():
16 return False
17 else:
18 check_path = (
19 requested_path if requested_path.is_dir() else requested_path.parent
20 )
21
22 for root in client_roots:
23 root_path = file_url_to_path(root.uri)
24 try:
25 check_path.relative_to(root_path)
26 return True
27 except ValueError:
28 continue
29
30 return False

The rule is simple: every tool and resource that reads or writes a path must call is_path_allowed() before doing anything. If you add a new tool, you must call this method.

6. Sampling Server asking the LLM

This is where MCP gets interesting. The flow normally goes: Client → server. Sampling reverses it. The server asks the Client to invoke the LLM on its behalf, and the result comes back through the same channel.

We need it to make our tool auto_tag_file() intelligent to tag the files and tagging is better done by a LLM than by keyword matching. So the server needs this feature. Let's look at the code:

Sampling code walkthrough

Code walkthrough

5 steps
1from mcp.types import SamplingMessage, TextContent
2from mcp.server.fastmcp import Context
3from core.server import mcp
4from core.security import is_path_allowed
5
6@mcp.tool()
7async def auto_tag_file(
8 path: str,
9 *,
10 ctx: Context,
11) -> list[str]:
12 """Use AI (via server-side sampling) to auto-generate tags for a file."""
13 file_path = Path(path).resolve()
14
15 if not await is_path_allowed(file_path, ctx):
16 raise ValueError(f"Access denied: '{path}'")
17
18 content = file_path.read_text(encoding="utf-8", errors="replace")[:3000]
19 prompt = (
20 f"Analyze '{file_path.name}' and generate 5-8 short, lowercase tags.\n\n"
21 f"Content:\n{content}\n\nReturn ONLY comma-separated tags."
22 )
23
24 result = await ctx.session.create_message(
25 messages=[
26 SamplingMessage(
27 role="user",
28 content=TextContent(type="text", text=prompt),
29 )
30 ],
31 max_tokens=200,
32 system_prompt="You are a precise file categorization assistant. Return ONLY comma-separated tags.",
33 )
34
35 if result.content.type == "text":
36 return [t.strip().lower() for t in result.content.text.split(",") if t.strip()]
37
38 raise ValueError("Sampling returned an unexpected content type.")

For sampling to work, the sampling_callback must be wired to the ClientSession. In FileSage, core/cli_chat.py does this after the session is initialized. Like this:

# core/cli_chat.py

def _wire_sampling_callback(self) -> None:
    session: ClientSession = self.filesage_client.session()
    session._sampling_callback = self.claude_service.sampling_callback

This is the step most people miss when first implementing sampling. The server-side create_message() call will hang or error if no callback is registered.


7. SSE Deploying to a Remote Server

Everything so far runs over stdio. The server is subprocess on your local machine. That's the right default for personal use. But if you want a team to share a FileSage instance running on a remote server, you need HTTP.

The switch is single transport argument wired in mcp_server.py

Code walkthrough

2 steps
1if __name__ == "__main__":
2 transport = sys.argv[1] if len(sys.argv) > 1 else "stdio"
3 if transport == "sse":
4 mcp.run(transport="sse", host="0.0.0.0", port=8000)
5 else:
6 mcp.run(transport="stdio")
7

# Start in SSE mode
python mcp_server.py sse
# Listening on http://0.0.0.0:8000

One gap from part 1 worth repeating here: stateless_http=True or json_response=True breaks progress notifications and sampling. If you're deploying via SSE and need those features, keep both settings at their defaults. Conclude from this capability comparison & choose what's best for your business needs.

## Wiring It all Together

Here's the complete call chain so you can see how every piece connects before you run it:

Now, to run it:

python main.py ~/your-directory-path ~/your-notes

Then you can ask it things like:

what files are in this directory? 
search for TODO comments in *.py files 
summarize the notes.txt file. 
auto-tag the files listed in the folder. 

Access prompts with /prompts. 

You can do more with it. Read the Readme file for more detail use cases. 

Wrapping it up

We built a real, production ready MCP server from scratch - not a normal trendy demo. A real server with secure codes, real-time progress feedback, intelligent tools and much more. We implement all 7 features of MCP server. Even implemented a client to interact with the server.

Every one of the features had a natural reason to exists in this project. So pay attention to each files code. My design principle is simple: "Don't add a feature to demonstrate it. Add it because the problem demands it. "

The full code for this project is on GitHub. View Code

Thanks for staying till the end. Hats off to your dedication to learn this tech to be competitive in today's fast growing job market. In part 3, we will go deeper on the agentic loop such as handling multi-step reasoning, context window management across long tool chains, and testing our MCP server in isolation. Subscribe if you don't want to miss it.

Tags#MCP#FastMCP#Python#Model Context Protocol#Claude#OpenAI.

Reading progress

0% read

Auto-completes after you reach the end and linger for a moment.

You made it to the end

Get more like this in your inbox

Every week I write about machine learning, engineering patterns, and things I'm building. Practical, no fluff — straight to your inbox.

Subscribe to the newsletter

Get thoughtful updates on AI, engineering, and product work.

We respect your privacy using double opt-in. Unsubscribe at any time.