MarcoPolo

Developer toolsby Immersa, Inc.

Launched Feb 18, 2026 on ChatGPT

MarcoPolo spins up a secure container where Claude can work with your actual data. Connect to your databases, APIs, S3, lakehouses, CRMs, Jira, logs and much more—using scoped credentials that are never exposed to the model.

Claude gets DuckDB, Python, a shell, and a set of tools to explore, query, transform, and analyze data across systems. The workspace persists over time, so that you can build on your work.

Prep a report, understand your data, debug an issue, or review the latest metrics right here in the conversation.

7ChatGPT Tools

Immersa, Inc.Developer

Developer toolsCategory

Available Tools

Browse

browse

Full Description

List files/directories in storage datasources with results loaded into DuckDB on REMOTE server.

**IMPORTANT

Remote Workspace:**

This tool operates in a REMOTE sandboxed Linux environment provisioned for you and the user, NOT your local machine.

Browse results are stored in DuckDB on the REMOTE server, not locally
Use workspace_sync() to transfer files to Google Drive if you need them locally

About Your REMOTE Workspace: This MCP server provides a REMOTE sandboxed Linux data workspace with integrated query execution and analysis capabilities. Browse results are automatically loaded into persistent DuckDB tables (on the REMOTE server), allowing you to:

Query file listings with SQL (pattern matching, filtering, sorting)
Analyze directory structures programmatically
Results persist in DuckDB for iterative exploration

Lists immediate children at specified path (like 'ls'). Complete listing is loaded into persistent DuckDB table on REMOTE server for SQL analysis. This tool response shows first 10 items as preview - use query() on the DuckDB table to see all results and apply filters/pattern matching. Large directories (>60s) run async.

Multi-bucket datasources: When path is empty and multiple buckets exist, returns bucket list for LLM auto-selection. LLM should analyze context and select the appropriate bucket, then call browse() again with bucket name in path.

Supports: S3, Azure Blob (ABFSS), Delta Lake storage datasources Does NOT support: SQL databases, APIs - use query() for those

Args: datasource_name: Storage datasource from list_datasources() (check capabilities=["browse"]) path: Directory path (default: "" for root, use "/" separators) For multi-bucket datasources: "bucket-name/" or "bucket-name/subdir/" detailed: Include sizes/timestamps (slower, default: False)

Returns: Bucket Selection: {execution_mode: "bucket_selection_required", available_buckets: [...]} Sync: {execution_mode: "sync", result: {duckdb_table_name, row_count, preview: [first 10]}} Async: {execution_mode: "async", operation_id, status: "running", next_action} Table schema: name, type, full_path, size (if detailed), modified (if detailed)

Example: >>> browse("MULTI_BUCKET_S3", "")

Returns bucket list for selection

{"execution_mode": "bucket_selection_required", "available_buckets": ["bucket-a", "bucket-b"]}

>>> browse("MULTI_BUCKET_S3", "bucket-a/prod/")

Browse specific bucket

{"result": {"duckdb_table_name": "storage_multi_bucket_s3_bucket_a_prod", "row_count": 150}}

>>> query("DUCKDB", "DUCKDB/find_csvs.duckdb")

Analyze listing

find_csvs.duckdb: SELECT * FROM storage_multi_bucket_s3_bucket_a_prod WHERE name LIKE '%.csv'

Parameters (1 required, 2 optional)

Required

datasource_namestring

Optional

detailedboolean

Default: False

pathstring

Download

download

Full Description

Download files from storage datasources to the REMOTE workspace downloads/ directory.

**⚠️ IMPORTANT

Files Download to REMOTE Workspace:**
Files are downloaded to /workspace/downloads/ on the REMOTE server, NOT your local machine
This is separate from any other computers/filesystems you may have access to via other MCPs
The ONLY way to interact with downloaded files is through THIS MCP's tools
Access them using execute_command("cat /workspace/downloads/file.csv") on the REMOTE server
To get files to your local machine, use workspace_sync() to sync to Google Drive

About Your REMOTE Workspace: This MCP server provides a REMOTE sandboxed Linux data workspace with integrated query execution and analysis capabilities. Downloaded files are saved to /workspace/downloads/{datasource}/ (on the REMOTE server) where you can:

Read and analyze files with shell tools via execute_command (cat, grep, python, etc.)
Process data in your sandboxed REMOTE environment
Access files persistently throughout your session
NOT accessible via local filesystem tools

Downloads specific files for analysis on REMOTE server. Files saved to downloads/{datasource}/ preserving full path structure. Use execute_command to inspect downloaded files on REMOTE server.

Supports: S3, Azure Blob (ABFSS), Delta Lake storage datasources Does NOT support: SQL databases, APIs - they don't have downloadable files

Args: datasource_name: Storage datasource from list_datasources() (check capabilities=["download"]) path: File path relative to datasource root - "prod/data/file.csv" (no leading slash) destination: Optional local destination path (relative to /workspace)

Returns: Success: {success: True, file_path, original_filename, datasource, ...} Error: {success: False, error, next_actions: ["Verify file exists with browse()"]}

Example: >>> download("OPSRUS_S3", "prod/data.csv") {"file_path": "downloads/opsrus_s3/prod/data.csv"}

⚠️ File is on REMOTE server - use execute_command to access:

>>> execute_command("cat /workspace/downloads/opsrus_s3/prod/data.csv")

Parameters (2 required, 1 optional)

Required

datasource_namestring

pathstring

Optional

destinationstring

Default: null

Execute Command

execute_command

Full Description

Execute a shell command securely within your REMOTE sandboxed Linux workspace.

**IMPORTANT

Remote Workspace:**

This tool, and all others from this MCP server, operate on a REMOTE sandboxed Linux environment provisioned specifically for you and the user, NOT on your local machine or the user's.

This is separate from any other computers/filesystems you may have access to via other MCPs
The ONLY way to interact with files and data in this REMOTE workspace is through the tools provided by THIS MCP

About Your REMOTE Workspace: This MCP server provides you a personalized REMOTE sandboxed Linux data workspace with integrated query execution and analysis capabilities. You may work in your home directory (/workspace) where you can run commands, manage files, and analyze data.

Your REMOTE workspace contains: /workspace (home directory, or ~) ├── examples/ Read-only working query examples (prepopulated, verified to work) ├── docs/ Read-only documentation (prepopulated) ├── downloads/ Downloaded files from storage datasources (created by download tool) ├── queries/ Your query files organized by datasource (e.g., queries/ATHENA/my_query.sql) └── ... You can create any other folders/files as needed for your work

When to use:

Explore files and directories (ls, find, tree)
Search content (grep, awk, sed)
Run scripts (python3, node)
Process data (sort, uniq, wc, jq)
Work with archives (tar, zip, gzip)
Create and manage query files in queries/ directory
Version control with git (clone repos, manage your own projects)
Manage scheduled jobs with dv-schedule (create recurring automated tasks)

Arguments:

command: Shell command to execute (e.g., "git status", "ls -la")
timeout: Maximum execution time in seconds (default: 30, max: 300)

Returns: { "success": True/False, "exit_code": 0, "stdout": "command output", "stderr": "error output (if any)", "execution_time": 0.234, }

Security:

Only executes within your REMOTE home directory (/workspace)

Examples:

1. List workspace contents: >>> execute_command("ls -la /workspace")

2. Create a query file: >>> execute_command("echo 'SELECT * FROM users' > queries/ATHENA/users.sql")

3. Find Python files: >>> execute_command("find . -name '*.py' -type f")

4. Search for pattern in a subdirectory: >>> execute_command("grep -r 'TODO' projects/repo")

5. Run a script: >>> execute_command("cd projects/repo && python3 analyze.py input.csv")

6. Commit changes: >>> execute_command("cd projects/repo && git add . && git commit -m 'Update'")

7. Create a query file (simple): >>> execute_command("mkdir -p queries/ATHENA") >>> execute_command("echo 'SELECT * FROM customers LIMIT 100' > queries/ATHENA/list.sql")

8. Create a query file (multi-line with heredoc): >>> cmd = "cat <<'EOF' > queries/DUCKDB/analysis.sql\n" >>> cmd += "SELECT category, COUNT(*) as count\n" >>> cmd += "FROM products GROUP BY category\nEOF" >>> execute_command(cmd)

9. Create a scheduled job: >>> execute_command("dv-schedule create daily-sync --script sync_data.py --cron '@daily'")

10. List scheduled jobs: >>> execute_command("dv-schedule list")

For Query Files:

Store queries in queries/{DATASOURCE}/ directory in the REMOTE workspace (e.g., queries/ATHENA/report.sql)
Organize by datasource name as shown in list_datasources()
See query_file_workflow prompt for complete guide

Next Steps:

Check exit_code to verify success (0 = success)
Read stdout for command output
Read stderr for error messages
Use list_installed_packages() to see available commands
Use get_command_help(command) for command documentation

Parameters (1 required, 1 optional)

Required

commandstring

Optional

timeoutinteger

Default: 30

Generate Connector Url

generate_connector_url

Full Description

Generate setup URL for configuring new datasource connectors in your REMOTE workspace.

Creates connector-specific URLs for setting up connectors with third-party services (OAuth: Hubspot, Salesforce) or database connections (PostgreSQL, MySQL, etc.) based on current user's customer context. URLs must be opened manually by the user. Once configured, datasources appear in list_datasources() and can be queried in the REMOTE workspace.

Args: connector_type: Type of connector ("hubspot", "salesforce", "postgres", "mysql", etc.)

Returns: {"url": "...", "connector_type": "...", "workflow_type": "...", "instructions": "..."}

Parameters (1 required)

Required

connector_typestring

Get Schema

get_schema

Full Description

Get schema/documentation for datasources in your REMOTE sandboxed workspace environment.

**IMPORTANT

Remote Workspace:**

This tool operates in a REMOTE sandboxed Linux environment provisioned for you and the user, NOT your local machine.

Use this to explore schemas before writing queries in the REMOTE workspace

This is a flexible tool that provides hierarchical schema discovery:

When called with ONLY datasource_name: Returns list of all databases/schemas
When called with datasource_name + database: Returns list of tables in that database
When called with datasource_name + database + table: Returns detailed column schema for that table

Use this as your primary schema discovery tool for all datasources.

Supports:

SQL databases (PostgreSQL, ClickHouse, Redshift, Athena, etc.)
APIs and NoSQL datasources (Jira, Salesforce, MongoDB, etc.) - returns overview documentation
Google Sheets: returns database_id for stable spreadsheet access

Does NOT support: Storage datasources (S3, Azure Blob) - use browse() for those

Args: datasource_name: Datasource from list_datasources() (check capabilities=["query"]) database: Optional

Database/schema name (only for SQL datasources)

table: Optional

Table name to get detailed schema for (requires database parameter)

Examples: >>> get_schema("ATHENA") {"success": True, "databases": ["analytics", "production"], "count": 2}

>>> get_schema("JIRA") {"success": True, "overview": "

JIRA Datasource

...", "generated": True}

>>> get_schema("ATHENA", database="analytics") {"success": True, "database": "analytics", "tables": [{"database": "analytics", "name": "users", ...}], "count": 5}

>>> get_schema("ATHENA", database="analytics", table="users") {"success": True, "database": "analytics", "table": "users", "columns": [{"name": "id", "type": "bigint"}], "column_count": 3}

Parameters (1 required, 2 optional)

Required

datasource_namestring

Optional

databasestring

Default: null

tablestring

Default: null

List Datasources

list_datasources

Full Description

Discover all available datasources (SQL databases, APIs, storage) with their capabilities.

**IMPORTANT

Remote Workspace:**

This tool operates in a REMOTE sandboxed Linux environment provisioned for you and the user, NOT your local machine.

First step in any data workflow. Returns all accessible datasources with capabilities field indicating which operations each supports.

Returns: { datasources: [{datasource_name, type, capabilities, display_name?, description?, examples?, docs?}, ...], count, message, next_actions, display_links: [{label, url, description}]

Links to display to users

}

Capabilities Guide: ["query"] → Use query() for SQL/API queries (results loaded into DuckDB) ["browse", "download"] → Use browse() to list files, download() to retrieve

Example: >>> list_datasources() { "datasources": [ {"datasource_name": "ATHENA", "capabilities": ["query"]}, {"datasource_name": "OPSRUS_S3", "capabilities": ["browse", "download"]} ] }

Query

query

Full Description

Execute queries on datasources with results automatically loaded into DuckDB for analysis.

**IMPORTANT

Remote Workspace:**

This tool operates in a REMOTE sandboxed Linux environment provisioned for you and the user, NOT your local machine.

About Your REMOTE Workspace: This MCP server provides a REMOTE sandboxed Linux data workspace with integrated query execution and analysis capabilities. When you run queries, results are automatically loaded into persistent DuckDB tables (on the REMOTE server), creating a REPL-like environment where:

Query results stay available in DuckDB tables for further analysis
You can write DuckDB queries to transform, join, and analyze the data
Data persists across tool calls for iterative analysis
Write query files to queries/{DATASOURCE}/ on the REMOTE server

Runs queries on configured datasources and saves results into DuckDB tables on the REMOTE server.

DuckDB Integration (on REMOTE server): After running a query, you can: 1. Use the returned duckdb_table_name to query the data with DuckDB 2. Write DuckDB queries in queries/DUCKDB/ to transform/analyze results 3. Join multiple query results together in DuckDB 4. Results persist for the session - no need to re-query

Source format for writes: "DATASOURCE:identifier"

DUCKDB:table_name → Read from DuckDB table
ATHENA:queries/athena/query.sql → Execute query first
DUCKDB:queries/duckdb/transform.sql → Execute DuckDB query

Mode: "replace" (clear + write) or "append" (add rows)

Args: datasource_name: Datasource from list_datasources() path: Query file path (e.g., "queries/ATHENA/query.sql") params: Optional Jinja2 parameters response_rows: Rows to return in response (default 10, full data in DuckDB)

Returns: Read: {duckdb_table_name, rows, dataframe, ...} Write: {operation: "write", spreadsheet, sheet, rows_written, ...}

**Example Workflow

Query and Analyze with DuckDB (on REMOTE server):**

1. Query datasource (results auto-load to REMOTE DuckDB): >>> query("ATHENA", "queries/ATHENA/sales_data.sql")

Returns: {"duckdb_table_name": "sales_data", "rows": 10000}

2. Analyze results with DuckDB in REMOTE workspace: >>> query("DUCKDB", "queries/DUCKDB/analyze_sales.sql")

analyze_sales.sql: SELECT region, SUM(amount) FROM sales_data GROUP BY region

3. Join with another query in REMOTE DuckDB: >>> query("POSTGRES", "queries/POSTGRES/customer_data.sql")

Returns: {"duckdb_table_name": "customer_data", "rows": 5000}

4. Combine in DuckDB on REMOTE workspace: >>> query("DUCKDB", "queries/DUCKDB/join_sales_customers.sql")

join_sales_customers.sql:

SELECT s.*, c.customer_name

FROM sales_data s

JOIN customer_data c ON s.customer_id = c.id

Parameters (2 required, 2 optional)

Required

datasource_namestring

pathstring

Optional

paramsobject

Default: null

response_rowsinteger

Default: 10

Links

Website Privacy Policy Terms of Service