← Back to all apps

MarcoPolo

Developer toolsby Immersa, Inc.
Launched Feb 18, 2026 on ChatGPT

MarcoPolo spins up a secure container where Claude can work with your actual data. Connect to your databases, APIs, S3, lakehouses, CRMs, Jira, logs and much more—using scoped credentials that are never exposed to the model.

Claude gets DuckDB, Python, a shell, and a set of tools to explore, query, transform, and analyze data across systems. The workspace persists over time, so that you can build on your work.

Prep a report, understand your data, debug an issue, or review the latest metrics right here in the conversation.

7ChatGPT Tools
Immersa, Inc.Developer
Developer toolsCategory

Available Tools

Browse

browse
Full Description

List files/directories in storage datasources with results loaded into DuckDB on REMOTE server.

**IMPORTANT

  • Remote Workspace:**

This tool operates in a REMOTE sandboxed Linux environment provisioned for you and the user, NOT your local machine.

  • Browse results are stored in DuckDB on the REMOTE server, not locally
  • Use workspace_sync() to transfer files to Google Drive if you need them locally

About Your REMOTE Workspace: This MCP server provides a REMOTE sandboxed Linux data workspace with integrated query execution and analysis capabilities. Browse results are automatically loaded into persistent DuckDB tables (on the REMOTE server), allowing you to:

  • Query file listings with SQL (pattern matching, filtering, sorting)
  • Analyze directory structures programmatically
  • Results persist in DuckDB for iterative exploration

Lists immediate children at specified path (like 'ls'). Complete listing is loaded into persistent DuckDB table on REMOTE server for SQL analysis. This tool response shows first 10 items as preview - use query() on the DuckDB table to see all results and apply filters/pattern matching. Large directories (>60s) run async.

Multi-bucket datasources: When path is empty and multiple buckets exist, returns bucket list for LLM auto-selection. LLM should analyze context and select the appropriate bucket, then call browse() again with bucket name in path.

Supports: S3, Azure Blob (ABFSS), Delta Lake storage datasources Does NOT support: SQL databases, APIs - use query() for those

Args: datasource_name: Storage datasource from list_datasources() (check capabilities=["browse"]) path: Directory path (default: "" for root, use "/" separators) For multi-bucket datasources: "bucket-name/" or "bucket-name/subdir/" detailed: Include sizes/timestamps (slower, default: False)

Returns: Bucket Selection: {execution_mode: "bucket_selection_required", available_buckets: [...]} Sync: {execution_mode: "sync", result: {duckdb_table_name, row_count, preview: [first 10]}} Async: {execution_mode: "async", operation_id, status: "running", next_action} Table schema: name, type, full_path, size (if detailed), modified (if detailed)

Example: >>> browse("MULTI_BUCKET_S3", "")

Returns bucket list for selection

{"execution_mode": "bucket_selection_required", "available_buckets": ["bucket-a", "bucket-b"]}

>>> browse("MULTI_BUCKET_S3", "bucket-a/prod/")

Browse specific bucket

{"result": {"duckdb_table_name": "storage_multi_bucket_s3_bucket_a_prod", "row_count": 150}}

>>> query("DUCKDB", "DUCKDB/find_csvs.duckdb")

Analyze listing

find_csvs.duckdb: SELECT * FROM storage_multi_bucket_s3_bucket_a_prod WHERE name LIKE '%.csv'

Parameters (1 required, 2 optional)
Required
datasource_namestring
Optional
detailedboolean
Default: False
pathstring

Download

download
Full Description

Download files from storage datasources to the REMOTE workspace downloads/ directory.

**⚠️ IMPORTANT

  • Files Download to REMOTE Workspace:**
  • Files are downloaded to /workspace/downloads/ on the REMOTE server, NOT your local machine
  • This is separate from any other computers/filesystems you may have access to via other MCPs
  • The ONLY way to interact with downloaded files is through THIS MCP's tools
  • Access them using execute_command("cat /workspace/downloads/file.csv") on the REMOTE server
  • To get files to your local machine, use workspace_sync() to sync to Google Drive

About Your REMOTE Workspace: This MCP server provides a REMOTE sandboxed Linux data workspace with integrated query execution and analysis capabilities. Downloaded files are saved to /workspace/downloads/{datasource}/ (on the REMOTE server) where you can:

  • Read and analyze files with shell tools via execute_command (cat, grep, python, etc.)
  • Process data in your sandboxed REMOTE environment
  • Access files persistently throughout your session
  • NOT accessible via local filesystem tools

Downloads specific files for analysis on REMOTE server. Files saved to downloads/{datasource}/ preserving full path structure. Use execute_command to inspect downloaded files on REMOTE server.

Supports: S3, Azure Blob (ABFSS), Delta Lake storage datasources Does NOT support: SQL databases, APIs - they don't have downloadable files

Args: datasource_name: Storage datasource from list_datasources() (check capabilities=["download"]) path: File path relative to datasource root - "prod/data/file.csv" (no leading slash) destination: Optional local destination path (relative to /workspace)

Returns: Success: {success: True, file_path, original_filename, datasource, ...} Error: {success: False, error, next_actions: ["Verify file exists with browse()"]}

Example: >>> download("OPSRUS_S3", "prod/data.csv") {"file_path": "downloads/opsrus_s3/prod/data.csv"}

⚠️ File is on REMOTE server - use execute_command to access:

>>> execute_command("cat /workspace/downloads/opsrus_s3/prod/data.csv")

Parameters (2 required, 1 optional)
Required
datasource_namestring
pathstring
Optional
destinationstring
Default: null

Execute Command

execute_command
Full Description

Execute a shell command securely within your REMOTE sandboxed Linux workspace.

**IMPORTANT

  • Remote Workspace:**

This tool, and all others from this MCP server, operate on a REMOTE sandboxed Linux environment provisioned specifically for you and the user, NOT on your local machine or the user's.

  • This is separate from any other computers/filesystems you may have access to via other MCPs
  • The ONLY way to interact with files and data in this REMOTE workspace is through the tools provided by THIS MCP

About Your REMOTE Workspace: This MCP server provides you a personalized REMOTE sandboxed Linux data workspace with integrated query execution and analysis capabilities. You may work in your home directory (/workspace) where you can run commands, manage files, and analyze data.

Your REMOTE workspace contains: /workspace (home directory, or ~) ├── examples/ Read-only working query examples (prepopulated, verified to work) ├── docs/ Read-only documentation (prepopulated) ├── downloads/ Downloaded files from storage datasources (created by download tool) ├── queries/ Your query files organized by datasource (e.g., queries/ATHENA/my_query.sql) └── ... You can create any other folders/files as needed for your work

When to use:

  • Explore files and directories (ls, find, tree)
  • Search content (grep, awk, sed)
  • Run scripts (python3, node)
  • Process data (sort, uniq, wc, jq)
  • Work with archives (tar, zip, gzip)
  • Create and manage query files in queries/ directory
  • Version control with git (clone repos, manage your own projects)
  • Manage scheduled jobs with dv-schedule (create recurring automated tasks)

Arguments:

  • command: Shell command to execute (e.g., "git status", "ls -la")
  • timeout: Maximum execution time in seconds (default: 30, max: 300)

Returns: { "success": True/False, "exit_code": 0, "stdout": "command output", "stderr": "error output (if any)", "execution_time": 0.234, }

Security:

  • Only executes within your REMOTE home directory (/workspace)

Examples:

1. List workspace contents: >>> execute_command("ls -la /workspace")

2. Create a query file: >>> execute_command("echo 'SELECT * FROM users' > queries/ATHENA/users.sql")

3. Find Python files: >>> execute_command("find . -name '*.py' -type f")

4. Search for pattern in a subdirectory: >>> execute_command("grep -r 'TODO' projects/repo")

5. Run a script: >>> execute_command("cd projects/repo && python3 analyze.py input.csv")

6. Commit changes: >>> execute_command("cd projects/repo && git add . && git commit -m 'Update'")

7. Create a query file (simple): >>> execute_command("mkdir -p queries/ATHENA") >>> execute_command("echo 'SELECT * FROM customers LIMIT 100' > queries/ATHENA/list.sql")

8. Create a query file (multi-line with heredoc): >>> cmd = "cat <<'EOF' > queries/DUCKDB/analysis.sql\n" >>> cmd += "SELECT category, COUNT(*) as count\n" >>> cmd += "FROM products GROUP BY category\nEOF" >>> execute_command(cmd)

9. Create a scheduled job: >>> execute_command("dv-schedule create daily-sync --script sync_data.py --cron '@daily'")

10. List scheduled jobs: >>> execute_command("dv-schedule list")

For Query Files:

  • Store queries in queries/{DATASOURCE}/ directory in the REMOTE workspace (e.g., queries/ATHENA/report.sql)
  • Organize by datasource name as shown in list_datasources()
  • See query_file_workflow prompt for complete guide

Next Steps:

  • Check exit_code to verify success (0 = success)
  • Read stdout for command output
  • Read stderr for error messages
  • Use list_installed_packages() to see available commands
  • Use get_command_help(command) for command documentation
Parameters (1 required, 1 optional)
Required
commandstring
Optional
timeoutinteger
Default: 30

Generate Connector Url

generate_connector_url
Full Description

Generate setup URL for configuring new datasource connectors in your REMOTE workspace.

Creates connector-specific URLs for setting up connectors with third-party services (OAuth: Hubspot, Salesforce) or database connections (PostgreSQL, MySQL, etc.) based on current user's customer context. URLs must be opened manually by the user. Once configured, datasources appear in list_datasources() and can be queried in the REMOTE workspace.

Args: connector_type: Type of connector ("hubspot", "salesforce", "postgres", "mysql", etc.)

Returns: {"url": "...", "connector_type": "...", "workflow_type": "...", "instructions": "..."}

Parameters (1 required)
Required
connector_typestring

Get Schema

get_schema
Full Description

Get schema/documentation for datasources in your REMOTE sandboxed workspace environment.

**IMPORTANT

  • Remote Workspace:**

This tool operates in a REMOTE sandboxed Linux environment provisioned for you and the user, NOT your local machine.

  • Use this to explore schemas before writing queries in the REMOTE workspace

This is a flexible tool that provides hierarchical schema discovery:

  • When called with ONLY datasource_name: Returns list of all databases/schemas
  • When called with datasource_name + database: Returns list of tables in that database
  • When called with datasource_name + database + table: Returns detailed column schema for that table

Use this as your primary schema discovery tool for all datasources.

Supports:

  • SQL databases (PostgreSQL, ClickHouse, Redshift, Athena, etc.)
  • APIs and NoSQL datasources (Jira, Salesforce, MongoDB, etc.) - returns overview documentation
  • Google Sheets: returns database_id for stable spreadsheet access

Does NOT support: Storage datasources (S3, Azure Blob) - use browse() for those

Args: datasource_name: Datasource from list_datasources() (check capabilities=["query"]) database: Optional

  • Database/schema name (only for SQL datasources)

table: Optional

  • Table name to get detailed schema for (requires database parameter)

Examples: >>> get_schema("ATHENA") {"success": True, "databases": ["analytics", "production"], "count": 2}

>>> get_schema("JIRA") {"success": True, "overview": "

JIRA Datasource

...", "generated": True}

>>> get_schema("ATHENA", database="analytics") {"success": True, "database": "analytics", "tables": [{"database": "analytics", "name": "users", ...}], "count": 5}

>>> get_schema("ATHENA", database="analytics", table="users") {"success": True, "database": "analytics", "table": "users", "columns": [{"name": "id", "type": "bigint"}], "column_count": 3}

Parameters (1 required, 2 optional)
Required
datasource_namestring
Optional
databasestring
Default: null
tablestring
Default: null

List Datasources

list_datasources
Full Description

Discover all available datasources (SQL databases, APIs, storage) with their capabilities.

**IMPORTANT

  • Remote Workspace:**

This tool operates in a REMOTE sandboxed Linux environment provisioned for you and the user, NOT your local machine.

First step in any data workflow. Returns all accessible datasources with capabilities field indicating which operations each supports.

Returns: { datasources: [{datasource_name, type, capabilities, display_name?, description?, examples?, docs?}, ...], count, message, next_actions, display_links: [{label, url, description}]

  • Links to display to users

}

Capabilities Guide: ["query"] → Use query() for SQL/API queries (results loaded into DuckDB) ["browse", "download"] → Use browse() to list files, download() to retrieve

Example: >>> list_datasources() { "datasources": [ {"datasource_name": "ATHENA", "capabilities": ["query"]}, {"datasource_name": "OPSRUS_S3", "capabilities": ["browse", "download"]} ] }

Query

query
Full Description

Execute queries on datasources with results automatically loaded into DuckDB for analysis.

**IMPORTANT

  • Remote Workspace:**

This tool operates in a REMOTE sandboxed Linux environment provisioned for you and the user, NOT your local machine.

About Your REMOTE Workspace: This MCP server provides a REMOTE sandboxed Linux data workspace with integrated query execution and analysis capabilities. When you run queries, results are automatically loaded into persistent DuckDB tables (on the REMOTE server), creating a REPL-like environment where:

  • Query results stay available in DuckDB tables for further analysis
  • You can write DuckDB queries to transform, join, and analyze the data
  • Data persists across tool calls for iterative analysis
  • Write query files to queries/{DATASOURCE}/ on the REMOTE server

Runs queries on configured datasources and saves results into DuckDB tables on the REMOTE server.

DuckDB Integration (on REMOTE server): After running a query, you can: 1. Use the returned duckdb_table_name to query the data with DuckDB 2. Write DuckDB queries in queries/DUCKDB/ to transform/analyze results 3. Join multiple query results together in DuckDB 4. Results persist for the session - no need to re-query

Source format for writes: "DATASOURCE:identifier"

  • DUCKDB:table_name → Read from DuckDB table
  • ATHENA:queries/athena/query.sql → Execute query first
  • DUCKDB:queries/duckdb/transform.sql → Execute DuckDB query

Mode: "replace" (clear + write) or "append" (add rows)

Args: datasource_name: Datasource from list_datasources() path: Query file path (e.g., "queries/ATHENA/query.sql") params: Optional Jinja2 parameters response_rows: Rows to return in response (default 10, full data in DuckDB)

Returns: Read: {duckdb_table_name, rows, dataframe, ...} Write: {operation: "write", spreadsheet, sheet, rows_written, ...}

**Example Workflow

  • Query and Analyze with DuckDB (on REMOTE server):**

1. Query datasource (results auto-load to REMOTE DuckDB): >>> query("ATHENA", "queries/ATHENA/sales_data.sql")

Returns: {"duckdb_table_name": "sales_data", "rows": 10000}

2. Analyze results with DuckDB in REMOTE workspace: >>> query("DUCKDB", "queries/DUCKDB/analyze_sales.sql")

analyze_sales.sql: SELECT region, SUM(amount) FROM sales_data GROUP BY region

3. Join with another query in REMOTE DuckDB: >>> query("POSTGRES", "queries/POSTGRES/customer_data.sql")

Returns: {"duckdb_table_name": "customer_data", "rows": 5000}

4. Combine in DuckDB on REMOTE workspace: >>> query("DUCKDB", "queries/DUCKDB/join_sales_customers.sql")

join_sales_customers.sql:

SELECT s.*, c.customer_name

FROM sales_data s

JOIN customer_data c ON s.customer_id = c.id

Parameters (2 required, 2 optional)
Required
datasource_namestring
pathstring
Optional
paramsobject
Default: null
response_rowsinteger
Default: 10