CeresLabX/humata-mcp

critical

Remote HTTP MCP server for Humata AI document chat

MCP server (purpose undetermined)

purpose: MCP server (purpose undetermined)threat: network exposed

TypeScript★ 0◷ May 20, 2026⚙ May 20, 2026GITHUB

◆Vulnerability Analysis[ 5 findings in 5 blocks ]

◷ 5/20/2026

high1 finding

src/tools.ts

51  }, async ({ url, folder_id }: { url: string; folder_id?: string }) => {
52    try {
53      validateDocumentUrl(url);
54      const effectiveFolderId = folder_id || process.env.HUMATA_FOLDER_ID;
55      if (!effectiveFolderId) {
56        return {
57          content: [{
58            type: 'text',
59            text: 'Missing Humata folder_id. Set HUMATA_FOLDER_ID in Railway or pass folder_id to the tool.',
60          }],
61          isError: true,
62        };
63      }
64
65      const result = await humata.addDocumentFromUrl(url, effectiveFolderId);

src/index.ts:11→src/tools.ts:5

// Exploitable if MCP is exposed to untrusted prompts (network_exposed).

The tool 'humata_add_document_from_url' accepts a URL from the user and passes it to the Humata API's addDocumentFromUrl method. The validateDocumentUrl function is called but only checks for basic URL format and HTTPS scheme; it does not block private IP ranges (e.g., 10.x.x.x, 192.168.x.x, 127.0.0.1) or internal hostnames. An attacker could provide a URL pointing to internal services (e.g., http://169.254.169.254/latest/meta-data/) to perform SSRF attacks, potentially accessing cloud metadata or internal resources.

ImpactAn attacker could exploit this to scan internal networks, access cloud instance metadata (e.g., AWS, GCP), or interact with internal services, leading to information disclosure or further compromise.

FixImplement strict URL validation that rejects private IP addresses, loopback addresses, and internal hostnames. Use a blocklist or allowlist approach. Consider using a URL parser to resolve the host and check against private ranges before making the request.

high1 finding

src/tools.ts

108  }, async ({ document_id }: { document_id: string }) => {
109    try {
110      const doc = await humata.getDocumentStatus(document_id);
111      registry.updateDocumentStatus(document_id, doc.read_status || 'UNKNOWN');
112      
113      return {
114        content: [{
115          type: 'text',
116          text: JSON.stringify({
117            document_id: doc.id,
118            document_name: doc.name,
119            folder_id: doc.folder_id,
120            read_status: doc.read_status,
121            source_url: doc.source_url,
122            created_at: doc.created_at,
123            updated_at: doc.updated_at,
124          }, null, 2),
125        }],
126      };

src/index.ts:11→src/tools.ts:5

// Exploitable if MCP is exposed to untrusted prompts (network_exposed).

Multiple tools (humata_get_document_status, humata_wait_for_document, humata_create_conversation, humata_chat, etc.) accept document_id or conversation_id as string inputs without any validation beyond being a string. These IDs are passed directly to the Humata API. An attacker could inject arbitrary values, potentially enumerating valid IDs or causing unexpected behavior on the Humata backend. While the IDs are UUIDs in practice, no format validation is enforced.

ImpactAn attacker could attempt to enumerate document or conversation IDs, leading to information disclosure about documents not intended for the user. In combination with other vulnerabilities, this could allow unauthorized access to data.

FixValidate that document_id and conversation_id match the expected UUID format (e.g., using a regex or zod's uuid validation). Reject inputs that do not conform.

medium1 finding

src/index.ts

379app.post('/mcp', authMiddleware, async (req: express.Request, res: express.Response) => {
380  setCorsHeaders(res);
381  
382  try {
383    // Get or create session ID
384    const sessionId = (req.headers['mcp-session-id'] as string) || 
385                      (req.headers['Mcp-Session-Id'] as string) || 
386                      randomUUID();
387    
388    // Get or create transport for this session
389    let session = sessions.get(sessionId);
390    if (!session) {
391      const transport = new StreamableHTTPServerTransport({
392        sessionIdGenerator: () => sessionId,
393      });
394      transport.onclose = () => {
395        sessions.delete(sessionId);
396      };
397      const sessionServer = createMcpServer();
398      await sessionServer.connect(transport);
399
400      session = { transport, server: sessionServer };
401      sessions.set(sessionId, session);
402    }
403
404    // Handle the request
405    await session.transport.handleRequest(req, res, req.body);
406  } catch (error) {

src/index.ts:4-13

// Exploitable if MCP is exposed to untrusted prompts (network_exposed).

The MCP endpoint creates sessions based on a session ID header provided by the client. If no session ID is provided, a new one is generated. However, there is no rate limiting on session creation or requests. An attacker could create many sessions or send many requests, potentially exhausting server resources (memory, connections). Additionally, session IDs from headers are trusted without validation, allowing session fixation or replay attacks.

ImpactAn attacker could perform a denial-of-service attack by creating many sessions or sending high volumes of requests, exhausting server resources. Session fixation could allow an attacker to hijack a session if they can control the session ID header.

FixImplement rate limiting on session creation and request handling. Validate session IDs (e.g., ensure they are UUIDs generated by the server). Consider using server-generated session IDs only and not trusting client-provided ones.

medium1 finding

src/index.ts

406    const errorMsg = error instanceof Error ? error.message : String(error);
407    console.error('MCP request error:', redactSecrets(errorMsg, HUMATA_API_KEY));
408    
409    if (!res.headersSent) {

src/index.ts:12

// Exploitable if logs are accessible to an attacker (e.g., log aggregation service).

The redactSecrets function is used to redact the HUMATA_API_KEY from error messages before logging. However, other secrets (e.g., MCP_AUTH_TOKEN, OAUTH_JWT_SECRET, OAUTH_ADMIN_PASSWORD) are not redacted. If an error message contains any of these secrets (e.g., from a stack trace or API response), they could be logged in plaintext.

ImpactAn attacker with access to server logs could obtain sensitive credentials (MCP_AUTH_TOKEN, OAUTH_JWT_SECRET, OAUTH_ADMIN_PASSWORD), leading to unauthorized access to the MCP server or OAuth functionality.

FixExtend the redactSecrets function to also redact MCP_AUTH_TOKEN, OAUTH_JWT_SECRET, OAUTH_ADMIN_PASSWORD, and any other secrets. Alternatively, avoid logging error messages that may contain secrets.

low1 finding

src/tools.ts

268  }, async ({ conversation_id, question, model = 'gpt-4o', selected_answer_approach = 'Grounded' }: {
269    conversation_id: string;
270    question: string;
271    model?: string;
272    selected_answer_approach?: string;
273  }) => {
274    try {
275      const result = await humata.askQuestion(conversation_id, question, { model, selectedAnswerApproach: selected_answer_approach });

src/index.ts:11→src/tools.ts:5

// Exploitable if MCP is exposed to untrusted prompts (network_exposed).

The 'model' parameter in humata_chat and humata_chat_with_documents is accepted as a free-form string without validation against an allowlist. While the Humata API may reject invalid models, an attacker could attempt to inject unexpected values that might cause the API to behave differently or expose internal model names.

ImpactLow risk; primarily could cause errors or unexpected behavior. However, if the Humata API has any injection vulnerabilities in model handling, this could be a vector.

FixValidate the model parameter against a predefined list of allowed models (e.g., 'gpt-4o', 'gpt-4-turbo'). Reject any value not in the list.

◆Heuristic Signals

shell.execenv.exposureauth.nonenetwork.http

◆Risk Score

LLM-based

low findings+5

high findings+50

medium findings+30