Day 7: TCP Stream Orchestration & Packet Crafting

Bridging the gap between static analysis and dynamic interaction via TCP socket programming

Author HungNguyen

#Understand the TCP Stream

Today’s research transitioned from reading source code to active interaction with the llama.cpp RPC server. I focused on building the communication backbone required to trigger the identified vulnerability

#1. Reverse Engineering the Server Handshake

By analyzing the server’s message-handling loop, I mapped out the exact sequence of bytes expected by the RPC backend. The server doesn’t just wait for data; it expects a strict “protocol-legal” handshake before it even considers processing complex commands

#2. Scripting the Raw Socket Interaction

I developed a custom Python script using the socket library to handle the raw TCP stream. Unlike high-level APIs, interacting directly with the socket allows for:

  • Byte-level Precision: Crucial for satisfying the #pragma pack(push, 1) requirement of the RPC structs
  • Timing Control: Managing how the server receives chunks of metadata to ensure the deserialize_tensor function is triggered under the right conditions

#3. Logic Flow Analysis

I’ve traced the server’s logic from the moment a packet arrives at the socket until it hits the vulnerable rpc_server layer:

  1. Socket Listen: Server accepts the connection
  2. Command Dispatch: The first few bytes define the rpc_cmd
  3. Metadata Ingestion: The server reads the remaining bytes directly into an rpc_tensor buffer

#The Exploit Backbone

The script currently implements the following milestones:

  • TCP Connection Establishment: Successful handshake with the RPC port
  • Packet Serialization: Correctly converting Python objects into the binary format expected by the C++ backend
  • Basic Command Execution: Sending RPC_CMD_HELLO and receiving a valid version response