Mapping Solana’s Black Boxes

A Method for Tracking Behavior in Closed-Source Programs

Apr 07, 2025

Introduction

As the Solana ecosystem grows more complex and widely adopted, many of its most active programs are being deployed without open-source code, public SDKs, or IDLs. These closed-source protocols, often fully upgradeable and admin-controlled, make it difficult for analysts, researchers, and users to understand how they work — or how they're being used.

While we often think of reverse engineering as reconstructing program logic, there’s another approach that’s more accessible and broadly useful: behavioral mapping. By tracking on-chain activity and surfacing consistent structures in real transactions, it’s possible to categorize what a contract is doing, who is using it, and how.

This post presents a practical, repeatable methodology for identifying and labeling real-world behavior within closed-source Solana programs. Instead of cracking open the contract’s logic, we infer structure through account patterns, inner instruction traces, and token flow — starting with nothing but the transactions themselves.

The result is a useful framework for surfacing contract activity at scale, even when internal logic is completely opaque. Our hope is that this work helps advance transparency across Solana — giving the community better tools for understanding how opaque programs are functioning in the real world.

Methodology Overview

Our approach is grounded in observing and interpreting the runtime behavior of closed-source Solana programs. Since we cannot inspect the source code or rely on an IDL, we treat the program as a black box — learning how it works by interacting with it and measuring its responses.

At a high level, the methodology involves:

Identifying a target program to track
Crafting and sending transactions with controlled input
Capturing transaction logs and execution traces
Analyzing involved accounts, changes in account data, and log output
Looking for consistent patterns that indicate function behavior, argument structures, and control flow
Refining our understanding through iteration and variation

To illustrate this process in action, we’ll walk through a full example of tracking a single category of user interactions within a closed-source Solana program step by step.

Step 1: Selecting a Target Program

We begin by choosing a program that is actively deployed, lacks an IDL, and has no publicly available source code. For this example, we’ll use pAMMBay6oceH9fJKBRHGP5D4bD4sWpmSwMn52FMfXEA, the address of PumpSwaps — a closed-source protocol deployed on Solana mainnet. It is a newly popular DEX on Solana, yet its behavior remains undocumented and opaque.

You can discover similar programs by monitoring active addresses in transaction flows, checking top interacting programs on Solana Explorer, or identifying contracts without IDLs on platforms like SolanaFM.

Step 2: Interacting Through the Frontend and Inspecting the Transaction

When a closed-source Solana program has an active frontend, it offers a convenient way to start usage tracking. Even without access to the codebase or an SDK, the UI exposes on-chain functionality that we can manually trigger to produce real transactions — giving us a starting point for deeper analysis.

In this example, we visited the PumpSwaps frontend and performed a basic action: submitting a token swap.

After submitting the swap, we opened our wallet’s transaction history and located the relevant transaction. From there, we looked it up on a block explorer (e.g., SolanaFM, Solscan, or Solana Explorer) to inspect its structure in more detail.

Here’s the transaction hash we started with: 4wMYkBw51pZ1ve45xTYcW4mnqHPVJRbNqpMwyFa2VyJKrfjHi13Feeeb7bUFrdJowvZRNasiBRiJaYzWKLrMG11J

We examined:

The program ID that was called (pAMMBay6oceH9fJKBRHGP5D4bD4sWpmSwMn52FMfXEA, i.e., PumpSwaps)
The list of accounts involved (user wallet, PDAs, token accounts, etc.)
Log messages emitted during execution
The raw instruction data and argument values

Once we had the basic anatomy of a transaction, we browsed the block explorer for similar transactions — ideally those that:

Called the same program ID
Contained similar account structures (e.g., 17 accounts)
Logged "Instruction: Buy" or "Instruction: Sell"
Used different token pairs or users

This gave us a broader sample set of how the program behaves in different scenarios. We could compare how inputs, accounts, and logs change — building an intuition for what is fixed (e.g., PDAs) and what is dynamic (e.g., amount fields, token mints).

This process of exploring related transactions is critical. It shifts us from isolated inspection to behavioral mapping — helping us identify patterns, spot anomalies, and eventually model the core logic of the program.

By probing both the frontend and its resulting transactions — and tracing how similar swaps behave — we lay the foundation for usage tracking the protocol from the outside-in.

Step 3: Identifying and Generalizing Swap Behavior

Once we had a small sample set of what we were confident were swaps, we began cataloging their structure and execution. Across these examples, we saw clear regularities:

3.1 Finding Valid Swaps Using Account Patterns

We use two Flipside queries for this step. The first query [here] counts the number of transactions calling the PumpSwaps contract, grouped by the number of accounts involved. This gives us a distribution of account counts and helps us identify common structural patterns. The second query [here] filters for transactions that include a specific number of accounts — in this case, 17 — and returns a sample set that we manually inspect.

We typically examine 10–20 transactions from this filtered set to validate whether they follow a consistent structure and likely represent swap behavior.

While account count is often the most effective starting point, depending on the contract, you may need to experiment with other signals — such as instruction discriminators, token flow direction, or PDA consistency — to accurately distinguish transaction types. But for many closed-source programs, account count offers the cleanest initial segmentation.

We found that all transactions calling PumpSwaps with exactly 17 accounts were swaps — whether buys or sells. Each included:

Signer wallet
Input and output token accounts (user and pool)
Pool state and config accounts
Protocol fee recipient
Supporting programs (Token, System, Associated Token)

This consistency let us begin labeling account roles across multiple transactions and distinguishing user-specific state from fixed PDAs.

3.2 Extracting Swap Data

Once we've identified the transactions worth analyzing, the next step is to extract the relevant data from them. In the case of PumpSwap swaps, the information that is useful includes:

Input token and amount
Output token and amount
Liquidity pool address
Timestamp of the transaction
Transaction ID
Trader (signer wallet)

This data allows us to quantify swap behavior, track token flows, and attribute on-chain actions to specific users. Since the account structure in these transactions is consistent, we can reliably extract this information using account indexing and known patterns within the transaction structure.

Our Flipside query [here] demonstrates how we extract that information from the events we've identified.

Step 4: Catching Divergent Behavior Through Broken Assumptions

With a working model in place for what a “standard” PumpSwap swap looks like, we assumed that:

Any transaction calling the PumpSwaps program with exactly 17 accounts is a swap — with the data about the tokens going in and out of the pool typically found in the 2nd and 3rd inner instruction of the event.

While digging through transactions that fit this framework, we ran into two key irregularities in the Flipside data tables:

Some transactions had the same token listed as both the token in and token out.
Some rows had null for both token_out and amount_out.

We investigated these by examining the transactions in a block explorer. Below are examples of each type of irregularity:

Null for both token_out and amount_out: 51yeRPusTvzsi1DREs6Kni3XAv4auwHJassrWR6ngRXfmuJZxVbxE9y7memath2VTAZxcf5L6n2D29hwFe7AYCTn
Same token listed as both input and output: 2v8fE6tAB2P4dUpRd5vcNZqV3vutqQnZeDJ233DPRZwJwKwwJMED3G8Xh42suVLhxERqouY5s9RRRKnr76ZVyU7Y

4.1 Irregularity: Null Output Data

In the transactions where both token_out and amount_out were null, we found that min_quote_amount_out was set to 0. That means the trader effectively said, “I’ll give you these tokens, and I’m okay receiving nothing in return.” And that’s exactly what happened — the transaction executed, but no tokens were sent back. The person essentially deposited tokens into the pool without getting anything in return. Since these swaps don’t reflect meaningful token exchange, we excluded them from the dataset.

4.2 Irregularity: Same Token In and Out

In the transactions where the token in and out were the same, we discovered that the relevant swap data appeared in the 6th and 7th inner instructions, not the 2nd and 3rd as in our original model. This revealed that the location of the swap data within inner instructions can vary, and must be handled accordingly.

After identifying and accounting for these variations in how swap data appears, we refined our Flipside query to capture swap activity more accurately and handle edge cases appropriately. You can view that query [here].

Conclusion

Through this black-box approach, we were able to track the usage of a closed-source Solana program — identifying swap transactions, labeling account roles, and handling irregular behaviors. Despite not having source code or an IDL, we reconstructed meaningful insights purely from on-chain data.

It's important to note that this methodology depends on studying a large number of example transactions. It doesn't guarantee a perfect or complete reconstruction of the program’s internal logic — nor does it fully reveal all possible execution branches. However, it excels at categorizing the bulk of real-world activity. By surfacing how users interact with the contract in practice, this approach provides strong insight into how the contract — and the different functions within it — are being used.

This methodology is generalizable. With the right tools and process, others in the Solana ecosystem can perform similar usage tracking — whether for research, security, analytics, or transparency purposes.

Example queries

https://flipsidecrypto.xyz/studio/queries/f1ff38c8-218c-417d-8fa8-cbcf17f6f85c
https://flipsidecrypto.xyz/studio/queries/e3359ba2-1525-4387-b798-61b7769789d2
https://flipsidecrypto.xyz/studio/queries/56a91435-86f7-449d-9b36-d6162c42c93d
https://flipsidecrypto.xyz/studio/queries/b69fbb22-8ee5-458a-8941-d07d8841fabc

Pine’s Substack

Discussion about this post