Set Shaping Theory
Structure is more important than input size
Set Shaping Theory
Help us improve this research site
Set Shaping Theory can be applied to many research fields and has significant potential. To make the simulations as meaningful as possible, we need experts in the relevant research fields who can help define the most relevant problems, methods, metrics, and practical interpretations.
What we are observing is that larger but more structured inputs can improve performance. In other words, input structure is more critical than input size.
The general research methodology is as follows:
- Select a problem and its most widely used solution method.
- Define a metric or function to measure the method's performance.
- Apply SST to evaluate whether this performance improves on average.
- Consult experts in the relevant research fields to determine whether the improvement is practically significant.
If SST provides an improvement, it can be applied directly to the input without changing the underlying method itself. Set Shaping Theory is not in competition with existing theories or algorithms: it aims to improve results by producing more structured inputs.
As computers become increasingly powerful, the calculations required for this approach will become more accessible, making this type of application progressively more widespread.
In the future, the resolution of problems will use only expanded and more structured inputs.
Set shaping theory
Build the correspondence table
- Source sequences
- -
- Expanded sequences
- -
- Mappings created
- -
- Status
- Ready
Preview
Ordered correspondences
Compute a table to see the results.
| Rank | Sequence \(N\) | \(N H_0\) | Sequence \(N+K\) | \((N+K)H_0\) |
|---|---|---|---|---|
| No result has been computed yet. | ||||
Recommended articles
- S. Kozlov, “Introduction to set shaping theory,†arXiv:2111.08369, 2021. https://arxiv.org/abs/2111.08369
- S. Kozlov, “Use of set shaping theory in the development of locally testable codes,†arXiv:2202.13152, 2022. https://arxiv.org/abs/2202.13152
- PowerPoint presentation, “Set Shaping Theory.†https://www.academia.edu/61997612/Set_Shaping_Theory_the_future_of_information_theory_
Set Shaping Theory
Application of Set Shaping Theory to long sequences
Generate \(H\) uniformly random source sequences, test every \(A^K\) deterministic length-\(N\) modular modifier, and keep the transformed sequence with the lowest empirical entropy score.
The comparison uses \(N H_0(s)\) and \((N+K)H_0(f(s))\), where \(H_0\) is the empirical entropy computed from the symbol frequencies in the sequence. With the approximate method the gain is visible when \(A \ge 5\); with correspondence tables it is already visible when \(A \ge 3\).
- Random sequences
- -
- Transforms per sequence
- -
- Average gain (bits)
- -
- Status
- Ready
Results
Average score comparison
Run the approximation to see the average values.
This program uses the algorithm developed by Glen Tankersley to perform the transformation: https://github.com/gotankersley/entropic-transform.
Recommended articles
- Lang, Mai and Lewis, Logan, “Beyond Shannon: Geometric Transformations and the Foundations of Set Shaping Theory,†November 21, 2025. https://ssrn.com/abstract=5781442
- Biereagu, Sochima. 2023. “Introducing the Role of Shaping Order K in Set Shaping Theory.†AfricArXiv. October 12. doi:10.31730/osf.io/ywmsr. https://africarxiv.ubuntunet.net/items/bca53a72-0fd8-477f-bb5e-19059847c2d4
- A. Koch and A. Petit, “Set Shaping Theory and the Foundations of RedundancyFree Testable Codes,†arXiv:2507.03444, 2025. https://arxiv.org/abs/2507.03444
Set Shaping Theory
Set Shaping Theory applied to graphs using the correspondence table method
Enumerate all simple labeled graphs with n nodes and m edges, then enumerate the expanded set with n+k nodes and the same number of edges. The program computes \(H_d\), \(H_{\mathrm{gap}}\), and \(S_J\) for every graph.
The selected subset contains the same number of graphs as the original set. For \(H_d\) and \(H_{\mathrm{gap}}\) it is chosen from the expanded set by minimum entropy; for similarity it is chosen by maximum similarity.
- Original graphs
- -
- Expanded graphs
- -
- Score difference \(H_d\)
- -
- Status
- Ready
Results
Graph metric comparison
Run graphy to compare the two graph sets with \(H_d\), \(H_{\mathrm{gap}}\), and \(S_J\).
Degree entropy. \[ H_d = -\sum_{q \ge 0} p_d(q)\log_2 p_d(q) \]
Adjacency-gap entropy. \[ H_{\mathrm{gap}} = -\sum_{g \ge 1} p_{\mathrm{gap}}(g)\log_2 p_{\mathrm{gap}}(g) \]
Similarity. \[ S_J = \frac{1}{\binom{n}{2}} \sum_{1 \le i < j \le n} \frac{\left|N(i)\cap N(j)\right|}{\left|N(i)\cup N(j)\right|} \]
Scores. \[ \operatorname{mean}(nH_d),\quad \operatorname{mean}((n+k)H_d),\quad \operatorname{mean}(nH_{\mathrm{gap}}),\quad \operatorname{mean}((n+k)H_{\mathrm{gap}}),\quad \operatorname{mean}(S_J) \]
Recommended articles
- Petit, Alix and Vdberg, Adrain and Schmidt, Christian, “Set Shaping Theory Applied to Graphs: Reducing Average Graph Entropy via Dimensional Expansion,†March 07, 2026. https://ssrn.com/abstract=6364578
Set Shaping Theory
Application of Set Shaping Theory to graphs using an approximate method
Generate \(H\) random simple graphs with \(N_d\) nodes and \(A_r\) edges, or read a graph from file. For each graph, the SST transformation expands the graph to \(N_d+K\) nodes.
- Input graphs
- -
- Transforms per graph
- -
- Average \(H_d\) gain
- -
- Status
- Ready
Results
Run the simulation to compare the original random graphs with the SST-transformed graphs.
Degree entropy. \[ H_d = -\sum_{q \ge 0} p_d(q)\log_2 p_d(q) \]
Adjacency-gap entropy. \[ H_{\mathrm{gap}} = -\sum_{g \ge 1} p_{\mathrm{gap}}(g)\log_2 p_{\mathrm{gap}}(g) \]
Similarity. \[ S_J = \frac{1}{\binom{N_d}{2}} \sum_{1 \le i < j \le N_d} \frac{\left|N(i)\cap N(j)\right|}{\left|N(i)\cup N(j)\right|} \]
Accepted graph file format Use a text edge list. Each non-empty line contains one undirected edge as two node indexes, separated by spaces, commas, or semicolons.
Comments start with #. Node indexes may be zero-based
or one-based. The optional line nodes: 20 can be used
to include isolated nodes.
Example:
nodes: 5
1 2
1 3
2 4
4 5
Loaded graph
Graph read from file and transformed graphs
The loaded graph will appear here after running the file mode.
This program uses the algorithm developed by Glen Tankersley to perform the transformation: https://github.com/gotankersley/entropic-transform.
Set Shaping Theory
Database Index Simulation
The main reason to apply the Set Shaping Theory is shifting part of the workload from RAM access to CPU computation. Simulate the structural behavior of dense hash indexes. The comparison measures how SST changes collisions per stored record and maximum cluster length.
- \(K\)
- -
- Load factor
- -
- Query mode
- -
- \(Q/N\)
- -
- Status
- Ready
Results
Structural indicators
Run the simulation to compute the structural comparison.
| Configuration | Metadata bits | Collisions/record | Max cluster | Collision reduction | Cluster reduction |
|---|---|---|---|---|---|
| No result has been computed yet. | |||||
This program uses the algorithm developed by Glen Tankersley to perform the transformation: https://github.com/gotankersley/entropic-transform.
Recommended articles
- Alix Petit, Mai Lang, Logan Lewis, Lily Scott, and Agi Weber, "Using Set Shaping Theory to Trade RAM Accesses for CPU Computation," arXiv:2605.29700, 2026. https://arxiv.org/abs/2605.29700
Set Shaping Theory
Reduce \(D_{\mathrm{KL}}\) with Set Shaping Theory
Select a grayscale or color image, generate or load the binary sequence to embed, and compare the classical LSB insertion with the best SST transformed sequence. The inserted sequence has length \(N+K\).
The method computes the Kullback-Leibler divergence between the original pixel histogram P and the modified histogram Q after LSB embedding. The SST step searches for the transformed message that minimizes \(D_{\mathrm{KL}}(P\parallel Q)\).
- Image size
- -
- Inserted bits
- -
- Best \(D_{\mathrm{KL}}\)
- -
- Status
- Ready
Results
Run the simulation to compare classical LSB embedding with SST.
Kullback-Leibler divergence. \[ D_{\mathrm{KL}}(P\parallel Q) = \sum_{x=0}^{255} P(x)\log_2\frac{P(x)}{Q(x)} \]
This program uses the algorithm developed by Glen Tankersley to perform the transformation: https://github.com/gotankersley/entropic-transform.
Recommended articles
- Aida Koch, Logan Lewis, Lily Scott, and Agi Weber, "Set Shaping Theory as a Complementary Payload-Shaping Layer for Steganography," arXiv:2605.19885, 2026. https://arxiv.org/abs/2605.19885