README.md · Nori | QuantaMrkt

README.md

5.7 KB · 154 lines · markdown Raw

1	`---`
2	`license: apache-2.0`
3	`library_name: synthefy-nori`
4	`pipeline_tag: tabular-regression`
5	`tags:`
6	`- tabular`
7	`- tabular-regression`
8	`- tabular-foundation-model`
9	`- in-context-learning`
10	`- synthetic-data`
11	`- pytorch`
12	`---`
13
14	`<p align="center">`
15	`<img src="synthefy_nori_banner.png" alt="Nori" width="100%">`
16	`</p>`
17
18	`# Nori`
19
20	`Nori is a tabular foundation model for regression via in-context`
21	`learning (ICL). Given a few labeled rows as context, it predicts on new query rows in a`
22	`single forward pass, with no task-specific training or fine-tuning. The model is`
23	`trained entirely on synthetic data.`
24
25	`- Documentation: https://docs.synthefy.com/nori/`
26	`- Repository: https://github.com/Synthefy/synthefy-nori`
27	- Library: `pip install synthefy-nori`
28	- Checkpoint: `nori.pt` (this repo)
29	`- Parameters: ~5.9M`
30	`- License: Apache-2.0`
31
32	`## Results`
33
34	`Mean and median R² of the base model across 96 regression tasks from three`
35	`public benchmark suites (single H200, up to 50K context rows per dataset):`
36
37	`\| Suite \| Datasets \| Mean R² \| Median R² \|`
38	`\|-------\|---------:\|--------:\|----------:\|`
39	`\| TabArena \| 13 \| 0.8117 \| 0.8757 \|`
40	`\| TALENT \| 72 \| 0.7569 \| 0.8802 \|`
41	`\| OpenML \| 11 \| 0.6373 \| 0.5856 \|`
42	`\| Overall \| 96 \| 0.7506 \| 0.8702 \|`
43
44	`Large-N / long-context tables (common in TabArena) are the current focus of the`
45	`large-table training stages. These numbers are reproducible end-to-end with one`
46	`command — see [Reproducing these numbers](https://github.com/Synthefy/synthefy-nori#reproducing-these-numbers).`
47
48	`> Thinking is an inference-time reasoning extension that improves these`
49	`> numbers further. Details are forthcoming.`
50
51	`## Usage`
52
53	```bash
54	`pip install synthefy-nori`
55	```
56
57	```python
58	`from sklearn.datasets import load_diabetes`
59	`from sklearn.model_selection import train_test_split`
60	`from synthefy_nori import NoriRegressor`
61
62	`X, y = load_diabetes(return_X_y=True)`
63	`X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)`
64
65	`model = NoriRegressor() # downloads these weights from the Hub on first use`
66	`model.fit(X_train, y_train) # "fit" just stores the labeled rows as context`
67	`pred = model.predict(X_test) # predictions in a single forward pass, no training`
68	```
69
70	`It uses a GPU when one is available and falls back to CPU. A one-shot helper skips the`
71	`object entirely:`
72
73	```python
74	`from synthefy_nori import predict`
75	`pred = predict(X_train, y_train, X_test, task="regression")`
76	```
77
78	`predict` follows the `TabPFNRegressor.predict` contract: pass `output_type="mean"`
79	(default), `"median"`, or `"mode"` to choose the point estimate drawn from the model's
80	`predictive distribution.`
81
82	`To run from a local checkpoint instead of the Hub default, pass a path:`
83
84	```python
85	`model = NoriRegressor(model_path="path/to/checkpoint.pt")`
86	```
87
88	`This checkpoint is public: the first inference call downloads and caches it`
89	`automatically, with no token and no access request. A Hugging Face token (read scope)`
90	`is only worth setting if you hit anonymous download rate limits — provide it via`
91	`export HF_TOKEN=hf_...`, `hf auth login`, or `NoriRegressor(token="hf_...")`.
92
93	`## How it works`
94
95	`### Architecture`
96
97	`A FeaturesTransformer (~5.9M parameters) that alternates two kinds of attention:`
98
99	`- Feature attention learns relationships between columns.`
100	`- Sample attention learns relationships between rows (context and query).`
101	`- In-context learning: predictions condition on labeled context rows, with no`
102	`gradient updates at inference.`
103
104	`Key config: 16 transformer layers, embed_dim 128, hidden 384, 2 heads, the v2-lite`
105	block (SwiGLU + RMSNorm + pre-norm), features grouped in pairs (`features_per_group=2`),
106	`with column-specific y-aware feature attention. Features are encoded with RBF`
107	`embeddings; missing values are handled natively via learned mask embeddings. The`
108	`regression head predicts a full distribution over 999 quantiles (pinball loss).`
109
110	`### Synthetic data`
111
112	`The model never sees real data during training. Its capability comes from a diverse`
113	`synthetic data generator covering real-world tabular regimes:`
114
115	`- Structural Causal Models (SCM): hierarchical DAGs with 8 edge-function types`
116	`(MLP, decision tree, piecewise-linear, polynomial, periodic, RBF, log/exp, conv1d).`
117	`- Regression priors: 9 target families (dense/sparse linear, GAM, interactions,`
118	`random MLP, random tree, radial/RBF, Fourier features, chained trigonometric).`
119	`- Realism augmentations: discretized features, noise features, correlated blocks,`
120	`structural missingness, label noise.`
121	`- Learnability filter: an ExtraTrees signal-quality filter rejects unlearnable`
122	`datasets so training compute is spent on learnable tasks.`
123
124	`Training runs entirely on synthetic data and trains to completion — there is no`
125	`real-data validation in the loop, so no benchmark data is needed to train and no eval`
126	`signal influences checkpoint selection. See the`
127	`[training guide](https://github.com/Synthefy/synthefy-nori/blob/main/docs/training.md)`
128	`for the full curriculum recipe.`
129
130	`## Intended use & limitations`
131
132	`- Intended for small-to-medium tabular regression where in-context learning is`
133	`attractive (no per-task training).`
134	`- Limitations: the current gap vs the best baselines is on large-N / long-context`
135	`TabArena datasets; dense O(N²) sample attention bounds practical context size. Very large`
136	`tables are the focus of the large-table training stages.`
137
138	`## Citation`
139
140	```bibtex
141	`@software{synthefy_nori_2026,`
142	`title = {Nori: A Tabular Foundation Model Trained on Synthetic Data},`
143	`author = {Synthefy},`
144	`year = {2026},`
145	`url = {https://github.com/Synthefy/synthefy-nori}`
146	`}`
147	```
148
149	`## License`
150
151	`Apache-2.0. See`
152	`[LICENSE](https://github.com/Synthefy/synthefy-nori/blob/main/LICENSE) and`
153	`[NOTICE](https://github.com/Synthefy/synthefy-nori/blob/main/NOTICE).`
154