README.md · tabpfn-mix-1.0-regressor

README.md

2.7 KB · 96 lines · markdown Raw

1	`---`
2	`license: apache-2.0`
3	`pipeline_tag: tabular-regression`
4	`---`
5
6	`# TabPFNMix Regressor`
7
8	`TabPFNMix regressor is a tabular foundation model that is pre-trained on purely synthetic datasets sampled from a mix of random regressors.`
9
10	`## Architecture`
11
12	`TabPFNMix is based on a 12-layer encoder-decoder Transformer of 37 M parameters. We use a pre-training strategy incorporating in-context learning, similar to that used by TabPFN and TabForestPFN.`
13
14	`## Usage`
15
16	`To use TabPFNMix regressor, install AutoGluon by running:`
17
18	```sh
19	`pip install autogluon`
20	```
21
22	`A minimal example showing how to perform fine-tuning and inference using TabPFNMix regressor`
23
24	```python
25	`import pandas as pd`
26
27	`from autogluon.tabular import TabularPredictor`
28
29
30	`if __name__ == '__main__':`
31	`train_data = pd.read_csv('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')`
32	`subsample_size = 5000`
33	`if subsample_size is not None and subsample_size < len(train_data):`
34	`train_data = train_data.sample(n=subsample_size, random_state=0)`
35	`test_data = pd.read_csv('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')`
36
37	`tabpfnmix_default = {`
38	`"model_path_classifier": "autogluon/tabpfn-mix-1.0-classifier",`
39	`"model_path_regressor": "autogluon/tabpfn-mix-1.0-regressor",`
40	`"n_ensembles": 1,`
41	`"max_epochs": 30,`
42	`}`
43
44	`hyperparameters = {`
45	`"TABPFNMIX": [`
46	`tabpfnmix_default,`
47	`],`
48	`}`
49
50	`label = "age"`
51	`problem_type = "regression"`
52
53	`predictor = TabularPredictor(`
54	`label=label,`
55	`problem_type=problem_type,`
56	`)`
57	`predictor = predictor.fit(`
58	`train_data=train_data,`
59	`hyperparameters=hyperparameters,`
60	`verbosity=3,`
61	`)`
62
63	`predictor.leaderboard(test_data, display=True)`
64	```
65
66	`## Citation`
67
68	`If you find TabPFNMix useful for your research, please consider citing the associated papers:`
69
70	```
71	`@article{erickson2020autogluon,`
72	`title={Autogluon-tabular: Robust and accurate automl for structured data},`
73	`author={Erickson, Nick and Mueller, Jonas and Shirkov, Alexander and Zhang, Hang and Larroy, Pedro and Li, Mu and Smola, Alexander},`
74	`journal={arXiv preprint arXiv:2003.06505},`
75	`year={2020}`
76	`}`
77
78	`@article{hollmann2022tabpfn,`
79	`title={Tabpfn: A transformer that solves small tabular classification problems in a second},`
80	`author={Hollmann, Noah and M{\"u}ller, Samuel and Eggensperger, Katharina and Hutter, Frank},`
81	`journal={arXiv preprint arXiv:2207.01848},`
82	`year={2022}`
83	`}`
84
85	`@article{breejen2024context,`
86	`title={Why In-Context Learning Transformers are Tabular Data Classifiers},`
87	`author={Breejen, Felix den and Bae, Sangmin and Cha, Stephen and Yun, Se-Young},`
88	`journal={arXiv preprint arXiv:2405.13396},`
89	`year={2024}`
90	`}`
91	```
92
93	`## License`
94
95	`This project is licensed under the Apache-2.0 License.`
96