README.md
2.7 KB · 96 lines · markdown Raw
1 ---
2 license: apache-2.0
3 pipeline_tag: tabular-regression
4 ---
5
6 # TabPFNMix Regressor
7
8 TabPFNMix regressor is a tabular foundation model that is pre-trained on purely synthetic datasets sampled from a mix of random regressors.
9
10 ## Architecture
11
12 TabPFNMix is based on a 12-layer encoder-decoder Transformer of 37 M parameters. We use a pre-training strategy incorporating in-context learning, similar to that used by TabPFN and TabForestPFN.
13
14 ## Usage
15
16 To use TabPFNMix regressor, install AutoGluon by running:
17
18 ```sh
19 pip install autogluon
20 ```
21
22 A minimal example showing how to perform fine-tuning and inference using TabPFNMix regressor
23
24 ```python
25 import pandas as pd
26
27 from autogluon.tabular import TabularPredictor
28
29
30 if __name__ == '__main__':
31 train_data = pd.read_csv('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
32 subsample_size = 5000
33 if subsample_size is not None and subsample_size < len(train_data):
34 train_data = train_data.sample(n=subsample_size, random_state=0)
35 test_data = pd.read_csv('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
36
37 tabpfnmix_default = {
38 "model_path_classifier": "autogluon/tabpfn-mix-1.0-classifier",
39 "model_path_regressor": "autogluon/tabpfn-mix-1.0-regressor",
40 "n_ensembles": 1,
41 "max_epochs": 30,
42 }
43
44 hyperparameters = {
45 "TABPFNMIX": [
46 tabpfnmix_default,
47 ],
48 }
49
50 label = "age"
51 problem_type = "regression"
52
53 predictor = TabularPredictor(
54 label=label,
55 problem_type=problem_type,
56 )
57 predictor = predictor.fit(
58 train_data=train_data,
59 hyperparameters=hyperparameters,
60 verbosity=3,
61 )
62
63 predictor.leaderboard(test_data, display=True)
64 ```
65
66 ## Citation
67
68 If you find TabPFNMix useful for your research, please consider citing the associated papers:
69
70 ```
71 @article{erickson2020autogluon,
72 title={Autogluon-tabular: Robust and accurate automl for structured data},
73 author={Erickson, Nick and Mueller, Jonas and Shirkov, Alexander and Zhang, Hang and Larroy, Pedro and Li, Mu and Smola, Alexander},
74 journal={arXiv preprint arXiv:2003.06505},
75 year={2020}
76 }
77
78 @article{hollmann2022tabpfn,
79 title={Tabpfn: A transformer that solves small tabular classification problems in a second},
80 author={Hollmann, Noah and M{\"u}ller, Samuel and Eggensperger, Katharina and Hutter, Frank},
81 journal={arXiv preprint arXiv:2207.01848},
82 year={2022}
83 }
84
85 @article{breejen2024context,
86 title={Why In-Context Learning Transformers are Tabular Data Classifiers},
87 author={Breejen, Felix den and Bae, Sangmin and Cha, Stephen and Yun, Se-Young},
88 journal={arXiv preprint arXiv:2405.13396},
89 year={2024}
90 }
91 ```
92
93 ## License
94
95 This project is licensed under the Apache-2.0 License.
96