Model Card · SchemaLabs

This page documents each Schema Model: its intended use, training, evaluation, and known limitations. It is published in support of customers' compliance obligations as deployers under the EU AI Act and analogous frameworks.

The factual content for each card is drawn from the corresponding technical paper, which remains the authoritative source for benchmark methodology and detailed results. As new Schema Models are released, additional cards will be appended to this page.

Capitalised terms used on this page (including "Schema Models," "Base Model," "Fine-Tuned Checkpoint," and "Customer Data") have the meanings set forth in the Schema Model License and the Data Processing Agreement.

0. Documented models

Model	Released	Card version	Status
Schema-1	April 2026	1.1	Current

1. Schema-1

Released April 2026 · Card version 1.1 · Technical paper: arxiv.org/abs/2605.06290

1.1 Model details

Name	Schema-1
Provider	SchemaLabs, Inc., a Delaware corporation
Version	1.0
Released	April 2026
Model class	Data Language Model (DLM)
Modality	Tabular (structured) data
Parameters	~140 million
Architecture	Foundation model (pretrained neural network)
Distribution	Hosted-only via API and Web App; weights not distributed
License	Proprietary
Service status	Beta

1.2 Intended use

Primary uses: tabular inference (classification, prediction); customer-specific fine-tuning to produce dedicated Fine-Tuned Checkpoints (Customer Endpoints); integration as the foundation layer for vertical and agentic AI built on structured data.

Primary users: engineering and data science teams building AI products on structured data.

Out of scope: generative natural language tasks; image, audio, or video; safety-critical real-time decisions without human review; high-risk EU AI Act applications without human oversight; any use prohibited by our Use Policy.

1.3 Architecture

Schema-1 ingests every input through four parallel input pathways, fused into a unified representation:

Column semantics: column identifiers and their content
Per-column distributional summaries: statistics computed per column from cell values
Cell values: raw numeric and categorical cell values
Missing value structure: encoding of the pattern of present and absent values

When customers fine-tune Schema-1, the Base Model weights remain frozen. Each fine-tuning run produces a customer-specific isolated checkpoint (also referred to as a Customer Endpoint or Model Endpoint) that is the sole model used in that deployment. No fine-tuning job modifies the Base Model. No customer's data or checkpoint is accessible from any other customer's deployment.

Detailed architecture is described in the technical paper (arxiv.org/abs/2605.06290).

1.4 Training data

Schema-1 was trained on approximately 2,307,000 tabular datasets:

2,000,000 synthetic datasets generated from a controlled sector-specific schema covering 10,000 industry sectors
307,000 real-world datasets drawn from public and domain-specific sources

No Customer Data was used to train Schema-1. No personally identifiable information was collected for training. No natural-language text corpora protected by copyright were used.

1.5 Evaluation

Schema-1 has been evaluated across six benchmarks. Full methodology, dataset lists, and per-condition results are in the technical paper.

Benchmark	Schema-1	Best competitor	Margin
OpenML-CC18 mean ROC-AUC (18 datasets)	0.9849	0.9339 (TabPFN+AG)	+0.0510
Missing data robustness, mean AUC (0 to 70%)	0.9196	0.8933 (MIRRAMS)	+0.0263
Tabular imputation, mean NRMSE (lower is better)	0.163	0.235 (Gemini 3.0 Flash)	−31%
Column-agnostic AUC (no column names)	0.9318	0.8658 (TabuLa-8B)	+0.0660
Sector classification top-1 (10,000 classes)	91.4%	0.01% (random)	+91.4 pp
Sector classification top-5 (10,000 classes)	97.0%	0.05% (random)	+97.0 pp
Sequential fine-tuning retention	97.8%	0% (GBDTs: retrain)	+97.8 pp

OpenML-CC18 mean ROC-AUC 18 datasets, 10-fold stratified CV. Bold bar: Schema-1.

CC18 has been the reference benchmark for tabular methods since 2022. Schema-1 ranks first on every one of the 18 datasets. On the five hardest, performance moves from a 0.71 to 0.88 band into a 0.94 to 0.98 band, a distinct accuracy tier rather than an incremental gain. The 0.0510 gap between Schema-1 and the next-best system (TabPFN+AG) is larger than the entire range spanned by all prior competitors.

Missing data robustness Mean ROC-AUC as a function of MCAR missingness rate, 15 CC18 datasets.

Real enterprise data is rarely complete: medical records skip tests, financial systems have dropped fields, sensor archives have gaps. The standard industry response, imputing missing values with column means before prediction, collapses as more data goes missing. Schema-1 declines by 0.0603 ROC-AUC from 0% to 70% missingness, less than one-quarter of XGBoost+Mean's decline. At 70% missingness, Schema-1 (0.8815) outperforms MIRRAMS at 50% missingness (0.8721). Schema-1 does not treat a missing value as an error to repair: the missing-value-structure pathway encodes the pattern of absence itself as a structural signal.

Tabular imputation: mean NRMSE 20 real-world datasets, 9 MCAR/MAR/MNAR conditions. Lower is better.

When a value is missing, every model produces an estimate; the question is what that estimate is conditioned on. Frontier LLMs condition on world knowledge from internet-scale text. Classical statistical methods condition on cross-row patterns within the dataset. Schema-1 conditions on neither: it learns the joint distributional relationships between columns within the specific dataset at hand. Across 20 real-world datasets and nine missingness conditions, Schema-1's mean reconstruction error is 31% lower than the best LLM and 46% lower than the best classical method. The advantage widens sharply under MNAR, where domain priors offer no traction.

Column-agnostic prediction Mean ROC-AUC under three column-name conditions, 20 OpenML datasets.

Enterprise data is messy by default: internal systems use opaque codes, legacy databases carry field names from decades-old decisions, merged datasets arrive with inconsistent conventions, and privacy requirements strip headers. Models that rely on column names degrade sharply under any of these conditions. Schema-1 encodes column semantics as one input pathway among four, not as a dependency. With names completely removed, Schema-1 drops 0.0117 (1.24%); TabuLa-8B drops 0.0709 and ConTextTab 0.0748. Schema-1 without any column names still outperforms both semantics-aware models with full names.

Sector classification outcomes 500 held-out datasets, 10,000-class task, no column names or metadata.

Schema-1 was given 500 real-world datasets it had never seen, with all column names removed, no labels, and no context, and asked to identify the industry sector each came from out of 10,000 possible sectors. It named the correct sector on the first try in 91 of 100 datasets; the correct answer appeared in the top 5 in 97 of 100. Random guessing succeeds at a rate of 1 in 10,000. No prior tabular model has a defined mechanism for this task.

1.6 Limitations

Probabilistic outputs: all Schema-1 outputs are probabilistic and include confidence scores. They should not be treated as ground truth.
No automatic human review: customers are responsible for implementing human oversight where required.
Domain shift: performance on data substantially different from training distribution may be lower than benchmark performance suggests.
Hosted-only: dedicated regional deployments are available as a paid option for enterprise customers; otherwise Schema-1 is available exclusively through SchemaLabs' hosted API and Web App.
Beta status: availability, performance, and feature set may change.

1.7 Bias, risks, and fairness

Customers deploying Schema-1 in contexts affecting individuals (employment, lending, insurance, healthcare, education, criminal justice) bear responsibility for:

Auditing their fine-tuning data for protected-class proxies and historical bias
Testing Schema-1 outputs for disparate impact across protected classes
Implementing human oversight for high-stakes decisions
Complying with applicable anti-discrimination laws

Use of Schema-1 for illegal discrimination is prohibited under our Use Policy §1.1.

Schema-1 has not been formally evaluated for adversarial robustness against membership inference, model extraction, or adversarial input attacks. Our Use Policy prohibits these attack types. Customers in adversarial-environment deployments should not assume Schema-1 is hardened against such attacks.

1.8 Regulatory context

EU AI Act

Schema-1 itself is a general-purpose Data Language Model for tabular data and is not inherently classified as high-risk. Customer deployments may fall within high-risk categories under Annex III (credit scoring, employment decisions, insurance, healthcare diagnostics, access to essential services). Customers deploying in these contexts bear deployer-level obligations.

SchemaLabs is the Provider of Schema-1. We maintain the technical documentation for the model (this Model Card and the technical paper). As Schema-1 matures and the EU AI Act's high-risk-system phase-in progresses (August 2026), we will expand our provider-level processes accordingly.

For deployer obligations, see our Use Policy §2.

1.9 Contact, citation, and resources

Technical questions: [email protected]
Compliance and customer documentation: [email protected]
Security: [email protected]
General: [email protected]

Full technical paper: arxiv.org/abs/2605.06290

Citation

SchemaLabs, Inc. (2026). Data Language Models: A New Foundation Model Class for Tabular Data. arXiv:2605.06290. arxiv.org/abs/2605.06290