For if you’re feeling a little too reliant on pd.DataFrame()
.
Managing data in python
For years of my engineering career, I’ve gotten by just fine using Pandas DataFrames for nearly all of my data reading, storing, and calculations. They’re the Excel spreadsheet of python - they can do pretty much everything.
Though as those projects grew in size, so did their maintenance burden. The flexibility of pandas encouraged me to mutate DataFrames as I needed, resulting in a variety of different input & output parameters for my functions.
- These were difficult for my teammates to track.
- Each new input/output parameter required even more code for testing.
- SQL code written specifically for reading & saving this data was required for every breed of DataFrame I created.
Solutions to these problems came in the form of two other python packages:
So when I stumbled across a tool that promises to combine the benefits of both, I couldn’t help but jump in.
Enter SQLModel
SQLModel promises to simplify the code you write by combining Pydantic and SQLAlchemy functionality into one model. This means you should be able to:
- Define your data structure once - no need to update your SQL scripts and validation logic when adding new data.
- Quickly read and write your data to a database or an external API if required, with minimal boilerplate code.
- Use all the advanced features of Pydantic and SQLAlchemy - SQLModel uses them under the hood.
- Enjoy an intuitive developer experience, thanks to python’s type hints.
from sqlmodel import SQLModel
class Price(SQLModel, table=True):
interval: datetime
region: Literal["QLD1", "NSW1", "VIC1", "SA1", "TAS1"]
price: float
record = Price(interval='2025-07-30 22:00', region='NSW1', price='72')
>>> record
Price(interval=datetime(2025, 7, 30, 22), region='NSW1', price=72)
Does SQLModel work?
Here’s my experience so far, from 3 small projects I’ve been using it on.
Is it kind to your database?
Yes! ✅
When it comes to simple cases of reading, writing, deleting and updating rows in your database, SQLModel excels. It can transform readable python code into efficient SQL queries, while removing the risk of SQL injection in the process.
# Create all your DB tables
engine = create_engine("sqlite:///prices.db")
SQLModel.metadata.create_all(engine)
# Insert data
with Session(engine) as session:
session.add_all([record])
session.commit()
# Select data
with Session(engine) as session:
statement = select(Price)
.where(Price.region == "NSW1")
.where(Price.interval.between(start, end))
results = session.exec(statement).all()
I have hit two limitations:
- Massive datasets (>100k rows). The database session object provides some of SQLAlchemy’s methods designed to speed up bulk insert operations, or inserting raw data if validation isn’t required.
- Complex queries with nested joins: like SQLAlchemy, SQLModel’s queries take much longer than I’d expect if I wrote the SQL myself. In those edge cases, SQLModel lets you do just that.
Does it do validation?
Yes, but not how you’d expect ⚠️
I would have expected my models to automatically validate the parameters when I instantiate them, just like a Pydantic model would. But it doesn’t - it only works for Models you create that you don’t want to save in your DB!
>>> Price(interval='not_a_datetime')
Price(interval='not_a_datetime') 💥 no validation!
class Region(SQLModel, table=False):
regionid: str
>>> Region(regionid=3)
ValidationError: 1 validation error for Region
regionid
Input should be a valid string ✅ invalid values are rejected.
This is happening because SQLModel is a leaky abstraction:
- Table models defined by
Model(table=True)
behave like SQLAlchemy models - they do not perform validation when you instantiate them. Validation is only done when writing to or reading from the database, but even then, invalid fields will be silently dropped. Probably not what you want. - Non-table models defined by
Model(table=False)
behave like Pydantic models - they perform validation when you instantiate them. They correctly raise problems with bad inputs, but they can’t be saved into your DB.
This caught me out, and I know others will be too. The recommended approach is to define two separate models: one model for validation, and a table model inheriting from the validation model.
class PriceBase(SQLModel, table=False):
interval: datetime
region: Literal["QLD1", "NSW1", "VIC1", "SA1", "TAS1"]
price: float
class Price(PriceBase, table=True):
pass
>>> validated = PriceBase(interval='2025-07-30 22:00', region='NSW1', price='72')
>>> Price(**validated.model_dump())
Price(interval=datetime(2025, 7, 30, 22), region='NSW1', price=72)
So in the end, you have to use two separate models if you want the full functionality promised at the top of this post.
I wasn’t happy with this, so I attempted to develop a hacky workaround for my smaller projects which will validate my table parameters automatically without needing to define the model twice. Check out the Github comment if you’re interested.
Does it integrate with your API?
Yes! ✅
It’s dead simple, as the guy who created SQLModel also created FastAPI. I’ve used it for some toy projects, which I’ve been able to get up and running quickly and reliably, with great documentation autogeneration too. That said, I haven’t used FastAPI on any production-grade systems and might lean towards the battle-tested maturity of Django and the Django REST API Framework if my job depended on it.
Other quirks?
- When using SQLModel with relationships (e.g. foreign key references to other tables), related objects aren’t automatically refreshed when you read the main model from your database. You’ll need to adjust your query to explicitly join or refresh fields you may want to read, such as
Price.region.name
.
Should you use SQLModel?
SQLModel isn’t a perfect abstraction, so you’ll still need to know how to use Pydantic and SQLAlchemy when you get stuck. If you’re new to those libraries, start with them first.
I’m enjoying the benefits so far, and outside the validation issues I haven’t found any blockers that have forced me to go back. So I’m going to persevere with it for now.
Love/hate it? Let me know how you went or what alternatives you prefer.