I mean, yeah, that’s the point of compression. I don’t quite get what you mean by that comment.
I mean, yeah, that’s the point of compression. I don’t quite get what you mean by that comment.
I really don’t think that’s a lot either. Nowadays we routinely process terabytes of data.
Oh, I know, believe me. I have some painful first-hand experience with such code.
I think portability and easy parsing is the only advantage od CSV. It’s definitely good enough (maybe even the best) for small datasets but if you have a lot of data you need a compressed binary format, something like parquet.
Is 600 MB a lot for pandas? Of course, CSV isn’t really optimal but I would’ve sworn pandas happily works with gigabytes of data.
I have to mention dataclasses
here, especially with frozen=True
.
Seriously, use dataclasses whenever possible, they’re great.
EAFP - “Easier to ask for forgiveness than for permission”.
For those who are (like me) unfamiliar with this… acronym?
Right, true!