Replies: 2 comments 1 reply
-
To be honest I have never dealt with such a big file, so I am just guessing: you could try either doing it with SAS (I guess they are more efficient managing their own files), or read it once with pyreadstat but transform it to a different format that would make it faster to handle, some kind of relational database probably as you need to join two tablesn (you could try with sqlite to start with, but maybe you will need something more powerful). You could research on the internet what people use for those situations and transform your data into that. Maybe others have other suggestions. |
Beta Was this translation helpful? Give feedback.
-
I had to read >1.5 TB from a network storage, and never got to read anything, although n records and n offset were set to about 100 or 1000, with no more than 30 columns, not too long text inside. So the one purpose I needed to use this library for was not met. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I need to read a large sas7bdat file (about 150 gb) and join it with another sas7bdat (a very small table with just few account ids) based on the account id. I have tried to use read in chunks and enabled multiprocessing but yet, it is taking way too long to read and load the large file. Can you please suggest if it can handle 150 Gb sas7bdat file? If Yes, what is the most efficient way of doing that. A little code snippet would be helpful and appreciated! Thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions