Skip to content
Julian Halliwell edited this page Nov 19, 2023 · 4 revisions

Uses memory-efficient streaming to read a large XLSX spreadsheet file into one of the following:

  • a CFML query object;
  • a HTML string;
  • a CSV string.
readLargeFile( src [, format [, sheetName [, headerRow [, includeHeaderRow [, includeBlankRows [, includeHiddenColumns [, includeHiddenRows [, queryColumnNames  [, queryColumnTypes, [ `makeColumnNamesSafe [, password [, csvDelimiter [, streamingReaderOptions, [ returnVisibleValues ] ] ] ] ] ] ] ] ] ] ] ] ] ] )

or

readLargeFile( src [, format [, sheetNumber [, headerRow [, includeHeaderRow [, includeBlankRows [, includeHiddenColumns [, includeHiddenRows [, queryColumnNames  [, queryColumnTypes, [ `makeColumnNamesSafe [, password [, csvDelimiter [, streamingReaderOptions, [ returnVisibleValues ] ] ] ] ] ] ] ] ] ] ] ] ] ] )

Required arguments

  • src string: full path to the XLSX file to read

Optional arguments

  • format string default="query": "query", "html" or "csv".
    • "html" returns the sheet data as a string containing a set of HTML table rows/columns, including the <thead>/<tbody> tags, but excluding the <table> start/end tags.
    • "csv" returns the sheet data as a CSV string with values delimited by commas and qualified by double-quotes. Rows are separated by new lines (CRLF).
  • sheetName string: name of the sheet to read OR
  • sheetNumber numeric default=1: number of the sheet to read (1 based, not zero-based)
  • headerRow numeric: specify which row is the header to be used for the query column names
  • includeHeaderRow boolean default=false: whether to include the header row from the spreadsheet.
  • includeBlankRows boolean default=false: whether to include blank rows from the spreadsheet in the query data set. By default blank rows are suppressed.
  • includeHiddenColumns boolean default=true: if set to false, columns formatted as "hidden" will not be included when reading into a query
  • includeHiddenRows boolean default=true: if set to false, rows formatted as "hidden" will not be included when reading into a query
  • queryColumnNames string OR array: a comma-delimited list or an array of the names to use for the query columns in the order the columns appear in the spreadsheet. Note that specifying queryColumnNames overrides the use of a headerRow for column names.
  • queryColumnTypes string or struct: when reading a spreadsheet into a query, this allows you to specify the column types.
  • makeColumnNamesSafe boolean default=false: ensure that the CSV column names are safe for use in the resulting query (duplicate free and valid CFML variable names).
  • password string: if supplied the file will be treated as encrypted and the password used to try and open it.
  • csvDelimiter string default=",": delimiter to use if reading the file into a CSV string.
  • streamingReaderOptions struct: the keys bufferSize (number of bytes to use) and rowCacheSize (number of rows to process at a time) can be used to tune how the streaming reader wrapper processes the file. The defaults are 1024 bytes and 10 rows.
  • returnVisibleValues boolean default=false: return values as they are visible/formatted in each cell rather than the "raw" values. This mostly affects numbers and dates, where you might for example wish to avoid scientific notation.

Chainable? No.

See Read() for more details and examples.

Notes

  • This method only works with Lucee. It does not currently work with Adobe ColdFusion.
  • Only XLSX (XML format) spreadsheet files can be read using this method. It cannot read binary XLS files.
  • This method cannot return a workbook object because many standard workbook features are not supported by the streaming reader.
  • Not all of the options of Read() are available. In particular you cannot specify the rows/columns to read. It will read the entire sheet.
  • More information about the Streaming Reader by pjfanning

Examples

Read a large spreadsheet file into a query

spreadsheet = New spreadsheet();
filepath = ExpandPath( "large.xlsx" );
myQuery = spreadsheet.readLargeFile( src=filepath );

Tune the streaming reader options

spreadsheet = New spreadsheet();
filepath = ExpandPath( "large.xlsx" );
streamingOptions = { bufferSize: 2048, rowCacheSize: 20 };
myQuery = spreadsheet.readLargeFile( src=filepath, streamingReaderOptions=streamingOptions  );
Clone this wiki locally