This PR is a fix for issue https://github.com/roapi/roapi/issues/259
List of updates/fixes:
* module xlsx renamed to excel.
* Allow reading not only xlsx format but also xls, ods, xlsb
* Allow Excel DateTime format and transform it to arrow
Timestamp(Seconds, None)
* Allow using NULLs in any data types and use null value instead of
string "null"
* Fix issue with incorrect data type inference when multiple data types
are detected.
* Add possibility to specify data schema in config.
* Add new options: - rows_range_start
- rows_range_end
- columns_range_start
- columns_range_end
- schema_inference_lines
* Make sheet_name optional and if it is not specified than use first
sheet by default
* Bump calamine crate to version 0.23.1 and add feature "dates"
(supporting for DateTime column format)
Documentation updates: https://github.com/roapi/docs/pull/20
* added MySQL and Sqlite datasource support
* updated arrow, datafusion and deltalake to latest version
* cleared simd ci test cache to workaround nightly compiler bug
* Allow for delta tables to be directly backed by storage.
Enables experimental support for delta tables that are too large to be
stored in memory. We directly expose `DeltaTable` instead of copying the
data into a datafusion::Memtable.
Disadvantages:
- in the new mode, no support for S3
- as we're relying on datafusion to handle the parquet files directly,
nested schemas and certain data types may not work properly.