mirror of
https://github.com/roapi/roapi.git
synced 2026-06-05 21:04:02 +08:00
Some checks failed
build / build (push) Has been cancelled
build / build_ui (push) Has been cancelled
build / database_test (push) Has been cancelled
build / object_store_memory_test (push) Has been cancelled
build / object_store_direct_test (push) Has been cancelled
build / openssl_build (push) Has been cancelled
build / mac_cross_build (push) Has been cancelled
build / Docker Image Build (push) Has been cancelled
columnq-cli release / Validate git tag (push) Has been cancelled
roapi release / Validate git tag (push) Has been cancelled
columnq-cli release / macos (push) Has been cancelled
columnq-cli release / windows (map[features:database-sqlite python-architecture:x64 target:x86_64-pc-windows-msvc]) (push) Has been cancelled
columnq-cli release / linux (map[features:rustls,database-sqlite image_tag:aarch64-musl manylinux:2014 name_suffix: rustflags: target:aarch64-unknown-linux-musl upload:true]) (push) Has been cancelled
columnq-cli release / linux (map[features:rustls,database-sqlite image_tag:x86_64-musl manylinux:2010 name_suffix: rustflags:-C target-cpu=skylake target:x86_64-unknown-linux-musl upload:true]) (push) Has been cancelled
columnq-cli release / PyPI Release (push) Has been cancelled
roapi release / macos (push) Has been cancelled
roapi release / windows (map[features:database-sqlite python-architecture:x64 target:x86_64-pc-windows-msvc]) (push) Has been cancelled
roapi release / linux (map[features:rustls,database-sqlite image_tag:aarch64-musl manylinux:2014 name_suffix: rustflags: target:aarch64-unknown-linux-musl upload:true]) (push) Has been cancelled
roapi release / linux (map[features:rustls,database-sqlite image_tag:x86_64-musl manylinux:2010 name_suffix: rustflags:-C target-cpu=skylake target:x86_64-unknown-linux-musl upload:true]) (push) Has been cancelled
roapi release / PyPI Release (push) Has been cancelled
roapi release / Docker Image Release (push) Has been cancelled
384 lines
10 KiB
Markdown
384 lines
10 KiB
Markdown
# ROAPI
|
|
|
|

|
|
[](https://roapi.github.io/docs/index.html)
|
|
[](https://discord.gg/FchMNYTm7G)
|
|
|
|
ROAPI automatically spins up read-only APIs for static datasets without
|
|
requiring you to write a single line of code. It builds on top of [Apache
|
|
Arrow](https://github.com/apache/arrow) and
|
|
[Datafusion](https://github.com/apache/arrow-datafusion). The
|
|
core of its design can be boiled down to the following:
|
|
|
|
- [Query frontends](https://roapi.github.io/docs/api/query/index.html) to
|
|
translate SQL, FlightSQL, GraphQL and REST API queries into
|
|
Datafusion plans.
|
|
- Datafusion for query plan execution.
|
|
- [Data layer](https://roapi.github.io/docs/config/dataset-formats/index.html)
|
|
to load datasets from a variety of sources and formats with automatic schema
|
|
inference.
|
|
- [Response encoding layer](https://roapi.github.io/docs/api/response.html) to
|
|
serialize intermediate Arrow record batch into various formats requested by
|
|
client.
|
|
|
|
See below for a high level diagram:
|
|
|
|
<img alt="roapi-design-diagram" src="https://roapi.github.io/docs/images/roapi.png">
|
|
|
|
## Installation
|
|
|
|
### Install pre-built binary
|
|
|
|
```bash
|
|
# if you are using homebrew
|
|
brew install roapi
|
|
# or if you prefer pip
|
|
pip install roapi
|
|
```
|
|
|
|
Check out [Github release page](https://github.com/roapi/roapi/releases) for
|
|
pre-built binaries for each platform. Pre-built docker images are also available at
|
|
[ghcr.io/roapi/roapi](https://github.com/orgs/roapi/packages/container/package/roapi).
|
|
|
|
### Install from source
|
|
|
|
```bash
|
|
cargo install --locked --git https://github.com/roapi/roapi --branch main --bins roapi
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Quick start
|
|
|
|
Spin up APIs for `test_data/uk_cities_with_headers.csv` and
|
|
`test_data/spacex_launches.json`:
|
|
|
|
```bash
|
|
roapi \
|
|
--table "uk_cities=test_data/uk_cities_with_headers.csv" \
|
|
--table "test_data/spacex_launches.json"
|
|
```
|
|
|
|
Or using docker:
|
|
|
|
```bash
|
|
docker run -t --rm -p 8080:8080 ghcr.io/roapi/roapi:latest --addr-http 0.0.0.0:8080 \
|
|
--table "uk_cities=test_data/uk_cities_with_headers.csv" \
|
|
--table "test_data/spacex_launches.json"
|
|
```
|
|
|
|
Query data using the builtin web UI at `http://localhost:8080/ui`:
|
|
|
|
<img alt="roapi-design-diagram" src="https://roapi.github.io/docs/images/ui.png">
|
|
|
|
Query data using SQL, GraphQL or REST via curl:
|
|
|
|
```bash
|
|
curl -X POST -d "SELECT city, lat, lng FROM uk_cities LIMIT 2" localhost:8080/api/sql
|
|
curl -X POST -d "query { uk_cities(limit: 2) {city, lat, lng} }" localhost:8080/api/graphql
|
|
curl "localhost:8080/api/tables/uk_cities?columns=city,lat,lng&limit=2"
|
|
```
|
|
|
|
Get inferred schema for all tables:
|
|
|
|
```bash
|
|
curl 'localhost:8080/api/schema'
|
|
```
|
|
|
|
For MySQL and SQLite, specify the table argument like below:
|
|
```
|
|
--table "table_name=mysql://username:password@localhost:3306/database"
|
|
--table "table_name=sqlite://path/to/database"
|
|
```
|
|
|
|
Want dynamic register data? Add parameter `-d` to command. `--table` parameter cannot be ignored for now.
|
|
```bash
|
|
roapi \
|
|
--table "uk_cities=test_data/uk_cities_with_headers.csv" \
|
|
-d
|
|
```
|
|
|
|
Then post config to `/api/table` register data.
|
|
```bash
|
|
curl -X POST http://172.24.16.1:8080/api/table \
|
|
-H 'Content-Type: application/json' \
|
|
-d '[
|
|
{
|
|
"tableName": "uk_cities2",
|
|
"uri": "./test_data/uk_cities_with_headers.csv"
|
|
},
|
|
{
|
|
"tableName": "table_name",
|
|
"uri": "sqlite://path/to/database"
|
|
}
|
|
]'
|
|
```
|
|
|
|
For windows, full scheme(file:// or filesystem://) must filled, and use double quote(") instead of single quote(') to escape windows cmdline limit:
|
|
|
|
```bash
|
|
roapi \
|
|
--table "uk_cities=file://d:/path/to/uk_cities_with_headers.csv" \
|
|
--table "file://d:/path/to/test_data/spacex_launches.json"
|
|
```
|
|
|
|
|
|
### Config file
|
|
|
|
You can also configure multiple table sources using YAML or Toml config, which supports more
|
|
advanced format specific table options:
|
|
|
|
```yaml
|
|
addr:
|
|
http: 0.0.0.0:8080
|
|
postgres: 0.0.0.0:5433
|
|
tables:
|
|
- name: "blogs"
|
|
uri: "test_data/blogs.parquet"
|
|
|
|
- name: "ubuntu_ami"
|
|
uri: "test_data/ubuntu-ami.json"
|
|
option:
|
|
format: "json"
|
|
pointer: "/aaData"
|
|
array_encoded: true
|
|
schema:
|
|
columns:
|
|
- name: "zone"
|
|
data_type: "Utf8"
|
|
- name: "name"
|
|
data_type: "Utf8"
|
|
- name: "version"
|
|
data_type: "Utf8"
|
|
- name: "arch"
|
|
data_type: "Utf8"
|
|
- name: "instance_type"
|
|
data_type: "Utf8"
|
|
- name: "release"
|
|
data_type: "Utf8"
|
|
- name: "ami_id"
|
|
data_type: "Utf8"
|
|
- name: "aki_id"
|
|
data_type: "Utf8"
|
|
|
|
- name: "spacex_launches"
|
|
uri: "https://api.spacexdata.com/v4/launches"
|
|
option:
|
|
format: "json"
|
|
|
|
- name: "github_jobs"
|
|
uri: "https://web.archive.org/web/20210507025928if_/https://jobs.github.com/positions.json"
|
|
```
|
|
|
|
To run serve tables using config file:
|
|
|
|
```bash
|
|
roapi -c ./roapi.yml # or .toml
|
|
```
|
|
|
|
See [config
|
|
documentation](https://roapi.github.io/docs/config/config-file.html) for more
|
|
options including [using Google spreadsheet as a table
|
|
source](https://roapi.github.io/docs/config/dataset-formats/gsheet.html).
|
|
|
|
### Response serialization
|
|
|
|
By default, ROAPI encodes responses in JSON format, but you can request
|
|
different encodings by specifying the `ACCEPT` header:
|
|
|
|
```bash
|
|
curl -X POST \
|
|
-H 'ACCEPT: application/vnd.apache.arrow.stream' \
|
|
-d "SELECT launch_library_id FROM spacex_launches WHERE launch_library_id IS NOT NULL" \
|
|
localhost:8080/api/sql
|
|
```
|
|
|
|
### REST API query interface
|
|
|
|
You can query tables through REST API by sending `GET` requests to
|
|
`/api/tables/{table_name}`. Query operators are specified as query params.
|
|
|
|
REST query frontend currently supports the following query operators:
|
|
|
|
- columns
|
|
- sort
|
|
- limit
|
|
- filter
|
|
|
|
To sort column `col1` in ascending order and `col2` in descending order, set
|
|
query param to: `sort=col1,-col2`.
|
|
|
|
To find all rows with `col1` equal to string `'foo'`, set query param to:
|
|
`filter[col1]='foo'`. You can also do basic comparisons with filters, for
|
|
example predicate `0 <= col2 < 5` can be expressed as
|
|
`filter[col2]gte=0&filter[col2]lt=5`.
|
|
|
|
### GraphQL query interface
|
|
|
|
To query tables using GraphQL, send the query through `POST` request to
|
|
`/api/graphql` endpoint.
|
|
|
|
GraphQL query frontend supports the same set of operators supported by [REST
|
|
query frontend](https://roapi.github.io/docs/api/query/rest.html). Here how is
|
|
you can apply various operators in a query:
|
|
|
|
```graphql
|
|
{
|
|
table_name(
|
|
filter: { col1: false, col2: { gteq: 4, lt: 1000 } }
|
|
sort: [{ field: "col2", order: "desc" }, { field: "col3" }]
|
|
limit: 100
|
|
) {
|
|
col1
|
|
col2
|
|
col3
|
|
}
|
|
}
|
|
```
|
|
|
|
### SQL query interface
|
|
|
|
To query tables using a subset of standard SQL, send the query through `POST`
|
|
request to `/api/sql` endpoint. This is the only query interface that supports
|
|
table joins.
|
|
|
|
|
|
### Key value lookup
|
|
|
|
You can pick two columns from a table to use a key and value to create a quick
|
|
keyvalue store API by adding the following lines to the config:
|
|
|
|
```yaml
|
|
kvstores:
|
|
- name: "launch_name"
|
|
uri: "test_data/spacex_launches.json"
|
|
key: id
|
|
value: name
|
|
```
|
|
|
|
Key value lookup can be done through simple HTTP GET requests:
|
|
|
|
```bash
|
|
curl -v localhost:8080/api/kv/launch_name/600f9a8d8f798e2a4d5f979e
|
|
Starlink-21 (v1.0)%
|
|
```
|
|
|
|
### Query through Postgres wire protocol
|
|
|
|
ROAPI can present itself as a Postgres server so users can use Postgres clients
|
|
to issue SQL queries.
|
|
|
|
```bash
|
|
$ psql -h 127.0.0.1
|
|
psql (12.10 (Ubuntu 12.10-0ubuntu0.20.04.1), server 13)
|
|
WARNING: psql major version 12, server major version 13.
|
|
Some psql features might not work.
|
|
Type "help" for help.
|
|
|
|
houqp=> select count(*) from uk_cities;
|
|
COUNT(UInt8(1))
|
|
-----------------
|
|
37
|
|
(1 row)
|
|
```
|
|
|
|
|
|
## Features
|
|
|
|
Query layer:
|
|
|
|
- [x] REST API GET
|
|
- [x] GraphQL
|
|
- [x] SQL
|
|
- [x] join between tables
|
|
- [x] access to array elements by index
|
|
- [x] access to nested struct fields by key
|
|
- [ ] column index
|
|
- protocol
|
|
- [x] Postgres
|
|
- [x] FlightSQL
|
|
- [x] Key value lookup
|
|
|
|
Response serialization:
|
|
|
|
- [x] JSON `application/json`
|
|
- [x] Arrow `application/vnd.apache.arrow.stream`
|
|
- [x] Parquet `application/vnd.apache.parquet`
|
|
- [ ] msgpack
|
|
|
|
Data layer:
|
|
|
|
- [x] filesystem
|
|
- [x] HTTP/HTTPS
|
|
- [x] S3
|
|
- [x] GCS
|
|
- [x] Azure Storage
|
|
- [x] Google spreadsheet
|
|
- [x] MySQL
|
|
- [x] SQLite
|
|
- [x] Postgres
|
|
- [ ] Airtable
|
|
- Data format
|
|
- [x] CSV
|
|
- [x] JSON
|
|
- [x] NDJSON
|
|
- [x] parquet
|
|
- [x] xls, xlsx, xlsb, ods: https://github.com/tafia/calamine
|
|
- [x] [DeltaLake](https://delta.io/)
|
|
|
|
Misc:
|
|
|
|
- [ ] auto gen OpenAPI doc for rest layer
|
|
- [ ] query input type conversion based on table schema
|
|
- [ ] stream arrow encoding response
|
|
- [ ] authentication layer
|
|
|
|
## Development
|
|
|
|
The core of ROAPI, including query front-ends and data layer, lives in the
|
|
self-contained [columnq](https://github.com/roapi/roapi/tree/main/columnq)
|
|
crate. It takes queries and outputs Arrow record batches. Data sources will
|
|
also be loaded and stored in memory as Arrow record batches.
|
|
|
|
The [roapi](https://github.com/roapi/roapi/tree/main/roapi) crate wraps
|
|
`columnq` with a multi-protocol query layer. It serializes Arrow record batches
|
|
produced by `columnq` into different formats based on client request.
|
|
|
|
### Debug
|
|
|
|
To log all FlightSQL requests in console, set `RUST_LOG=tower_http=trace`.
|
|
|
|
### Build Docker image
|
|
|
|
```bash
|
|
docker build --rm -t ghcr.io/roapi/roapi:latest .
|
|
```
|
|
### VS Code DevContainer
|
|
|
|
#### Requirements
|
|
- [x] Vscode
|
|
- [x] Ensure this extension is installed on your vs code `ms-vscode-remote.remote-containers`
|
|
---
|
|
Once done you will see prompt from left side to reopen the project in dev container or open command palette and search for open with remote container:
|
|
|
|
1. install dependencies
|
|
```bash
|
|
apt-get update && apt-get install --no-install-recommends -y cmake
|
|
```
|
|
2. connect to database from your local using db client of choice using the following credentials
|
|
```
|
|
username: user
|
|
password: user
|
|
database: test
|
|
```
|
|
once done create table so you can map it in `-t` arg or consider using sample in `.devcontainer/db-migration.sql` to populate some tables with data
|
|
|
|
3. run cargo command with mysql db as feature
|
|
```bash
|
|
cargo run --bin roapi --features database -- -a localhost:8080 -t posts=mysql://user:user@db:3306/test
|
|
```
|
|
otherwise if you are looking for other features
|
|
you have to select appropriate one from `roapi/Cargo.toml`
|
|
|
|
|