Best IoT Database    Unifying Space and Time    
Jaguar Technology Product Documents Download Source
Time Series and Location Data All In One

Instant Scalability

JaguarDB is highly scalable, fast NoSQL database for Internet of Things (IoT) which handles large volumes of location data and time series data, unifying events data consisting of space and time. Traditional methods for handling horizontal scalability is to use consistent hashing to reduce the workload of data migration. However, when datasets become large, it still takes long time to scale hundreds of nodes. With JaguarDB, there is absolutely zero data migration during the scaling process. Scaling is done instantly. For example, scaling from 100 nodes to 400 nodes in a JaguarDB cluster takes 10 seconds. That is it. The system keeps running with 500 nodes that are now able to store 5 times more data. The next time the system needs scaling, the prcess will be the same.



Time Series Data

JaguarDB enables fast ingestion of time series data, coupling location-based data. It also can index in both dimensions, space and time. Back-filling time series data is also fast (inserting large volumes of data in past time). Normally time series is a series of data points indexed in time order. In JaguarDB, the time series has a different meaning: it is both a sequence of data points and a series of tick tables holding aggregated data values at specified time spans. For example, a time series table in JaguarDB can have a base table storing data points in time order, and tick tables such as 5 minute, 15 minute, hourly, daily, weekly, monthly tables to store aggregated data within these time spans.

The following formats describe commands to create a time series table:

create table timeseries(TICK:RETENTION, TICK:RETENTION, …|BASERENTION)

BASETABLE (key: KEYCOL1, KEYCOL2, …, value: col rollup VTYPE, …);
     Where:
     TICK:RETENTION specifies a tick type and retention period of the tick table;119
     BASERENTION represents the retention period of the base table;
     BASETABLE is the name of the base table;
     KEYCOL1, KEYCOL2, … are the key columns in the base table;
     Rollup specifies the columns whose values will be rolled up to the tick tables;
     VTYPE is the type of the column to be rolled up.
     The TICK keyword starts with a number and a period type. For example, 15s means 
	 a tick table of 15 seconds; 30m means a tick table of 30 minutes.
     The letter ‘s’ indicates TICKs in seconds. 
     The letter ‘m’ indicates TICKs in minutes. 
     The letter ‘h’ indicates TICKs in hours. 
     The letter ‘d’ indicates TICKs in days. 
     The letter ‘w’ indicates TICKs in weeks. 
     The letter ‘M’ indicates TICKs in months. 
     The letter ‘M’ indicates TICKs in months. 
     The letter ‘q’ indicates TICKs in quarters. 
     The letter ‘y’ indicates TICKs in years. 
     The letter ‘D’ indicates TICKs in decades. 
     Valid TICKs in seconds scale include: 1s, 2s, 3s, 5s, 6s, 10s, 12s, 15s, 20s, 30s. 
     Valid TICKs in minutes include: 1m, 2m, 3m, 5m, 6m, 10m, 12m, 15m, 20m, 30m. 
     Valid TICKs in hours include: 1h, 2h, 3h, 4h, 6h, 8h, 12h. 
     Valid TICKs in days include: 1d, 2d, 3d, 4d, 5d, 6d, 7d, 10d, 15d. 
     Valid TICKs in weeks include: 1w, 2w, 3w, 4w. 
     Valid TICKs in months include: 1M, 2M, 3M, 4M, 6M. 
     Valid TICKs in quarters include: 1q, 2q. 
     Valid TICKs in years can be any number of years. 120
     Valid TICKs in decades can be any number of decades. 
     Multiple TICKs are allowed in the same TICK group. For example, you can have 5m and 15m 
     tables, and 1d and 10d tick tables.
	 

The format for the RETENTION is the same as the TICK format, except that it can have any number of retention periods. The RETENTION specifies how long the data points in the base table should be kept. Examples of RETENTION are 15d, 1M, 3M, 1y, etc. If no RETENTION is provided, the data points in the tick table are not deleted. If the retention period is passed, old data will be deleted from the tick tables. The BASERETNYTION specifies how long the data points in the base table should persist. Data points that are older than the retention period are deleted frequently. If no BASERETENTION is provided, the data points in the base table will not be deleted.
A rollup column in a base table indicates that its value will be rolled to the tick tables. In the tick tables the last stored value of the rollup column is saved from the base table. In addition, aggregated values of ‘sum’, ‘min’, ‘max’, ‘avg’, ‘var’ of the column are computed and stored in the tick tables. Queries for aggregation values are extremely fast because only a single record is read from the database to retrieve the aggregated data without scanning the tables to compute and get the result.

Location Data

JaguarDB is the only database that supports both vector and raster spatial data. Vector spatial shapes include line, square, rectangle, circle, ellipse, triangle, sphere, ellipsoid, cone, cylinder, box, 3D line, 3D square, 3D rectangle, 3D circle, 3D ellipse, and 3D triangle. Raster spatial shapes include point, multipoint, linestring, multilinestring, polygon, multipolgon, and their 3D counterparts. Location-based data can be managed with regular tables. In a table containing spatial data types, the type of a column can have a spatial reference identifier (SRID). If no SRID is provided, the default value is zero, meaning it is a simple geometric coordinate system. In addition to the SRID of the column, the number of metrics associated with location point or a shape can be specified with the “metrics:” keyword. The following examples show how to create tables with spatial columns.
create table if not exists geom ( key: a int, value: pt point(srid:4326), b int );
create table if not exists geom2 ( key: a int, value: pt point(srid:wgs84), b int );
create table if not exists geom3 ( key: a int, value: pt point, b int );
create table dot ( key: a int, pt1 point, b int, pt2 point, value: c int, d int, pt3 point3d );
create table cb ( key: a int, q1 cube, b int, q2 cube, value: c int, q3 cube );
create table es ( key: a int, c ellipsoid, value: d int, e ellipse );
create table linestr ( key: lsw linestring(srid:wgs84), a int, value: lss linestring );
create table pol ( key: a int, value: po2 polygon, po3 polygon3d, tm datetime, ls linestring );
create table mline ( key: a int, value: m multilinestring, m3 multilinestring3d );
create table mpg ( key: a int, value: p multipolygon, p3 multipolygon3d );
create table street ( key: a int, value: pt linestring(srid:wgs84,metrics:10), b int );
create table base ( key: a int, value: pt point(srid:wgs84,metrics:20), char(32) );
The number of metrics is unlimited, as long as the storage space allows. Each metric has a length of 8 bytes, with default value of zero. The metrics are identified by mN, such as:
select col:m1, col:m3 from mytab where a=100 and col:x=200 and col:y=300;
Most shapes can be used to query with these functions:
  • within
  • contain
  • cover
  • coveredby
  • intersect
  • disjoint

Combining Time Series And Location Data

The following example demonstrates how a user can manage time series data and location-based data in one JaguarDB ‘rides’ table. The rides table is created by the following command:
CREATE TABLE timeseries(5m,30m,1d,1M) rides (
 key:
 pickup_datetime datetimesec,
 dropoff_datetime datetimesec,
 driver_name char(16),
 rate_type char(8),
 payment_type char(1),
 value:
 passenger_count rollup int,
 trip_distance rollup float(8.2),
 pickup_location point(srid:wgs84),
 dropoff_location point(srid:wgs84),
 fare_amount rollup float(8.2),
 tip_amount rollup float(6.2),
 tolls_amount float(6.2),
 total_amount rollup float(8.2),144
);
Here the ‘rides’ is the base table, and there are four tick tables created for ticks of five minutes, thirty minutes, one day, and one month. Each rollup column will generate five heap columns in the tick tables. Passenger pickup location and drop off location are represented by points having longitude and latitude coordinates in degrees. Data can be inserted by the following example:
insert into rides values ( '2021-02-11 09:22:12', '2021-02-11 09:50:42', 'DriverAHM', 
'REG', '1', '2', '48.6', point(122.036 37.7), point(122.385 37.622), '56.5', '10.5', 
'5.0', '72.0' );

insert into rides values ( '2021-02-11 09:32:12', '2021-02-11 09:58:42', 'DriverJHS', 
'HYP', '1', '3', '49.2', point(122.035 37.369), point(122.381 37.621), '73.5', '12.5', 
'5.8', '91.8' );

insert into rides values ( '2021-02-12 09:32:12', '2021-02-12 13:50:42', 'DriverAHM', 
'REG', '1', '2', '66.8', point(121.8864 37.3382 ), point(122.382 37.622), '96.1', 
'20.5', '8.0', '124.6' );
With the data we have, we can answer the following questions:
(1) How many rides took place on each day?
select pickup_datetime as day, counter as rides from rides@1d where 
driver_name='*' and rate_type='*' and payment_type='*';
Answer:
day=[2021-02-11 00:00:00] rides=[2]
day=[2021-02-12 00:00:00] rides=[1]

(2) How many rides took place on the day of ‘2021-02-12’?
select pickup_datetime as day, counter as rides from rides@1d where 
driver_name='*' and rate_type='*' and payment_type='*' and 
pickup_datetime='2021-02-11 00:00:00';
Answer:
day=[2021-02-11 00:00:00] rides=[2]

(3) What is the average fare amount?
select avg( fare_amount::avg) avg_fare_mount from rides@1M where 
driver_name='*' and rate_type='*' and payment_type='*';
Answer:
avg_fare_mount=[75.366667]

(4) What is the average fare amount in February of year 2021?145
select pickup_datetime as month, fare_amount::avg avg_fare_mount from rides@1M 
where driver_name='*' and rate_type='*' and payment_type='*' and 
pickup_datetime='2021-02-01 00:00:00';
Answer:
month=[2021-02-01 00:00:00] avg_fare_mount=[75.3666666667]

(5) What is the average fare amount for each driver?
select driver_name, avg( fare_amount::avg) avg_fare_mount from rides@1M where 
driver_name != '*' and rate_type='*' and payment_type='*' group by driver_name;
Answer:
driver_name=[DriverAHM] avg_fare_mount=[76.3]
driver_name=[DriverJHS] avg_fare_mount=[73.5]

(6) How many rides took place for each rate type?
select rate_type, sum(counter) rides from rides@1M where rate_type != '*' and 
driver_name='*' and payment_type='*' group by rate_type;
Answer:
rate_type=[HYP] rides=[1.0]
ate_type=[REG] rides=[2.0]

(7) What are the monthly average trip distance for all drivers?
select pickup_datetime as month, trip_distance::avg from rides@1M where 
rate_type='*' and payment_type='*' and driver_name='*';
Answer:
month=[2021-02-01 00:00:00] trip_distance::avg=[54.8666666667]

(8) What are the monthly average trip distance and maximum average distance for 
each driver?
select driver_name driver, pickup_datetime as month, avg(trip_distance::avg) 
avg_distance, max(trip_distance::avg ) max_avg_distance from rides@1M where 
rate_type='*' and payment_type='*' and driver_name != '*' group by driver_name;
nswer:
driver=[DriverAHM] month=[2021-02-01 00:00:00] avg_distance=[57.7] max_avg_distance=[57.7]
driver=[DriverJHS] month=[2021-02-01 00:00:00] avg_distance=[49.2] max_avg_distance=[49.2]

(9) How many rides took place every 5 minutes for the day of '2021-02-11' ?
elect pickup_datetime time, counter rides from rides@5m where driver_name='*' 
and rate_type='*' and payment_type='*' and pickup_datetime >= '2021-02-11 
00:00::00' and pickup_datetime < '2021-02-12 00:00:00' ;
Answer:146
time=[2021-02-11 09:20:00] rides=[1]
time=[2021-02-11 09:30:00] rides=[1]

(10) How many rides on the day of '2021-02-11' originated from within 10 
kilometers of Sunnyvale, California in 30 minute buckets?
select pickup_datetime as day from rides where distance(pickup_location, 
point( 122.035 37.369 ), 'center' ) < 18000;

11) What is the average total amount by 5 minutes for the day of 2021-02-11?
select pickup_datetime start5min, window(5m, pickup_datetime), 
avg(total_amount) avg_total_amount
from rides
where date(pickup_datetime)='2021-02-11'
group by pickup_datetime;
The window function creates time windows of 5 minutes based on the column ‘pickup_datetime’. The average is taken in the 5 minute windows by grouping the windows.


www.jaguardb.com