logv

overview

Problem: Querying and visualizing data to locate the source of a problem is slow and tedious

Solution: Build a general purpose tool that makes it easy to dig through instrumentation

Product

Highlights

A lifesaver in emergencies

Built for instrumentation (bad at everything else)

Stores recent raw samples for adhoc analysis

Prunes data by time and memory constraints

Emphasizes a comparative analysis workflow

Self Hosted & Open Source Software

Snorkel (the frontend)

Query & Visualization Builder

Built for ease of querying

Many visualizations, one dataset

Comparison queries are first class

Common controls across visualizations

Supports filtering, grouping and aggregating

Supports time series, tables and histograms

Flip between differents views of the same data easily

Sybil (the backend)

Append only OLAP for JSON data

Built for ease of ingestion

Data organized in datasets

Limit table size by time and memory

No table creation step: log and query immediately

Better compression with columnar storage

High performance with multi-threaded aggregation

Queries executed on the fly: full table scans only

Use cases

Organizational

A/B Testing

Firefighting

Bug Tracking

Perf Experiments

Deploy Monitoring

Personal

IoT & HW Sensors

Browser Usage

Web Analytics

Fitness & Health

Tradeoffs

Pros

Good for about 10mm - 30mm rows per query

Performs well as a firefighting tool

Good for digging into anomalies

Get an intuitive feeling for datasets

Cons

>100mm samples is not recommended

No SQL syntax

No joins, no updates

Is not a replacement for monitoring with TSDBs

No support for engagement or retention queries

Comparisons

vs Time Series DBs

Good for monitoring & long term trends

Products

InfluxDB

Graphite

RRDTool

Prometheus

TSDB Pros

Very fast queries and throughput

Has good UI and dashboarding

Long term data retention

TSDB Cons

No Datasets, top level keys instead

Single numeric quantity per key

Only supports time series queries

Does not store raw data

Bad at digging into anomalies

vs Column Stores

Good with analytics, logs, etc

Products

Interana

Memsql

Druid

Presto

ElasticSearch

Big Table

Clickhouse

Column Store Pros

Sharded, supports massive data!

Supports JOINS and advanced queries

Supports engagement queries

Built in SQL query engine

Is a full fledged DB

Column Store Cons

Hard to setup / Requires a cluster

Tables require creation step

Requires schema maintenance

vs Traditional DBs

Good with user & app data, etc

Products

Mysql

Postgres

Vertica

Rethinkdb

SQL server

RDBMS Pros

Battle tested, very robust

Supports JOINs and advanced queries

Supports engagement queries

RDBMS Cons

Tables require creation step

Requires schema maintenance

Single threaded aggregations

Row store architecture (bad for full table scans / arbitrary filters)

Demo

Get in touch

logv.org

Want to try out a demo?

Need help getting set up?

Interested in custom features?

Feel like to making it better?

Please email or open a github issue!