JSONL Resources
Curated collection of tools, libraries, and community resources for JSON Lines
Official Resources
JSONLines.org
Official JSON Lines format specification and documentation
NDJSON GitHub Organization
Community-driven NDJSON tools and specifications
Recommended Books
Fundamentals of Data Engineering
Learn how to plan and build data systems to serve your needs
Designing Data-Intensive Applications
The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Data Pipelines Pocket Reference
Moving and Processing Data for Analytics
AI Engineering Foundation Models
Building Applications with Foundation Models
Libraries & Parsers
Python
jsonlines
Popular Python library for reading and writing JSONL files with a clean API
ndjson
NDJSON parser with the same API as the built-in json module
orjson
Fast, correct Python JSON library - 6x+ faster than stdlib with dataclass and numpy support
orjsonl
High-performance JSONL reader/writer powered by orjson with gzip, bzip2, and zstd compression
JavaScript / Node.js
Go
Ruby
Java
Command Line Tools
jq
The gold standard command-line JSON processor with full JSONL support
jaq
Rust-based jq alternative focused on correctness and speed - 5-10x faster on most benchmarks
gojq
Pure Go jq implementation with YAML support and arbitrary-precision integers
fx
Interactive terminal JSON viewer with JSONL support and mouse navigation
qsv
Blazing-fast data-wrangling toolkit for CSV, JSONL, and more - successor to xsv
Miller (mlr)
Like awk/sed/cut for name-indexed data such as CSV and JSONL
AI & Machine Learning
All major AI platforms use JSONL as their standard format for fine-tuning training data
OpenAI Fine-Tuning
JSONL format for GPT model fine-tuning with messages array structure
Anthropic Claude Fine-Tuning
Claude fine-tuning via Amazon Bedrock uses JSONL training data format
Google Gemini Fine-Tuning
Gemini supervised tuning on Vertex AI requires JSONL format training data
Mistral Fine-Tuning
Mistral models use JSONL with messages array for SFT training
Hugging Face Datasets
ML datasets library with native JSONL import/export support
TensorFlow Datasets
Collection of ready-to-use datasets with JSONL export capabilities
Data Processing & Databases
Polars
High-performance Rust/Python DataFrame library with native read_ndjson() and lazy JSONL reader
Pandas
Python data analysis with read_json(lines=True) for JSONL files
DuckDB
In-process analytical SQL engine with native NDJSON/JSONL query support
ClickHouse
OLAP database with native JSONL ingestion via JSONEachRow format
Apache Spark
Distributed big data processing with native JSONL read/write support
Apache Arrow
In-memory columnar format with efficient JSONL conversion