Indexly Data Analysis & File Pipeline Overview

Understand how Indexly analyzes CSV, JSON, NDJSON, SQLite, Excel, XML, YAML, and Parquet files through its universal loader and specialized pipelines.

Who This Page Is For

Users deciding between analyze-file, analyze-json, analyze-db, and analyze-autodoctor
Developers tracing how Indexly routes structured files through the loader and orchestrator
Operators analyzing AutoDoctor report JSON, telemetry JSON, or SQLite output with Indexly

What changed recently

Current staging builds include stricter JSON and NDJSON handling: bounded JSON detection, chunk-limited NDJSON materialization with full-stream validation, malformed-line rejection, safer mixed identifier handling, Socrata-style table mapping, and clearer sampling metadata. CSV analysis persists cleaned and raw data through a single orchestrator write path, and AutoDoctor report JSON, telemetry JSON, and SQLite databases have dedicated analysis paths. See Analyze JSON And NDJSON Files for the JSON-specific workflow.

Supported Formats

Indexly provides analysis and summarization for these structured formats:

CSV

Delimiter detection
Summary statistics, optional cleaning, visualization, and persistence via analyze-csv
CSV routing through analyze-file when you want one command for mixed structured files
Statistical inference through the Inference Docs

JSON and NDJSON

Standard list and dictionary JSON
NDJSON / record-list JSON
.json files that contain NDJSON records
Compressed *.json.gz files
Socrata-style columns and data JSON
Indexly search cache JSON
AutoDoctor report JSON
AutoDoctor telemetry JSON

SQLite

Generic SQLite profiling
Specialized AutoDoctor DB summaries when the schema matches AutoDoctor tables

Excel, Parquet, XML, YAML

Sheet-aware Excel loading
Efficient Parquet previews
XML structure analysis and tree rendering
Safe YAML loading into JSON-like structures
YAML persistence through analyze-file writes to cleaned_data and stores YAML-specific metadata/artifact references when persistence is enabled

Choose The Right Command

Scenario	Best command	Why
Known CSV file	`indexly analyze-csv <file>`	Uses the dedicated CSV parser, cleaning flags, visualizations, and CSV analysis exports
CSV file inside a mixed-format workflow	`indexly analyze-file <file> --auto-clean`	Lets the universal dispatcher detect CSV while still accepting CSV-specific options
Unknown structured file	`indexly analyze-file <file>`	Lets the universal loader detect the file and route it automatically
Exported CSVs or reports with inconsistent names	`indexly rename-file <folder> --dry-run` before analysis	Standardizes filenames so later analysis, search, and organizer logs are easier to compare
Large JSON or NDJSON file	`indexly analyze-json <file> --chunk-size 10000`	Uses JSON-specific detection and chunk-limited NDJSON materialization
Generic SQLite inspection	`indexly analyze-db <db>`	Focused on schema, table profiling, and export
AutoDoctor report JSON, telemetry JSON, or `autodoctor.db`	`indexly analyze-autodoctor <path>`	Produces an operational summary instead of a generic table dump
AutoDoctor artifact, but you want auto-detection through the generic path	`indexly analyze-file <path>`	The orchestrator detects AutoDoctor and switches to the specialized path

Command Behaviors

`indexly analyze-csv <file>`

This is the dedicated CSV route.

It is best for:

delimiter detection and numeric summary statistics
optional --auto-clean, --normalize, and --remove-outliers
terminal, static, or interactive CSV visualizations
CSV analysis exports in txt, md, or json

For parser-accurate CSV options, see Analyze CSV Data and Clean CSV Data.

`indexly analyze-file <file>`

This is the universal dispatcher.

It:

detects file type through universal_loader
adds metadata hints for special formats such as AutoDoctor
routes into the correct pipeline through the orchestrator

Use this when you want one command for mixed datasets.

For SQLite files, this route is intentionally a quick preview path. It loads bounded table previews for generic database inspection. Use analyze-db when you need relationship discovery, table profiling controls, diagrams, or exportable database summaries.

For YAML and YML files, persistence is handled by the same orchestrator path used by other structured formats. By default, analyze-file writes the cleaned preview and summary to ~/.indexly/indexly.db (cleaned_data), includes a JSON-safe yaml_table_output block in metadata, and records an auxiliary analysis artifact path with schema indexly.yaml.analysis.v1. Use --no-persist to skip both the analysis database write and YAML artifact creation for that run.

`indexly analyze-json <file>`

This is the JSON-focused route.

It is best for:

plain JSON
NDJSON
compressed JSON
Socrata-style table JSON
JSON files that may need structural fallback logic

It now shares more routing behavior with the orchestrator, which helps prevent the old failure mode where NDJSON-style .json files summarized correctly but could not persist cleanly.

Use --chunk-size on this command when a newline-delimited source is too large to materialize fully.

See Analyze JSON And NDJSON Files.

`indexly analyze-db <db>`

This is the database-focused route.

It is best for:

unknown SQLite databases
table-by-table profiling
relationship discovery
schema exports and diagrams

By default, large tables are profiled with bounded sampling. Use --sample-size to choose a profile size, --fast or --fast-mode for lighter metrics, and --all-data only when full-table profiling is required.

When the database matches AutoDoctor’s schema, Indexly switches to an operational summary instead of staying in the generic inspection path.

`indexly analyze-autodoctor <path>`

This is the dedicated operational route for AutoDoctor artifacts.

It supports:

AutoDoctor_Report.json
Telemetry_*.json
autodoctor.db

Use it when you want human-readable summaries first, not raw structure exploration.

See Analyze AutoDoctor Artifacts.

How Routing Works

Indexly’s structured-data analysis has three layers:

analyze-file / analyze-json / analyze-db / analyze-autodoctor
        |
        v
Universal Loader
        |
        v
Analysis Orchestrator
        |
        v
Specialized Pipelines

Universal Loader Responsibilities

detect file type from extension and content
distinguish JSON, NDJSON, SQLite, Excel, XML, YAML, and Parquet
attach metadata hints such as AutoDoctor schema fingerprints

Analysis Orchestrator Responsibilities

decide which analysis pipeline should run
preserve JSON-aware persistence behavior
reroute special formats such as AutoDoctor into dedicated summaries

Pipeline Responsibilities

Each specialized pipeline handles:

validation
normalization
preview generation
summary generation
persistence/export handoff

AutoDoctor-Aware Analysis

Indexly now recognizes two AutoDoctor JSON families plus the AutoDoctor SQLite schema:

Artifact	What Indexly shows
`AutoDoctor_Report.json`	Root cause, health score, operational findings, inventory highlights
`Telemetry_*.json`	Run metadata, identity, module success, database sync, system snapshot
`autodoctor.db`	Latest system snapshot, alert summary, module status, baselines, remediation

This avoids flattening operational documents into one synthetic table when a domain-specific summary is more useful.

For operational examples and artifact selection guidance, see:

Practical Examples

Preparing exported files before analysis

indexly rename-file ./exports --pattern "{date}-{title}" --recursive --dry-run
indexly rename-file ./exports --pattern "{date}-{title}" --recursive

Use Rename File when exported CSVs, reports, or logs need stable names before analysis or organization.

CSV analysis and cleaning

indexly analyze-csv sales.csv --show-summary
indexly analyze-csv sales.csv --auto-clean --show-summary --no-persist
indexly analyze-csv sales.csv --show-chart ascii --chart-type hist --transform auto

Generic structured-file analysis

indexly analyze-file sales.csv --auto-clean --show-summary
indexly analyze-file data.json --show-summary
indexly analyze-file metrics.parquet --show-summary
indexly analyze-file workbook.xlsx --sheet-name Sheet1 --show-summary

JSON and NDJSON analysis

indexly analyze-json iris.json --show-summary
indexly analyze-json events.ndjson --chunk-size 10000 --show-summary
indexly analyze-json records.json.gz --show-summary

SQLite analysis

indexly analyze-db chinook.db --show-summary --all-tables
indexly analyze-file chinook.db --show-summary

AutoDoctor analysis

indexly analyze-autodoctor .\AutoDoctor_Report.json --show-summary
indexly analyze-autodoctor .\Telemetry_20260416-081258-BTNB05.json --summary-only
indexly analyze-autodoctor .\autodoctor.db --show-summary

Indexly Data Analysis & File Pipeline Overview

Categories:

Tags:

Who This Page Is For

Supported Formats

CSV

JSON and NDJSON

SQLite

Excel, Parquet, XML, YAML

Choose The Right Command

Command Behaviors

`indexly analyze-csv <file>`

`indexly analyze-file <file>`

`indexly analyze-json <file>`

`indexly analyze-db <db>`

`indexly analyze-autodoctor <path>`

How Routing Works

Universal Loader Responsibilities

Analysis Orchestrator Responsibilities

Pipeline Responsibilities

AutoDoctor-Aware Analysis

Practical Examples

Preparing exported files before analysis

CSV analysis and cleaning

Generic structured-file analysis

JSON and NDJSON analysis

SQLite analysis

AutoDoctor analysis

Indexly Data Analysis & File Pipeline Overview

Who This Page Is For

Supported Formats

CSV

JSON and NDJSON

SQLite

Excel, Parquet, XML, YAML

Choose The Right Command

Command Behaviors

indexly analyze-csv <file>

indexly analyze-file <file>

indexly analyze-json <file>

indexly analyze-db <db>

indexly analyze-autodoctor <path>

How Routing Works

Universal Loader Responsibilities

Analysis Orchestrator Responsibilities

Pipeline Responsibilities

AutoDoctor-Aware Analysis

Practical Examples

Preparing exported files before analysis

CSV analysis and cleaning

Generic structured-file analysis

JSON and NDJSON analysis

SQLite analysis

AutoDoctor analysis

Related Pages

`indexly analyze-csv <file>`

`indexly analyze-file <file>`

`indexly analyze-json <file>`

`indexly analyze-db <db>`

`indexly analyze-autodoctor <path>`