Indexly Data Analysis & File Pipeline Overview

Understand how Indexly analyzes CSV, JSON, NDJSON, SQLite, Excel, XML, YAML, and Parquet files through its universal loader and specialized pipelines.

Who This Page Is For

  • Users deciding between analyze-file, analyze-json, analyze-db, and analyze-autodoctor
  • Developers tracing how Indexly routes structured files through the loader and orchestrator
  • Operators analyzing AutoDoctor report JSON, telemetry JSON, or SQLite output with Indexly

Supported Formats

Indexly provides analysis and summarization for these structured formats:

CSV

  • Delimiter detection
  • Summary statistics, optional cleaning, visualization, and persistence via analyze-csv
  • CSV routing through analyze-file when you want one command for mixed structured files
  • Statistical inference through the Inference Docs

JSON and NDJSON

  • Standard list and dictionary JSON
  • NDJSON / record-list JSON
  • .json files that contain NDJSON records
  • Compressed *.json.gz files
  • Socrata-style columns and data JSON
  • Indexly search cache JSON
  • AutoDoctor report JSON
  • AutoDoctor telemetry JSON

SQLite

  • Generic SQLite profiling
  • Specialized AutoDoctor DB summaries when the schema matches AutoDoctor tables

Excel, Parquet, XML, YAML

  • Sheet-aware Excel loading
  • Efficient Parquet previews
  • XML structure analysis and tree rendering
  • Safe YAML loading into JSON-like structures

Choose The Right Command

Scenario Best command Why
Known CSV file indexly analyze-csv <file> Uses the dedicated CSV parser, cleaning flags, visualizations, and CSV analysis exports
CSV file inside a mixed-format workflow indexly analyze-file <file> --auto-clean Lets the universal dispatcher detect CSV while still accepting CSV-specific options
Unknown structured file indexly analyze-file <file> Lets the universal loader detect the file and route it automatically
Exported CSVs or reports with inconsistent names indexly rename-file <folder> --dry-run before analysis Standardizes filenames so later analysis, search, and organizer logs are easier to compare
Large JSON or NDJSON file indexly analyze-json <file> --chunk-size 10000 Uses JSON-specific detection and chunk-limited NDJSON materialization
Generic SQLite inspection indexly analyze-db <db> Focused on schema, table profiling, and export
AutoDoctor report JSON, telemetry JSON, or autodoctor.db indexly analyze-autodoctor <path> Produces an operational summary instead of a generic table dump
AutoDoctor artifact, but you want auto-detection through the generic path indexly analyze-file <path> The orchestrator detects AutoDoctor and switches to the specialized path

Command Behaviors

indexly analyze-csv <file>

This is the dedicated CSV route.

It is best for:

  • delimiter detection and numeric summary statistics
  • optional --auto-clean, --normalize, and --remove-outliers
  • terminal, static, or interactive CSV visualizations
  • CSV analysis exports in txt, md, or json

For parser-accurate CSV options, see Analyze CSV Data and Clean CSV Data.

indexly analyze-file <file>

This is the universal dispatcher.

It:

  • detects file type through universal_loader
  • adds metadata hints for special formats such as AutoDoctor
  • routes into the correct pipeline through the orchestrator

Use this when you want one command for mixed datasets.

For SQLite files, this route is intentionally a quick preview path. It loads bounded table previews for generic database inspection. Use analyze-db when you need relationship discovery, table profiling controls, diagrams, or exportable database summaries.

indexly analyze-json <file>

This is the JSON-focused route.

It is best for:

  • plain JSON
  • NDJSON
  • compressed JSON
  • Socrata-style table JSON
  • JSON files that may need structural fallback logic

It now shares more routing behavior with the orchestrator, which helps prevent the old failure mode where NDJSON-style .json files summarized correctly but could not persist cleanly.

Use --chunk-size on this command when a newline-delimited source is too large to materialize fully.

See Analyze JSON And NDJSON Files.

indexly analyze-db <db>

This is the database-focused route.

It is best for:

  • unknown SQLite databases
  • table-by-table profiling
  • relationship discovery
  • schema exports and diagrams

By default, large tables are profiled with bounded sampling. Use --sample-size to choose a profile size, --fast or --fast-mode for lighter metrics, and --all-data only when full-table profiling is required.

When the database matches AutoDoctor’s schema, Indexly switches to an operational summary instead of staying in the generic inspection path.

indexly analyze-autodoctor <path>

This is the dedicated operational route for AutoDoctor artifacts.

It supports:

  • AutoDoctor_Report.json
  • Telemetry_*.json
  • autodoctor.db

Use it when you want human-readable summaries first, not raw structure exploration.

See Analyze AutoDoctor Artifacts.

How Routing Works

Indexly’s structured-data analysis has three layers:

analyze-file / analyze-json / analyze-db / analyze-autodoctor
        |
        v
Universal Loader
        |
        v
Analysis Orchestrator
        |
        v
Specialized Pipelines

Universal Loader Responsibilities

  • detect file type from extension and content
  • distinguish JSON, NDJSON, SQLite, Excel, XML, YAML, and Parquet
  • attach metadata hints such as AutoDoctor schema fingerprints

Analysis Orchestrator Responsibilities

  • decide which analysis pipeline should run
  • preserve JSON-aware persistence behavior
  • reroute special formats such as AutoDoctor into dedicated summaries

Pipeline Responsibilities

Each specialized pipeline handles:

  • validation
  • normalization
  • preview generation
  • summary generation
  • persistence/export handoff

AutoDoctor-Aware Analysis

Indexly now recognizes two AutoDoctor JSON families plus the AutoDoctor SQLite schema:

Artifact What Indexly shows
AutoDoctor_Report.json Root cause, health score, operational findings, inventory highlights
Telemetry_*.json Run metadata, identity, module success, database sync, system snapshot
autodoctor.db Latest system snapshot, alert summary, module status, baselines, remediation

This avoids flattening operational documents into one synthetic table when a domain-specific summary is more useful.

For operational examples and artifact selection guidance, see:

Practical Examples

Preparing exported files before analysis

indexly rename-file ./exports --pattern "{date}-{title}" --recursive --dry-run
indexly rename-file ./exports --pattern "{date}-{title}" --recursive

Use Rename File when exported CSVs, reports, or logs need stable names before analysis or organization.

CSV analysis and cleaning

indexly analyze-csv sales.csv --show-summary
indexly analyze-csv sales.csv --auto-clean --show-summary --no-persist
indexly analyze-csv sales.csv --show-chart ascii --chart-type hist --transform auto

Generic structured-file analysis

indexly analyze-file sales.csv --auto-clean --show-summary
indexly analyze-file data.json --show-summary
indexly analyze-file metrics.parquet --show-summary
indexly analyze-file workbook.xlsx --sheet-name Sheet1 --show-summary

JSON and NDJSON analysis

indexly analyze-json iris.json --show-summary
indexly analyze-json events.ndjson --chunk-size 10000 --show-summary
indexly analyze-json records.json.gz --show-summary

SQLite analysis

indexly analyze-db chinook.db --show-summary --all-tables
indexly analyze-file chinook.db --show-summary

AutoDoctor analysis

indexly analyze-autodoctor .\AutoDoctor_Report.json --show-summary
indexly analyze-autodoctor .\Telemetry_20260416-081258-BTNB05.json --summary-only
indexly analyze-autodoctor .\autodoctor.db --show-summary