Indexly Data Analysis & File Pipeline Overview
Categories:
Who This Page Is For
- Users deciding between
analyze-file,analyze-json,analyze-db, andanalyze-autodoctor - Developers tracing how Indexly routes structured files through the loader and orchestrator
- Operators analyzing AutoDoctor report JSON, telemetry JSON, or SQLite output with Indexly
Current staging builds include stricter JSON and NDJSON handling: bounded JSON detection, chunk-limited NDJSON materialization, malformed-line rejection, safer mixed identifier handling, Socrata-style table mapping, and clearer sampling metadata. CSV analysis persists cleaned and raw data through a single orchestrator write path, and AutoDoctor report JSON, telemetry JSON, and SQLite databases have dedicated analysis paths. See Analyze JSON And NDJSON Files for the JSON-specific workflow.
Supported Formats
Indexly provides analysis and summarization for these structured formats:
CSV
- Delimiter detection
- Summary statistics, optional cleaning, visualization, and persistence via
analyze-csv - CSV routing through
analyze-filewhen you want one command for mixed structured files - Statistical inference through the Inference Docs
JSON and NDJSON
- Standard list and dictionary JSON
- NDJSON / record-list JSON
.jsonfiles that contain NDJSON records- Compressed
*.json.gzfiles - Socrata-style
columnsanddataJSON - Indexly search cache JSON
- AutoDoctor report JSON
- AutoDoctor telemetry JSON
SQLite
- Generic SQLite profiling
- Specialized AutoDoctor DB summaries when the schema matches AutoDoctor tables
Excel, Parquet, XML, YAML
- Sheet-aware Excel loading
- Efficient Parquet previews
- XML structure analysis and tree rendering
- Safe YAML loading into JSON-like structures
Choose The Right Command
| Scenario | Best command | Why |
|---|---|---|
| Known CSV file | indexly analyze-csv <file> |
Uses the dedicated CSV parser, cleaning flags, visualizations, and CSV analysis exports |
| CSV file inside a mixed-format workflow | indexly analyze-file <file> --auto-clean |
Lets the universal dispatcher detect CSV while still accepting CSV-specific options |
| Unknown structured file | indexly analyze-file <file> |
Lets the universal loader detect the file and route it automatically |
| Exported CSVs or reports with inconsistent names | indexly rename-file <folder> --dry-run before analysis |
Standardizes filenames so later analysis, search, and organizer logs are easier to compare |
| Large JSON or NDJSON file | indexly analyze-json <file> --chunk-size 10000 |
Uses JSON-specific detection and chunk-limited NDJSON materialization |
| Generic SQLite inspection | indexly analyze-db <db> |
Focused on schema, table profiling, and export |
AutoDoctor report JSON, telemetry JSON, or autodoctor.db |
indexly analyze-autodoctor <path> |
Produces an operational summary instead of a generic table dump |
| AutoDoctor artifact, but you want auto-detection through the generic path | indexly analyze-file <path> |
The orchestrator detects AutoDoctor and switches to the specialized path |
Command Behaviors
indexly analyze-csv <file>
This is the dedicated CSV route.
It is best for:
- delimiter detection and numeric summary statistics
- optional
--auto-clean,--normalize, and--remove-outliers - terminal, static, or interactive CSV visualizations
- CSV analysis exports in
txt,md, orjson
For parser-accurate CSV options, see Analyze CSV Data and Clean CSV Data.
indexly analyze-file <file>
This is the universal dispatcher.
It:
- detects file type through
universal_loader - adds metadata hints for special formats such as AutoDoctor
- routes into the correct pipeline through the orchestrator
Use this when you want one command for mixed datasets.
For SQLite files, this route is intentionally a quick preview path. It loads bounded table previews for generic database inspection. Use analyze-db when you need relationship discovery, table profiling controls, diagrams, or exportable database summaries.
indexly analyze-json <file>
This is the JSON-focused route.
It is best for:
- plain JSON
- NDJSON
- compressed JSON
- Socrata-style table JSON
- JSON files that may need structural fallback logic
It now shares more routing behavior with the orchestrator, which helps prevent the old failure mode where NDJSON-style .json files summarized correctly but could not persist cleanly.
Use --chunk-size on this command when a newline-delimited source is too large to materialize fully.
See Analyze JSON And NDJSON Files.
indexly analyze-db <db>
This is the database-focused route.
It is best for:
- unknown SQLite databases
- table-by-table profiling
- relationship discovery
- schema exports and diagrams
By default, large tables are profiled with bounded sampling. Use --sample-size to choose a profile size, --fast or --fast-mode for lighter metrics, and --all-data only when full-table profiling is required.
When the database matches AutoDoctor’s schema, Indexly switches to an operational summary instead of staying in the generic inspection path.
indexly analyze-autodoctor <path>
This is the dedicated operational route for AutoDoctor artifacts.
It supports:
AutoDoctor_Report.jsonTelemetry_*.jsonautodoctor.db
Use it when you want human-readable summaries first, not raw structure exploration.
See Analyze AutoDoctor Artifacts.
How Routing Works
Indexly’s structured-data analysis has three layers:
analyze-file / analyze-json / analyze-db / analyze-autodoctor
|
v
Universal Loader
|
v
Analysis Orchestrator
|
v
Specialized Pipelines
Universal Loader Responsibilities
- detect file type from extension and content
- distinguish JSON, NDJSON, SQLite, Excel, XML, YAML, and Parquet
- attach metadata hints such as AutoDoctor schema fingerprints
Analysis Orchestrator Responsibilities
- decide which analysis pipeline should run
- preserve JSON-aware persistence behavior
- reroute special formats such as AutoDoctor into dedicated summaries
Pipeline Responsibilities
Each specialized pipeline handles:
- validation
- normalization
- preview generation
- summary generation
- persistence/export handoff
AutoDoctor-Aware Analysis
Indexly now recognizes two AutoDoctor JSON families plus the AutoDoctor SQLite schema:
| Artifact | What Indexly shows |
|---|---|
AutoDoctor_Report.json |
Root cause, health score, operational findings, inventory highlights |
Telemetry_*.json |
Run metadata, identity, module success, database sync, system snapshot |
autodoctor.db |
Latest system snapshot, alert summary, module status, baselines, remediation |
This avoids flattening operational documents into one synthetic table when a domain-specific summary is more useful.
For operational examples and artifact selection guidance, see:
Practical Examples
Preparing exported files before analysis
indexly rename-file ./exports --pattern "{date}-{title}" --recursive --dry-run
indexly rename-file ./exports --pattern "{date}-{title}" --recursive
Use Rename File when exported CSVs, reports, or logs need stable names before analysis or organization.
CSV analysis and cleaning
indexly analyze-csv sales.csv --show-summary
indexly analyze-csv sales.csv --auto-clean --show-summary --no-persist
indexly analyze-csv sales.csv --show-chart ascii --chart-type hist --transform auto
Generic structured-file analysis
indexly analyze-file sales.csv --auto-clean --show-summary
indexly analyze-file data.json --show-summary
indexly analyze-file metrics.parquet --show-summary
indexly analyze-file workbook.xlsx --sheet-name Sheet1 --show-summary
JSON and NDJSON analysis
indexly analyze-json iris.json --show-summary
indexly analyze-json events.ndjson --chunk-size 10000 --show-summary
indexly analyze-json records.json.gz --show-summary
SQLite analysis
indexly analyze-db chinook.db --show-summary --all-tables
indexly analyze-file chinook.db --show-summary
AutoDoctor analysis
indexly analyze-autodoctor .\AutoDoctor_Report.json --show-summary
indexly analyze-autodoctor .\Telemetry_20260416-081258-BTNB05.json --summary-only
indexly analyze-autodoctor .\autodoctor.db --show-summary