Analyze CSV: Visualize, Transform & Understand Your Data

Explore, visualize, and normalize CSV datasets in Indexly using statistical summaries, skew detection, and ASCII visualizations. Perfect for data analysts and developers working with terminal-based data exploration.

Overview

The analyze-csv command in Indexly turns raw CSV files into meaningful insights.
With a single command, you can:

  • Compute detailed summary statistics (mean, median, std, IQR, skew, etc.)
  • Apply numeric transformations (log, sqrt, softplus, exp-log, or auto)
  • Visualize results using ASCII histograms or boxplots
  • Auto-adjust scaling and binning for highly skewed data
  • Export results as Markdown or HTML for reporting

This feature bridges quick terminal exploration with statistical understanding โ€” all without leaving your CLI.


Key Highlights

  • ๐Ÿ“ˆ Smart transformations โ€” detect skew automatically and apply optimal scaling (auto mode).
  • ๐ŸŽจ ASCII visualizations โ€” view histograms or boxplots directly in the terminal.
  • ๐Ÿ” Skew and distribution insight โ€” see before/after skew changes at a glance.
  • โš™๏ธ Adaptive scaling โ€” use log or sqrt scaling for long-tailed distributions.
  • ๐Ÿงฎ Statistical summary โ€” mean, median, std, and quartiles per column.
  • ๐Ÿงพ Export options โ€” save as Markdown (--export md) or plot as interactive HTML (--mode interactive).

Quick Start Example

Letโ€™s analyze a dataset called sales_data.csv:

indexly analyze-csv sales_data.csv --show-chart ascii --chart-type hist --transform auto

Output Example:

๐Ÿ“ˆ Transformation Statistics Overview
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Column        โ”‚ Mean (Before)โ”‚ Mean (After) โ”‚ Median (Before) โ”‚ Median (After) โ”‚ Std (Before) โ”‚ Std (After) โ”‚ Skew (Before) โ”‚ Skew (After) โ”‚ ฮ”Skew โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ revenue       โ”‚ 8123.33      โ”‚ 6.21         โ”‚ 4100.00       โ”‚ 6.02         โ”‚ 9255.50      โ”‚ 2.12         โ”‚ 4.12          โ”‚ 0.41          โ”‚ -3.71   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

This table compares pre- and post-transformation statistics, clearly showing how the skew reduced by 3.7 points.


Statistical Insights

Indexly automatically calculates:

Metric Description
Count Total non-null values per column
Nulls Missing entries
Mean Average value
Median 50th percentile
Std Dev Spread of the data
Sum Total cumulative value
Q1 / Q3 / IQR Quartiles and interquartile range
Skew Measures symmetry โ€” positive means right-tailed

Skewed data can distort interpretation, so Indexly includes a transformation pipeline to normalize it automatically.


Transformation & Scaling

When you run with --transform auto, Indexly examines each numeric columnโ€™s skewness and selects the most appropriate transformation:

Skew Range Transformation Applied
> 3 Log transform
1โ€“3 Square root transform
< -1 Softplus transform
otherwise No transform

For manual control, use:

--transform log
--transform sqrt
--transform softplus
--transform exp-log

Adaptive Scaling

Histograms automatically switch to log scaling if the ratio between the highest and lowest bin counts exceeds 1,000 โ€” ensuring readability in extremely uneven distributions.


Visual Exploration

You can visualize your data directly in the terminal:

1. Histogram Mode

indexly analyze-csv sales_data.csv --chart-type hist

Produces an ASCII histogram like this:

[revenue (ฮ”skew=-3.71)]
Min: 0.00   Q1: 2.10   Median: 6.02   Q3: 8.20   Max: 10.80
[0.00, 1.08]  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ  62.3% (589)
[1.08, 2.16]  โ–ˆโ–ˆโ–ˆโ–ˆ                                          9.4% (89)
[2.16, 3.24]  โ–ˆโ–ˆ                                            4.8% (45)
...

Bars scale dynamically based on bin counts. Extremely small bins (<0.1%) display as <0.1%, ensuring even sparse data remains visible.


2. Boxplot Mode

indexly analyze-csv sales_data.csv --chart-type box

Shows an ASCII boxplot with quartiles and median indicators:

[revenue] (transform=log)
   0.00 โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ”‚โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก 10.80
           Q1          Med                Q3
โ†’ Range=10.80, IQR=3.25, Median=6.02

Export & Integration

Export as Markdown

indexly analyze-csv sales_data.csv --export md

Saves a Markdown table of all summary statistics for documentation or reports.

Generate Interactive Charts

indexly analyze-csv sales_data.csv --mode interactive

Uses Plotly to produce dynamic visualizations viewable in the browser.


Behind the Scenes

  • Binning Strategy:

    • For normal data or mild skew, uses equal-width bins.
    • For extreme skew (|skew| > 5), switches to quantile-based binning for better visibility.
  • Adaptive Decimal Precision: Decimal places adjust automatically based on bin width using:

    decimals = max(2, int(-np.floor(np.log10(bin_width))))
    
  • ฮ”Skew Calculation: Displayed as (After - Before) to show the direction of improvement. Example: ฮ”skew=-3.71 means skew reduced by 3.71 after transformation.


Pro Tips

  • Use --transform auto for mixed datasets โ€” Indexly will normalize each column automatically.

  • Use --scale sqrt for moderate skew instead of full log scaling.

  • For quick terminal analysis, combine with:

    indexly analyze-csv data.csv --show-chart ascii --chart-type hist --bins 15
    
  • Export results for documentation:

    indexly analyze-csv data.csv --export md > analysis.md
    

Next Steps

Continue exploring Indexlyโ€™s analytical capabilities:


โœจ Indexly makes your data talk โ€” visually, statistically, and intelligently.