Analyze CSV: Visualize, Transform & Understand Your Data
Categories:
Overview
The analyze-csv command in Indexly turns raw CSV files into meaningful insights.
With a single command, you can:
- Compute detailed summary statistics (mean, median, std, IQR, skew, etc.)
- Apply numeric transformations (
log,sqrt,softplus,exp-log, orauto) - Visualize results using ASCII histograms or boxplots
- Auto-adjust scaling and binning for highly skewed data
- Export results as Markdown or HTML for reporting
This feature bridges quick terminal exploration with statistical understanding โ all without leaving your CLI.
Key Highlights
- ๐ Smart transformations โ detect skew automatically and apply optimal scaling (
automode). - ๐จ ASCII visualizations โ view histograms or boxplots directly in the terminal.
- ๐ Skew and distribution insight โ see before/after skew changes at a glance.
- โ๏ธ Adaptive scaling โ use log or sqrt scaling for long-tailed distributions.
- ๐งฎ Statistical summary โ mean, median, std, and quartiles per column.
- ๐งพ Export options โ save as Markdown (
--export md) or plot as interactive HTML (--mode interactive).
Quick Start Example
Letโs analyze a dataset called sales_data.csv:
indexly analyze-csv sales_data.csv --show-chart ascii --chart-type hist --transform auto
Output Example:
๐ Transformation Statistics Overview
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโ
โ Column โ Mean (Before)โ Mean (After) โ Median (Before) โ Median (After) โ Std (Before) โ Std (After) โ Skew (Before) โ Skew (After) โ ฮSkew โ
โโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโค
โ revenue โ 8123.33 โ 6.21 โ 4100.00 โ 6.02 โ 9255.50 โ 2.12 โ 4.12 โ 0.41 โ -3.71 โ
โโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโ
This table compares pre- and post-transformation statistics, clearly showing how the skew reduced by 3.7 points.
Statistical Insights
Indexly automatically calculates:
| Metric | Description |
|---|---|
| Count | Total non-null values per column |
| Nulls | Missing entries |
| Mean | Average value |
| Median | 50th percentile |
| Std Dev | Spread of the data |
| Sum | Total cumulative value |
| Q1 / Q3 / IQR | Quartiles and interquartile range |
| Skew | Measures symmetry โ positive means right-tailed |
Skewed data can distort interpretation, so Indexly includes a transformation pipeline to normalize it automatically.
Transformation & Scaling
When you run with --transform auto, Indexly examines each numeric columnโs skewness and selects the most appropriate transformation:
| Skew Range | Transformation Applied |
|---|---|
> 3 |
Log transform |
1โ3 |
Square root transform |
< -1 |
Softplus transform |
| otherwise | No transform |
For manual control, use:
--transform log
--transform sqrt
--transform softplus
--transform exp-log
Adaptive Scaling
Histograms automatically switch to log scaling if the ratio between the highest and lowest bin counts exceeds 1,000 โ ensuring readability in extremely uneven distributions.
Visual Exploration
You can visualize your data directly in the terminal:
1. Histogram Mode
indexly analyze-csv sales_data.csv --chart-type hist
Produces an ASCII histogram like this:
[revenue (ฮskew=-3.71)]
Min: 0.00 Q1: 2.10 Median: 6.02 Q3: 8.20 Max: 10.80
[0.00, 1.08] โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 62.3% (589)
[1.08, 2.16] โโโโ 9.4% (89)
[2.16, 3.24] โโ 4.8% (45)
...
Bars scale dynamically based on bin counts. Extremely small bins (<0.1%) display as <0.1%, ensuring even sparse data remains visible.
2. Boxplot Mode
indexly analyze-csv sales_data.csv --chart-type box
Shows an ASCII boxplot with quartiles and median indicators:
[revenue] (transform=log)
0.00 โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโก 10.80
Q1 Med Q3
โ Range=10.80, IQR=3.25, Median=6.02
Export & Integration
Export as Markdown
indexly analyze-csv sales_data.csv --export md
Saves a Markdown table of all summary statistics for documentation or reports.
Generate Interactive Charts
indexly analyze-csv sales_data.csv --mode interactive
Uses Plotly to produce dynamic visualizations viewable in the browser.
Behind the Scenes
-
Binning Strategy:
- For normal data or mild skew, uses equal-width bins.
- For extreme skew (|skew| > 5), switches to quantile-based binning for better visibility.
-
Adaptive Decimal Precision: Decimal places adjust automatically based on bin width using:
decimals = max(2, int(-np.floor(np.log10(bin_width)))) -
ฮSkew Calculation: Displayed as
(After - Before)to show the direction of improvement. Example:ฮskew=-3.71means skew reduced by 3.71 after transformation.
Pro Tips
-
Use
--transform autofor mixed datasets โ Indexly will normalize each column automatically. -
Use
--scale sqrtfor moderate skew instead of full log scaling. -
For quick terminal analysis, combine with:
indexly analyze-csv data.csv --show-chart ascii --chart-type hist --bins 15 -
Export results for documentation:
indexly analyze-csv data.csv --export md > analysis.md
Next Steps
Continue exploring Indexlyโs analytical capabilities:
- ๐ Configuration & Optimization
- ๐ท๏ธ Tagging & Metadata Management
- โก Real-Time Watchdog Indexing
- ๐ Search & Filter with FTS5
โจ Indexly makes your data talk โ visually, statistically, and intelligently.