Legacy .log Logging System – Full Documentation

Complete documentation of Indexly’s legacy .log-based logging system. Learn how classic log files are parsed, cleaned, normalized, exported, and migrated to the modern NDJSON logging standard.

This part of the documentation explains how the old .log-based logging system works. Although Indexly now uses ndjson logging by default, some users may still rely on the classic workflow for exporting, cleaning, or analysing older logs. To learn how the current logging system works,

➡️ Logging with ndjson


1. What Are Legacy .log Files?

Before the ndjson upgrade, Indexly saved indexing activity in daily files such as:

2024-11-03_index.log
2024-11-04_index.log

These logs contain raw indexing paths and timestamps. Unlike ndjson logs, they require cleaning and processing before being used for analysis or conversion.


2. How Indexly Parses .log Files

Indexly uses the following logic:

✓ Identifying log files

Files must match the pattern:

YYYY-MM-DD_index.log

✓ Extracting metadata

Each line is scanned for:

  • timestamp
  • file path
  • filename and extension
  • optional metadata from directory structure: /year/month/customer/filename

Example:

2024-11-04T10:32:22Z /projects/2024/05/acme/report.docx

Extracted result:

{
  "path": "projects/2024/05/acme/report.docx",
  "filename": "report.docx",
  "extension": "docx",
  "customer": "acme",
  "year": "2024",
  "month": "05"
}

3. Cleaning and Normalization

Before exporting, Indexly automatically:

  • normalizes slashes
  • fixes duplicate separators
  • cleans filenames (spaces → dashes)
  • removes duplicates across logs
  • extracts year/month/customer if present
  • computes SHA-1 hash of each log for integrity tracking

4. Exporting Legacy Logs

Legacy logs can be exported to JSON, NDJSON, or CSV.

Single Log Example

indexly log-clean ./2024-11-03_index.log --export json

Batch / Directory Example

indexly log-clean ./logs/ --export ndjson --combine-log

Export functions:

  • _export_json()
  • _export_ndjson()
  • _export_csv()

5. Combined vs Individual Export

Individual Mode

Each .log file becomes its own cleaned output:

2024-11-03_cleaned.json
2024-11-04_cleaned.json

Combined Mode

All logs → one merged output:

index-cleaned-all.ndjson

6. Summary Output

A human-readable summary is generated:

  • log dates
  • number of entries
  • earliest/latest timestamps
  • per-customer file count
  • duplicate path detection

7. Migration Note

Although .log files remain fully supported, the new ndjson logging system is recommended because:

  • metadata extraction is automatic
  • no cleaning step is required
  • analysis is faster (stream-friendly format)
  • works directly with analyze-json and analyze-file

To continue, see: ➡️ Logging with ndjson (New Standard)