Tracking
File Tracking¶
LDA's file tracking system provides comprehensive provenance tracking for every file in your project. Each modification is recorded with timestamps, hash values, and analyst attribution to ensure complete traceability.
Overview¶
The tracking system maintains a complete audit trail of all file operations:
- File creation - When files are first added to the project
- Modifications - All changes with before/after hashes
- Deletions - When files are removed from tracking
- Analyst attribution - Who made each change
Using the Track Command¶
The lda track command manages file tracking in your project:
# Track all files in current section
lda track
# Track specific files
lda track data/input.csv outputs/results.png
# Track with custom message
lda track --message "Updated preprocessing pipeline"
# Force tracking even if no changes
lda track --force
Manifest Structure¶
Each section maintains a manifest.json file that records:
{
"version": "1.0",
"section": "sec01_preprocessing",
"created": "2024-01-15T10:30:00Z",
"analyst": "john.doe",
"files": {
"inputs/raw_data.csv": {
"hash": "sha256:abcd1234...",
"modified": "2024-01-15T10:30:00Z",
"size": 12345,
"analyst": "john.doe"
}
},
"history": [
{
"timestamp": "2024-01-15T10:30:00Z",
"action": "created",
"files": ["inputs/raw_data.csv"],
"analyst": "john.doe",
"message": "Initial data import"
}
]
}
Viewing Changes¶
Use the changes command to see file modifications:
# Show all changes in current section
lda changes
# Show changes since specific date
lda changes --since "2024-01-01"
# Show changes by specific analyst
lda changes --analyst john.doe
# Show detailed diff
lda changes --diff
History and Provenance¶
View complete file history with the history command:
# Show full history
lda history
# History for specific file
lda history outputs/figure1.png
# Export history to file
lda history --output history.json --format json
Best Practices¶
- Track Early and Often
- Run
lda trackafter every significant change -
Use meaningful messages to describe changes
-
Review Before Committing
- Use
lda changesto review modifications -
Ensure all expected files are tracked
-
Maintain Clean History
- Use clear, descriptive messages
-
Track related changes together
-
Regular Validation
- Run
lda validateto check manifest integrity - Fix any issues before they accumulate
Integration with Git¶
LDA tracking complements Git version control:
- Git tracks code changes
- LDA tracks data and output provenance
- Together they provide complete project history
# Typical workflow
lda track --message "Updated analysis parameters"
git add .
git commit -m "Updated analysis parameters"
Troubleshooting¶
Missing Files¶
If files are missing from tracking:
# Check current status
lda status
# Force re-scan
lda track --scan
# Validate manifest
lda validate --fix
Hash Mismatches¶
When file contents change without tracking:
Permission Issues¶
Ensure proper file permissions:
Advanced Features¶
Custom Hash Algorithms¶
Configure hash algorithm in lda_config.yaml:
Automated Tracking¶
Set up automated tracking with file watchers:
Remote Synchronization¶
Sync tracking data with remote servers:
tracking:
remote:
enabled: true
endpoint: "https://tracking.example.com"
api_key: "${TRACKING_API_KEY}"
See Also¶
- Configuration - Tracking configuration options
- Workflows - Integrating tracking into workflows
- CLI Reference - Complete track command reference