Complete Top 500 Albums project with 100% data coverage and UI improvements

- Fixed Info/Description columns after regenerating CSV with clean Wikipedia data - Remapped and downloaded missing album covers to match new rankings - Modified website UI to show all description text without click-to-expand - Added comprehensive Info/Description for all 500 albums using research - Created multiple data processing scripts for album information completion - Achieved 100% data completion with descriptions ending "(by Claude)" for new content - All albums now have complete metadata and cover art 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-01 00:33:47 +02:00 · 2025-07-01 00:33:47 +02:00 · 462fdcfa84
commit 462fdcfa84
parent 09b5491f8a
500 changed files with 2323 additions and 502 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -4,19 +4,35 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

 ## Project Overview

-This repository contains Rolling Stone's Top 500 Albums of 2020 list in CSV format. It is a data-only repository with no executable code.
+This repository contains a comprehensive Top 500 Albums analysis with data from Rolling Stone (2020) and Wikipedia (2023), plus a fully functional website to explore the rankings with real album cover artwork. The project includes comparison scripts, data merging tools, and an interactive web interface with downloadable album covers from the iTunes API.

 ## Data Structure

-### File: `rolling_stone_top_500_albums_2020.csv`
+### Primary Files:
+- **`rolling_stone_top_500_albums_2020.csv`** - Original 2020 Rolling Stone list
+- **`wikipedia_top_500_albums.csv`** - Clean 2023 Wikipedia list  
+- **`top_500_albums_2023.csv`** - Combined comparison file

-The CSV file contains the following columns:
+### File: `rolling_stone_top_500_albums_2020.csv`
 - **Rank**: Album ranking (1-500, stored in reverse order with 500 first)
 - **Artist**: Artist or band name
 - **Album**: Album title
 - **Info**: Label and release year (e.g., "Blue Note, 1959")
 - **Description**: Detailed description of the album's significance and impact

+### File: `wikipedia_top_500_albums.csv`
+- **rank**: Album ranking (1-500)
+- **artist**: Artist or band name
+- **album**: Album title
+
+### File: `top_500_albums_2023.csv`
+- **Rank**: 2023 ranking
+- **Artist**: Artist or band name
+- **Album**: Album title
+- **Status**: Ranking change ("New in 2023", "+10", "-5", "No change")
+- **Info**: Label and year (from 2020 data where available)
+- **Description**: Album description (from 2020 data where available)
+
 ## Common Tasks

 ### Reading the Data
@ -42,8 +58,68 @@ This data can be used for:
 5. Educational projects about music history
 6. Statistical analysis of the most influential albums

+## Script Files
+
+### Data Processing Scripts
+- **`compare_top500_albums.py`** - Compares 2020 and 2023 lists, generates combined CSV with ranking changes
+- **`merge_album_info.py`** - Merges Info and Description columns from 2020 data into the combined file
+- **`download_all_covers.py`** - Downloads album cover artwork using iTunes Search API (500/500 success rate)
+
+### Website Files
+- **`index.html`** - Main website interface with search, filtering, and sorting
+- **`script.js`** - JavaScript for interactivity, state management, and URL sharing
+- **`style.css`** - Responsive styling with CSS Grid and modern design
+- **`favicon.ico`** - Custom favicon for the website
+- **`covers/`** - Directory containing downloaded album cover images
+
+## Important Data Quality Notes
+
+- **Clean Data Source**: Uses `wikipedia_top_500_albums.csv` (clean Wikipedia data) rather than `wikipedia_500_albums.csv` (old version with duplicates)
+- **Fixed Duplicates**: Previous versions had duplicate "Suicide" entries at ranks 234, 293, and 498. Current version correctly shows only one Suicide entry at rank 498
+- **Column Format**: Wikipedia file uses lowercase column names (`rank`, `album`, `artist`) vs title case in other files
+
+## Technical Implementation
+
+### Album Cover Download
+- Uses iTunes Search API without external dependencies
+- Implements fuzzy matching for artist/album names
+- Downloads 600x600 pixel artwork
+- 100% success rate (500/500 albums)
+- Failed downloads logged to `failed_downloads.txt`
+
+### Website Features
+- Responsive design with infinite scroll
+- Search functionality across artist/album names
+- Filter by ranking status (new, improved, dropped, no change)
+- Sort by rank, artist, or album
+- Bookmark functionality with shareable URLs
+- Individual album sharing with preserved state
+- Jump-to-rank navigation
+
+### Data Comparison Logic
+- Fuzzy string matching using Python's difflib
+- Handles artist name variations ("The Beatles" vs "Beatles")
+- Matches albums with minor title differences
+- Calculates ranking improvements/drops with +/- notation
+
+## Running the Scripts
+
+### Python Requirements
+All scripts use only Python standard library (no external dependencies):
+- `urllib` for HTTP requests
+- `csv` for data processing  
+- `json` for API responses
+- `re` for text processing
+- `difflib` for fuzzy matching
+
+### Website Deployment
+- Serve with local HTTP server: `python -m http.server 8000`
+- Required due to CORS restrictions when loading CSV files
+- No build process needed - pure HTML/CSS/JS
+
 ## Notes

- This is not a git repository, so version control commands are not applicable
- No build, test, or lint commands exist as this is purely a data repository
- When processing the data, be aware that the ranking order is reversed in the file
+- Rankings in `rolling_stone_top_500_albums_2020.csv` are stored in reverse order (500 to 1)
+- Wikipedia data is clean and properly formatted (1-500)
+- Website preserves filter/sort state in shareable URLs
+- Cover images use rank-based filenames for easy organization