Flickr Metadata Export Script
This page documents a local Python script used to export Flickr metadata and image URLs to CSV for later reuse in Archives workflows and other descriptive systems.
Purpose
Use this script when you need a spreadsheet of Flickr item metadata after upload, especially when you want stable Flickr URLs for linking in finding aids, digital object records, or local project tracking.
Requirements
- Python 3
- The
requestspackage installed in the Python environment used to run the script - A Flickr API key
- The Flickr account user ID for the account being exported
- A local configuration file named
flickr_accounts.inistored in the same folder as the script
Recommended Setup
If you are new to running Python scripts, this is a practical baseline setup:
- install Python from python.org for Windows
- install Visual Studio Code
- install the Python extension in VS Code
- review the VS Code Python Quick Start
- review the official Flickr API documentation
After installation, confirm Python is available in PowerShell:
python --version
If that does not work on Windows, try:
py --version
VS Code is recommended here because it provides a clean text editor, an integrated terminal, and a straightforward way for future staff to inspect or modify the script safely.
Configuration File
Expected format:
[default]
api_key = YOUR_FLICKR_API_KEY_HERE
user_id = YOUR_FLICKR_USER_ID_NSID_HERE
Example with obvious placeholders:
[default]
api_key = abc123replace_me_with_your_real_key
user_id = 12345678@N00
Notes:
api_keyis the Flickr API application keyuser_idis the Flickr NSID for the account owner- the account alias or screen name is not a substitute for
user_id
How the Script Works
The script prompts for one export mode:
album: callsflickr.photosets.getPhotosfor one Flickr albumdate: callsflickr.people.getPhotosfor a Flickr upload-date range
For each photo, it collects or derives values such as:
- Flickr photo ID
- title
- description
- upload date
- taken date
- owner name and path alias
- tags
- view count
- original format and dimensions
- direct image URLs
- Flickr photo page URL
If an original-size URL is not already present in the first API response, the script makes an additional flickr.photos.getSizes request for that photo.
Identifier Extraction
The script attempts to populate image_supplier_image_id from either:
- recognized machine tags
- an
ID:pattern inside the Flickr description text
This is useful when the local workflow embeds a legacy identifier or filename stem in the description or machine tags before upload.
Running the Script
Example:
python .\flickr_metadata_export.py
Interactive prompts:
- Choose
albumordate. - Enter the album ID or album URL, or enter the minimum and maximum upload dates.
- Enter an output CSV path or accept the default filename.
Output Columns
The current script writes these columns:
image_supplier_image_idphoto_idtitledescriptiondate_upload_unixdate_takenowner_nsidowner_namepath_aliaslicensemediatagsviewsoriginal_formatoriginal_widthoriginal_heightoriginal_urlembed_url_originallarge_urlmedium_urlsmall_urlphoto_page_url
Recommended Use
- Save the CSV beside the project files or processing documentation for that Flickr batch.
- Reuse exported URLs in downstream systems instead of manually copying them from the Flickr interface one at a time.
- Keep the script and the local INI file out of public repos unless secrets are removed and the setup is rewritten safely.
Script Copy
#!/usr/bin/env python3
import configparser
import csv
import datetime as dt
import re
import sys
import time
from pathlib import Path
from urllib.parse import parse_qs, urlparse
import requests
API_URL = "https://api.flickr.com/services/rest/"
EXTRAS = ",".join([
"date_upload", "date_taken", "description", "license", "media",
"machine_tags", "o_dims", "original_format", "owner_name", "path_alias",
"tags", "url_o", "url_l", "url_m", "url_z", "views"
])
ID_PATTERN = re.compile(r"\bID:\s*([A-Za-z0-9_-]+)\b", re.IGNORECASE)
# Configuration note:
# This script expects a `flickr_accounts.ini` file in the same directory as
# this script. That INI file should include the Flickr API key and the Flickr
# user ID (NSID) for the account being exported.
#
# Important:
# - `api_key` is your Flickr API application key.
# - `user_id` is the Flickr account NSID for the account owner.
# - Do not put the API secret in the `user_id` field.
#
# How to get an API key:
# - You must have a Flickr subscription.
# - In Flickr, go to Account Settings > Sharing and Extending > API Keys.
# - Create a new key there, or view an existing one if you already have it.
#
# How to get the Flickr user ID (NSID):
# - The Flickr screen name or account URL alias is not the same as the user ID.
# - Use the Flickr API method `flickr.urls.lookupUser` with your account URL.
# - Example:
# https://www.flickr.com/services/rest/?method=flickr.urls.lookupUser&api_key=YOUR_API_KEY&url=https://www.flickr.com/people/valdosta_archives/&format=json&nojsoncallback=1
# - The response will include `user.id`, which is the value to place in
# `flickr_accounts.ini` as `user_id`.
def flickr_call(api_key, method, pause=0, **params):
r = requests.get(API_URL, params={
"method": method,
"api_key": api_key,
"format": "json",
"nojsoncallback": 1,
**params,
}, timeout=60)
r.raise_for_status()
data = r.json()
if data.get("stat") != "ok":
raise RuntimeError(f"{method} failed: {data.get('message')}")
if pause:
time.sleep(pause)
return data
def load_settings():
script_dir = Path(__file__).resolve().parentalb
config_path = script_dir / "flickr_accounts.ini"
if not config_path.exists():
raise FileNotFoundError(f"Config file not found: {config_path}")
config = configparser.ConfigParser()
config.read(config_path, encoding="utf-8")
if "default" not in config:
raise KeyError(f"Missing [default] section in {config_path}")
api_key = config["default"].get("api_key", "").strip()
user_id = config["default"].get("user_id", "").strip()
if not api_key:
raise ValueError("Missing api_key in [default] section.")
if not user_id:
raise ValueError("Missing user_id in [default] section.")
return {
"api_key": api_key,
"user_id": user_id,
"config_path": config_path,
"script_dir": script_dir,
}
def parse_album_id(value):
value = value.strip()
if not value:
raise ValueError("Album ID or album URL is required.")
if value.isdigit():
return value
parsed = urlparse(value)
parts = [p for p in parsed.path.split("/") if p]
if "albums" in parts:
i = parts.index("albums")
if i + 1 < len(parts):
return parts[i + 1]
query = parse_qs(parsed.query)
for key in ("set", "photoset"):
if query.get(key):
return query[key][0]
raise ValueError(f"Could not extract album ID from: {value}")
def parse_date(value):
return int(
dt.datetime.strptime(value, "%Y-%m-%d")
.replace(tzinfo=dt.timezone.utc)
.timestamp()
)
def get_original_url(api_key, photo_id):
data = flickr_call(api_key, "flickr.photos.getSizes", photo_id=photo_id)
for size in data.get("sizes", {}).get("size", []):
if size.get("label") == "Original":
return size.get("source", "")
return ""
def photo_page_url(photo):
owner = photo.get("pathalias") or photo.get("owner", "")
return f"https://www.flickr.com/photos/{owner}/{photo['id']}/"
def normalize_description(photo):
desc = photo.get("description", "")
return desc.get("_content", "") if isinstance(desc, dict) else desc
def extract_machine_tag_value(machine_tags):
for tag in machine_tags.split():
if ":" not in tag or "=" not in tag:
continue
namespace_predicate, value = tag.split("=", 1)
_, predicate = namespace_predicate.split(":", 1)
normalized = predicate.lower().replace("-", "").replace("_", "")
if normalized in {
"imagesupplierimageid",
"supplierimageid",
"imageid",
"identifier",
}:
return value.strip('"')
return ""
def extract_image_supplier_image_id(photo):
value = extract_machine_tag_value(photo.get("machine_tags", ""))
if value:
return Path(value).stem
description = normalize_description(photo)
match = ID_PATTERN.search(description)
if match:
return match.group(1)
return ""
def prompt_mode():
while True:
value = input("Export by album or upload date? Enter 'album' or 'date': ").strip().lower()
if value in ("album", "date"):
return value
print("Please enter 'album' or 'date'.")
def prompt_album_id():
value = input("Enter the Flickr album ID or album URL: ").strip()
return parse_album_id(value)
def prompt_date_range():
min_date = input("Enter the minimum upload date (YYYY-MM-DD): ").strip()
max_date = input("Enter the maximum upload date (YYYY-MM-DD): ").strip()
parse_date(min_date)
parse_date(max_date)
return min_date, max_date
def prompt_output_path(script_dir, mode):
default_name = f"flickr_export_{mode}.csv"
value = input(f"Enter output CSV path [{default_name}]: ").strip()
if not value:
value = default_name
output_path = Path(value)
if not output_path.is_absolute():
output_path = script_dir / output_path
return output_path
def collect_album(api_key, user_id, album_id):
page, rows = 1, []
while True:
data = flickr_call(
api_key,
"flickr.photosets.getPhotos",
user_id=user_id,
photoset_id=album_id,
extras=EXTRAS,
per_page=500,
page=page,
)
photoset = data["photoset"]
rows.extend(photoset.get("photo", []))
if page >= int(photoset["pages"]):
return rows
page += 1
def collect_date_range(api_key, user_id, min_upload_date, max_upload_date):
page, rows = 1, []
min_date = parse_date(min_upload_date)
max_date = parse_date(max_upload_date) + 86399
while True:
data = flickr_call(
api_key,
"flickr.people.getPhotos",
user_id=user_id,
extras=EXTRAS,
per_page=500,
page=page,
min_upload_date=min_date,
max_upload_date=max_date,
)
photos = data["photos"]
rows.extend(photos.get("photo", []))
if page >= int(photos["pages"]):
return rows
page += 1
def to_row(api_key, photo):
original_url = photo.get("url_o", "")
if not original_url:
original_url = get_original_url(api_key, photo["id"])
return {
"image_supplier_image_id": extract_image_supplier_image_id(photo),
"photo_id": photo.get("id", ""),
"title": photo.get("title", ""),
"description": normalize_description(photo),
"date_upload_unix": photo.get("dateupload", ""),
"date_taken": photo.get("datetaken", ""),
"owner_nsid": photo.get("owner", ""),
"owner_name": photo.get("ownername", ""),
"path_alias": photo.get("pathalias", ""),
"license": photo.get("license", ""),
"media": photo.get("media", ""),
"tags": photo.get("tags", ""),
"views": photo.get("views", ""),
"original_format": photo.get("originalformat", ""),
"original_width": photo.get("width_o", ""),
"original_height": photo.get("height_o", ""),
"original_url": original_url,
"embed_url_original": original_url,
"large_url": photo.get("url_l", ""),
"medium_url": photo.get("url_m", ""),
"small_url": photo.get("url_z", ""),
"photo_page_url": photo_page_url(photo),
}
def write_csv(output_path, rows):
fieldnames = [
"image_supplier_image_id", "photo_id", "title", "description",
"date_upload_unix", "date_taken", "owner_nsid", "owner_name",
"path_alias", "license", "media", "tags", "views",
"original_format", "original_width", "original_height",
"original_url", "embed_url_original", "large_url", "medium_url",
"small_url", "photo_page_url"
]
output_path.parent.mkdir(parents=True, exist_ok=True)
with output_path.open("w", newline="", encoding="utf-8-sig") as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(rows)
def main():
settings = load_settings()
api_key = settings["api_key"]
user_id = settings["user_id"]
script_dir = settings["script_dir"]
print("Flickr Metadata Export")
print(f"Using config: {settings['config_path']}")
print("Using profile: [default]")
print()
mode = prompt_mode()
if mode == "album":
album_id = prompt_album_id()
output_path = prompt_output_path(script_dir, "album")
photos = collect_album(api_key, user_id, album_id)
else:
min_date, max_date = prompt_date_range()
output_path = prompt_output_path(script_dir, "date")
photos = collect_date_range(api_key, user_id, min_date, max_date)
rows = [to_row(api_key, photo) for photo in photos]
write_csv(output_path, rows)
print()
print(f"Exported {len(rows)} photos to:")
print(output_path)
if __name__ == "__main__":
try:
main()
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
input("Press Enter to close...")
sys.exit(1)
Extended EXIF Export Script
This variant is viable when you want a richer export for local preservation tracking, website reuse, or downstream metadata cleanup. It adds:
- EXIF export for each photo
- direct Flickr page link
- thumbnail URL
- medium URL
- original URL
- a JSON-packed
exif_jsoncolumn containing all returned EXIF tags
Note
This version makes additional API calls per photo. Large exports will take longer than the basic script.
#!/usr/bin/env python3
import configparser
import csv
import datetime as dt
import json
import re
import sys
import time
from pathlib import Path
from urllib.parse import parse_qs, urlparse
import requests
API_URL = "https://api.flickr.com/services/rest/"
EXTRAS = ",".join([
"date_upload", "date_taken", "description", "license", "media",
"machine_tags", "o_dims", "original_format", "owner_name", "path_alias",
"tags", "url_o", "url_l", "url_m", "url_z", "views"
])
ID_PATTERN = re.compile(r"\bID:\s*([A-Za-z0-9_-]+)\b", re.IGNORECASE)
def flickr_call(api_key, method, pause=0, **params):
response = requests.get(API_URL, params={
"method": method,
"api_key": api_key,
"format": "json",
"nojsoncallback": 1,
**params,
}, timeout=60)
response.raise_for_status()
data = response.json()
if data.get("stat") != "ok":
raise RuntimeError(f"{method} failed: {data.get('message')}")
if pause:
time.sleep(pause)
return data
def load_settings():
script_dir = Path(__file__).resolve().parent
config_path = script_dir / "flickr_accounts.ini"
if not config_path.exists():
raise FileNotFoundError(f"Config file not found: {config_path}")
config = configparser.ConfigParser()
config.read(config_path, encoding="utf-8")
if "default" not in config:
raise KeyError(f"Missing [default] section in {config_path}")
api_key = config["default"].get("api_key", "").strip()
user_id = config["default"].get("user_id", "").strip()
if not api_key:
raise ValueError("Missing api_key in [default] section.")
if not user_id:
raise ValueError("Missing user_id in [default] section.")
return {
"api_key": api_key,
"user_id": user_id,
"config_path": config_path,
"script_dir": script_dir,
}
def parse_album_id(value):
value = value.strip()
if not value:
raise ValueError("Album ID or album URL is required.")
if value.isdigit():
return value
parsed = urlparse(value)
parts = [part for part in parsed.path.split("/") if part]
if "albums" in parts:
index = parts.index("albums")
if index + 1 < len(parts):
return parts[index + 1]
query = parse_qs(parsed.query)
for key in ("set", "photoset"):
if query.get(key):
return query[key][0]
raise ValueError(f"Could not extract album ID from: {value}")
def parse_date(value):
return int(
dt.datetime.strptime(value, "%Y-%m-%d")
.replace(tzinfo=dt.timezone.utc)
.timestamp()
)
def normalize_description(photo):
desc = photo.get("description", "")
return desc.get("_content", "") if isinstance(desc, dict) else desc
def extract_machine_tag_value(machine_tags):
for tag in machine_tags.split():
if ":" not in tag or "=" not in tag:
continue
namespace_predicate, value = tag.split("=", 1)
_, predicate = namespace_predicate.split(":", 1)
normalized = predicate.lower().replace("-", "").replace("_", "")
if normalized in {
"imagesupplierimageid",
"supplierimageid",
"imageid",
"identifier",
}:
return value.strip('"')
return ""
def extract_image_supplier_image_id(photo):
value = extract_machine_tag_value(photo.get("machine_tags", ""))
if value:
return Path(value).stem
description = normalize_description(photo)
match = ID_PATTERN.search(description)
if match:
return match.group(1)
return ""
def photo_page_url(photo):
owner = photo.get("pathalias") or photo.get("owner", "")
return f"https://www.flickr.com/photos/{owner}/{photo['id']}/"
def prompt_mode():
while True:
value = input("Export by album or upload date? Enter 'album' or 'date': ").strip().lower()
if value in ("album", "date"):
return value
print("Please enter 'album' or 'date'.")
def prompt_album_id():
value = input("Enter the Flickr album ID or album URL: ").strip()
return parse_album_id(value)
def prompt_date_range():
min_date = input("Enter the minimum upload date (YYYY-MM-DD): ").strip()
max_date = input("Enter the maximum upload date (YYYY-MM-DD): ").strip()
parse_date(min_date)
parse_date(max_date)
return min_date, max_date
def prompt_output_path(script_dir, mode):
default_name = f"flickr_export_exif_{mode}.csv"
value = input(f"Enter output CSV path [{default_name}]: ").strip()
if not value:
value = default_name
output_path = Path(value)
if not output_path.is_absolute():
output_path = script_dir / output_path
return output_path
def collect_album(api_key, user_id, album_id):
page, rows = 1, []
while True:
data = flickr_call(
api_key,
"flickr.photosets.getPhotos",
user_id=user_id,
photoset_id=album_id,
extras=EXTRAS,
per_page=500,
page=page,
)
photoset = data["photoset"]
rows.extend(photoset.get("photo", []))
if page >= int(photoset["pages"]):
return rows
page += 1
def collect_date_range(api_key, user_id, min_upload_date, max_upload_date):
page, rows = 1, []
min_date = parse_date(min_upload_date)
max_date = parse_date(max_upload_date) + 86399
while True:
data = flickr_call(
api_key,
"flickr.people.getPhotos",
user_id=user_id,
extras=EXTRAS,
per_page=500,
page=page,
min_upload_date=min_date,
max_upload_date=max_date,
)
photos = data["photos"]
rows.extend(photos.get("photo", []))
if page >= int(photos["pages"]):
return rows
page += 1
def get_sizes_map(api_key, photo_id):
data = flickr_call(api_key, "flickr.photos.getSizes", photo_id=photo_id)
sizes = {}
for size in data.get("sizes", {}).get("size", []):
label = size.get("label", "").lower()
sizes[label] = size.get("source", "")
return sizes
def get_exif_json(api_key, photo_id):
try:
data = flickr_call(api_key, "flickr.photos.getExif", photo_id=photo_id)
except RuntimeError as err:
message = str(err).lower()
if "permission denied" in message or "photo not found" in message:
return ""
raise
exif_entries = []
for entry in data.get("photo", {}).get("exif", []):
exif_entries.append({
"tagspace": entry.get("tagspace", ""),
"tagspaceid": entry.get("tagspaceid", ""),
"tag": entry.get("tag", ""),
"label": entry.get("label", ""),
"raw": entry.get("raw", {}).get("_content", ""),
"clean": entry.get("clean", {}).get("_content", ""),
})
return json.dumps(exif_entries, ensure_ascii=False)
def to_row(api_key, photo):
sizes = get_sizes_map(api_key, photo["id"])
original_url = photo.get("url_o", "") or sizes.get("original", "")
medium_url = photo.get("url_m", "") or sizes.get("medium", "") or sizes.get("medium 640", "")
thumbnail_url = sizes.get("thumbnail", "") or sizes.get("square", "") or sizes.get("small square", "")
return {
"image_supplier_image_id": extract_image_supplier_image_id(photo),
"photo_id": photo.get("id", ""),
"title": photo.get("title", ""),
"description": normalize_description(photo),
"date_upload_unix": photo.get("dateupload", ""),
"date_taken": photo.get("datetaken", ""),
"owner_nsid": photo.get("owner", ""),
"owner_name": photo.get("ownername", ""),
"path_alias": photo.get("pathalias", ""),
"license": photo.get("license", ""),
"media": photo.get("media", ""),
"tags": photo.get("tags", ""),
"views": photo.get("views", ""),
"original_format": photo.get("originalformat", ""),
"original_width": photo.get("width_o", ""),
"original_height": photo.get("height_o", ""),
"thumbnail_url": thumbnail_url,
"medium_url": medium_url,
"original_url": original_url,
"photo_page_url": photo_page_url(photo),
"exif_json": get_exif_json(api_key, photo["id"]),
}
def write_csv(output_path, rows):
fieldnames = [
"image_supplier_image_id", "photo_id", "title", "description",
"date_upload_unix", "date_taken", "owner_nsid", "owner_name",
"path_alias", "license", "media", "tags", "views",
"original_format", "original_width", "original_height",
"thumbnail_url", "medium_url", "original_url", "photo_page_url",
"exif_json",
]
output_path.parent.mkdir(parents=True, exist_ok=True)
with output_path.open("w", newline="", encoding="utf-8-sig") as handle:
writer = csv.DictWriter(handle, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(rows)
def main():
settings = load_settings()
api_key = settings["api_key"]
user_id = settings["user_id"]
script_dir = settings["script_dir"]
print("Flickr Metadata + EXIF Export")
print(f"Using config: {settings['config_path']}")
print("Using profile: [default]")
print()
mode = prompt_mode()
if mode == "album":
album_id = prompt_album_id()
output_path = prompt_output_path(script_dir, "album")
photos = collect_album(api_key, user_id, album_id)
else:
min_date, max_date = prompt_date_range()
output_path = prompt_output_path(script_dir, "date")
photos = collect_date_range(api_key, user_id, min_date, max_date)
rows = [to_row(api_key, photo) for photo in photos]
write_csv(output_path, rows)
print()
print(f"Exported {len(rows)} photos to:")
print(output_path)
if __name__ == "__main__":
try:
main()
except Exception as err:
print(f"Error: {err}", file=sys.stderr)
input("Press Enter to close...")
sys.exit(1)