Skip to content

berkmancenter/transparency-hub

Transparency Hub

A comprehensive platform for tracking and analyzing platform policies over time.

🌐 Live Site: https://hub.transparency.berkmancenter.org

πŸ“š Project Page: https://asml.cyber.harvard.edu/transparency-hub/

About

Transparency Hub is an open-source initiative by the Berkman Klein Center for Internet & Society at Harvard University. It provides researchers, journalists, and the public with tools to:

  • Track policy changes across major platforms
  • Compare policies between different platforms
  • Access historical policy documents
  • Analyze transparency report data

Features

  • Policy Index: Browse and search platform policies by company
  • Comparison Tool: Side-by-side comparison of policies across platforms
  • Project Database: Explore research projects related to platform transparency
  • WARC Archive Integration: Access archived policy documents
  • Change Tracking: Monitor when policies are updated

Tech Stack

  • Framework: Next.js 16 with App Router
  • Language: TypeScript
  • Database: MongoDB
  • Storage: Google Cloud Storage
  • Styling: Tailwind CSS
  • Analytics: Vercel Analytics

Getting Started

Prerequisites

  • Node.js 18+ and npm
  • MongoDB instance
  • Google Cloud Storage bucket (for WARC files)

Installation

  1. Clone the repository:

    git clone https://github.com/berkmancenter/transparency-hub.git
    cd transparency_hub
  2. Install dependencies:

    npm install
  3. Set up environment variables:

    cp .env.example .env.local

    Then edit .env.local with your configuration (see Configuration below).

  4. Run the development server:

    npm run dev
  5. Open http://localhost:3002 in your browser.

Configuration

Create a .env.local file in the root directory with the following variables:

# MongoDB Configuration
NEXT_ATLAS_URI=mongodb://localhost:27017
NEXT_ATLAS_DATABASE=transparency_hub

# Google Cloud Storage (optional, for WARC proxy)
GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
GCS_BUCKET_NAME=your-bucket-name

See .env.example for more details.

To request access to the live Transparency Hub data, please fill out this form

Project Structure

transparency_hub/
β”œβ”€β”€ app/                    # Next.js App Router pages
β”‚   β”œβ”€β”€ api/               # API routes
β”‚   β”œβ”€β”€ policy_index/      # Policy browsing pages
β”‚   β”œβ”€β”€ comparison_tool/   # Policy comparison interface
β”‚   β”œβ”€β”€ projects/          # Research projects showcase
β”‚   └── lib/               # Database connections
β”œβ”€β”€ components/            # React components
β”‚   β”œβ”€β”€ ui/               # UI components
β”‚   └── lib/              # Shared component utilities
β”œβ”€β”€ public/               # Static assets
└── types/                # TypeScript type definitions

Development

Available Scripts

  • npm run dev - Start development server on port 3002
  • npm run build - Build for production
  • npm start - Start production server
  • npm run lint - Run ESLint

Code Style

This project uses:

  • ESLint for code linting
  • TypeScript for type safety
  • Prettier-style formatting

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Data Collection

The platform aggregates data from publicly available sources. For information about our data collection practices, see our Privacy Policy.

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). That license applies to the source code only. This project depends on caniuse-lite, a dataset licensed under CC BY 4.0 by caniuse.com. The CC BY 4.0 license applies to that dataset; the AGPL 3.0 license does not.

The AGPL-3.0 license requires that:

  • Source code must be made available when the software is run as a network service
  • Modifications must also be released under AGPL-3.0
  • Users interacting with the software over a network must be able to access the source code

Related Repositories

This project is part of the Transparency Hub ecosystem:

Repository Description
Transparency Hub (this repo) Next.js frontend β€” the public-facing website
Transparency Archiver Python pipeline that crawls and archives policy documents
Browsertrix Crawler Fork Custom fork of Browsertrix Crawler used by the archiver

Acknowledgments

Contact

For questions or collaboration inquiries, please visit our project page or open an issue on GitHub.

Links