A comprehensive platform for tracking and analyzing platform policies over time.
π Live Site: https://hub.transparency.berkmancenter.org
π Project Page: https://asml.cyber.harvard.edu/transparency-hub/
Transparency Hub is an open-source initiative by the Berkman Klein Center for Internet & Society at Harvard University. It provides researchers, journalists, and the public with tools to:
- Track policy changes across major platforms
- Compare policies between different platforms
- Access historical policy documents
- Analyze transparency report data
- Policy Index: Browse and search platform policies by company
- Comparison Tool: Side-by-side comparison of policies across platforms
- Project Database: Explore research projects related to platform transparency
- WARC Archive Integration: Access archived policy documents
- Change Tracking: Monitor when policies are updated
- Framework: Next.js 16 with App Router
- Language: TypeScript
- Database: MongoDB
- Storage: Google Cloud Storage
- Styling: Tailwind CSS
- Analytics: Vercel Analytics
- Node.js 18+ and npm
- MongoDB instance
- Google Cloud Storage bucket (for WARC files)
-
Clone the repository:
git clone https://github.com/berkmancenter/transparency-hub.git cd transparency_hub -
Install dependencies:
npm install
-
Set up environment variables:
cp .env.example .env.local
Then edit
.env.localwith your configuration (see Configuration below). -
Run the development server:
npm run dev
-
Open http://localhost:3002 in your browser.
Create a .env.local file in the root directory with the following variables:
# MongoDB Configuration
NEXT_ATLAS_URI=mongodb://localhost:27017
NEXT_ATLAS_DATABASE=transparency_hub
# Google Cloud Storage (optional, for WARC proxy)
GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
GCS_BUCKET_NAME=your-bucket-nameSee .env.example for more details.
To request access to the live Transparency Hub data, please fill out this form
transparency_hub/
βββ app/ # Next.js App Router pages
β βββ api/ # API routes
β βββ policy_index/ # Policy browsing pages
β βββ comparison_tool/ # Policy comparison interface
β βββ projects/ # Research projects showcase
β βββ lib/ # Database connections
βββ components/ # React components
β βββ ui/ # UI components
β βββ lib/ # Shared component utilities
βββ public/ # Static assets
βββ types/ # TypeScript type definitions
npm run dev- Start development server on port 3002npm run build- Build for productionnpm start- Start production servernpm run lint- Run ESLint
This project uses:
- ESLint for code linting
- TypeScript for type safety
- Prettier-style formatting
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
The platform aggregates data from publicly available sources. For information about our data collection practices, see our Privacy Policy.
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). That license applies to the source code only. This project depends on caniuse-lite, a dataset licensed under CC BY 4.0 by caniuse.com. The CC BY 4.0 license applies to that dataset; the AGPL 3.0 license does not.
The AGPL-3.0 license requires that:
- Source code must be made available when the software is run as a network service
- Modifications must also be released under AGPL-3.0
- Users interacting with the software over a network must be able to access the source code
This project is part of the Transparency Hub ecosystem:
| Repository | Description |
|---|---|
| Transparency Hub (this repo) | Next.js frontend β the public-facing website |
| Transparency Archiver | Python pipeline that crawls and archives policy documents |
| Browsertrix Crawler Fork | Custom fork of Browsertrix Crawler used by the archiver |
- Developed by the Berkman Klein Center for Internet & Society
- A project of the Applied Social Media Lab initiative
For questions or collaboration inquiries, please visit our project page or open an issue on GitHub.