A W3C Entity Reconciliation API compliant service for matching and reconciling cultural data against the Artsdata Knowledge Graph. This service enables data integration workflows for cultural organizations, allowing them to match their event, organization, person, place, and other entity data against Artsdata's authoritative cultural database.
Built with NestJS, this service provides a robust and scalable reconciliation endpoint compatible with tools like OpenRefine.
- W3C Reconciliation API Compliant: Full support for the W3C Entity Reconciliation API specification
- Multiple Entity Types: Reconcile events, organizations, people, places, agents, concepts, and more
- Batch Reconciliation: Process multiple entities in a single request
- GraphDB Integration: Direct connection to Artsdata's GraphDB instance
- OpenAPI/Swagger Documentation: Interactive API documentation available at
/api - Docker Support: Easy deployment with Docker and Docker Compose
- Heroku Ready: Configured for seamless Heroku deployment
- Node.js: ^22.17.0
- npm: ^10.9.2
- Docker (optional): For containerized deployment
- GraphDB: Access to an Artsdata GraphDB instance (or use staging)
The application uses the following environment variables. Create a .env file in the root directory based on .env.sample:
| Variable | Description | Default Value |
|---|---|---|
ENVIRONMENT |
Deployment environment (staging/production) | staging |
PORT |
HTTP port for the server | 3000 |
ARTSDATA_ENDPOINT |
GraphDB endpoint URL | https://staging.db.artsdata.ca/ |
REPOSITORY |
GraphDB repository name | artsdata |
ARTSDATA_USER |
GraphDB username (optional) | - |
ARTSDATA_PASSWORD |
GraphDB password (optional) | - |
| Index Configuration | ||
EVENT |
Event entity index name | event-index |
ENTITY |
General entity index name | entity-index |
PLACE |
Place entity index name | place-index |
ORGANIZATION |
Organization entity index name | organization-index |
PERSON |
Person entity index name | person-index |
AGENT |
Agent entity index name | agent-index |
CONCEPT |
Concept entity index name | concept-index |
EVENT_TYPE |
Event type index name | event-type-index |
LIVE_PERFORMANCE_WORK |
Live performance work index name | live-performance-work-index |
DEFAULT |
Default resource index name | resource-index |
PROPERTY |
Property index name | property-index |
TYPE |
Type index name | type-index |
LABELLED_ENTITIES |
Labeled entities index name | all-literals |
| Feature Flags | ||
ENABLE_EVENT_BATCH_RECONCILIATION |
Enable batch reconciliation for events | true |
LOG_QUERIES |
Enable query logging for debugging | true |
-
Copy the sample environment file:
cp .env.sample .env
-
Edit
.envand update the values as needed for your environment.
-
Clone the repository:
git clone https://github.com/culturecreates/artsdata-reconciliation.git cd artsdata-reconciliation -
Install dependencies:
npm install
-
Configure environment variables:
cp .env.sample .env # Edit .env with your configuration -
Build the application:
npm run build
-
Start the application:
# Development mode with hot-reload npm run start:dev # Production mode npm run start:prod
-
Access the application:
- API:
http://localhost:3000 - Swagger Documentation:
http://localhost:3000/api
- API:
Docker Compose provides the easiest way to run the application in a containerized environment:
# Build and start the container
docker-compose up --build
# Run in detached mode (background)
docker-compose up -d --build
# View logs
docker-compose logs -f
# Stop the container
docker-compose downThe application will be available at http://localhost:3000.
If you prefer to use Docker without Docker Compose:
# Build the Docker image
docker build -t artsdata-reconciliation .
# Run the container
docker run -p 3000:3000 --env-file .env artsdata-reconciliation
# Or with environment variables directly
docker run -p 3000:3000 \
-e ARTSDATA_ENDPOINT=https://staging.db.artsdata.ca/ \
-e REPOSITORY=artsdata \
artsdata-reconciliationThis application is configured for deployment on Heroku with the included Procfile.
- Heroku CLI installed
- A Heroku account
-
Login to Heroku:
heroku login
-
Create a Heroku application:
heroku create your-app-name
-
Set environment variables on Heroku:
# Set required environment variables heroku config:set ENVIRONMENT=production heroku config:set ARTSDATA_ENDPOINT=https://db.artsdata.ca/ heroku config:set REPOSITORY=artsdata # Set optional credentials if needed heroku config:set ARTSDATA_USER=your_username heroku config:set ARTSDATA_PASSWORD=your_password # Set feature flags heroku config:set ENABLE_EVENT_BATCH_RECONCILIATION=true heroku config:set LOG_QUERIES=false # View all configured variables heroku config
-
Deploy to Heroku:
# Deploy from main branch git push heroku main # Or deploy from a different branch git push heroku your-branch:main
-
Open your application:
heroku open
-
View logs:
# Stream logs in real-time heroku logs --tail # View recent logs heroku logs --num 200
- The
Procfileis configured to runnpm run start:prod - Heroku automatically sets the
PORTenvironment variable - Node.js version is specified in
package.jsonengines field - The buildpack will automatically detect and use Node.js
# Run all tests
npm test
# Run tests in watch mode
npm run test:watch
Once the application is running, you can access the interactive API documentation:
- Swagger UI:
http://localhost:3000/api
The API follows the W3C Entity Reconciliation API specification:
- Node.js - JavaScript runtime built on Chrome's V8 engine
- NestJS - Progressive Node.js framework for scalable server-side applications
- TypeScript - Typed superset of JavaScript
- GraphDB - Semantic graph database for RDF and SPARQL
- Docker - Containerization platform
- Swagger/OpenAPI - API documentation and testing
artsdata-reconciliation/
├── src/
│ ├── config/ # Configuration files
│ ├── controller/ # API controllers
│ ├── service/ # Business logic services
│ ├── dto/ # Data Transfer Objects
│ ├── enum/ # Enumerations
│ ├── interface/ # TypeScript interfaces
│ ├── helper/ # Helper utilities
│ └── main.ts # Application entry point
├── test/ # Test files
├── seed/ # Database seeding scripts
├── graph-db/ # GraphDB configuration
├── Dockerfile # Docker configuration
├── docker-compose.yml # Docker Compose configuration
├── Procfile # Heroku process configuration
└── .env.sample # Sample environment variables
If you encounter connection issues with GraphDB:
- Verify your
ARTSDATA_ENDPOINTis correct - Check if authentication is required (set
ARTSDATA_USERandARTSDATA_PASSWORD) - Ensure network connectivity to the GraphDB instance
- Check the application logs for detailed error messages
If port 3000 is already in use:
# Set a different port
export PORT=3001
npm run start
# Or in .env file
PORT=3001If Docker containers fail to start:
# Remove existing containers and rebuild
docker-compose down
docker-compose up --build
# View detailed logs
docker-compose logs -fFor questions and support:
- W3C Community Group: public-reconciliation@w3.org
- Artsdata: https://artsdata.ca
- GitHub Issues: Report an issue