Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,21 @@ The name "Combine", pronounced /kämˌbīn/, is a nod to the [combine harvester

[![Documentation Status](https://readthedocs.org/projects/combine/badge/?version=master)](http://combine.readthedocs.io/en/master/?badge=master)

Documentation is available at [Read the Docs](http://combine.readthedocs.io/).
The combine team is in the process of updating the documentation. The installation process and user interface have had significant changes. In the meantime some out-of-date documentation is available at [Read the Docs](http://combine.readthedocs.io/).

## Installation

Combine has a fair amount of server components, dependencies, and configurations that must be in place to work, as it leverages [Apache Spark](https://spark.apache.org/), among other applications, for processing on the backend. There are a couple of deployment options.
Combine has a fair amount of server components, dependencies, and configurations that must be in place to work, as it leverages [Apache Spark](https://spark.apache.org/), among other applications, for processing on the backend. For previous version of combine there were a couple of deployment options. However, for the current and future versions (v0.11.1 and after) only the docker option is available.

### Docker

A GitHub repository [Combine-Docker](https://github.com/MI-DPLA/combine-docker) exists to help stand up an instance of Combine as a series of interconnected Docker containers.

### Server Provisioning with Vagrant and/or Ansible
### Security Warning

To this end, use the repository, [Combine-playbook](https://github.com/MI-DPLA/combine-playbook), which has been created to assist with provisioning a server with everything neccessary, and in place, to run Combine. This repository provides routes for server provisioning via [Vagrant](https://www.vagrantup.com/) and/or [Ansible](https://www.ansible.com/). Please visit the [Combine-playbook](https://github.com/MI-DPLA/combine-playbook) repository for more information about installation.
Combine code should be run behind your institution's firewall on a secured server. Access to combine should be protected by your instituion-wide identity and password system, preferably using two-factor authentication. If your institution supports using VPNs for access to the server's network that is a good additional step.

This is in addition to the combine's own passwords. While we haven't got explicit documentation on how to set up SSL inside the provided nginx in combine it's possible and strongly recommended.

## Tech Stack Details

Expand Down
2 changes: 1 addition & 1 deletion combine/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@
'USER': 'combine',
'PASSWORD': 'combine',
'HOST': '127.0.0.1',
'PORT': '3306',
'PORT': '3307',
}
}
# SILENCED_SYSTEM_CHECKS = ['mysql.E001']
Expand Down
1 change: 1 addition & 0 deletions core/models/job.py
Original file line number Diff line number Diff line change
Expand Up @@ -503,6 +503,7 @@ def get_total_input_job_record_count(self):

def get_detailed_job_record_count(self, force_recount=False):

force_recount = False
'''
Return details of record counts for input jobs, successes, and errors

Expand Down
7 changes: 5 additions & 2 deletions core/models/record_group.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,5 +123,8 @@ def all_jobs(self):
@property
def last_modified(self):
jobs = self.job_set.all()
timestamps = [job.timestamp for job in jobs]
return max(timestamps)
if not jobs:
return None
else:
timestamps = [job.timestamp for job in jobs]
return max(timestamps)
2 changes: 1 addition & 1 deletion core/mongo.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,5 @@
# import pymongo and establish client
import pymongo

mongoengine.connect('combine', host=settings.MONGO_HOST, port=27017)
mongoengine.connect('combine', host=settings.MONGO_HOST, port=27017, connectTimeoutMS=60000, serverSelectionTimeoutMS=90000)
mc_handle = pymongo.MongoClient(host=settings.MONGO_HOST, port=27017)
2 changes: 2 additions & 0 deletions core/views/external_background_tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,8 @@ def system_bg_status(request):
livy_session.refresh_from_livy()
# set status
livy_status = livy_session.status
else:
livy_status = 'unknown'
else:
livy_status = 'stopped'

Expand Down
25 changes: 12 additions & 13 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
appnope==0.1.0
avro-python3==1.8.2
backports.shutil-get-terminal-size==1.0.0
avro==1.11.3
blinker==1.4
boto3
boto3==1.5.2
bs4==0.0.1
redis==3.3.11
celery==4.4.0rc2
Expand All @@ -19,7 +18,7 @@ django-datatables-view==1.14.0
django-extensions==1.9.0
django-filter==1.1.0
django-tables2==1.13.0
elasticsearch==5.4.0
elasticsearch==5.5.2
elasticsearch-dsl==5.3.0
enum34==1.1.6
filelock==2.0.6
Expand All @@ -32,12 +31,12 @@ jedi==0.10.2
jsonschema==2.6.0
kombu==4.6.3
livy==0.3.0
lxml==4.3.3
mongoengine==0.15.3
lxml==4.8.0
mongoengine==0.24.1
mysqlclient==1.4.2
numpy==1.16.3
numpy==1.21.6
openpyxl==2.4.9
pandas==0.24.2
pandas==1.3.5
pathlib2==2.3.0
pexpect==4.7
pickleshare==0.7.4
Expand All @@ -49,17 +48,17 @@ py4j==0.10.7
pydot==1.2.3
Pygments==2.2.0
pyjxslt==0.7.0
pykerberos==1.1.14
pymongo==3.7.1
pykerberos==1.2.4
pymongo==4.1.1
pyparsing==2.2.0
pytest==4.5.0
pytest-ordering==0.6
python-dateutil==2.6.1
pytz==2017.2
python-dateutil==2.8.2
pytz==2022.1
requests==2.22.0
requests-kerberos==0.11.0
responses==0.8.1
scandir==1.5
scandir==1.10.0
Sickle==0.6.2
simplegeneric==0.8.1
six==1.10.0
Expand Down