diff --git a/_episodes/01_introduction.md b/_episodes/01_introduction.md index 5f822569..8130686f 100644 --- a/_episodes/01_introduction.md +++ b/_episodes/01_introduction.md @@ -22,7 +22,7 @@ OTN partners with regional acoustic telemetry networks around the world to enabl ### How does a Node benefit its users? -OTN and affiliated networks provide automated cross-referencing of your detection data with other tags in the system to help resolve "mystery detections" and provide detection data to taggers in other regions. OTN Data Managers perform extensive quality control on submitted metadata to ensure the most accurate records possible are stored in the database and shared with researchers. OTN's database and Data Portal website are well-suited for archiving datasets for future use and sharing with collaborators. The OTN system and data workflows include pathways to publish datasets with the Ocean Biodiversity Information System, and for sharing via open data portals such as ERDDAP and GeoServer. The data product returned by OTN is directly ingestible by populare acoustic telemetry data analysis packages including `glatos`, `actel`, `remora`, and `resonATe`. In addition to the data curation materials, OTN offers continuous support and workshop materials detailing the use of these packages and tools. +OTN and affiliated networks provide automated cross-referencing of your detection data with other tags in the system to help resolve "mystery detections" and provide detection data to taggers in other regions. OTN Data Managers perform extensive quality control on submitted metadata to ensure the most accurate records possible are stored in the database and shared with researchers. OTN's database and Data Portal website are well-suited for archiving datasets for future use and sharing with collaborators. The OTN system and data workflows include pathways to publish datasets with the Ocean Biodiversity Information System, and for sharing via open data portals such as ERDDAP and GeoServer. The data product returned by OTN is directly ingestible by popular acoustic telemetry data analysis packages including `glatos`, `actel`, `remora`, and `resonATe`. In addition to the data curation materials, OTN offers continuous support and workshop materials detailing the use of these packages and tools. Below is a link to a presentation from current Node Managers, describing the relationship between OTN and its Nodes, the benefits of the Node system as a community outgrows more organic person-to-person sharing, as well as a realistic understanding of the work involved in hosting/maintaining a Node. diff --git a/_episodes/02_data_requests.md b/_episodes/02_data_requests.md index 4c173094..a438450c 100644 --- a/_episodes/02_data_requests.md +++ b/_episodes/02_data_requests.md @@ -19,14 +19,15 @@ OTN recommends creating a "Data Request Response Policy". ## External Requests - Restricted Data - + This is an example of how OTN handles these requests: -1. Request for data (from person other than data owner) submitted -2. Data request is scoped, in a GitLab Issue. All details from requester is included. -3. Impacted PIs are identified, and contacted, seeking written permission for requester to access the information. -4. Written permission is documented in GitLab Issue, to preserve the paper-trail. -5. Data request report is compiled and provided to requester, once all permissions have been received. +1. As OTN registered projects are public and open by default, check if the data requested is available publically for sharing. +2. Request for data (from person other than data owner) submitted +3. Data request is scoped, in a GitLab Ticket. All details from requester is included. +4. Impacted PIs are identified, and contacted, seeking written permission for requester to access the information. +5. Written permission is documented in GitLab Ticket, to preserve the paper-trail. +6. Data request report is compiled and provided to requester, once all permissions have been received. ## Internal Requests diff --git a/_episodes/03_OTN_System_and _structure_and_Outputs.md b/_episodes/03_OTN_System_and _structure_and_Outputs.md index 7892b21b..836bd697 100644 --- a/_episodes/03_OTN_System_and _structure_and_Outputs.md +++ b/_episodes/03_OTN_System_and _structure_and_Outputs.md @@ -25,14 +25,16 @@ Affiliated acoustic telemetry partner Networks may become an OTN Node by deployi # Basic Structure The basic structural decision at the centre of an OTN-style Database is that each of a Node's projects will be subdivided into their own database `schemas`. These schemas contain only the relevant tables and data to that project. The tables included in each schema are created and updated based on which types of data each project is reporting. - + + Projects can have the type `tracker`, `deployment`, or `data`. - Tracker projects only submit data about tag releases and animals. They get tables based on the tags, animals, and detections of those tags. - Deployment projects only submit data about receivers and their collected data. These projects get tables related to receiver deployments and detections on their receivers. - Data projects are projects that deploy both tags and receivers and will submit data related tags, animals, receivers, and detections and will get all the related tables. +When loading the project metadata using the Create and Update projects.ipynb (shown in subsequent training materials: 07_project_metatdata.md) the data loader will select one of the above. In addition to the project-specific `schemas`, there are some important common schemas in the Database that Node Managers will interact with. These additional schemas include the `obis`, `erddap`, `geoserver`, `vendor`, and `discovery` schemas. These schemas are found across all Nodes and are used to create important end-products and for processing. -- The `obis` schema holds summary data describing each project contained in the Node as well as the aggregated data from those projects. When data goes into a final table of the project schema it will be inherited into a table in `obis` (generally with a similar name). +- The `obis` schema holds summary data describing each project contained in the Node as well as the aggregated data from those projects. When data goes into a final table of the project schema it will be inherited into a table in `obis` (generally with a similar name). Of note, 'obis' is a legacy OTN schema name and bears no connection to the Ocean Biodiversity Information System (OBIS), although OTN does share publically available data to OBIS also. - The `erddap` schema holds aggregated data re-formatted to be used to serve telemetry data via an ERDDAP data portal. - The `geoserver` schema holds aggregated data re-formatted to be used to create geospatial data products published to a GeoServer. - The `vendor` schema holds manufacturer specifications for tags and receivers, used for quality control purposes. @@ -111,7 +113,11 @@ flowchart BT `Project` data has a unique workflow from the other input data and metadata that flows into an OTN Node, it is generally the first bit of information received about a project, and will be used to create the new `schema` in the Database for a project. The type of project selected (`tracker`, `deployment`, or `data`) will determine the format of the tables in the newly created `schema`. The type of project will also impact the loading tools and processes that will be used later on. The general journey of project data is: - To register a new project a researcher will fill out a [project metadata template](https://members.oceantrack.org/data/data-collection) and submit it to the Node Manager. -- The Node Manager will visually evaluate the template to catch any obvious errors and then run the data through the OTN Nodebook responsible for creating and updating projects (`Create and Update Projects`). +- The Node Manager will visually evaluate the template to catch any obvious errors and then run the data through the OTN Nodebook responsible for creating and updating projects (`Create and Update Projects`). Such errors for visual inspection include; +-> Checking the researcher names, affiliations and emails are listed row wise per researcher and a data repository role has been chosen. +-> Obvious spelling errors. +-> Empty sections where no information has been provided +-> Non-english text characters, these are flagged by the nodebook but catching them visually allows for estimating time required to load project metadata. Non-english text characters are flagged as the database cannot store them readily so time is required to parse them. If an error is received when loading project metadata and a unicode character code is given such as U+00E1, you can google this and find what it is and visually scan or find and replace the metadata text for them. - The `Create and Update Projects` notebook will make a new schema in the Database for that project, and fill it with the required tables based on the type of project. - Summary tables are populated at this time (`scientificnames`, `contacts`, `otn_resources` etc). - After this, OTN analysts will verify the project one last time to make sure every necessary field is filled out and properly defined. @@ -120,9 +126,9 @@ flowchart BT Even though `tag`, `deployment`, and `detections` data all have their own loading tools and processes, their general path through the database is the same. - Their data workflows all begin with a submission of data or metadata files from a researcher. -- The Node Manager ensures there is a copy of the file on the Node's document management website. +- The Node Manager ensures there is a copy of the file on the Node's document management website, this is typically the Plone members repository but is decided in the creation of the node what this managment website will be. - The Node Manager carries out visual quality control to catch any obvious errors. -- The data is then processed through the relevant OTN Nodebooks. This process is outlined by the task list associated with the GitLab Issue made for this data. +- The data is then processed through the relevant OTN Nodebooks. This process is outlined by the task list associated with the GitLab Ticket made for this data. - The data will first be loaded into the "raw" tables. This is the table that holds the raw data as submitted by the researcher (the naming convention for raw tables is that they always have the prefix `c_` and will have a suffix indicating the date it was loaded, typically `YYYY_MM`). - After the raw data table is verified, the data will move to the "intermediate" tables which act as a staging area for partially-processed data. - After the intermediate table is verified, data will move to the "upper" tables, where the data is finished processing and is in its final form. This is the data that will be used for aggregation tables such as `obis` and for outputs such as Detection Extracts. @@ -138,7 +144,7 @@ In order to create meaningful Detection Extracts, OTN and affiliated Nodes only - Summary schemas like `discovery`, `erddap`, and `geoserver` are updated with the newly verified data. Summary schema records can be used to create maps and other record overviews such as this map of active OTN receivers: - + Summary Map # Backing Up Your Data diff --git a/_episodes/04_Setup_and_Installing_Needed_Software.md b/_episodes/04_Setup_and_Installing_Needed_Software.md index 55732ea1..7a9d7da0 100644 --- a/_episodes/04_Setup_and_Installing_Needed_Software.md +++ b/_episodes/04_Setup_and_Installing_Needed_Software.md @@ -111,6 +111,9 @@ In order to work efficiently as a Node Manager, the following programs are neces **Visual Studio Code** - An advanced code editing integrated development environment (IDE). Also contains extensions that can run JuPyTeR notebooks, open CSV files in a visually appealing way, as well as handle updating your Git repositories. * [https://code.visualstudio.com/](https://code.visualstudio.com/) +**Positron** - An advanced code editing integrated development environment (IDE), an RStudio version of VSCode. Also contains extensions that can run JuPyTeR notebooks, open CSV files in a visually appealing way, as well as handle updating your Git repositories. +* [https://positron.posit.co/](https://positron.posit.co/) + ## For WINDOWS users **Path Copy Copy** - For copying path links from your file browser. Since many of the notebooks require you to provide the path to the file you wish to load, being able to copy and paste the entire path at once can save a lot of time. diff --git a/_episodes/05_workflow_gitlab_dbeaver.md b/_episodes/05_workflow_gitlab_dbeaver.md index 52b76a86..fc86ea91 100644 --- a/_episodes/05_workflow_gitlab_dbeaver.md +++ b/_episodes/05_workflow_gitlab_dbeaver.md @@ -8,7 +8,7 @@ questions: - "How can a Node Manager interact with their database directly?" objectives: - "Understand the data-loading workflow" -- "Understand how to create and use GitLab Issues" +- "Understand how to create and use GitLab Work Items" - "Understand how to access and query your database tables" - "Understand how to use the `AUTH - Create and Update` notebook to maintain your database credentials file" keypoints: @@ -16,10 +16,10 @@ keypoints: - "Data submissions and QC processes should be trackable and archived" - "OTN is always here to help with any step of the process" --- - + Data Managers receive data from a researcher and then begin the process of QA/QC and data matching: -1. Records are received and a GitLab Issue is created. +1. Records are received and a GitLab Ticket is created. 1. Data are QA/QC'd using the OTN Nodebooks, and all progress is tracked in GitLab. Feedback between Data Manager and researchers happens at this stage, until data is clean and all GitLab tasks are completed. 1. Successful processing can be checked by using DBeaver to query and explore the database. @@ -27,7 +27,7 @@ Data Managers receive data from a researcher and then begin the process of QA/QC flowchart LR data_start(( )) --> get_data(Receive metadata
from researchers) style data_start fill:#00FF00,stroke:#00FF00,stroke-width:4px - get_data --> gitlab(Create Gitlab issue
with template) + get_data --> gitlab(Create Gitlab work item
with template) gitlab --> viz{Visually inspect,
does metadata have errors?} viz --yes--> req(Request corrected data
from researchers) req --> end1(( )) @@ -50,7 +50,7 @@ OTN-managed Nodes can always use the same Plone file management portal software The FACT Network currently uses a custom instance of Research Workspace for the same purpose. -The ACT and GLATOS Networks use a custom data-submission form managed through their networks' web sites. +The GLATOS Network use a custom data-submission form managed through their networks' web sites. Its common for groups of researchers to use DropBox, Google Drive, or something similar to share data/metadata when the Network is still small. This can be a great, accessible option but the caveat is that is is much more difficult to control access to each individual folder to protect the Data Policy, and it may be difficult to determine when new data has been submitted. @@ -62,57 +62,63 @@ It is **not** recommended to use a personal email account for this, since all th # Documenting data submission +*NOTE: GitLab Work items are often referred to as "tickets" and "issue(s)", gitlab now uses work items and issues interchangably to refer to tickets. Work items is the prominent language used on the Gitlab website. Here we will generally use the term tickets with the exception of cases where we direct the user on using gitlab features.* + Using one of the suggested means above, a user has submitted data and metadata to the Node Manager. Now what? -OTN uses GitLab Issues with templates of task-lists to ensure we NEVER forget a step in data loading, and that no file is ever lost/forgotten in an inbox. +OTN uses GitLab Tickets with templates of task-lists to ensure we NEVER forget a step in data loading, and that no file is ever lost/forgotten in an inbox. A ticket (a general name) functions as a live tracker of actions related to a piece of data or query/request. In gitlab a 'Work Item' is a ticket. When a new work item is created, there will be a number of drop down menus and text boxes that can be used to classify and organise the ticket for tracking. This makes the ticket searchable within gitlab via use of labeling or text etc, these tickets can then be found again and updated as required until the required action is resolved. Immediately upon receipt of a data file, you are advised to login to [OTN's GitLab](https://gitlab.oceantrack.org). You will have a project for your Node named . This is where you will navigate to. You may want to bookmark this webpage! -Once on the GitLab project page, you should navigate to the **Issues** menu option, on the left side. Think of your GitLab issues as your running "TODO List"! You will want to create a new Issue for each piece of data that is submitted. +Once on the GitLab project page, you should navigate to the **Work items** menu option, on the left side. Think of your GitLab tickets as your running "TODO List"! You will want to create a new Ticket for each piece of data that is submitted. + -*NOTE: GitLab Issues are often referred to as "tickets"* -### Creating GitLab issues +### Creating GitLab tickets -By choosing the **New Issue** button in the top-right of your screen, you will be taken to a new, blank, issue form. You will need to fill out the following fields: +By choosing the **New item** button in the top-right of your screen, you will be taken to a new, blank, ticket form. You will need to fill out the following fields: - Title: Write the project name/code, the type of data submitted, and the submission date, this makes the ticket searchable in the future (eg: `HFX tag metadata 2022-02`) - Type: Should be type `Issue`. - Description: - * There are pre-made *Templates* to choose from here, using the drop down menu. Ensure you choose the relevant checklist for the type of data that was submitted (eg: `Tag_metadata`). This will populate the large description field! - * Ensure you include the link to the submitted data file OR use the `Attach a file` option to attach a copy of the submitted data file to the issue. + * There are pre-made *Templates* to choose from here, using the drop down menu. Ensure you choose the relevant checklist for the type of data that was submitted (eg: `Tag_metadata`,'Receiver_metadata','Detections'). This will populate the large description field! + * Ensure you include the link to the submitted data file OR use the `Attach a file` option to attach a copy of the submitted data file to the ticket. - Assignee: Assign to yourself if this is a task for you, or to anyone else to whom you want to delegate. -- Milestone: These are the upcoming Data Push dates. You should choose the nearest future PUSH date as the Milestone for this issue. -- Labels: This is for your reference - choose a label that will help you remember what stage of processing this issue is in. Some common examples include `Needs QC`, `Waiting for Metadata`, `Waiting for VRLs`, `Request PI Clarification` etc. You can create new labels at any time to help sort your tickets. +- Milestone: These are the upcoming Data Push dates. You should choose the nearest future PUSH date as the Milestone for this ticket. +- Labels: This is for your reference - choose a label that will help you remember what stage of processing this ticket is in. Some common examples include `Ready to Load`,'Loading records',`Needs QC`, `Waiting for Metadata`, `Waiting for VRLs`, `Request PI Clarification` etc. You can create new labels at any time to help sort your tickets. Addionally, there are now labels for 'Project metadata','receiver metadata', 'Tagging metadata' 'Detection data','Gliders/Movers'. It is standard practice in OTN to label the different data types being ticketed for searchability in Gitlab, labels tend to provide the best user-determined filtering system in Gitlab. With the above information supplied, you can click the **Create Issue** button. ### Using GitLab to track progress -As you approach the deadline for data-loading, before a data PUSH, you should begin to work on your Issues which fall under that Milestone. When you open an issue, you will be able to see the remaining tasks to properly load/process that data along with the name of the OTN Nodebook you should use to complete each task. +As you approach the deadline for data-loading, before a data PUSH, you should begin to work on your Items which fall under that Milestone. When you open an item, you will be able to see the remaining tasks to properly load/process that data along with the name of the OTN Nodebook you should use to complete each task. -Keep GitLab open in your browser as you work through the relevant Nodebooks. You should check off the tasks as you complete them, and insert any comments you have into the bottom of the ticket. Comments can include error messages from the Nodebook, questions you have for the researcher, any re-formatting required, etc. At any time you can change the Labels on the issue, to help you remember the issue's status at a glance. +Keep GitLab open in your browser as you work through the relevant Nodebooks. You should check off the tasks as you complete them, and insert any comments you have into the bottom of the ticket. Comments can include error messages from the Nodebook, questions you have for the researcher, any re-formatting required, etc. At any time you can change the Labels on the item, to help you remember the item's status at a glance. Once you are done for the day, you'll be able to come back and see exactly where you left off, thanks to the checklist! -You can tag anyone from the OTN Data Team in your GitLab issue (using the `@NAME` syntax). We will be notified via email to come and check out the Issue and answer any questions that have been commented. +You can tag anyone from the OTN Data Team in your GitLab ticket (using the `@NAME` syntax). We will be notified via email to come and check out the ticket and answer any questions that have been commented. -Once you have completed all the tasks in the template, you can edit the `Assignee` value in the top-right corner, and assign to someone from OTN's Database team (currently, Angela or Yinghuan). They will complete the final verification of the data, and close the issue when completed. At this time, you can change the issue Label to `Verify`, or something similar, to help visually "mark it off" your issue list on the main page. +Once you have completed all the tasks in the template, you can edit the `Assignee` value in the top-right corner, and assign to someone from OTN's Database team (currently, Angela or Yinghuan). They will complete the final verification of the data, and close the item when completed. At this time, you can change the item Label to `Verify`, or something similar, to help visually "mark it off" your item list on the main page. ## GitLab practice -At this time we will take a moment to practice making GitLab Issues, and explore other pages on our GitLab like, `Milestones`, `Repository`, `Snippets`, and `Wiki`. +At this time we will take a moment to practice making GitLab Work Items, and explore other pages on our GitLab like, `Milestones`, `Repository`, `Snippets`, and `Wiki`. +- Milestones can be found under the Plan heading in the left-hand side of the Gitlab page when within your Node-DAQ gitlab section. Reviewing this page can give insight into the progression of ticket processing from unstarted, ongoing to completed. +- Repository is found under the Code heading in the left-hand side of the Gitlab page also. This page shows the files associated in the Gitlab such as templates (click .gitlab folder in upper left unter Files heading, then issue_templates). +-Snippets is found under the same Code heading, this a good place to store copy and pasteable email templates for repeated queries to researchers or responses for frequently asked questions. You can also store bits of code here and there such as SQL queries for the Node database for common database searches +- Wiki is a place to store things like how-to-do's or process guides that may not have a template associated yet such as data policy agrrement tracking. ## Database access As part of the OTN workflow, it may be prudent to use a database client like DBeaver to view the contents of your Node's database directly and make sure the data has been loaded as expected. -DBeaver is an open-source application for interacting directly with databases. There are lots of built-in tools for query writing and data exploration. We will assume that workshop attendees are novices in using this application. +DBeaver is an open-source application for interacting directly with databases. There are lots of built-in tools for query writing and data exploration. We will assume that workshop attendees are novices in using this application. [https://dbeaver.io/](https://dbeaver.io/) (free and open access - **recommended**) + ### Connecting to your database For this training we will connect to a Node Training test database, as practice. Once you open DBeaver, you will need to click on the `Database` menu item, and choose `New Database Connection`. A popup will appear, and you will choose the `PostreSQL` logo (the elephant) then click Next. Using the `.auth` file provided to you by OTNDC you will complete the following fields: - - Host: this could be something like `matos.asascience.com` for your DB, but we will use the IP address: `129.173.48.161` for our Node Training DB. - Database: this will be your database name, something like `pathnode`. For training, it will be `nodetraining`. - Port: this is specified in your `.auth` file and will be four digits. For training, this port will be set to `5432`. @@ -125,7 +131,6 @@ On the left-side you should now see a `Database Navigator` tab, and a list of al ### Writing a query in DBeaver If you wish to write a query to see a specific portion of your already-loaded data, you should first open a new SQL console. Choose `SQL Editor` from the top menu, then `New SQL Script`. A blank form should appear. - While writing SQL is out of the scope of this course, there are many great SQL resources available online. The general premise involves creating conditional `select` statements to specify the data you're interested in. As an example, `select * from hfx.rcvr_locations where rcv_serial_no = '12345';` will select all records from the HFX schema's rcvr_locations table where the serial number is 12345. To run a query, ensure your cursor (the vertical line that shows where you are editing text) is on the line you want to run, then either 1) right-click, and choose Execute, or 2) press CTRL-ENTER (CMD-ENTER for Mac). The results of your query will be displayed in the window below the SQL console. diff --git a/_episodes/06_auth_notebook.md b/_episodes/06_auth_notebook.md index a43abf8f..11d976f6 100644 --- a/_episodes/06_auth_notebook.md +++ b/_episodes/06_auth_notebook.md @@ -48,7 +48,6 @@ Run this cell. You will be prompted to create a password for the file (if it is ### Create or Update Main Connections Run this cell. This section will have an editable form. If it is a new file, all fields will be blank. If it is an existing file, the previously-entered information will display. You may now edit the information, pressing the blue button when you are finished to save your results. - - Conn Name: this is customizable - what is the name of this connection? We recommend choosing something like "OTN Database" to help you remember. - Host: this will be something like `matos.asascience.com` for your DB, but for training purposes we will use the IP of our Node Training DB: `129.173.48.161`. - Port: this is specified in your `.auth` file and will be four digits. Use `5432` for Node Training. @@ -113,7 +112,7 @@ Press `Save` to change the password of your `.kdbx`. **Ensure that you remember This will be relevant for users of the `Database Fix` suite of Nodebooks only. If you are not going to use these tools, you can skip this cell in the Nodebooks. -A Gitlab Access Token will allow Nodebooks to access your GitLab account and insert comments into an Issue directly, as you are working on it. This has been developed for the Database Fix Notebooks to ensure all changes made within the notebooks are documented in GitLab properly. The automation is part of the `OTNGitlabAutomation` package. +A Gitlab Access Token will allow Nodebooks to access your GitLab account and insert comments into an ticket directly, as you are working on it. This has been developed for the Database Fix Notebooks to ensure all changes made within the notebooks are documented in GitLab properly. The automation is part of the `OTNGitlabAutomation` package. Instructions to create a Personal Access Token are found on our wiki [here](https://gitlab.oceantrack.org/otn-partner-nodes/otngitlabautomation/-/wikis/How-to-create-an-personal-access-token) diff --git a/_episodes/07_project_metadata.md b/_episodes/07_project_metadata.md index 025dc211..8b15e3b7 100644 --- a/_episodes/07_project_metadata.md +++ b/_episodes/07_project_metadata.md @@ -15,12 +15,13 @@ keypoints: --- ## Process workflow + The process workflow for project metadata is as follows:
 flowchart LR
     proj_start(( )) --> get_meta(Receive 
project metadata
from researchers) style proj_start fill:#00FF00,stroke:#00FF00,stroke-width:4px - get_meta --> gitlab(Create
Gitlab
issue) + get_meta --> gitlab(Create
Gitlab
Work item) gitlab --> inspect(Visually
inspect) inspect --> nodebook(QC with
nodebooks) nodebook --> plone(Verify repository
folder
is correct) @@ -34,31 +35,45 @@ The **first** step when you are contacted by a researcher who wants to register ## Completed Metadata -Immediately upon receipt of the metadata, you must create a new Gitlab Issue. Please use the `Project Metadata` Issue checklist template. - -Here is the Issue checklist, for reference: +Immediately upon receipt of the metadata, you must create a new Gitlab Ticket (aka Work Item). Please use the `Project Metadata` work item checklist template found in the drop down menu under the 'Description'. +Here is the Work item checklist, for reference: + ~~~ Project Metadata - [ ] - NAME add label *'loading records'* -- [ ] - NAME define type of project **select here one of Data, Deployment, Tracker** +- [ ] - NAME define type of project **select one: Data, Deployment, Tracker** - [ ] - NAME create schema and project records (`Creating and Updating project metadata` notebook) - [ ] - NAME add project contact information (`Creating and Updating project metadata` notebook) - [ ] - NAME add scientificnames (`Creating and Updating project metadata` notebook) +- [ ] - NAME [OTN only] manually identify if this is a loan, if so add record to obis.loan_tracking (`Creating and Updating project metadata` notebook) - [ ] - NAME verify all of above (`Creating and Updating project metadata` notebook) -- [ ] - NAME [Plone-users only] create new project repository users (`Create Plone Folders and Add Users` notebook) -- [ ] - NAME [Plone-users only] create project repository folder (`Create Plone Folders and Add Users` notebook) -- [ ] - NAME [Plone-users only] add project repository users to folder (`Create Plone Folders and Add Users` notebook) -- [ ] - NAME [Plone-users only] access project repository double-check project repository creation and user access -- [ ] - NAME add project metadata file to relevant project folder (Plone site, Research Workspace etc) +- [ ] - NAME [OTN only] create new project repo users (`Create Plone Folders and Add Users` notebook) +- [ ] - NAME [OTN only] create project repo folder (`Create Plone Folders and Add Users` notebook) +- [ ] - NAME [OTN only] add project repo users to folder (`Create Plone Folders and Add Users` notebook) +- [ ] - NAME [OTN only] access project repo double-check project repository creation and user access - **post repository URL HERE** +- [ ] - NAME add project metadata file to project folder (OTN members.oceantrack.org, FACT RW etc) - [ ] - NAME email notification of updated metadata file to PI and individual who submitted -- [ ] - NAME send onboarding email to all contacts -- [ ] - NAME label issue with *'Verify'* -- [ ] - NAME pass issue to OTN DAQ staff -- [ ] - NAME [OTN only] manually identify if this is a loan, if so add record to otnunit.obis.loan_tracking (`Creating and Updating project metadata` notebook) +- [ ] - NAME send onboarding email to PIs using https://gitlab.oceantrack.org/otndc/otn-data-acquisition/-/snippets/203 - [ ] - NAME [OTN only] if this is a loan, update links for PMO -- [ ] - NAME pass issue to OTN analyst for final verification +- [ ] - NAME [OTN only] check inbox for any email asking for an embargo on this project + +_If there is an embargo request:_ +- [ ] - NAME if there is an embargo request: create a new ticket using the correct task list (either 'Embargo_request_two_year' or 'Embargo_request_extended') and then skip down to the **Verification** steps. **Paste link to new ticket here** + +_If there is **no** embargo request:_ +- [ ] - NAME update the 'Publication of Tag Data' section with the embargo date as current date and select the *PI Approval* box (`Publication Control Table Update` notebook) +- [ ] - NAME update the 'Publication of Detection Data' section with the embargo date as current date and select the *PI Approval* box (`Publication Control Table Update` notebook) +- [ ] - NAME select **yes** for publish to OBIS, unless otherwise specified (`Publication Control Table Update` notebook) +- [ ] - NAME select **yes** for publish to ERDDAP, unless otherwise specified (`Publication Control Table Update` notebook) +- [ ] - NAME add signed data policy to the Background folder of the project folder **Paste link to file** +- [ ] - NAME email PIs about making the Plone repo public as well _(do not need to wait on a response, pass ticket on to verification once email has been sent)_ + +**Verification** +- [ ] - NAME label work item with *'Verify'* +- [ ] - NAME reassign work item to OTN data analyst for final verification - [ ] - NAME verify project in database +- [ ] - NAME verify obis.publication_control updates **project metadata txt file** ~~~ @@ -70,7 +85,7 @@ Once the researcher provides the completed file, the Data Manager should complet Please make sure of the following: -1. Is the PI-provided collection code unique/appropriate? Do you need to create one yourself? Existing schemas/collection codes can be seen in the database. +1. Is the PI-provided collection code unique/appropriate? Do you need to create one yourself? Existing schemas/collection codes can be seen and checked for prexistence in the database. 1. Are there typos in the title or abstract? 1. Are the contacts formatted correctly? 1. Are the species formatted correctly? @@ -147,7 +162,7 @@ format: Lastname, I., Lastname, I. YYYY. [Title from question 1 or suitable alte ## Quality Control - Create and Update Projects -Each step in the Issue checklist will be discussed here, along with other important notes required to use the Nodebooks. +Each step in the Work item checklist will be discussed here, along with other important notes required to use the Nodebooks. ### Imports Cell @@ -158,7 +173,7 @@ You will have to edit one section: `engine = get_engine()` - On MacOS computers, you can usually find and copy the path to your database `.kdbx` file by right-clicking on the file and holding down the "option" key. On Windows, we recommend using the installed software Path Copy Copy, so you can copy a unix-style path by right-clicking. - The path should look like `engine = get_engine('C:/Users/username/Desktop/Auth files/database_connection.kdbx')`. -### Project Metadata Parser +### Project Metadata Parser This cell is where you input the information contained in the Project Metadata `.txt` file. There are two ways to do this: @@ -173,8 +188,8 @@ The output will have useful information: - Are there strange characters in the collection code, project title, or abstract? - Were the names and affiliations of each contact successfully parsed? Are there any affiliated institutions which are not found? Are there any contacts which were not found that you expected to be? - Is the project URL formatted correctly? -- Are all the species studied found in WoRMS? Are any of them non-accepted taxonomy (entries with accepted taxonomies will have a success message of the format `INFO: Genus species is an accepted taxon, and has Aphia ID XXXXXX.`, followed by a URL)? Which ones have common names that do **not** match the WoRMS records (look at bottom of each species record for success: `OK: Animal name is an acceptable vernacular name for Genus species`)? **NOTE: any mismatches with common name can be fixed at a later stage, make a note in the Issue for your records** -- Is the suggested Bounding Box appropriate based on the abstract? **NOTE: any issues with the scale of the bounding box can be fixed at a later stage, make a note in the Issue for your records** +- Are all the species studied found in WoRMS? Are any of them non-accepted taxonomy (entries with accepted taxonomies will have a success message of the format `INFO: Genus species is an accepted taxon, and has Aphia ID XXXXXX.`, followed by a URL)? Which ones have common names that do **not** match the WoRMS records (look at bottom of each species record for success: `OK: Animal name is an acceptable vernacular name for Genus species`)? **NOTE: any mismatches with common name can be fixed at a later stage, make a note in the Ticket for your records** +- Is the suggested Bounding Box appropriate based on the abstract? **NOTE: any issues with the scale of the bounding box can be fixed at a later stage, make a note in the Ticket for your records** - Are the start and end dates formatted correctly? Generally, most of the error messages arise from the **Contacts** and **Species** sections. @@ -188,12 +203,12 @@ There are some fields which need to be set up by the Data Manager, rather than Run the cell to generate a fillable form with these fields: 1. Node: select your node -1. Collaboration Type: based on the abstract, are they deploying only tags (`Tracker` project), only receivers (`Deployment` project) or both tags and receivers (`Data` project)? +1. Collaboration Type: make an assessment based on the abstract, are they deploying only tags (`Tracker` project), only receivers (`Deployment` project) or both tags and receivers (`Data` project)? 1. Ocean: choose the most appropriate ocean region based on the abstract. 1. Shortname: usually a summarised version of the project title, which will be used as the name of the Data Portal folder. ex: `OTN Blue Sharks`. 1. Longname: use the Title provided by the researcher, or something else, which is in "scientific-paper" style. ex: `Understanding the movements of Blue sharks through Nova Scotia waters, using acoustic telemetry.` 1. Series Code: this will generally be the name of your node. Compare to values found in the database `obis.otn_resources` if you’re unsure. -1. Institution Code: The main institution responsible for maintaining the project. Compare to values found in the database `obis.institution_codes` and `obis.otn_resources` if you’re unsure. **If this is a new Institution, please make a note in the Issue, so you can add it later on** +1. Institution Code: The main institution responsible for maintaining the project. Compare to values found in the database `obis.institution_codes` and `obis.otn_resources` if you’re unsure. **If this is a new Institution, please make a note in the Ticket, so you can add it later on** 1. Country: based upon the abstract. Multiple countries can be listed as such: `CANADA, USA, EGYPT` etc. 1. State: based upon the abstract. Multiple states can be listed as such: `NOVA SCOTIA, NEWFOUNDLAND` etc. 1. Local Area: based upon the abstract. Location information. ex: `Halifax` @@ -208,7 +223,7 @@ Verify the output from the parser cell, looking for several things: If anything is wrong, please begin again from the Manual Field input cell. -If the institution code **IS NOT** found - compare to values found in the database `obis.institution_codes` and `obis.otn_resources`. **If this is a new Institution, please make a note in the Issue, so you can add it later on** +If the institution code **IS NOT** found - compare to values found in the database `obis.institution_codes` and `obis.otn_resources`. **If this is a new Institution, please make a note in the Ticket, so you can add it later on** #### Task List Checkpoint @@ -511,13 +526,28 @@ Successful output will be of this format: > {: .language-plaintext .example} > > Then you may choose `Add another user` and begin again. -> -> The acceptable folder permissions may vary depending on the project role of the contact. Here are some guidelines: -> - Principal Investigator: all permissions -> - Researcher: all permissions except `Reviewer` -> - Student: all permissions except `Reviewer` -> - Technician: only `Contributor` and `Reader` -> - Collaborator: only `Contributor` and `Reader` +> +Upon creation of a project or in cases of adding a new person to your project, you are asked to specify their role on the project. This role relates to the level of authority they have with regards to the repository itself and the data contained therein. + +The actions a person can perform in the data repository are as follows: + +Can add: A user can access the repository and share data files to be uploaded to the repository. +Can edit: A user can access the repository and can add files, edit files, remove files and make folders. +Can view: A user can access the repository and view files. +Can Review: A user can access the repository and has final say on data sharing. + +The acceptable folder permissions may vary depending on the project role of the contact. Here are some guidelines: +• Principal Investigator: all permissions +• Researcher: all permissions except Reviewer +• Student: all permissions except Reviewer +• Technician: only Contributor and Reader +• Collaborator: only Contributor and Reader + +Roles and abilities in the OTN Data repository: +Reader: Can view but cannot make changes. +Contributor: Can view the repository, can edit and can add. +Reviewer: Can do all of above and also has role of authority on what happens to the data in respect to sharing. + > > This is very fluid and can be edited at any time. These are guidelines only! > diff --git a/_episodes/08_tag_metadata.md b/_episodes/08_tag_metadata.md index 8252b104..dd12ecd3 100644 --- a/_episodes/08_tag_metadata.md +++ b/_episodes/08_tag_metadata.md @@ -14,7 +14,7 @@ keypoints: - "Loading tagging metadata requires judgement from the Data Manager" - "Communication with the researcher is essential when errors are found" --- - + ## Process workflow The process workflow for tag metadata is as follows:
@@ -32,30 +32,32 @@ flowchart LR
 
 Once a project has been registered, the next step (for `Tracker` and `Data` project types) is to begin to quality control and load the project's tagging metadata into the database. Tagging metadata should be reported to your Node in the template provided [here](https://members.oceantrack.org/data/data-collection). This file holds information about the deployment of any and all tags (acoustic, PIT, satellite, floy etc.) in or on animals for the purposes of tracking their movements using either listening stations or via mark/recapture. Any biological metrics that were measured at tagging time, i.e. length, weight, population, are also able to be recorded for association with the tagging event, permitting future analyses.
 
-Recall that there are multiple levels of data tables in the database for tagging records: `raw tables` ("raw"), `cache tables` ("intermediate") and `otn tables` ("upper"). The process for loading tagging metadata evaluates and promotes the data through each of these levels, as reflected by the GitLab task list.
+Recall that there are multiple levels of data tables in the database for tagging records: `raw tables` ("raw"), `cache tables` ("intermediate") and `otn tables` ("upper"). The process for loading tagging metadata evaluates and promotes the data through each of these levels, as reflected by the GitLab Ticket checklist template found in the drop down menu under the 'Description'. 
+
 
 ## Completed Metadata
 
-Immediately, upon receipt of the metadata, create a new GitLab issue. Please use the `Tag Metadata` Issue checklist template.
+Immediately, upon receipt of the metadata, create a new GitLab ticket. Please use the `Tag Metadata` Work item checklist template.
 
-Here is the Issue checklist, for reference:
+Here is the Work item checklist, for reference:
 
 ~~~
-Tag Meta Data
+Tag Metadata
 - [ ] - NAME add label *'loading records'*
-- [ ] - NAME load raw tag metadata (`tag-1` notebook) **put_table_name_in_ticket**
+- [ ] - NAME load raw tag metadata (`tag-1` notebook) **:fish: put_table_name_in_ticket**
 - [ ] - NAME confirm no duplicates in raw table, review and remove (`tag-1b` notebook)
 - [ ] - NAME verify raw table (`tag-2` notebook)
-- [ ] - NAME post updated metadata to project folder (OTN members.oceantrack.org, FACT RW etc) if needed
+- [ ] - NAME post updated metadata to project folder (OTN members.oceantrack.org, FACT RW etc) if needed **:fish: put_link_to_updated_metadata**
 - [ ] - NAME email notification of updated metadata file to PI and individual who submitted
 - [ ] - NAME build cache tables (`tag-2` notebook)
 - [ ] - NAME verify cache tables (`tag-2` notebook)
 - [ ] - NAME load otn tables (`tag-2` notebook)
 - [ ] - NAME verify otn tables (`tag-2` notebook)
 - [ ] - NAME verify tags are not part of another collection (`tag-2` notebook)
-- [ ] - NAME label issue with *'Verify'*
-- [ ] - NAME pass issue to analyst for final verification
-- [ ] - NAME check for double reporting (verification_notebooks/Tag Verification notebook)
+- [ ] - NAME label work item with *'Verify'*
+- [ ] - NAME reassign work item to OTN data analyst for final verification
+- [ ] - NAME check for double reporting (verification_notebooks/`Tag Verification` notebook)
+
 ~~~
 {: .language-plaintext .example}
 
@@ -107,7 +109,7 @@ The metadata template [available here](https://members.oceantrack.org/data/data-
 
 # Quality Control - Tag-1 Nodebook
 
-Each step in the Issue checklist will be discussed here, along with other important notes required to use the Nodebook.
+Each step in the Work item checklist will be discussed here, along with other important notes required to use the Nodebook.
 
 ### Imports cell
 
@@ -325,7 +327,7 @@ In Gitlab, this task can be completed at this stage:
 
 `- [ ] - NAME confirm no duplicates in raw table, review and remove ("tag-1b" notebook)`
 
-**Ensure you paste the `no_dup` table name (ex: c_tag_meta_2021_09_no_dup), if relevant, into the Issue** before you check the box. This is now the raw table that will be used for the result of the data-loading process.
+**Ensure you paste the `no_dup` table name (ex: c_tag_meta_2021_09_no_dup), if relevant, into the Ticket** before you check the box. This is now the raw table that will be used for the result of the data-loading process.
 
 # Quality Control - Tag-2 Nodebook
 
@@ -528,7 +530,7 @@ First: you should access the Repository folder in your browser and add the clean
 
 Then, please email a copy of this file to the researcher who submitted it, so they can use the "cleaned" version in the future.
 
-Finally, the Issue can be passed off to an OTN-analyst for final verification in the database.
+Finally, the Ticket can be passed off to an OTN-analyst for final verification in the database.
 
 {% include links.md %}
 
diff --git a/_episodes/09_deploy_metadata.md b/_episodes/09_deploy_metadata.md
index 5706a8bb..243ce95a 100644
--- a/_episodes/09_deploy_metadata.md
+++ b/_episodes/09_deploy_metadata.md
@@ -19,7 +19,8 @@ The process workflow for deployment metadata is as follows:
 flowchart LR
     tag_start(( )) --> get_meta(Receive 
deployment metadata
from researchers) style tag_start fill:#00FF00,stroke:#00FF00,stroke-width:4px - get_meta --> gitlab(Create
Gitlab
issue) + + get_meta --> gitlab(Create
Gitlab
Work item) gitlab --> inspect(Visually
inspect) inspect --> nodebook(Process and verify
with nodebooks) nodebook --> plone(Add metadata
to repository folder) @@ -33,20 +34,23 @@ Once a project has been registered, the next step (for `Deployment` and `Data` p Recall that there are multiple levels of data-tables in the database for deployment records: `raw tables`, `rcvr_locations`, `stations` and `moorings`. The process for loading instrument metadata reflects this, as does the GitLab task list. -## Submitted Metadata +**Check deployment sheet information for misplaced Movers data i.e. VMT metadata** -Immediately upon receipt of the metadata, create a new GitLab issue. Please use the `Receiver_metadata` Issue checklist template. +## Submitted Metadata -Here is the Issue checklist, for reference: +Immediately upon receipt of the metadata, create a new GitLab ticket. Please use the `Receiver_metadata` Work item checklist template found in the drop down menu under the 'Description'. +Here is the Work item checklist, for reference: + ~~~ Receiver Metadata - [ ] - NAME add label *'loading records'* -- [ ] - NAME load raw receiver metadata (`deploy` notebook) **put_table_name_in_ticket** +- [ ] - NAME load raw receiver metadata (`deploy` notebook) **:fish: put_table_name_in_ticket** +- [ ] - NAME [OTN only] check for *new* lost indicator in recovery column, list receiver serial numbers for OTN inventory updating, tag OTN daq personnel (only for current deployments/recoveries, not historical) - [ ] - NAME check that station locations have not changed station "NAMES" since last submission (manual check) - [ ] - NAME verify raw table (`deploy` notebook) -- [ ] - NAME post updated metadata file to project repository (OTN members.oceantrack.org, FACT RW etc) -- [ ] - NAME email notification of updated metadata file to PI and individual who submitted +- [ ] - NAME post updated metadata file to project repository (OTN members.oceantrack.org, FACT RW etc) **:fish: put_link_to_updated_metadata** +- [ ] - NAME email notification of updated metadata file to PI and individual who submitted - [ ] - NAME load station records (`deploy` notebook) - [ ] - NAME verify stations (`deploy` notebook) - [ ] - NAME load to rcvr_locations (`deploy` notebook) @@ -54,11 +58,10 @@ Receiver Metadata - [ ] - NAME add transmitter records receivers with integral pingers (`deploy` notebook) - [ ] - NAME load to moorings (`deploy` notebook) - [ ] - NAME verify moorings (`deploy` notebook) -- [ ] - NAME label issue with *'Verify'* -- [ ] - NAME pass issue to OTN DAQ for reassignment to analyst -- [ ] - NAME check if project is OTN loan, if yes, check for lost indicator in recovery column, list receiver serial numbers for OTN inventory updating. -- [ ] - NAME pass issue to OTN analyst for final verification -- [ ] - NAME check for double reporting (verification_notebooks/Deployment Verification notebook) +- [ ] - NAME label work item with *'Verify'* +- [ ] - NAME reassign work item to OTN data analyst for final verification +- [ ] - NAME check for double reporting (verification_notebooks/`Deployment Verification` notebook) + **receiver deployment files/path:** ~~~ @@ -84,7 +87,7 @@ Check for the following in the deployment metadata: * recover_date_time 2. If any of the above mandatory fields are blank, follow-up with the researcher will be required if: * you cannot discern the values yourself. - * you do not have access to the Tag or Receiver Specifications from the manufacturer (relevant for the columns containing transmitter information). + * you do not have access to the Tag or Receiver Specifications from the manufacturer, these can be checked in DBeaver (relevant for the columns containing transmitter information). 3. Are the station names in the metadata consistent with those already loaded to the database (ex. '_yyyy' appended to station names or special characters in the metadata)? 4. Are all lat/longs in the correct sign? Are they in the correct format (decimal degrees)? 5. Do all transceivers/test tags have their transmitters provided? @@ -111,7 +114,7 @@ In GitLab, this task can be completed at this stage: # Quality Control - Deploy Nodebook -Each step in the Issue checklist will be discussed here, along with other important notes required to use the Nodebook. +Each step in the Ticket checklist will be discussed here, along with other important notes required to use the Nodebook. ### Imports Cell diff --git a/_episodes/10_Detections.md b/_episodes/10_Detections.md index c6a1de81..234ff4af 100644 --- a/_episodes/10_Detections.md +++ b/_episodes/10_Detections.md @@ -11,7 +11,7 @@ objectives: - "Learn common errors and pitfalls that come up when loading detections" keypoints: - "Its important to handle errors when they come up as they can have implications on detections" -- "OTN finishes off detections Issues by running Matching and sensor tag processing" +- "OTN finishes off detections Tickets by running Matching and sensor tag processing" --- ## Process workflow @@ -20,7 +20,7 @@ The process workflow for detection data is as follows: flowchart LR tag_start(( )) --> get_meta(Receive
detection data
from researchers) style tag_start fill:#00FF00,stroke:#00FF00,stroke-width:4px - get_meta --> gitlab(Create
Gitlab
issue) + get_meta --> gitlab(Create
Gitlab
Work item) gitlab --> inspect(Visually
inspect) inspect --> convert(Convert to
CSVs) convert --> nodebook(Process and verify
with nodebooks) @@ -34,39 +34,39 @@ Once `deployment metadata` has been processed for a project, the related detecti ## Submitted Records -Immediately upon receipt of the data files, you must create a new GitLab issue. Please use the `Detections` Issue checklist template. - -Here is the Issue checklist, for reference: +Immediately upon receipt of the data files, you must create a new GitLab Work item. Please use the `Detections` Work item checklist template found in the drop down menu under the 'Description'. +Here is the Work item checklist, for reference: + ~~~ Detections - [ ] - NAME add label *'loading records'* -- [ ] - NAME load raw detections and events `(detections-1` notebook and `events-1` notebook **OR** `Convert - Fathom Export` notebook and `detections-1` notebook) **(put table names here)** +- [ ] - NAME load raw detections and events `(detections-1` notebook and `events-1` notebook **OR** `convert - Fathom (vdat) Export - VRL to CSV` notebook and `detections-1` notebook) **:fish:(put table names here)** - [ ] - NAME upload raw detections to project folder (OTN members.oceantrack.org, FACT RW etc) if needed - [ ] - NAME verify raw detections table (`detections-1` notebook) - [ ] - NAME load raw events to events table (`events-2` notebook) -- [ ] - NAME load to detections_yyyy (`detections-2` notebook) **(put detection years that were loaded here)** +- [ ] - NAME load to detections_yyyy (`detections-2` notebook) **:fish:(put detection years that were loaded here)** - [ ] - NAME verify detections_yyyy (looking for duplicates) (`detections-2` notebook) -- [ ] - NAME load to sensor_match_yyyy (`detections-2` notebook) **(put sensor years that were loaded here)** +- [ ] - NAME load to sensor_match_yyyy (`detections-2` notebook) **:fish:(put sensor years that were loaded here)** - [ ] - NAME timedrift correction for affected detection and sensor years (`detections-2b` notebook) -- [ ] - NAME verify timedrift corrections (`detections-2b` notebook) -- [ ] - NAME manually check for open, unverified receiver metadata, **STOP** if it exists! (**put Gitlab issue number here**) ------ -- [ ] - NAME load to otn_detections_yyyy (`detections-3` notebook) **(put affected years here)** +- [ ] - NAME verify timedrift corrections (`detections-2b` notebook) **:fish:(put affected years here)** +- [ ] - NAME manually check for open, unverified receiver metadata, **STOP** if it exists! **(put Gitlab work item number here)** +------ +- [ ] - NAME load to otn_detections_yyyy (`detections-3` notebook) **:fish:(put affected years here)** +- [ ] - NAME load sentinel records (`detections-3` notebook) - [ ] - NAME verify otn_detections_yyyy (`detections-3` notebook) -- [ ] - NAME load sentinel records (`detections-3` notebook) -- [ ] - NAME check for missing receiver metadata (`detections-3b` notebook) -- [ ] - NAME check for missing data records (`detections-3c` notebook) +- [ ] - NAME check for missing receiver metadata (`detections-3b` notebook) and use Missing Metadata template to create a ticket **(put Gitlab work item number here)** +- [ ] - NAME check for missing data records (`detections-3c` notebook) and use the Missing Detections template to create a ticket **(put Gitlab work item number here)** - [ ] - NAME load download records (`events-3` notebook) - [ ] - NAME verify download records (`events-3` notebook) - [ ] - NAME process receiver configuration (`events-4` notebook) -- [ ] - NAME label issue with *'Verify'* -- [ ] - NAME pass issue to OTN analyst for final steps -- [ ] - NAME check for double reporting (verification_notebooks/`Detection Verification` notebook) +- [ ] - NAME label work item with *'Verify'*, removed label *'loading records'* +- [ ] - NAME reassign work item to OTN data analyst for final steps +- [ ] - NAME run verification_notebooks/`Detection Verification` notebook - [ ] - NAME match tags to animals (`detections-4` notebook) - [ ] - NAME overwrite sentinel tags with animal tags (`detections-4b` notebook) - [ ] - NAME do sensor tag processing (`detections-5` notebook) - only done if vendor specifications are available -- [ ] - NAME update detection extract table + **detections files/path:** ~~~ @@ -109,7 +109,7 @@ Once the raw files are obtained, the data must often be converted to `.csv` form - Use the `ComPort` software to open the `.tbdb` file and export as CSV **For Lotek** -- Exporting to CSV is more complicated, please reach out to OTN for specific steps for a given instrument model +- Exporting to CSV is more complicated, please reach out to OTN for specific steps for a given instrument model. For **all other manufacturers**, contact OTN staff to get specifics on the detection data loading workflow. @@ -118,7 +118,7 @@ This will use the `vdat.exe` executable to export from VRL/VDAT to CSV. **IMPORTANT NOTE:** newer versions of `vdat.exe` are only being supported on Windows. Mac users will not be able to use this Nodebook. For instructions on using a program like Wine to run windows programs on other operating systems, contact the OTN Data Centre. -Before you begin, you will need to ensure you have access to a Fathom vdat executable. This executable ships with Fathom Connect for desktop computers as `vdat.exe` +Before you begin, you will need to ensure you have access to a Fathom vdat executable. This executable ships with Fathom Connect for Windows computers as `vdat.exe` - Access the Vemco (Innovasea) website to download `Fathom Connect` - [https://support.fishtracking.innovasea.com/s/downloads](https://support.fishtracking.innovasea.com/s/downloads) - Agree to the Licence @@ -126,8 +126,10 @@ Before you begin, you will need to ensure you have access to a Fathom vdat execu - Locate your ProgramFiles on your computer. Locate the `InnovaSea` subfolder, and the `Fathom` folder within. - Copy the full filepath to your `vdat.exe` file for use in the Nodebook - this will look like `C:/Program Files/Innovasea/Fathom/vdat.exe` + NOTE: Older versions of VDAT may have unintended consequences when converting newer files (like Open Protocol-enabled Innovasea receivers), and should not be used. Versions newer than `vdat-9.3.0-20240207-74ad8e-release` are safe to process Open Protocol data. **NOT RECOMMENDED BY OTN:** If you are desperate for an older version of `vdat.exe` you can find them [here](https://gitlab.oceantrack.org/otndc/vdat-working-group/-/tree/master/releases?ref_type=heads) + - **MAC Users Only** - Locate the vdat executable in your terminal by navigating with the command `cd /path/to/vdat/file` - Enable execution by running `chmod +x vdat` @@ -710,7 +712,7 @@ If a Push is ongoing, or if verification has not yet occurred, you **must** wait In GitLab, this task can be completed at this stage: -`- [ ] - NAME manually check for open, unverified receiver metadata, **STOP** if it exists! **(put GitLab issue number here)**` +`- [ ] - NAME manually check for open, unverified receiver metadata, **STOP** if it exists! **(put GitLab Work item number here)**` ### Creating detection views and loading to otn_detections @@ -1185,6 +1187,6 @@ The remaining steps in the GitLab Checklist are completed outside the Nodebooks. First: you should access the Repository folder in your browser and ensure the raw detections are posted in the `Data and Metadata` folder. -Finally, the Issue can be passed off to an OTN-analyst for final verification in the database. +Finally, the Work item can be passed off to an OTN-analyst for final verification in the database. {% include links.md %} diff --git a/_episodes/11_movers_notebooks.md b/_episodes/11_movers_notebooks.md index 63e058ad..b31818b1 100644 --- a/_episodes/11_movers_notebooks.md +++ b/_episodes/11_movers_notebooks.md @@ -15,44 +15,55 @@ keypoints: - "`mission metadata`, `telemetry data` and `detection data` should be submitted prior to the Moving platform data loading process." --- -Here is the issue checklist in the OTN Gitlab `Moving Platforms` template, for reference: +Here is the Work item checklist in the OTN Gitlab `Moving Platforms` template, for reference: ~~~ Moving platform -- [ ] - NAME load raw metadata file (`movers-1` notebook)**(:fish: table name: c_moving_platform_missions_yyyy)** + +metadata: **(put metadata plone link here)** + +data: **(put data plone link here)** + +telemetry: **(put telemetry plone link here)** + +- [ ] - NAME add label *'loading records'* +- [ ] - NAME add label *'Gliders/Movers'* +- [ ] - NAME verify raw mission metadata file (`movers-1` notebook) +- [ ] - NAME load raw mission metadata file (`movers-1` notebook)**(:fish: table name: c_moving_platform_missions_yyyy)** - [ ] - NAME load raw telemetry files (`movers-2` notebook) **(:fish: table name: c_moving_platform_telemetry_yyyy**) - [ ] - NAME create telemetry table from raw table (`movers-2` notebook) **(:fish: table name: moving_platform_telemetry_yyyy**) +- [ ] - NAME verify telemetry table (`movers-2` notebook) - [ ] - NAME combine mission metadata with telemetry (`movers-2` notebook) **(:fish: table name: moving_platform_mission_telemetry_yyyy)** -- [ ] - NAME load to raw detections (`detections-1` notebook) **(:fish: table name: c_detections_yyyy)** -- [ ] - NAME verify raw detections table (`detections-1` notebook) -- [ ] - NAME load raw events (`events-1` notebook) **(:fish: table name: c_events_yyyy )** -- [ ] - NAME load raw events to events table (`events-2` notebook) -- [ ] - NAME load to detections_yyyy_movers (`movers-2` notebook) **(:fish: put affected years here)** -- [ ] - NAME delete self detections (`movers-3` notebook) -- [ ] - NAME timedrift correction for affected detection (`movers-3` notebook) -- [ ] - NAME verify timedrift corrections (`movers-3` notebook) -- [ ] - NAME verify detections_yyyy_movers (looking for duplicates) (`movers-3` notebook) -- [ ] - NAME load to sensor match (`movers-3` notebook) **(:fish: put affected years here)** +- [ ] - NAME verify joined table (`movers-2` notebook) +- [ ] - NAME load to raw detections (if detections available) (`detections-1` notebook) **(:fish: table name: c_detections_yyyy)** +- [ ] - NAME verify raw detections table (if detections available) (`detections-1` notebook) +- [ ] - NAME load raw events (if events available) (`events-1` notebook) **(:fish: table name: c_events_yyyy )** +- [ ] - NAME load raw events to events table (if events available) (`events-2` notebook) +- [ ] - NAME load to detections_yyyy_movers (if detections available) (`movers-3` notebook) **(:fish: put number of detections here)** +- [ ] - NAME compare detections to telemetry (if detections available) (`movers-3b` notebook) +- [ ] - NAME delete self detections (if detections available) (`movers-3` notebook) **(:fish: add number of detections removed here)** +- [ ] - NAME add time drift factors (if events available) (`movers-3` notebook) **(:fish: put affected years here)** +- [ ] - NAME timedrift correction for affected detection (if detections available) (`movers-3` notebook) **(:fish: put affected years here)** +- [ ] - NAME verify timedrift corrections (if detections available) (`movers-3` notebook) +- [ ] - NAME verify detections_yyyy_movers (looking for duplicates) (if detections available) (`movers-3` notebook) +- [ ] - NAME load to sensor match (if detections available) (`movers-3` notebook) **(:fish: put affected years here)** - [ ] - NAME load formatted telemetry tables (`movers-4` notebook) **(:fish: put affected years here)** -- [ ] - NAME load reduced telemetry tables (`movers-4` notebook) **(:fish: put affected years here)** -- [ ] - NAME load glider as receiver tables (`movers-4` notebook) **(:fish: put affected years here)** -- [ ] - NAME load into vw_detections_yyyy_movers (`movers-4` notebook) **(:fish: put affected years here)** -- [ ] - NAME load view detections into otn_detections_yyyy (`movers-4` notebook) **(:fish: put affected years here)** -- [ ] - NAME verify otn_detections_yyyy (`movers-4` notebook) +- [ ] - NAME load moving_platform_as_receiver table (`movers-4` notebook) +- [ ] - NAME load into vw_detections_yyyy_movers (if detections available) (`movers-4` notebook) **(:fish: put affected years here)** +- [ ] - NAME load view detections into otn_detections_yyyy (if detections available) (`movers-4` notebook) **(:fish: put affected years here)** +- [ ] - NAME verify otn_detections_yyyy (if detections available) (`movers-4` notebook) - [ ] - NAME create mission and receiver records in moorings (`movers-4` notebook) -- [ ] - NAME load download records (`events-3` notebook) -- [ ] - NAME verify download records (`events-3` notebook) -- [ ] - NAME process receiver configuration (`events-4` notebook) -- [ ] - NAME label issue with *'Verify'* -- [ ] - NAME pass issue to analyst for final steps +- [ ] - NAME check for missing receiver metadata (`detections-3b notebook`) +- [ ] - NAME check for missing data records (`detections-3c notebook`) +- [ ] - NAME load download records (if events available) (`events-3` notebook) +- [ ] - NAME verify download records (if events available) (`events-3` notebook) +- [ ] - NAME process receiver configuration (if events available) (`events-4` notebook) +- [ ] - NAME label work item with *'Verify'* +- [ ] - NAME reassign work item to OTN data analyst for final steps - [ ] - NAME match tags to animals (`detections-4` notebook) -- [ ] - NAME update detection extract table - -metadata: **(put metadata repository link here)** - -data: **(put data repository link here)** +- [ ] - NAME overwrite sentinel tags with animal tags (`detections-4b` notebook) +- [ ] - NAME do sensor tag processing (`detections-5` notebook) -telemetry: **(put telemetry repository link here)** ~~~ diff --git a/_episodes/13_Database fix notebooks.md b/_episodes/13_Database fix notebooks.md index e849e27c..079f8e94 100644 --- a/_episodes/13_Database fix notebooks.md +++ b/_episodes/13_Database fix notebooks.md @@ -41,7 +41,7 @@ flowchart LR style E fill:#ffffff,color:#000000 E --> F{Does notebook exist?} style F fill:#000000,color:#ffffff - F -- No --> G[Create feature issue for
notebook creation and do fix
manually] + F -- No --> G[Create feature ticket for
notebook creation and do fix
manually] style G fill:#ffffff,color:#000000 G --> H(( )) style H fill:#FF0000,stroke:#FF0000 @@ -91,11 +91,11 @@ An exciting feature of the Database Fix Notebooks is that if you add a Gitlab to To integrate the Gitlab token into your kdbx file, please use the instructions found at the bottom of the [AUTH - Create and Update](https://gitlab.oceantrack.org/otn-partner-nodes/ipython-utilities/-/blob/main/AUTH%20-%20Create%20and%20Update.ipynb) notebook in ipython-utilities. -## Issue Creation +## Work item Creation -The **first** step when you have confirmed an incorrect dataqbase value is to create a new Gitlab Issue with the `DB Fix` Issue checklist template. +The **first** step when you have confirmed an incorrect dataqbase value is to create a new Gitlab work item with the `DB Fix` work item checklist template. -Here is the Issue checklist, for reference: +Here is the Work item checklist, for reference: ~~~ # **DB Fix Issue** @@ -124,7 +124,7 @@ Some of the Database Fix Notebooks require the user to provide a spreadsheet of The required columns will be shown in the description as well. Once input, if there are missing required columns, the notebook will display an error identifiying which columns are missing. -The spreadsheet should be created and added to the created Gitlab issue, either in the description or in a comment. +The spreadsheet should be created and added to the created Gitlab work item, either in the description or in a comment. ## Examples Once you know which notebook to use and have created the spreadsheet (if needed), you can open the correct Database Fix Notebook. This notebook will consist of a single cell to run. @@ -136,7 +136,7 @@ The notebooks have similar formats so four examples will be demonstrated below. ### Example 1: Changing a receiver serial Let's say for the first example, a researcher has emailed saying that they made a typo in the receiver metadata and that serial 87654321 should actually be 12345678 for receivers 'CODE-87654321-2020-03-10' and 'CODE-87654321-2024-09-09' in project CODE. -The first step is to create a Gitlab issue with the relevant information titled 'CODE Change receiver serial'. +The first step is to create a Gitlab work item with the relevant information titled 'CODE Change receiver serial'. The next step would be to figure out which notebook to use to make this change. Running the first cell in `0. Which notebook should I use` gives the following results: @@ -175,7 +175,7 @@ If you have a gitlab token authorization associated with your kdbx, as mentioned ### Example 2: Changing tag end date Let's say for the second example, a researcher has emailed saying that they had forgotten to add the harvest date '2024-09-09 10:00:00' to tag 'A69-1303-12345' on animal 'CODE-Jane', which was released on '2024-01-01 13:00:00', which should be used instead of the estimated tag life '365 days'. -The first step is to create a Gitlab issue with the relevant information titled 'CODE Change tag end date with harvest date' or something with relevant information. +The first step is to create a Gitlab work item with the relevant information titled 'CODE Change tag end date with harvest date' or something with relevant information. The next step would be to figure out which notebook to use to make this change. Running the first cell in `0. Which notebook should I use` gives the following results: @@ -213,7 +213,7 @@ If you have a gitlab token authorization associated with your kdbx, as mentioned ### Example 3: Fixing the_geom Let's say for the third example, you are verifying tag metadata for project 'NSBS' and an error comes up from ipython-utilities saying that the_geom is incorrect and the instructions direct you to the 'fix the_geom' Database Fix Notebook. -The first step is to create a Gitlab issue with the relevant information titled 'NSBS fix the_geom'. +The first step is to create a Gitlab work item with the relevant information titled 'NSBS fix the_geom'. You can then open the `Fix the_geom` notebook as the ipython-utilities nodebook will direct you. In this notebook, there is a description that does not have a spreadsheet so no spreadsheet is needed. @@ -314,7 +314,7 @@ Once 'Update' is pressed, the notebook will display a success message describing If your .kdbx file includes a GitLab Access Token, as mentioned above, the Nodebook will automatically comment all updates and success messages in the created Gitlab ticket. Otherwise, you must copy and paste this information into the Issue manually. For Nodebooks that require updates to the databases of other nodes: -1. If you have a GitLab Access Token: the Nodebook will provide the SQL needed to update other nodes and will automatically add a 'Cross-node executions' label to the Gitlab Issue. +1. If you have a GitLab Access Token: the Nodebook will provide the SQL needed to update other nodes and will automatically add a 'Cross-node executions' label to the Gitlab work item. 2. If you **do not** have a GitLab Access Token: the Nodebook will provide the SQL needed to update other nodes. Please add a 'Cross-node executions' label onto the Issue manually. A text box to enter a super-user authorization to automatically run the SQL on the other node will be displayed: @@ -323,11 +323,11 @@ A text box to enter a super-user authorization to automatically run the SQL on t If you have a super-user authorization for the other Node (ie; you are the Data Manager for multiple Nodes, or are OTN staff): 1. You may enter the filepath of the super-user authorization in the above text box and will receive a success message, which will run the SQL on the other Node. -2. Please remove the 'Cross-node executions' label from the Issue once completed +2. Please remove the 'Cross-node executions' label from the Work item once completed ![Example Node Authorization - Filled](../fig/nmt_dbfix_genex_nodeauth_filled_redacted.png) If you **do not** have the super-user authorizations for the other Node : 1. Please inform the associated Node Manager that you have SQL for them to run and send them the SQL. -2. After they have run the SQL, please add a comment to the issue saying they have run the SQL +2. After they have run the SQL, please add a comment to the work item saying they have run the SQL 3. Remove the 'Cross-node executions' label. \ No newline at end of file diff --git a/_episodes/14_The Push.md b/_episodes/14_The Push.md index 15b6bb6a..8c1d84e7 100644 --- a/_episodes/14_The Push.md +++ b/_episodes/14_The Push.md @@ -55,7 +55,7 @@ We have created an OTNDC Bot that announces updates through OTN's node Slack cha ## Push Reports -Once a push is completed, statistics are gathered about the overall push as well as metrics about each node. This process creates a snapshot of what each node looked like at the time of that push. The statistics tracked include metrics such as the number of issues in the push, the number of projects a node is managing, the total number of detections, and the size of the database. +Once a push is completed, statistics are gathered about the overall push as well as metrics about each node. This process creates a snapshot of what each node looked like at the time of that push. The statistics tracked include metrics such as the number of tickets in the push, the number of projects a node is managing, the total number of detections, and the size of the database. Using this data, a push report is generated for each node. These reports provide a summary of the push, including graphs and figures that illustrate how each node is growing over time. In addition to sharing these reports, we try to schedule a check-in meeting with nodes. These meetings are not only a chance for OTN to get information to the nodes but also for you to relay any information to us. @@ -83,11 +83,11 @@ Detection Extract files are formatted for direct ingestion by analysis packages During the Push process, any new detection matches that are made are noted in the `obis.detection_extracts_list` table of your Node. These entries will have several pieces of useful information: - `detection_extract`: this contains the project code, year, and type of extract that needs to be created. * ex: `ABC,2022,t` will suggest that project ABC needs the extract `matched to animals 2022` (tracker format) created. -- `git_issue_link`: the issue in which these detection matches were impacted +- `git_issue_link`: the ticket in which these detection matches were impacted - `push_date`: the date of the Push when this extract will have to be made Using these fields, the `detections-create detection extracts` Nodebook can determine which extracts need to be created for each push. - + **As of December 2024, please ensure you are on the `main` branch of ipython utilities before running this Nodebook** To switch branches in Git, please follow the instructions on this page [https://gitlab.oceantrack.org/otn-partner-nodes/ipython-utilities/-/wikis/updating-notebooks-after-bugfixes-and-new-features#changing-branches-of-ipython-utilities](https://gitlab.oceantrack.org/otn-partner-nodes/ipython-utilities/-/wikis/updating-notebooks-after-bugfixes-and-new-features#changing-branches-of-ipython-utilities) @@ -193,7 +193,6 @@ The last cell of the section sends the emails that you have constructed. -------------------------------------- **The following section is relevant to Nodes who use Plone as their document management system** - > ### Emailing Researchers - Plone (This method will be considered obsolete once the database contact method is fully adopted) > > Using the Plone users system, its possible to identify which researchers require an email notification. diff --git a/_episodes/15_Supplementary notebooks.md b/_episodes/15_Supplementary notebooks.md index 3b5f1e24..8407e9e4 100644 --- a/_episodes/15_Supplementary notebooks.md +++ b/_episodes/15_Supplementary notebooks.md @@ -9,7 +9,7 @@ objectives: keypoints: - "ipython-utilities has many useful notebooks for Node Managers to help them" --- - + OTN maintains several additional Nodebooks that fall outside the core `tag`, `deployment` and `detection` tools. These may be useful to Node managers who also deal with these particular scenarios. ## Check Environment diff --git a/_episodes/16_feature_requests.md b/_episodes/16_feature_requests.md index 4dfe5905..4591e11e 100644 --- a/_episodes/16_feature_requests.md +++ b/_episodes/16_feature_requests.md @@ -55,7 +55,7 @@ If you encounter an error in your Nodebooks, its possible there is an issue with To identify a bug, here are the steps to take: 1. Ask in our Slack channels to see if the error is caused by your dataset. This can include posting an error message, or just describing the output from the Nodebook, and why it is not as expected. -2. If OTN developers identify that the problem is not your dataset, the next step will be to create a GitLab Issue [here](https://gitlab.oceantrack.org/otn-partner-nodes/ipython-utilities/-/issues), using the `bug` template. You should assign to one of the OTN developers, and use the label `bugfix`. +2. If OTN developers identify that the problem is not your dataset, the next step will be to create a GitLab Work item [here](https://gitlab.oceantrack.org/otn-partner-nodes/ipython-utilities/-/work_items), using the `bug` template. You should assign to one of the OTN developers, and use the label `bugfix`. Here is the "bug" template, for your information: diff --git a/_episodes/17_Important Data Tables.md b/_episodes/17_Important Data Tables.md index d12bffa1..25856309 100644 --- a/_episodes/17_Important Data Tables.md +++ b/_episodes/17_Important Data Tables.md @@ -8,6 +8,8 @@ questions: + + **1. Vendor Schema**