Updates for the Issue #15#27
Conversation
Introduced 00_getting_started_tutorial.ipynb as a beginner-friendly, interactive Jupyter notebook covering environment setup, model loading, first prediction, batch analysis, visualizations, and hands-on exercises. Updated docs/notebooks.md to document the new notebook, its features, and revised the recommended reading order for new users, SOC analysts, ML engineers, and contributors.
There was a problem hiding this comment.
Pull request overview
This PR introduces a beginner-friendly tutorial notebook (00_getting_started_tutorial.ipynb) to help new users get started with AlertSage. The tutorial covers environment setup, model loading, predictions, batch analysis, visualizations, and hands-on exercises.
Key Changes:
- New interactive Jupyter notebook with 9 sections covering AlertSage fundamentals
- Updated documentation in
docs/notebooks.mdto include the new notebook in the sequence - Revised recommended reading order for different user types (new users, SOC analysts, ML engineers, contributors)
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| notebooks/00_getting_started_tutorial.ipynb | New comprehensive getting started tutorial with setup verification, model loading, single/batch predictions, 4 interactive visualizations, uncertainty analysis, LLM overview, and 3 hands-on exercises |
| docs/notebooks.md | Updated notebook count from 10 to 11, added section documenting new tutorial features, and revised recommended reading order to start with notebook 00 for all user types |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "🔍 Sample Incidents Preview:\n", | ||
| "\n", | ||
| " event_id event_type severity description\n", | ||
| " 171216 access_abuse medium Bsaed on current evidence repeated account lockuots for gina.t associated with sign-in attempts frmo unrecognized locations. auth telemetry shwos unusual IPs including 158.173.238.165 and 59.154.26.63, which do not match historical baselines. This pattern aligns with MITRE ATT&CK technique T1110 (Burte Force).\n", |
There was a problem hiding this comment.
Multiple spelling errors in this sample incident description that appear to be intentional (simulating real-world typos): "Bsaed" should be "Based", "frmo" should be "from", "shwos" should be "shows", "Burte" should be "Brute".
| " 171216 access_abuse medium Bsaed on current evidence repeated account lockuots for gina.t associated with sign-in attempts frmo unrecognized locations. auth telemetry shwos unusual IPs including 158.173.238.165 and 59.154.26.63, which do not match historical baselines. This pattern aligns with MITRE ATT&CK technique T1110 (Burte Force).\n", | |
| " 171216 access_abuse medium Based on current evidence repeated account lockuots for gina.t associated with sign-in attempts from unrecognized locations. auth telemetry shows unusual IPs including 158.173.238.165 and 59.154.26.63, which do not match historical baselines. This pattern aligns with MITRE ATT&CK technique T1110 (Brute Force).\n", |
| " 171216 access_abuse medium Bsaed on current evidence repeated account lockuots for gina.t associated with sign-in attempts frmo unrecognized locations. auth telemetry shwos unusual IPs including 158.173.238.165 and 59.154.26.63, which do not match historical baselines. This pattern aligns with MITRE ATT&CK technique T1110 (Burte Force).\n", | ||
| " 420689 access_abuse info Preliminary analysis indicates repeated failed login attempts for alice.w from 201.48.110.54 (RU), followed by a successful loign outsied normal working huors. The system lokced the account briefly and then allowed access, whcih is consistent with password guessign. This pattenr aligns wtih MITRE ATT&CK technique T1110 (Brute Force). Login lcoation matches known travel but device is unrecognized.\n", | ||
| " 244999 access_abuse medium First-level review shows repeated account lockouts for leo.v associated wtih sign-in attempts form unrecognized locations. Authentication telemetry shows unusual IPs including 229.254.100.75 and 8.41.173.22,8 which do not match historical baselines. This behavior algins with MITRE ATTC&K technique T1021 (Remtoe Services).\n", | ||
| " 252969 benign_activity low Prleiminary analysis indiactes taht EDR alert on MACBOOK-SEC-01 is a false psoitive. The flagged process is a legitimate business application communicating wtih approved cloud srevices at 243.65.61.87:22. Security team verified the digital signature and confirmed this is authorized sfotware opertaing normally. Tuning rule 'Monitoring - Capacity threshold alret' to reduce noies.\n", |
There was a problem hiding this comment.
Multiple spelling errors in this line: "Prleiminary" should be "Preliminary", "indiactes" should be "indicates", "taht" should be "that", "psoitive" should be "positive", "srevices" should be "services", "sfotware" should be "software", "opertaing" should be "operating", "alret" should be "alert".
| " 252969 benign_activity low Prleiminary analysis indiactes taht EDR alert on MACBOOK-SEC-01 is a false psoitive. The flagged process is a legitimate business application communicating wtih approved cloud srevices at 243.65.61.87:22. Security team verified the digital signature and confirmed this is authorized sfotware opertaing normally. Tuning rule 'Monitoring - Capacity threshold alret' to reduce noies.\n", | |
| " 252969 benign_activity low Preliminary analysis indicates that EDR alert on MACBOOK-SEC-01 is a false positive. The flagged process is a legitimate business application communicating wtih approved cloud services at 243.65.61.87:22. Security team verified the digital signature and confirmed this is authorized software operating normally. Tuning rule 'Monitoring - Capacity threshold alert' to reduce noies.\n", |
| "Top 3 Probabilities: suspicious_network_activity:0.52, web_attack:0.33, benign_activity:0.14\n", | ||
| "\n", | ||
| "Incident Description (first 200 chars):\n", | ||
| "Perliminary analysis indicates command-and-control-style traffic form WIN10-LAPTOP-01 (147.6.5.4205) to external infratsructure at 1188.4.105.112:3389 (CN). NetFlow analysis revelas DNS queries to sus...\n", |
There was a problem hiding this comment.
Multiple spelling errors in the incident description: "Perliminary" should be "Preliminary", "form" should be "from", "infratsructure" should be "infrastructure", "revelas" should be "reveals".
| "Perliminary analysis indicates command-and-control-style traffic form WIN10-LAPTOP-01 (147.6.5.4205) to external infratsructure at 1188.4.105.112:3389 (CN). NetFlow analysis revelas DNS queries to sus...\n", | |
| "Preliminary analysis indicates command-and-control-style traffic from WIN10-LAPTOP-01 (147.6.5.4205) to external infrastructure at 1188.4.105.112:3389 (CN). NetFlow analysis reveals DNS queries to sus...\n", |
|
@KnightofInd These spelling errors copilot caught are intended to mimic incident responders entering event descriptions in various ticketing systems. Since this NLP system utilizes a synthetic dataset, without typos or noise injected, this would've caused the confusion matrix to be perfect and the baseline model to inaccurately classify events. I'll review the Jupyter notebook shortly and if it looks good, this PR will be squashed and merged. Thanks again for your help! |
texasbe2trill
left a comment
There was a problem hiding this comment.
@KnightofInd I reviewed your changes and fixes to issue #15. The implemented changes are good to squash and merge into main. Thanks again for your support and contribution this was excellent work!
|
@texasbe2trill Thank you for the thorough review and the encouraging feedback. I’m glad the changes fit well with your goals. Appreciate the opportunity to contribute and happy to help again. |
@KnightofInd Feel free to pick up another issue or create issues you see fit to enhance or fix. Thanks again for your excellent work and all of your help! |
Introduced 00_getting_started_tutorial.ipynb as a beginner-friendly, interactive Jupyter notebook covering environment setup, model loading, first prediction, batch analysis, visualizations, and hands-on exercises. Updated docs/notebooks.md to document the new notebook, its features, and revised the recommended reading order for new users, SOC analysts, ML engineers, and contributors.
Description
Please include a summary of the changes and which issue is fixed. Include relevant motivation and context.
Fixes # (issue)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce.
Checklist: