It would be better for find_keywords() to not rely on clean_job_descriptions() (or perhaps another function in the future) to pass in data as specific as it does. For example, the regex pattern in find_keywords() assumes that:
- All unnecessary punctuation (except those found in tech keywords) is going to be removed ahead of time
- It will be lower-cased to matched the case of the keywords set
- All white space will be normalized to one space between words and newlines will be removed
If this isn't the case, this can lead to missed keywords and false positives.
It would be better for
find_keywords()to not rely onclean_job_descriptions()(or perhaps another function in the future) to pass in data as specific as it does. For example, the regex pattern infind_keywords()assumes that:If this isn't the case, this can lead to missed keywords and false positives.