This is a React app for the Data Masking Platform. It provides a user interface to interact with a backend service that masks sensitive data in text.
- Enter text to be processed and masked.
- Submit the text to the backend service for processing.
- Display the masked output text.
- Handle error cases gracefully.
-
Clone the repository:
git clone https://github.com/your/repository.git -
Navigate to the project directory:
cd project-directory -
Install the dependencies:
npm install
-
Start the development server:
npm start -
Access the app in your browser at
http://localhost:3000. -
Enter the text you want to process in the input field.
-
Click the "Submit" button to send the text to the backend service for processing.
-
The processed and masked output will be displayed below the input field.
The app is configured to send requests to the backend service at http://127.0.0.1:5000/process_text. If your backend service is running on a different URL, you can modify the endpoint in the handleSubmit function of the App component.
- React: JavaScript library for building user interfaces.
- Axios: Promise-based HTTP client for making API requests.
Contributions are welcome! If you have any suggestions, bug reports, or feature requests, please open an issue or submit a pull request.
This project is licensed under the MIT License.
ner model code - Here's an explanation of the code:
-
The code begins by importing the necessary libraries:
csvfor reading training data from a CSV file,spacyfor natural language processing,randomfor shuffling the training data, andExamplefromspacy.training.examplefor creating training examples. -
The function
offsets_to_biluo_tagsconverts the entity offsets to a list of BIO (beginning-inside-outside) tags. It takes a spaCydocobject and a list ofentitiesas input and returns a list of tags. -
The function
train_ner_modeltrains a named entity recognition (NER) model using the provided training data. It takestraining_dataas input, which is a list of tuples containing the full text, masked text, entity spans, and other information.- It initializes a blank NER model using
spacy.blank("en"). - It adds the NER component to the pipeline of the model.
- It extracts the unique entity labels from the training data and adds them as labels in the NER component.
- It prepares the training data in spaCy format by converting the entity spans to the required format.
- It trains the NER model using the FastText algorithm for a specified number of iterations.
- Finally, it returns the trained NER model.
- It initializes a blank NER model using
-
The code reads the training data from a CSV file named
data.csvand stores it in thetraining_datalist. The CSV file should have columns for full text, masked text, entity spans, PII (Personally Identifiable Information) entities, and other entities. -
The
train_ner_modelfunction is called with thetraining_datato obtain the trained NER model. -
The trained NER model is saved to disk using the
to_diskmethod, and it is stored in a directory namedner_model. -
The code tests the NER model on a sample text by creating a spaCy
docobject using thener_modeland the sample text. -
The
masked_textvariable is initialized with the sample text. Then, for each entity (ent) in thedoc.ents, the corresponding entity text is replaced with the string "{{MASKED}}". -
Finally, the
masked_textis printed, which contains the sample text with the identified entities replaced by "{{MASKED}}".
This code trains a NER model using the provided training data and demonstrates its usage by masking the entities in a sample text.