Skip to content

Design Proposal

Kyle Zhou edited this page Dec 15, 2024 · 1 revision

Architecture

1. High-level component diagrams

Design_Proposal

Internal to our system are the front-end client and backend cloud service. The APIs will be external.

2. Architecture style

EchoNote will use the MVVM architectural pattern to manage user interactions.

  • Model would represent the data structures for user accounts, lecture recordings, summaries, and flashcards. We also would have classes responsible for fetching data from the cloud/local storage + performing data operations
  • View would be all of the UI elements - the screens for recording lectures, viewing summaries, folder system, managing flashcards, etc.
  • View Model classes would be ones that handle actions between view and model such as the user starting a lecture recording through the view (pressing record button), then the ViewModel responds by calling a method to start recording and updates its internal state.

The application’s speech-to-summary feature will use a pipeline-based architecture to handle sequential data processing steps. This pipeline will ensure that the recording, transcription, summarization, and storage happen in an orderly and efficient manner. The stages of the pipeline would be as follows:

  • Recording: The user presses the record button to capture their lecture or office hour session. Audio data is recorded and temporarily stored for processing.
  • Transcription: The recorded audio is sent through a speech-to-text API. The API converts the speech into text in real time or after the recording session ends. The transcribed text is stored and ready for summarization.
  • Summarization: The transcribed text is processed by a summarization API. This API takes the raw transcription and identifies key points, generating a concise summary of the most important information presented during the lecture or meeting. The summarized text is stored in the Model, organized within user-created folders.
  • Storage: The summarized content is categorized based on the user's organization (e.g., by folder, class, or topic). Summaries are saved in the database, associated with metadata (class, date, topic), and made available for retrieval later. We will also be storing user-specific data (username, password).
  • Flash Cards (Stretch Goal): If the flashcard feature is enabled, summarized content may also be automatically turned into flashcards and stored.

3. Components Outside of System

  • We will need a cloud service to host our backend for processing and storing the data, this would be something such as AWS or Firebase.
  • We have numerous external APIs that we need to call to integrate specific functionality such as:
    • Transcribing the audio files into text
    • Creating a summarized version of that text file.

4. Security and privacy

  • We must ensure that all data is securely stored and transmitted, with stricter controls so that each user has exclusive access to their own information.
  • The APIs used for summarizing and transcribing recordings need to be safeguarded to maintain data privacy.
  • Building an online application introduces important considerations around security and privacy that aren't as critical in local apps, requiring us to implement more comprehensive protection mechanisms to ensure a safer user experience.

Application Features

1. Target platform and OS:

  • Android

2. How we test:

  • To test against the Android platform, Yarik has an Android phone, however the team will be using an android emulator on their laptops. Therefore, there are no testing risks/issues.

3. Features that you do not know how to implement:

  • Which specific external APIs to use and how to specifically call them / what data to pass and what format
  • Implementing user authentication
  • Setting up a cloud backend service (AWS?)
  • How will these unknowns be addressed?
    • We will research how to implement these features and the best practices to do so. If necessary we will consult with our mentors.

4. Low-fidelity Prototypes of Our UI/Screens

image

  • Welcome Screen Transition (Screens 1-3)
    • Start with the EchoNote welcome splash screen (screen 1). After a few seconds (or manual interaction), smoothly transition to the login/register screen (screens 2 & 3).
  • Home Screen with Category Folders (Screen 4)
    • After successful login/registration, the Home screen (screen 4) displays folder categories.
    • Default folder “default" is preloaded, but users can add new categories with an empty item by clicking the '+' button.
    • Clicking a category should toggle open the list of contents (lecture notes, office hours, etc.).
  • Note Screen (Screen 5)
    • Clicking on an item within a category brings the user to a note screen.
    • Users can toggle between different subjects within the same category without navigating back (dropdowns).
  • Testing Mode (Screen 6)
    • Test Mode is accessible from both the “Test Me” button on notes or the “Test” icon in the footer.
    • On the test, flashcards are made by user’s summarized notes in the category and subject selected.
  • Adding a New Note (Screens 7-9):
    • The Add New Note flow begins when users click the "+" button on the home or notes screen.
    • The Add screen should first ask the user to choose a category and subject (required fields).
    • Clicking on the record button, would display start recording.
    • After or during recording, the user can also write their own text by clicking on the text toggle.
    • Once the user is satisfied with their recording/text, they can finalize the note and Submit (screen 9). Ensure all required fields (category, subject, content) are filled before allowing submission.

External Dependencies

  • Audio Recording API
    • Android has native API called MediaRecorder that we will explore for implementing our audio recording functionality
  • UI
    • Likely want to use Jetpack Compose to build the UI
  • Transcribing API
    • We plan on researching and testing out transcribing APIs in the first sprint
  • Summarizing API
    • We plan on researching and testing out summarizing APIs in the first sprint

Clone this wiki locally