Skip to content

mingchungx/Curio

Repository files navigation

Curio

Language License: MIT

Curio is a research-based, unsupervised topic modelling pipeline for social media, written in Swift. It draws from available libraries to support data collection, document encoding (e.g., CoreML, Model2vec, Apple's Natural Language), dimensionality reduction (e.g., PCA, tSNE, UMAP), clustering (e.g., HDBSCAN, KMeans), and topic modeling. Our goals are to provide a modular and efficient set of tools that work across a variety of data sources. We leverage modern Swift concurrency and libraries like MLX to provide performant and safe implementations that work well on commodity Mac hardware. Curio will enable the development of new qualitative data analysis tools for edge devices like laptops, tablets, and smartphones.

Roadmap

Installation

You can use Swift Package Manager and specify dependency in Package.swift by adding:

.package(url: "https://git.uwaterloo.ca/jrWallac/curio.git", from: "0.0.8")

Contributing

This project is developed by a team of researchers from the Human-Computer Interaction and Health Lab at the University of Waterloo. The project is led by Prof. Jim Wallace, with contributions from:

  • Jason Zhao
  • Nicole Mathis
  • Peter Li
  • Adrian Davila
  • Henry Tian
  • Jean Nordmann
  • Mingchung Xia
  • Abhinav Jain
  • George Wang
  • Ali Raza Zaidi

If you would like to contribute to the project, contact Prof. Wallace with "Curio" in the subject line, and mention one or more of the roadmap items above that you would like to work on.

License

All original code released under the MIT license for commercial and non-commercial use.

About

Curio is a research-based, unsupervised topic modelling pipeline for social media, written in Swift.

Topics

Resources

License

Stars

Watchers

Forks

Languages