Skip to content

Fast Elixir RSS feed parser, a NIF wrapper around the Rust RSS crate

License

Notifications You must be signed in to change notification settings

BurntCaramel/fast_rss

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FastRSS

Parse RSS feeds very quickly

Hex.pm Hex.pm Hex.pm HexDocs.pm last commit

Intro | Compatibility | Installation | Usage | Benchmarks | Deploying | License


Intro

Parse RSS feeds very quickly

  • This is rust NIF built using rustler
  • Uses the RSS rust crate to do the actual RSS parsing

Speed

Currently this is already much faster than most of the pure elixir/erlang packages out there. In benchmarks there are speed improvements anywhere between 6.12x - 50.09x over the next fastest package (feeder_ex) that was tested.

Compared to the slowest elixir options tested (feed_raptor, elixir_feed_parser), FastRSS was sometimes 259.91x faster and used 5,412,308.17x less memory (0.00156 MB vs 8423.70 MB).

See full benchmarks below:

Compatibility

FastRSS requires a minimum combination of Elixir 1.6.0 and Erlang/OTP 20.0, and is tested with a maximum combination of Elixir 1.11.1 and Erlang/OTP 22.0.

Installation

This package is available on hex.

It can be installed by adding fast_rss to your list of dependencies in mix.exs:

def deps do
  [
    {:fast_rss, "~> 0.3.4"}
  ]
end

You also need the rust compiler installed: https://www.rust-lang.org/tools/install

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Usage

There is only one function it takes an RSS string and outputs an {:ok, map()} with string keys.

iex(1)>  {:ok, map_of_rss} = FastRSS.parse("...rss_feed_string...")
iex(2)> Map.keys(map_of_rss)
["categories", "cloud", "copyright", "description", "docs", "dublin_core_ext",
 "extensions", "generator", "image", "items", "itunes_ext", "language",
 "last_build_date", "link", "managing_editor", "namespaces", "pub_date",
 "rating", "skip_days", "skip_hours", "syndication_ext", "text_input", "title",
 "ttl", "webmaster"]

The docs can be found at https://hexdocs.pm/fast_rss.

Supported Feeds

Reading from the following RSS versions is supported:

  • RSS 0.90
  • RSS 0.91
  • RSS 0.92
  • RSS 1.0
  • RSS 2.0
  • iTunes
  • Dublin Core

Benchmark

HTML: https://avencera.github.io/fast_rss/

Benchmark run from 2020-02-22 05:23:47.524699Z UTC

System

Benchmark suite executing on the following system:

Operating System macOS
CPU Information Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
Number of Available Cores 16
Available Memory 32 GB
Elixir Version 1.10.1
Erlang Version 22.2.6

Configuration

Benchmark suite executing with the following configuration:

:time 30 s
:parallel 1
:warmup 5 s

Statistics

Input: anxiety

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 188.57 5.30 ms ±8.26% 5.45 ms 6.43 ms
feeder_ex 3.70 269.92 ms ±5.34% 268.12 ms 316.12 ms
feed_raptor 2.99 334.01 ms ±2.44% 331.03 ms 371.28 ms
elixir_feed_parser 1.94 515.72 ms ±1.94% 516.10 ms 536.05 ms
Comparison
Name IPS Slower
fast_rss 188.57  
feeder_ex 3.70 50.9x
feed_raptor 2.99 62.99x
elixir_feed_parser 1.94 97.25x
Memory Usage
Name Memory Factor
fast_rss 0.00156 MB  
feeder_ex 17.21 MB 11004.73x
feed_raptor 268.53 MB 171693.91x
elixir_feed_parser 313.30 MB 200316.09x

Input: ben

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 83.95 11.91 ms ±10.29% 12.23 ms 16.17 ms
feeder_ex 13.33 75.04 ms ±4.38% 74.21 ms 89.72 ms
elixir_feed_parser 3.52 284.18 ms ±3.89% 283.83 ms 324.08 ms
feed_raptor 0.48 2078.76 ms ±0.52% 2076.27 ms 2097.44 ms
Comparison
Name IPS Slower
fast_rss 83.95  
feeder_ex 13.33 6.3x
elixir_feed_parser 3.52 23.86x
feed_raptor 0.48 174.51x
Memory Usage
Name Memory Factor
fast_rss 0.00155 MB  
feeder_ex 27.86 MB 17990.96x
elixir_feed_parser 163.88 MB 105811.88x
feed_raptor 1577.41 MB 1018492.36x

Input: daily

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 32.98 0.0303 s ±7.62% 0.0313 s 0.0339 s
feeder_ex 4.94 0.20 s ±4.61% 0.199 s 0.24 s
elixir_feed_parser 0.64 1.57 s ±1.50% 1.57 s 1.63 s
feed_raptor 0.127 7.88 s ±0.23% 7.88 s 7.90 s
Comparison
Name IPS Slower
fast_rss 32.98  
feeder_ex 4.94 6.68x
elixir_feed_parser 0.64 51.86x
feed_raptor 0.127 259.91x
Memory Usage
Name Memory Factor
fast_rss 0.00153 MB  
feeder_ex 109.73 MB 71555.78x
elixir_feed_parser 880.51 MB 574178.95x
feed_raptor 6386.12 MB 4164382.64x

Input: dave

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 407.08 2.46 ms ±9.83% 2.41 ms 3.16 ms
feeder_ex 56.52 17.69 ms ±6.14% 17.37 ms 22.51 ms
elixir_feed_parser 8.90 112.31 ms ±4.12% 111.93 ms 127.60 ms
feed_raptor 1.59 628.45 ms ±1.60% 626.71 ms 656.74 ms
Comparison
Name IPS Slower
fast_rss 407.08  
feeder_ex 56.52 7.2x
elixir_feed_parser 8.90 45.72x
feed_raptor 1.59 255.83x
Memory Usage
Name Memory Factor
fast_rss 0.00157 MB  
feeder_ex 9.25 MB 5886.17x
elixir_feed_parser 80.42 MB 51170.23x
feed_raptor 571.18 MB 363425.45x

Input: sleepy

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 760.30 1.32 ms ±16.62% 1.21 ms 2.03 ms
feeder_ex 124.28 8.05 ms ±6.94% 8.03 ms 10.32 ms
elixir_feed_parser 26.26 38.09 ms ±5.08% 37.81 ms 44.42 ms
feed_raptor 3.21 311.16 ms ±2.85% 307.86 ms 345.09 ms
Comparison
Name IPS Slower
fast_rss 760.30  
feeder_ex 124.28 6.12x
elixir_feed_parser 26.26 28.96x
feed_raptor 3.21 236.57x
Memory Usage
Name Memory Factor
fast_rss 0.00157 MB  
feeder_ex 4.28 MB 2726.19x
elixir_feed_parser 35.88 MB 22829.92x
feed_raptor 274.98 MB 174963.99x

Input: stuff

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 19.19 0.0521 s ±9.19% 0.0546 s 0.0635 s
feeder_ex 0.93 1.07 s ±2.49% 1.07 s 1.15 s
elixir_feed_parser 0.53 1.88 s ±1.22% 1.89 s 1.92 s
feed_raptor 0.0797 12.54 s ±1.61% 12.44 s 12.77 s
Comparison
Name IPS Slower
fast_rss 19.19  
feeder_ex 0.93 20.59x
elixir_feed_parser 0.53 36.11x
feed_raptor 0.0797 240.68x
Memory Usage
Name Memory Factor
fast_rss 0.00154 MB  
feeder_ex 140.58 MB 91220.55x
elixir_feed_parser 1018.78 MB 661058.28x
feed_raptor 8424.44 MB 5466379.81x

Deploying

Deploying rust NIFs can be a little bit annoying as you have to install the rust compiler. If you are having trouble deploying this package make an issue and I will try and help you out.

I will then add it to the FAQ below.

Q. How do I deploy using an Alpine Dockerfile?

A. I recommend using a multi-stage Dockerfile, and doing the following

  1. On the stages where you build all your deps, and build your release make sure to install build-base and libgcc:

    # This step installs all the build tools we'll need
    RUN apk update && \
        apk upgrade --no-cache && \
        apk add --no-cache \
        git \
        curl \
        build-base \
        libgcc  && \
        mix local.rebar --force && \
        mix local.hex --force
  2. Install the rust compiler and allow dynamic linking to the C library by setting the rust flag

    # install rustup
    RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
    ENV RUSTUP_HOME=/root/.rustup \
        RUSTFLAGS="-C target-feature=-crt-static" \
        CARGO_HOME=/root/.cargo  \
        PATH="/root/.cargo/bin:$PATH"
  3. On the stage where you actually run your elixir release install libgcc:

    ################################################################################
    ## STEP 4 - FINAL
    FROM alpine:3.11
    
    ENV MIX_ENV=prod
    
    RUN apk update && \
        apk add --no-cache \
        bash \
        libgcc \
        openssl-dev
    
    COPY --from=release-builder /opt/built /app
    WORKDIR /app
    CMD ["/app/my_app/bin/my_app", "start"]

License

FastRSS is released under the Apache License 2.0 - see the LICENSE file.

About

Fast Elixir RSS feed parser, a NIF wrapper around the Rust RSS crate

https://avencera.github.io/fast_rss/

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Elixir 99.4%
  • Rust 0.6%