Skip to content

Improve performance #43

@lo48576

Description

@lo48576

There is a lot of room for improvement.

Forced validation which can fail only by a bug

For example, this part.

iri-string/src/normalize.rs

Lines 641 to 657 in f95d50a

#[cfg(feature = "alloc")]
impl<S: Spec> ToDedicatedString for Normalized<'_, RiStr<S>> {
type Target = RiString<S>;
fn try_to_dedicated_string(&self) -> Result<Self::Target, TryReserveError> {
let s = self.try_to_string()?;
Ok(TryFrom::try_from(s).expect("[validity] the normalization result must be a valid IRI"))
}
}
#[cfg(feature = "alloc")]
impl<S: Spec> From<Normalized<'_, RiStr<S>>> for RiString<S> {
#[inline]
fn from(v: Normalized<'_, RiStr<S>>) -> Self {
v.to_dedicated_string()
}
}

When the syntax of the normalization result is already known, the algorithm guarantees that the resulting string is conformant to the syntax. However, Some part of this crate uses general and public interface to convert string types (such as TryFrom::try_from instead of unsafe counterparts and debug_assert!).
This kind of "should fail only in case of bug" test should basically only be run in debug build if the performance is important.

Non-allocating algorithm even when alloc or std is available

For example, this part.

// None: No segments are written yet.
// Some(false): Something other than `/` is already written as the path.
// Some(true): Only a `/` is written as the path.
let mut only_a_slash_is_written = None;
let mut too_deep_area_may_have_dot_segments = true;
while !rest.is_empty() && too_deep_area_may_have_dot_segments {
/// The size of the queue to track the path segments.
///
/// This should be nonzero.
const QUEUE_SIZE: usize = 8;

let mut queue: [Option<&'_ str>; QUEUE_SIZE] = Default::default();
let mut level: usize = 0;
let mut first_segment_has_leading_slash = false;
// Find higher path segments.
let mut end = 0;
for seg in PathSegmentsIter::new(&rest) {
let kind = seg.kind(&rest);
match kind {
SegmentKind::Dot => {
too_deep_area_may_have_dot_segments = true;
}
SegmentKind::DotDot => {
level = level.saturating_sub(1);
too_deep_area_may_have_dot_segments = true;

In this algorithm, the extra string iteration is necessary if . or .. exists in the deep compontent of the path, because the constant-sized queue is used in order to support non-alloc usage.
However, most of the use case of this crate would be web-related and thus std would be available, so in that case more efficient implementation would be possible. (For example, pre-allocating extra buffer and modifying it inline.)

Other parts that uses core::fmt::Display for serialization could also be improved using the memory allocation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions