Skip to content

Conversation

@arnsholt
Copy link
Contributor

As we discussed in #31, From<> implementations for the types that seem to make sense (for EliasFano .iter().collect() should suffice, and the RMQ types are Deref and aren't constructed from BitVecs). While using the library I've also felt the need for the extend and split functionality, so I added those while I was at it.

Adds:

  • From<T> for BitVec implementations for converting RsVec, BpTree and WaveletMatrix back into BitVecs
  • impl Extend<BitVec> for BitVec and BitVec.extend_vec() to append many bits at once
  • BitVec.split_at() and ._split_at_unchecked() to split a BitVec into two halves

@arnsholt
Copy link
Contributor Author

And I see now that between me looking at this again yesterday and deciding to do it, and actually doing it today, you did most of it anyways. Oops. =)

@Cydhra
Copy link
Owner

Cydhra commented May 12, 2025

Well it doesn't matter, too much, the main work I did yesterday was fix the padding thing in b478799, which enables the From implementation to work in the first place. The work on the issue_31 branch isn't very substantial. You could rebase onto the branch though, if you want, as I already did the documentation work for the conversion methods, so you could rewire your From implementations to call those methods and profit from the existing documentation. If you think this is redundant, we can also ignore the branch.

@Cydhra Cydhra added the enhancement New feature or request label May 12, 2025
@Cydhra Cydhra linked an issue May 12, 2025 that may be closed by this pull request
Cydhra and others added 4 commits May 12, 2025 15:34
Adds:
- `From<T> for BitVec` implementations for converting RsVec, BpTree and
  WaveletMatrix back into BitVecs
- `impl Extend<BitVec> for BitVec` and `BitVec.extend_vec()` to append
  many bits at once
- `BitVec.split_at()` and `._split_at_unchecked()` to split a BitVec
  into two halves
@arnsholt
Copy link
Contributor Author

I rebased onto your branch and updated the From impls for RsVec and BpTree. Left WaveletMatrix as is for now.

I'm pretty sure I added edit for you on the PR, so if you have a clear idea of what tests you want you can put them in. Otherwise, I can try to add some tests tomorrow.

@Cydhra Cydhra self-requested a review May 12, 2025 19:53
Copy link
Owner

@Cydhra Cydhra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added test cases and fixed three bugs in split_at. I also removed the From<WaveletMatrix> implementation.


let iter_limbs = if leading_partial > 0 {
other.append_bits_unchecked(
self.data[first_limb] >> (WORD_SIZE - leading_partial),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails if the limb isn't WORD_SIZE long, see red test case.

@arnsholt
Copy link
Contributor Author

Wow, I clearly didn't think the splitting logic through properly. Sorry for rushing making the PR and making you find the bugs first! I had a proper think about the algorithm and ended up with a factoring where the initial logic was more or less sound, but with an added edge case that needed handling where the split point is inside a partial last limb of the input vector. I also added a few more test cases.

@arnsholt
Copy link
Contributor Author

And I completely failed to see the unchecked set of tests just above. Sorry for being so sloppy again!

@arnsholt
Copy link
Contributor Author

There, now it passes the full test suite here.

@Cydhra Cydhra merged commit b820503 into Cydhra:master May 15, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Conversion between data structures and BitVec

2 participants