Make SpEnoteStore serializable#25
Make SpEnoteStore serializable#25DangerousFreedom1984 wants to merge 37 commits intoUkoeHB:seraphis_libfrom
Conversation
rbrunner7
left a comment
There was a problem hiding this comment.
Took quite some work to grind through all those classes, nice that you took that on you, so we can just profit and have it comfortable :)
Nice code, but I had a number of question marks popping up.
There was a problem hiding this comment.
Just a question to further my knowledge: What's the connection between serialization and equality comparison operators? Are they needed for serialization in one way or another?
There was a problem hiding this comment.
No, they are not needed for the serialization. The only purpose is testing. Maybe they could be useful too later.
src/seraphis_impl/checkpoint_cache.h
Outdated
There was a problem hiding this comment.
Little nitpick: CheckpointCache&other looks strange to me without any blank. Saw it alread in one or two other places.
There was a problem hiding this comment.
Thanks. I will fix it.
src/seraphis_impl/checkpoint_cache.h
Outdated
There was a problem hiding this comment.
Not sure my understanding is correct, but isn't turning this into serializable_map a "creeping in" of serialization into the library class that we want to avoid, with great effort, by using special serialization classes like here ser_CheckpointCache?
Not sure how it's already done as a pattern in other similar cases, and not sure whether having to copy maps one element at a time because of different type between this class and ser_CheckpointCache is a good trade-off for absolute "purity" here.
There was a problem hiding this comment.
Why do we want to avoid the classes from serialization/containers.h ? We are still using binary_archive with them. What do we want avoid actually?
There was a problem hiding this comment.
Well, I am not the best person to ask, because I myself as the "master" over the Seraphis library probably wouldn't go through all the complications with those ser_ classes and add serialization instrumentation directly to the Seraphis library struct and classes ...
But now, as we do separate, I think we could well go the extra mile and keep any serialization stuff out of the Seraphis library, as a matter of principle, and for not mixing things.
Maybe one day somebody will want to serialize Seraphis library structures with a completely different technology than binary_archive where, at least in theory, such a serializable_map would be not helpful in a good case and a problem with some bad luck.
src/seraphis_impl/enote_store.h
Outdated
There was a problem hiding this comment.
Same question, of course, of using serializable things in Seraphis library objects as in the checkpoint cache.
There was a problem hiding this comment.
Shouldn't we, quite in general, version all these classes with a version field? I can imagine smaller changes in the Seraphis enote origin context class that alone will not yet lead to SpEnoteOriginContextV2 but will lead to small changes in ser_SpEnoteOriginContextV1 and its serialization that we absolutely want to handle cleanly with the help of a version field?
Consider that the very simple serialization format used here has no meta info stored together with the data, thus without version fields you would have to resort to ugly heuristics in such cases.
I wonder whether we should treat any new such SERIALIZE_OBJECT block without VERSION_FIELD as deffective and have version fields strictly mandatory ...
There was a problem hiding this comment.
Yeah, that's a good point. If we change the type of one variable for example then we wouldn't know that in the serialization. On the other hand, if you dont break the serialization you dont need to know if changes were made. Since it would be deserializing correctly again. I can't think of an example of 'small changes' now that would really require adding a version field. If the changes are big (for example adding another field element) then we would need to have another ser_struct. Can you think of an example to better support the use of a version field?
There was a problem hiding this comment.
Well, imagine you need one more uint64 in that struct. Or origin_status changes from char to int. Or something similar. The issue is: How would you read "old" files? Files that do not yet have have that extra int? How would you know, without seeing "Ah, this is still version 0, not yet version 1 with the new int", that you have to read the old data differently?
See here for a nice example in the existing codebase.
There was a problem hiding this comment.
I may misremember, but I have something at the back of my mind that either @jeffro256 or @j-berman once explained that before variants correctly serialize with binary_archive you have to add some additional declarations somewhere for them. Stuff like this that is now in the file cryptonote_basic.h:
VARIANT_TAG(binary_archive, cryptonote::txin_to_key, 0x2);
As I do not see you adding something like that: Am I mistaken? Is that not needed in your cases of variants?
Just now saw your own question regarding this: "(why) do I need to use the VARIANT_TAG when serializing variants?" A guess: It works without such tags, but only up to the first change, when you have to support more variants?
There was a problem hiding this comment.
In all my own programming I use to add asserts at the end of such constructs, here with a final else, so if we ever have LegacyEnoteV6 and I forget to adjust this code to handle it, in addition to maybe 20 other places that have to deal with that 6th type, I will immediately notice.
There was a problem hiding this comment.
Thanks. I agree. I will add the final conditional.
There was a problem hiding this comment.
That memcpy looks suspicious to me, and I wonder why such a blunt "hammer" should be necessary to just copy something, but I don't know the codebase enough to really judge. Maybe just have a good look again?
There was a problem hiding this comment.
Yeah, sure. I didnt think much about it and just followed the way koe did it. I do also think that there is a better way to do it.
There was a problem hiding this comment.
Shouldn't that be make_serializable_checkpoint_cache_config?
There was a problem hiding this comment.
I dont think so as the struct has just one big name. But yeah, looks better with underlines.
There was a problem hiding this comment.
As far as I can see, you test converting between the "real" classes and the ser_ classes, which is certainly nice, but you do not test the actual serialization to bytes itself. Wouldn't it be important to test that as well, with a binary_archive and a memory stream for example? I mean, just one forgotten FIELD line, and all goes wrong, right?
There was a problem hiding this comment.
Yeah, definitely. Actually my code is not working as it is now. I just spotted many mistakes which I will correct soon. Definitely I need more unit_tests. I will work on them.
|
Hey @DangerousFreedom1984! I wrote a PR here monero-project#9077 that would allow you to serialize/deserialize the STL containers directly, without needing the You can simply use |
Thank you @jeffro256 ! I think it will look much better with the changes that you propose. I will rebuild this considering that your changes will be approved. It looks clean to me as it is now because if I want to serialize something, I just need to search for the serializable definition and then apply the corresponding functions to do it. On the other hand, all structs (or almost all) should be serializable and it should not be a big deal to call the serialization functions on them. It also seems to me that it is more efficient to do that directly. From my understanding now, I would vote to not have these |
It was the decision of @UkoeHB to not directly equip the Seraphis library classes with serialization capability, under the impression of existing serialization code that pretty much degenerated into a mess over time, so it may indeed be a good idea to make sure that this never happens to parts of the Seraphis library. I for one have the tendency to grant him as the architect a strong vote in such matters that we shouldn't overrule lightly. And as I argued already elsewhere, there are a number of serialization technologies around, that also may get used for the Seraphis classes, which would be another argument for being "agnostic" in the library itself. |
Some downstream code (most notably PR UkoeHB#25) wants to use the src/serialization lib for storing information persistently. When one builds classes/machines wishing to serialize containers, they must use the `serializable_*` container classes. In this case, this makes the Seraphis library code unnecessarily tightly coupled with the src/serialization code since one cannot swap out their type of storage format without major refactoring of class field types. By serializing STL containers directly, we can abstract the serialization details away, making for much cleaner design. Also small bonus side effect of this change is that STL containers with custom Comparators, Allocators, and Hashers are serializable. `std::multimap` is added to the list of serializable containers. Depends upon monero-project#9069.
d493a6d to
279be2b
Compare
a8319f9 to
b9ea56a
Compare
There was a problem hiding this comment.
I just want to point out that after PR monero-project#9069, you can actually write free functions for serialization, which means that you can serialize directly from that function and don't have to create new types.
There was a problem hiding this comment.
Nice! So I can use something like that for (almost) all structs in seraphis?
BLOB_SERIALIZER(struct);
serialization::dump_binary(struct, blob_struct));
serialization::parse_binary(blob_struct, struct_recovered));
I will try to do so and simplify this PR. Thanks.
There was a problem hiding this comment.
Yes, you can do that. And for non-blob types, you can write a do_serialize() function. I might make changes to the serialization header to make this an easier process.
There was a problem hiding this comment.
Actually it might be better to leave as-is for now so it matches with the rest of the file's serialization. I will probably refactor all these classes out later
There was a problem hiding this comment.
I have been working on that all day. Now I can serialize the enote_store and I dont use the serialization_demo_*. I'm trying to make a tx serializable now without using the serialization_demo_*. I have to fix some bugs. I will post my changes as soon as I fix it.
…is_lib_hist_05_15_23 branch for commit history
make JamtisDestinationV1 serializable --------- Co-authored-by: DangerousFreedom <monero-inflation-checker@protonmail.com>
* add operator== to JamtisPaymentProposals --------- Co-authored-by: DangerousFreedom <monero-inflation-checker@protonmail.com>\
* make JamtisPaymentProposal serializable --------- Co-authored-by: DangerousFreedom <monero-inflation-checker@protonmail.com>
* derive view_balance from master key Co-authored-by: UkoeHB <37489173+UkoeHB@users.noreply.github.com> --------- Co-authored-by: DangerousFreedom1984 <monero-inflation-checker@protonmail.com> Co-authored-by: UkoeHB <37489173+UkoeHB@users.noreply.github.com>
* SpTxCoinbaseV1: remove block_reward field Not storing/serializing `block_reward` saves us a few bytes on coinbase transactions, and makes it so that you can't initialize a coinbase transaction that has a block reward not matching its output sum.
--------- Co-authored-by: UkoeHB <37489173+UkoeHB@users.noreply.github.com>
* direct & compact tx serialization txs are [de]serialized directly from their classes and sizes of containers are not serialized if they can be implied.
Implement async wallet scanner.
Adds a new functional test for direct wallet2 -> live RPC daemon
interactions. This sets up a framework to test pointing the
Seraphis wallet lib to a live daemon.
Tests sending and scanning:
- a normal transfer
- a sweep single (0 change)
- to a single subaddress
- to 3 subaddresses (i.e. scanning using additional pub keys)
* scan machine: option to force reorg avoidance increment first pass
- when pointing to a daemon that does not support returning empty
blocks when the client requests too high of a height, we have to
be careful in our scanner to always request blocks below the chain
tip, in every request.
- by forcing the reorg avoidance increment on first pass, we make
sure clients will always include the reorg avoidance increment
when requesting blocks from the daemon, so the client can expect
the request for blocks should *always* return an ok height.
* core tests: check conversion tool on all legacy enote version types
Stil TODO:
- check complete scanning on all enote types
- hit every branch condition for all enote versions
* conn pool mock: epee http client connection pool
- Enables concurrent network requests using the epee http client.
- Still TODO for production:
1) close_connections
2) require the pool respect max_connections
* enote finding context: IN LegacyUnscannedChunk, OUT ChunkData
- finds owned enotes by legacy view scanning a chunk of blocks
* async: function to remove minimum element from token queue
- Useful when we want to remove elements of the token queue in an
order that is different than insertion order.
* async scanner: scan via RPC, fetching & scanning parallel chunks
*How it works*
Assume the user's wallet must start scanning blocks from height
5000.
1. The scanner begins by launching 10 RPC requests in parallel to
fetch chunks of blocks as follows:
```
request 0: { start_height: 5000, max_block_count: 20 }
request 1: { start_height: 5020, max_block_count: 20 }
...
request 9: { start_height: 5180, max_block_count: 20 }
```
2. As soon as any single request completes, the wallet immediately
parses that chunk.
- This is all in parallel. For example, as soon as request 7
responds, the wallet immediately begins parsing that chunk in
parallel to any other chunks it's already parsing.
3. If a chunk does not include a total of max_block_count blocks,
and the chunk is not the tip of the chain, this means there was a
"gap" in the chunk request. The scanner launches another parallel
RPC request to fill in the gap.
- This gap can occur because the server will **not** return a
chunk of blocks greater than 100mb (or 20k txs) via the
/getblocks.bin` RPC endpoint
([`FIND_BLOCKCHAIN_SUPPLEMENT_MAX_SIZE`](https://github.com/monero-project/monero/blob/053ba2cf07649cea8134f8a188685ab7a5365e5c/src/cryptonote_core/blockchain.cpp#L65))
- The gap is from `(req.start_height + res.blocks.size())` to
`(req.start_height + req.max_block_count)`.
4. As soon as the scanner finishes parsing the chunk, it
immediately submits another parallel RPC request.
5. In parallel, the scanner identifies a user's received (and
spent) enotes in each chunk.
- For example, as soon as request 7 responds and the wallet
parses it, the wallet scans that chunk in parallel to any other
chunks it's already scanning.
6. Once a single chunk is fully scanned locally, the scanner
launches a parallel task to fetch and scan the next chunk.
7. Once the scanner reaches the tip of the chain (the terminal
chunk), the scanner terminates.
*Some technical highlights*
- The wallet scanner is backwards compatible with existing daemons
(though it would need to point to an updated daemon to realize the
perf speed-up).
- On error cases such as the daemon going offline, the same wallet
errors that wallet2 uses (that the wallet API expects) are
propagated up to the higher-level Seraphis lib.
- The implementation uses an http client connection pool (reusing
the epee http client) to support parallel network requests
([related](seraphis-migration/wallet3#58)).
- A developer using the scanner can "bring their own blocks/network
implementation" to the scanner by providing a callback function of
the following type as a param to the async scanner constructor:
`std::function<bool(const cryptonote::COMMAND_RPC_GET_BLOCKS_FAST::request, cryptonote::COMMAND_RPC_GET_BLOCKS_FAST::response)>`
---------
Co-authored-by: jeffro256 <jeffro256@tutanota.com>
--------- Co-authored-by: SNeedlewoods <sneedlewoods_1@protonmail.com>
This PR removes "universal"-style indexing for legacy CLSAG rings, and replaces it with a reference set scheme that uses (amount, index in amount) indexing pairs to reference on-chain enotes. This is the same method that Cryptonote txs use, and is how the current Monero Core LMDB database is referenced. Doing things this way means that the database will not have to be re-indexed, saving at a very minimum 1.6 GB (100M on-chain enotes * (16 bytes for extra table keys)) of storage space, and an expensive database migration involving moving all existing enote data to a new table. We change the MockLedgerContext to support this indexing scheme. In practice, serialized txs under this method shouldn't take up much more space than pre-PR if compressed clever-ly, and assuming most ring members will RingCT enotes. We also add LegacyEnoteOriginContext for contextualized enote records so we can better keep tracked of scanned legacy enotes under the legacy indexing scheme. Co-authored-by: SNeedlewoods <sneedlewoods_1@protonmail.com>
d8d2fed to
de93f63
Compare
de93f63 to
a54bbcc
Compare
a13d366 to
df3ef8f
Compare
The idea is to make the
SpEnoteStoreclass serializable so we can store/load it into/from files, as it is really important for the wallet to recover the owned/spent enotes.I tried to keep the serialization pattern used in seraphis (using binary_archive) and I did the following modifications:
serialization_demo_types.hserialization_demo_utilscheckpoint_cache.*and removed theconstof the variables thereenote_store:Depends upon:
monero-project#9069
monero-project#9077