Skip to content

Export user preferences on project export#3968

Open
kleintom wants to merge 8 commits intoSpeciesFileGroup:developmentfrom
kleintom:export_prefs
Open

Export user preferences on project export#3968
kleintom wants to merge 8 commits intoSpeciesFileGroup:developmentfrom
kleintom:export_prefs

Conversation

@kleintom
Copy link
Contributor

@kleintom kleintom commented Jun 5, 2024

I tested this locally with 1) default (reset) preferences, and 2)

{"layout"=>
  {"tasks::digitize::collectionObjects::hideBuffered"=>true,
   "tasks::digitize::collectionObjects::hideCatalogNumber"=>true,
   "tasks::digitize::LeftColumnOrder"=>["BiologicalAssociation", "TaxonDeterminationLayout", "TypeMaterial"],
   "tasks::extract::componentsOrder"=>["ByComponent", "MadeComponent", "ProtocolsComponent", "OriginComponent", "IdentifierComponent", "RepositoryComponent", "CustomAttributes"]},
 "disable_chime"=>false,
 "items_per_list_page"=>20,
 "taxon_name_autocomplete_redirects_to_valid"=>false,
 "default_hub_tab_order"=>["tasks", "data", "favorite"]}

Two caveats, please send back to me if there's any concern:

  • I didn't try all of the preferences, maybe there are some that are more complicated?
  • as far as I could tell the quoting gymnastics are necessary, but I didn't check/don't know if they do the right thing if there's already an escaped " in the preferences (since I'm expecting that never happens...)

kleintom added 2 commits June 4, 2024 22:21
Otherwise you get exceptions such as when tasks like New CE try to use Preferences.layout
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.78%. Comparing base (de5c9f0) to head (b764c70).
Report is 7 commits behind head on development.

Current head b764c70 differs from pull request most recent head f994a57

Please upload reports for the commit f994a57 to get more accurate results.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@               Coverage Diff                @@
##           development    #3968       +/-   ##
================================================
+ Coverage        58.17%   72.78%   +14.60%     
================================================
  Files             1655     1935      +280     
  Lines            52473    66524    +14051     
================================================
+ Hits             30527    48419    +17892     
+ Misses           21946    18105     -3841     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mjy mjy requested a review from LocoDelAssembly June 5, 2024 17:28
<h3> Export SQL </h3>
<p><em>Generate a downloadable copy of the database (PostgreSQL dump) with all data referenced in this project. Includes Community data like Sources, People, and Repositories.</em></p>

<p><em>All exported user passwords are set to</em> 'taxonworks'.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of this (that I totally forgot about) then would be nice to add a warning that it is not a good idea at all to build a public TW instance out of the generated dump.

is_administrator: user.is_administrator || 'NULL',
hub_tab_order: "'{#{conn.escape_string(user.hub_tab_order.join(','))}}'"
hub_tab_order: "'{#{conn.escape_string(user.hub_tab_order.join(','))}}'",
preferences: %['"#{conn.escape_string(JSON.generate(user.preferences)).gsub('"', '\"')}"']
Copy link
Contributor

@LocoDelAssembly LocoDelAssembly Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried preferences: "'#{conn.escape_string(user.preferences.to_json)}'"? I think ' is the char that needs to be escaped, not ".

3.3.1 :017 > puts ActiveRecord::Base.connection.raw_connection.escape_string(%["hello" 'bye'])
"hello" ''bye''

@LocoDelAssembly
Copy link
Contributor

As general comment, my concern with this would be if there a potential to leak semi-sensitive (or private) information from other project members. If not a problem, after doing some revisions I think it is good to merge, otherwise I'd seek to make the user model tolerant to empty preferences and initialize them properly if so.

@LocoDelAssembly
Copy link
Contributor

Correction, it would export preferences for all users, not just project members (although non-members have their email and name fields redacted).

@mjy
Copy link
Member

mjy commented Jun 5, 2024

I think we can review preferences to ensure nothing sensitive is there. There shouldn't be, if there is we need to strike it from there and handle it elsewhere. We should never store credentials, or any personal information beyond things like "I want the form layed out this way".

@jlpereira
Copy link
Member

We don't save any sensitive information there, just layout preferences and copy paste shortcuts. Maybe we could add a checkbox option for this?
Like:

[x] Export user preferences

If it is not checked, initialize the users preferences. I'm not sure if export projects will always need export user preferences or not

@LocoDelAssembly
Copy link
Contributor

The initial motivation for this I guess was that @kleintom experienced software crashes on some tasks because preferences were not initialized. If that is the main goal here, then I think would be best to instead fix the model so User's preferences attribute returns a default set of preferences if none are stored in the database or have an invalid structure.

@mjy
Copy link
Member

mjy commented Jun 5, 2024

. If that is the main goal here, then I think would be best to instead fix the model so User's preferences attribute returns a default set of preferences if none are stored in the databa

Yes, that was intended already. Might just be that we have to re-initialize a legal JSON base, not sure, '{}'

kleintom added 2 commits June 5, 2024 18:39
This reverts commit f994a57.

Decided to go the route of making sure preferences have a default value instead - see the discussion at SpeciesFileGroup#3968
Because all passwords are set to 'taxonworks'.
@kleintom
Copy link
Contributor Author

kleintom commented Jun 6, 2024

The initial motivation for this I guess was that @kleintom experienced software crashes on some tasks because preferences were not initialized. If that is the main goal here, then I think would be best to instead fix the model so User's preferences attribute returns a default set of preferences if none are stored in the database or have an invalid structure.

Defaults are merged into any preferences being saved on before_save here:

before_save :fill_preferences

def fill_preferences
if preferences.empty?
reset_preferences
else
BASE_PREFERENCES.keys.each do |k|
preferences[k] = BASE_PREFERENCES[k] if send(k).nil?
end
end
true
end

The issue is that in the export/import case, users are created at the pg level directly by psql, and the default preferences in that case are {}.

With those empty preferences we get exceptions like here:

where we try to do preferences.layout.

I don't think we can make the default for User preferences be BASE_PREFERENCES (can we?); if preferences are non-sensitive/will stay non-sensitive, then always importing preferences, for everyone, seems to me like the easiest solution here. (I'm not clear on why non-project users are exported, maybe so you're guaranteed to have an existing administrator account?) Other thoughts?

@kleintom
Copy link
Contributor Author

kleintom commented Jun 6, 2024

Have you tried preferences: "'#{conn.escape_string(user.preferences.to_json)}'"?

Based on what I'm seeing online I expected that to work, but it didn't.

With the current release code, if I do select preferences from users (where I only have one user), it shows
"{\"layout\":{},\"disable_chime\":false,\"items_per_list_page\":20,\"taxon_name_autocomplete_redirects_to_valid\":false,\"default_hub_tab_order\":[\"tasks\",\"data\",\"favorite\"]}".

If I import using "'#{conn.escape_string(user.preferences.to_json)}'" and then do the select it shows {"layout":{},"disable_chime":false,"items_per_list_page":20,"taxon_name_autocomplete_redirects_to_valid":false,"default_hub_tab_order":["tasks","data","favorite"]}. (Which is what you would expect to see in a json field, no?)

But with that second imported preferences value, if I try to load a page that uses those preferences, or from the rails console I do User.first.preferences, I get an error no implicit conversion of Hash into String (TypeError) - that error hangs the app when you try to load a page that uses preferences.

It seems to expect the db value of preferences to be serialized json (which is what it appears to be with release code), not a json hash (though it's not clear to me in what sense the db value is a hash in this case). Schema.rb says t.json "preferences", default: {} for preferences.

If you update preferences to be {} (no quotes) it's fine with that, even though it's a "hash", not a double-quoted string...

I haven't been able to make sense of all of that yet; thoughts?

@LocoDelAssembly
Copy link
Contributor

The initial motivation for this I guess was that @kleintom experienced software crashes on some tasks because preferences were not initialized. If that is the main goal here, then I think would be best to instead fix the model so User's preferences attribute returns a default set of preferences if none are stored in the database or have an invalid structure.

Defaults are merged into any preferences being saved on before_save here:

before_save :fill_preferences

def fill_preferences
if preferences.empty?
reset_preferences
else
BASE_PREFERENCES.keys.each do |k|
preferences[k] = BASE_PREFERENCES[k] if send(k).nil?
end
end
true
end

The issue is that in the export/import case, users are created at the pg level directly by psql, and the default preferences in that case are {}.

In this case I would add this in user/preferences.rb

# Assuming this method will be instance method of User, if not the case then needs to be relocated (if possible)
def preferences
  prefs = read_attribute(:preferences)
  return prefs unless prefs.empty?

  reset_preferences
  read_attribute(:preferences)
end

What do you think? (cc @mjy)

With those empty preferences we get exceptions like here:


where we try to do preferences.layout.

I don't think we can make the default for User preferences be BASE_PREFERENCES (can we?); if preferences are non-sensitive/will stay non-sensitive, then always importing preferences, for everyone, seems to me like the easiest solution here. (I'm not clear on why non-project users are exported, maybe so you're guaranteed to have an existing administrator account?) Other thoughts?

Non project users are exported because housekeeping of community data might reference them.

@kleintom
Copy link
Contributor Author

kleintom commented Jun 6, 2024

That fixes the exception issue, though it doesn't persist the new prefs to the database. I don't know, would you want to call save in the accessor in that case?

Do we want to not export preferences then? (Could still export plus add the reader for future-proofing in other export cases.)

@mjy
Copy link
Member

mjy commented Jun 6, 2024

would you want to call save in the accessor in that case?

No, never :).

@LocoDelAssembly
Copy link
Contributor

Would be nice to figure out exactly how is ActiveRecord serializing the preferences attribute to make sure gsub is the correct solution. If we can be certain the export is done properly, then I guess it is fine to start exporting preferences. Would also include the custom getter even if the export includes preferences.

@mjy
Copy link
Member

mjy commented Jun 7, 2024

I remember battling this. I just realized this is a json not jsonb field, IIRC the latter was my preference. So first we should probalby unify all our json to one type. Can check migrations for pattern/comments.

kleintom added 3 commits June 10, 2024 09:34
Thanks to Hernán for the suggestion and code.

For when preferences are empty, in this case because the project was imported via psql and so avoided the before_save check that normally would fill any unset preferences.

Empty prefs lead to exceptions on things like preferences.layout reads, which assume default preferences exist.

Note that in general the preferences returned here won't be saved to the database.
`change_column` is always irreversible, so we provide down here.
@kleintom
Copy link
Contributor Author

The issue with user preferences showing up in the database as double-quoted strings: "{\"layout\":{},\"disable_chime\":false,\"items_per_list_page\":20,\"taxon_name_autocomplete_redirects_to_valid\":false,\"default_hub_tab_order\":[\"tasks\",\"data\",\"favorite\"]}" seems to be because there's a store property defined on that field:

store :preferences, accessors: BASE_PREFERENCES.symbolize_keys.keys, coder: JSON
From https://api.rubyonrails.org/classes/ActiveRecord/Store.html "Store gives you a thin wrapper around serialize", and the coder: JSON allows the serialization to be stored on a JSON field.

Seems strange that you can't insert actual json into your json column anymore (and how does it work if you only have a store accessor on one of your many json keys?). In any case, the link suggests that if your underlying field is already json, then you should use store_accessor instead of store. I confirmed that does work: you can insert "normal" json such as '{"a": true}' and still read it as User.a. But if you make the switch to store_accessor without migrating your old serialized data then you get errors lik a 500 in user/preferences.json.jbuilder when you try to do json.merge! @user.preferences on a string.

So I'm skipping exporting user preferences for now on project export, but let me know if there's more you'd like me to work on here.

@kleintom
Copy link
Contributor Author

Oops, I'll check out the errors.

@kleintom
Copy link
Contributor Author

I modeled the json -> jsonb migration that I wrote on previous migrations doing the same thing:

change_column :sqed_depictions, :metadata_map, :jsonb, using: 'metadata_map::jsonb', null: false, default: '{}'
change_column :sqed_depictions, :specimen_coordinates, :jsonb, using: 'specimen_coordinates::jsonb', null: false, default: '{}'

change_column :projects, :workbench_settings, :jsonb, using: 'workbench_settings::jsonb', null: false, default: '{}'

Note the default set to '{}' and not {} in all cases.

Most of the failing tests from my previous push are at

fp['recently_visited'] ||= []

The issue seems to be that we're trying to read the value of the footprints hash with key recently_visited when the current value of the hash is "{}" (not {}). This would work if footprints was a stored field, but it's not (as determined by User.stored_attributes). So in this case the default value of footprints should be {} and not '{}'. (Yuk.) Doing that fixes those tests. (Thank goodness for the tests!)

The Project migration linked above (which also renames workbench_settings to preferences) should be fine/correct because preferences in Project is also stored.

The sqed_depiction migration I think might be in error. specimen_coordinates is currently unused and unaccessed, so no errors there, but if you create two sqed depictions and then visit /tasks/accessions/breakdown/buffered_data/thumb_navigator/2.html you get a 500 on undefined method 'inject' for an instance of String here:

metadata_map.inject({}){|hsh, i| hsh.merge(i[0].to_i => i[1].to_sym)}
because metadata_map is the string "{}" (the default value) and you're trying to do an inject on it. If I change those default metadata_map to '{}' instead of "{}" then the page loads without error.

So I think the way forward is to do another migration to:

  • migrate user :footprint's default to {} (not '{}')
  • migrate sqed_depiction's metadata_map and specimen_coordinates defaults to {} (not '{}')
  • maybe migrate all existing sqed_depiction metadata_map and specimen_coordinates default values already in the database to {} (not '{}').

I can keep working on this - in which case let me know if everything sounds right - but it's a little hinky with the quotes vs unquotes and now the migrations are changing more, so feel free to take it if that works better.

@kleintom
Copy link
Contributor Author

Coming back to this again since it's still a pain and becoming moreso now that we store more user prefs.

Reminder of the basic issue:

  • when we export projects we don't export user preferences. When you load an exported project, some tasks will raise exceptions because they don't handle user prefs being empty.
    • User has to go to user settings and manually Reset their prefs.
  • since User prefs are a json field, we wanted to take this opportunity to migrate all json fields to be jsonb (or perhaps vice versa)
  1. New explanation for the struggles I was having interpreting the db values for User prefs I was seeing (via Claude of course):
    t.json "preferences", default: {}

    store :preferences, accessors: [:disable_chime], coder: JSON

These two json interpretations are ADDITIVE, not complementary :( PG does its conversion, AR does its own additional conversion. (Project may have the same issue with its preferences.)
In other words,

  # IMPORTANT: Double-encoding behavior with `store` and `json` column type
  #
  # The preferences column is a PostgreSQL `json` type (see db/schema.rb), but
  # we also use `store` with `coder: JSON` here (below). This creates a
  # double-encoding/decoding flow:
  #
  # When WRITING to the database:
  #   1. User model has a preferences Hash: { disable_chime: false }
  #   2. ActiveRecord::Type::Serialized (from `store`) encodes it:
  #      '{"disable_chime":false}'
  #   3. PostgreSQL json type stores this JSON string AS a json value:
  #      '"{"disable_chime":false}"' (Note: The database stores a JSON string
  #      containing another JSON string)
  #
  # When READING from the database:
  #   1. PostgreSQL returns the json column value: '{"disable_chime":false}' (a
  #      Ruby String)
  #   2. ActiveRecord::Type::Serialized (from `store`) decodes it: {
  #      disable_chime: false }
  #   3. User model receives the preferences Hash
  #
  # This means:
  #   - In PostgreSQL, preferences are stored as:
  #     '"{\\"disable_chime\\":false}"' (a JSON string value, not a JSON object)
  #   - Project exports must use JSON.generate(JSON.generate(user.preferences))
  #     to match this double-encoding behavior (see
  #     lib/export/project_data/sql.rb)
  #   - Raw SQL queries will return a JSON string, not a parsed object
  #
  # Historical note: This pattern works but is unconventional. Typically with a
  # `json` column, you would NOT use `coder: JSON` since PostgreSQL's adapter
  # already handles JSON encoding. Changing this would require a migration and
  # careful data conversion.
  store :preferences, accessors: [:disable_chime], coder: JSON

Possible actions regarding the double encoding

a) Just deal with it. It's possible now that we understand it (see new commit comment below).
b) Remove the store :preferences designation on User and (likely) on Project (those are the only two uses I'm seeing). As Claude says, this will require a migration (since currently prefs are stored in the db double-encoded).

  • I expect this will clear up my confusion around what formerly seemed like 'do-this-because-it-seems-to-work-not-because-you-know-why' changes I was considering and allow for:
    c) switching all json columns to jsonb with confidence.

(We could add backup columns as we have in other similar cases, if we do change things.)

Discussion regarding exporting prefs with Project export

  • It seemed we all felt pretty okay about exporting user prefs, provided we were confident we weren't storing any personal information in them.
    • That can now be done "cleanly" whether or not we make json changes (see new commit, where it works, cleanly (but dumbly)).

@kleintom
Copy link
Contributor Author

(This whole branch will need to be cleaned up of course, I'm not looking to push anything to dev right away.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants