Skip to content

Fixed queue issue#84

Open
rishitha-ravi wants to merge 9 commits into
upgrade-to-java21-es-updatefrom
java-21-taxonomy-propagation
Open

Fixed queue issue#84
rishitha-ravi wants to merge 9 commits into
upgrade-to-java21-es-updatefrom
java-21-taxonomy-propagation

Conversation

@rishitha-ravi

Copy link
Copy Markdown
Contributor

No description provided.

@rishitha-ravi rishitha-ravi requested a review from arunasd463 June 23, 2026 04:54
Session session = sessionFactory.openSession();
List<Long> documentIds = null;
try {
String hql = "SELECT DISTINCT docSciName.documentId FROM DocSciName docSciName "

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can name this as sql just to maintain uniformity.

query.setParameter("taxonConceptIds", taxonConceptIds);
documentIds = query.list();

logger.info("Found {} unique documentIds for taxonConceptIds: {}",

@arunasd463 arunasd463 Jun 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need info in dao functions? we already have this in error.

return documentIds;
}

public void deleteByDocumentId(Long documentId) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same changes needed like getDocumentIdsByTaxonConceptIds.

query.setParameter("bulkIds", bulkIds);
results = query.list();

logger.info("Found {} documents for bulkIds: {}", results != null ? results.size() : 0, bulkIds);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info not needed.

}

public List<Document> findByBulkIds(List<Long> bulkIds) {
logger.info("Fetching documents by bulkIds: {}", bulkIds);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can avoid this info.

}

public List<Long> getDocumentIdsByTaxonConceptIds(List<Long> taxonConceptIds) {
logger.info("Fetching unique documentIds by taxonConceptIds: {}", taxonConceptIds);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this info.

System.out.println("------------name finder process started-----------");
parsePdfWithGNFinder(ufile.getPath(), documentId);
}
if (externalUrl != null && externalUrl.startsWith("http")) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consult this with @thomvee.Both will execute if both becomes true.

try {
List<Long> documentIds = docSciNameDao.getDocumentIdsByTaxonConceptIds(updateData.getBulkIds());
List<Document> documents = documentDao.findByBulkIds(documentIds);
for (Document doc : documents) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return early if no documents is found

}

// Rerun for deleted
if (updateData.getDeleteRecoIds() != null) {

@arunasd463 arunasd463 Jun 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic can be improved using set because it performs same action. please refer

public void handleTaxonByName(TaxonomyUpdateData updateData) {
        if (updateData == null) return;

        // 1. Accumulate all unique Taxon IDs that need processing across all scenarios
        Set<Long> taxonIdsToProcess = new HashSet<>();

        if (updateData.getBulkIds() != null) {
            taxonIdsToProcess.addAll(updateData.getBulkIds());
        }

        if (updateData.getDeleteRecoIds() != null) {
            taxonIdsToProcess.addAll(updateData.getDeleteRecoIds());
        }

        if (!Objects.equals(updateData.getOldName(), updateData.getName()) && updateData.getTargetId() != null) {
            taxonIdsToProcess.add(updateData.getTargetId());
        }

        // 2. Early exit if there is absolutely nothing to process
        if (taxonIdsToProcess.isEmpty()) {
            return;
        }

        // 3. Delegate to a single, optimized processing pipeline
        processDocumentsForTaxonIds(taxonIdsToProcess);
    }

    private void processDocumentsForTaxonIds(Set<Long> taxonIds) {
        try {
            // Fetch all associated document IDs in one single database call
            List<Long> documentIds = docSciNameDao.getDocumentIdsByTaxonConceptIds(new ArrayList<>(taxonIds));
            if (documentIds == null || documentIds.isEmpty()) {
                logger.info("No documents found for the requested taxon updates.");
                return;
            }

            // Fetch all Document objects in bulk
            List<Document> documents = documentDao.findByBulkIds(documentIds);
            
            // Local Cache Map to avoid hitting the resource API repeatedly for duplicate uFileIds
            Map<String, UFile> uFileCache = new HashMap<>();

            for (Document doc : documents) {
                try {
                    UFile resource = null;
                    
                    if (doc.getuFileId() != null) {
                        String uFileIdStr = doc.getuFileId().toString();
                        
                        // Look up in the local cache first before making an API call
                        if (uFileCache.containsKey(uFileIdStr)) {
                            resource = uFileCache.get(uFileIdStr);
                        } else {
                            logger.debug("Fetching resource from service for uFileId: {}", uFileIdStr);
                            resource = resourceService.getUFilePath(uFileIdStr);
                            
                            if (resource != null && resource.getPath() != null) {
                                resource.setPath(resource.getPath().replace("/documents", ""));
                            }
                            // Save to local cache (even if null, to avoid repeating a failing/empty request)
                            uFileCache.put(uFileIdStr, resource);
                        }
                    }

                    // Process the scientific names extraction safely
                    updateScientificNames(doc.getId(), resource, doc.getExternalUrl());

                } catch (com.strandls.resource.ApiException e) {
                    logger.error("API error fetching resource for documentId: {} (uFileId: {}). Skipping.", 
                            doc.getId(), doc.getuFileId(), e);
                } catch (Exception e) {
                    logger.error("Unexpected error processing documentId: {}. Skipping.", doc.getId(), e);
                }
            }
        } catch (Exception e) {
            logger.error("Critical error while batch processing taxonomy documents", e);
        }
    }

@arunasd463 arunasd463 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check my comments and update the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants